Feed System Basics
An Experienced Engineer’s Walkthrough for Backend Engineers
A feed (e.g. news feed, timeline) shows the latest content from accounts the user follows. When you first design it, the natural question is: “When a user opens the feed, where do we get the data?” Two extremes: (1) Pull — on each feed load, fetch the follow list, then fetch recent posts from each followed account, merge and sort; (2) Push — when someone posts, push that post into every follower’s precomputed feed, so when a user opens the feed we just read their list. Both work in the small; the trouble starts when you have celebrities (accounts with millions of followers). Pull causes a read hot spot (every follower’s read touches the celebrity’s content); push causes write amplification (one post triggers millions of writes). So in practice we use a hybrid: push for normal users, pull (or special handling) for celebrities. This article is written as if I’m sitting next to you: we’ll go through Pull, Push, and Hybrid step by step, with diagrams and a comparison table, and why the “celebrity problem” drives the design.
Lesson 1: The Two Extremes — Pull vs Push
Pull (Compute on Read)
- On feed load: Fetch the user’s follow list, then pull the latest content from each followed account, aggregate and sort (e.g. by time). The feed is computed at read time.
- Read: Many reads (one per follow, or batched); can be slow if the follow list is large. Caching per author helps.
- Write: Simple — just write the post to the author’s stream; no fan-out.
- Hot spot: When a celebrity posts, every follower’s read pulls that post → many repeated reads of the same content. So pull can create a read hot spot for popular accounts.
Push (Fan-out on Write)
- On post: Push the new post into every follower’s feed list (write fan-out). Each user has a precomputed feed (e.g. list of post IDs or references).
- Read: Fast — just read the user’s own feed list (e.g. Redis sorted set, or a dedicated store).
- Write: Expensive for celebrities — one post triggers millions of writes (one per follower). So push can create a write amplification problem for popular accounts.
Lesson 1 Takeaway
- Pull = compute on read; simple write, but read cost scales with follows and can create read hot spots for celebrities.
- Push = precompute on write; fast read, but write amplification for celebrities. The real design challenge is celebrity handling.
Lesson 2: Pull — Read Path
- Steps: Get follow list → for each (or batched) author, get recent posts (from cache or storage) → merge and sort → return feed.
- Storage: Follow graph in a relation table or graph DB; content (posts) in object storage + DB or dedicated store; cache per author to reduce read load.
Lesson 2 Takeaway
Pull means O(follows) (or batched) reads per feed load. Cache follow list and per-author content to reduce load; celebrities still cause many reads of the same content.
Lesson 3: Push — Write Path (Fan-out)
- Steps: Persist post → get follower list → for each follower, append (or reference) the post to that follower’s feed. Feed store might be Redis sorted set (score = timestamp), or a dedicated feed table per user.
- Read: Single read of the user’s feed list (e.g. range query on sorted set by score).
Lesson 3 Takeaway
Push means O(followers) writes per post. Fast read, but for a celebrity with millions of followers, one post triggers millions of writes — so push alone does not scale for celebrities.
Lesson 4: Hybrid — The Practical Choice
- Idea: Use push for "normal" users (small fan-out) and pull (or no fan-out) for celebrities above a follower threshold. When a user loads the feed, merge precomputed feed (push) with on-demand pull for celebrities they follow.
- Variants:
- Push for recent (e.g. last 7 days), pull for older (cold/hot split).
- Or: push for small accounts, pull-only for accounts above N followers.
Lesson 4 Takeaway
Hybrid = push for normal users, pull (or no fan-out) for celebrities. Balances read and write load and is the usual approach in production systems.
Lesson 5: Comparison and Storage
Strategy Comparison Table
| Pattern | Read | Write | Best for |
|---|---|---|---|
| Pull | Aggregate on read; can be slow | Simple write | Few follows, high real-time need |
| Push | Fast; read own list | Write fan-out; hard for celebrities | Few follows, few celebrities |
| Hybrid | Mixed (precomputed + on-demand) | Mixed (fan-out for non-celebrities) | Many celebrities, many followers |
Storage (Short)
- Feed list per user: Redis sorted set (score = timestamp), or MongoDB, or dedicated feed store.
- Follow graph: Relation table (user_id, follower_id) or graph DB.
- Content: Object storage for media + DB (or store) for metadata and references.
Ordering and Pagination
- Order: Usually reverse chronological (newest first).
- Pagination: Cursor-based (e.g. last_id, last_timestamp) to avoid offset deep pagination and inconsistent pages.
Consistency
- Feed can be eventually consistent; push can be async after post. User may see the new post on next refresh.
Lesson 5 Takeaway
Handle celebrities separately (hybrid or pull-only). Cold/hot split: recent feed from cache/store; older on demand. Cursor-based pagination for stable, efficient listing.
Key Rules (Summary)
- Handle celebrities separately: Hybrid or pull-only to avoid write amplification.
- Cold/hot split: Recent feed from cache; older from storage; load on demand.
- Consistency: Feed can be eventual; push async; user sees on refresh.
What's Next
See Pagination, Redis ZSet, Cache Strategy. See High Concurrency Toolkit for high-concurrency design.