Feed System Basics

An Experienced Engineer’s Walkthrough for Backend Engineers

A feed (e.g. news feed, timeline) shows the latest content from accounts the user follows. When you first design it, the natural question is: “When a user opens the feed, where do we get the data?” Two extremes: (1) Pull — on each feed load, fetch the follow list, then fetch recent posts from each followed account, merge and sort; (2) Push — when someone posts, push that post into every follower’s precomputed feed, so when a user opens the feed we just read their list. Both work in the small; the trouble starts when you have celebrities (accounts with millions of followers). Pull causes a read hot spot (every follower’s read touches the celebrity’s content); push causes write amplification (one post triggers millions of writes). So in practice we use a hybrid: push for normal users, pull (or special handling) for celebrities. This article is written as if I’m sitting next to you: we’ll go through Pull, Push, and Hybrid step by step, with diagrams and a comparison table, and why the “celebrity problem” drives the design.


Lesson 1: The Two Extremes — Pull vs Push

Pull (Compute on Read)

  • On feed load: Fetch the user’s follow list, then pull the latest content from each followed account, aggregate and sort (e.g. by time). The feed is computed at read time.
  • Read: Many reads (one per follow, or batched); can be slow if the follow list is large. Caching per author helps.
  • Write: Simple — just write the post to the author’s stream; no fan-out.
  • Hot spot: When a celebrity posts, every follower’s read pulls that post → many repeated reads of the same content. So pull can create a read hot spot for popular accounts.

Push (Fan-out on Write)

  • On post: Push the new post into every follower’s feed list (write fan-out). Each user has a precomputed feed (e.g. list of post IDs or references).
  • Read: Fast — just read the user’s own feed list (e.g. Redis sorted set, or a dedicated store).
  • Write: Expensive for celebrities — one post triggers millions of writes (one per follower). So push can create a write amplification problem for popular accounts.

Lesson 1 Takeaway

  • Pull = compute on read; simple write, but read cost scales with follows and can create read hot spots for celebrities.
  • Push = precompute on write; fast read, but write amplification for celebrities. The real design challenge is celebrity handling.

Lesson 2: Pull — Read Path

  • Steps: Get follow list → for each (or batched) author, get recent posts (from cache or storage) → merge and sort → return feed.
  • Storage: Follow graph in a relation table or graph DB; content (posts) in object storage + DB or dedicated store; cache per author to reduce read load.

Lesson 2 Takeaway

Pull means O(follows) (or batched) reads per feed load. Cache follow list and per-author content to reduce load; celebrities still cause many reads of the same content.


Lesson 3: Push — Write Path (Fan-out)

  • Steps: Persist post → get follower list → for each follower, append (or reference) the post to that follower’s feed. Feed store might be Redis sorted set (score = timestamp), or a dedicated feed table per user.
  • Read: Single read of the user’s feed list (e.g. range query on sorted set by score).

Lesson 3 Takeaway

Push means O(followers) writes per post. Fast read, but for a celebrity with millions of followers, one post triggers millions of writes — so push alone does not scale for celebrities.


Lesson 4: Hybrid — The Practical Choice

  • Idea: Use push for "normal" users (small fan-out) and pull (or no fan-out) for celebrities above a follower threshold. When a user loads the feed, merge precomputed feed (push) with on-demand pull for celebrities they follow.
  • Variants:
    • Push for recent (e.g. last 7 days), pull for older (cold/hot split).
    • Or: push for small accounts, pull-only for accounts above N followers.

Lesson 4 Takeaway

Hybrid = push for normal users, pull (or no fan-out) for celebrities. Balances read and write load and is the usual approach in production systems.


Lesson 5: Comparison and Storage

Strategy Comparison Table

PatternReadWriteBest for
PullAggregate on read; can be slowSimple writeFew follows, high real-time need
PushFast; read own listWrite fan-out; hard for celebritiesFew follows, few celebrities
HybridMixed (precomputed + on-demand)Mixed (fan-out for non-celebrities)Many celebrities, many followers

Storage (Short)

  • Feed list per user: Redis sorted set (score = timestamp), or MongoDB, or dedicated feed store.
  • Follow graph: Relation table (user_id, follower_id) or graph DB.
  • Content: Object storage for media + DB (or store) for metadata and references.

Ordering and Pagination

  • Order: Usually reverse chronological (newest first).
  • Pagination: Cursor-based (e.g. last_id, last_timestamp) to avoid offset deep pagination and inconsistent pages.

Consistency

  • Feed can be eventually consistent; push can be async after post. User may see the new post on next refresh.

Lesson 5 Takeaway

Handle celebrities separately (hybrid or pull-only). Cold/hot split: recent feed from cache/store; older on demand. Cursor-based pagination for stable, efficient listing.


Key Rules (Summary)

  • Handle celebrities separately: Hybrid or pull-only to avoid write amplification.
  • Cold/hot split: Recent feed from cache; older from storage; load on demand.
  • Consistency: Feed can be eventual; push async; user sees on refresh.

What's Next

See Pagination, Redis ZSet, Cache Strategy. See High Concurrency Toolkit for high-concurrency design.