Case 05 · FeedStream: social feed and video content distribution

The thesis in one line: this case drills amplification in content distribution — the hard part is not storing one post, but giving millions of users their own screen of content while handling creators with huge audiences, viral video, search, recommendation, and CDN cost.

🧪 Case track, case 5 · This case drills one thing
Drill architectural judgment for social feeds + content search + video distribution: when to push, when to pull, how to handle large creators and viral content, why video needs transcoding and CDN, and how search / recommendation / moderation avoid blocking the main path.
After reading you should be able to How this case trains it
Explain why Feed is not a simple post list query Use read/write asymmetry and personalized home timelines
Decide between push, pull, and hybrid Feed Use push for ordinary creators and pull for large creators
See why images / videos should not come from the business origin Use transcoding, segments, CDN, prewarming, and origin protection
Put search, recommendation, and moderation into content distribution Use recall / ranking / moderation recall and degradation
Important reminder: this is a teaching case, not any social platform's internal blueprint. The numbers are for order-of-magnitude reasoning. The goal is judgment, not a single correct answer.

After reading you should be able to	How this case trains it
Explain why Feed is not a simple post list query	Use read/write asymmetry and personalized home timelines
Decide between push, pull, and hybrid Feed	Use push for ordinary creators and pull for large creators
See why images / videos should not come from the business origin	Use transcoding, segments, CDN, prewarming, and origin protection
Put search, recommendation, and moderation into content distribution	Use recall / ranking / moderation recall and degradation

Opening: why Feed is not "query posts from people I follow"

Because the real product is not the publish button. It is the screen of content users see every time they open the app.

FeedStream is a text, image, and short-video community. Users follow creators, publish posts or videos, like, comment, search, and see recommendations. Ordinary creators have hundreds of followers. Top creators have millions. When a viral video launches, hundreds of thousands of viewers may open it within minutes.

At first glance, it looks like an ordinary content system:

users publish posts;
followers read home feeds;
people like and comment;
users search keywords;
creators upload videos;
the system recommends content.

But after launch, the real incidents are not "we cannot store posts." They are:

Home refresh is slow, one top creator post overloads fanout queues, viral video cache misses crush the origin, and unsafe content has already been distributed into millions of timelines.

So this chapter is not about "how to make a posts table." It asks a sharper question:

How do you distribute massive content to different people while controlling hotspots, ranking, media bandwidth, and content safety?

The pressure source is different from the previous cases:

StarArena fears inventory and payment state.
PatchDesk fears multi-tenant boundaries.
DocuMind fears answer trust.
SyncRoom fears realtime state convergence.
FeedStream fears one write being amplified into massive reads, massive distribution, and massive bandwidth.

Mini glossary before reading

This chapter repeats a few terms. Here they are in plain language:

Term	Plain-language meaning
Feed	A stream of content shown on the user's home screen.
Timeline	An ordered list of content, such as a creator profile timeline or a home timeline.
Fan-out	One item is distributed to many recipients.
Push model / fan-out on write	When content is published, write its ID into followers inboxes. Reading becomes fast.
Pull model / fan-out on read	Publish once, then pull followed creators content when the user refreshes.
Hybrid model	Ordinary creators use push; large creators use pull. This avoids huge write fanout.
Large creator	A creator with many followers. One post can create extreme fanout.
Inbox	A user's Feed candidate list. It usually stores content IDs, not full post bodies.
Recall	First fetch a rough candidate set from massive content.
Ranking / fine ranking	Score candidates and decide what appears first.
CDN	Content Delivery Network. Caches static content such as images and videos near users.
Origin fetch	CDN misses cache and has to fetch from origin storage. This is usually slower and more expensive.
Transcoding	Process one source video into multiple quality / bitrate versions.
ABR	Adaptive Bitrate. The player switches quality based on the current network.
Moderation recall	After content is judged unsafe, remove it not only from the post page, but also from Feed, search, recommendation, and caches.

1. Starting point: validate supply and consumption before building a recommendation empire

FeedStream version one has a simple goal: users can follow creators, see followed content, and publish text, images, and short videos.

The starting constraints look roughly like this:

Dimension	Starting phase
Daily active users	Fewer than 10,000
Creators	1,000-5,000
Daily content	5,000-20,000 items
Video share	10%-20%
Peak Feed reads	200-500 QPS
Team size	5-8 engineers
Core goal	Prove people publish, read, and interact
Must not fail	Home Feed cannot stay empty or slow; unsafe content cannot spread without control

The right architecture at this point is not a full recommendation platform. It is a content service + follow graph + simple Feed read path + object storage / CDN:

User publishes content
  │
  ▼
┌────────────────────────────────────────────┐
│ Content service                             │
│ writes post body, media URLs, author,       │
│ visibility, moderation state                │
└──────────────┬─────────────────────────────┘
               ▼
       ┌──────────────┐
       │ Content store │
       │ post / media  │
       └──────────────┘

User refreshes home
  │
  ▼
┌────────────────────────────────────────────┐
│ Feed read service                           │
│ load follow list → fetch recent content     │
│ → simple ranking → return                   │
└────────────────────────────────────────────┘

Images / videos → object storage → CDN

This is not under-designed. Without large creators or massive reads, the pull model is simple and good enough to validate the product.

2. Quantified assumptions: writes will not kill it first; reads and hotspots will

Run the numbers. Suppose FeedStream has grown for half a year:

Daily active users: 2,000,000
Monthly active users: 10,000,000
Creators: 500,000
New content per day: 1,000,000
Video content: 200,000 per day
Peak publishing: 2,000-5,000 per second
Peak Feed refresh: 50,000-150,000 QPS
Ordinary user followers: 100-2,000
Mid-tier creator followers: 10,000-200,000
Top creator followers: 1,000,000-20,000,000
Viral video: 500,000 plays in 10 minutes
Target: home refresh P95 < 500ms, publish-to-visible usually < 10s
Video target: first frame < 2s, rebuffering near 0, CDN hit rate > 95%
Moderation target: high-risk content reviewed before broad distribution; unsafe content removed from major surfaces within 5 minutes

At this scale, writing content is not the largest problem. The dangerous parts are:

Reads massively outnumber writes: one post can be read thousands, millions of times.
Follower distribution is extremely skewed: ordinary creators have hundreds of followers, top creators have millions.
Video bandwidth is real cost: every extra minute watched costs CDN and bandwidth money.
Unsafe content gets amplified: stronger Feed and recommendation make spread faster, and recall harder.

So FeedStream's architectural center of gravity is not "how to store posts." It is:

Make personalized reads fast, hotspot distribution controlled, media delivery edge-based, and moderation recall a system capability.

3. Trigger signals: when version one starts to be insufficient

Once version one is running, do not upgrade by feeling. Watch these signals:

Signal	What it looks like	Why this is architectural
Home refresh slows down	Users follow 1,000 people and each refresh aggregates too many sources	Pure pull creates read amplification
Large creator post backs up queues	One item must enter millions of follower inboxes	Pure push hits fanout explosion
Hot content body reads spike	Viral posts repeatedly fetch the same content body	Missing hot content cache
Video origin fetch rises	A new viral video misses CDN and origin bandwidth spikes	Missing prewarming / origin protection
New content is not searchable	Content stays invisible in search for minutes	Index freshness is insufficient
Recommendation quality swings	Users see duplicates, stale items, or low-quality content	Recall / ranking / dedupe chain is unstable
Unsafe content is hard to recall	Post is deleted, but Feed, search, or cache still shows it	Moderation state does not control every distribution surface
Like count drifts	Cached counters disagree with interaction logs	Derived counters lack a source of truth

These signals are not saying "the database is too small." They are saying: content distribution amplifies everything, including hotspots, cost, and mistakes.

4. Core tension: every user needs a different home page

FeedStream has four groups of core objects:

Content / media / author: what was published, where media lives, and moderation state.
Follow graph / visibility: who follows whom, who can see this content, blocks and privacy rules.
Timeline / recommendation candidates / ranking result: where the home screen comes from and why it is ordered this way.
Search / moderation / interaction signals: whether content can be searched, whether it can keep distributing, and how users respond.

If you look only at the simplest path, it feels like this:

User refreshes home → load people I follow → load their recent posts → rank and return

A real system must answer at every step:

If I follow many people, can every refresh query all of them?
When a top creator posts, should it be written to every follower inbox?
Should timeline entries store full text or only content IDs?
Can ranking finish within 500ms?
Where does video playback come from? What happens on CDN miss?
If content is later marked unsafe, how is it removed from timelines, search, recommendation, and CDN?

The new architectural statement becomes:

Feed is a personalized content distribution engine, not a posts table.

The key layers are:

Content authority layer: post body, media, moderation state, visibility
Distribution layer: timeline inbox, push / pull / hybrid fanout
Ranking layer: candidate recall, coarse ranking, fine ranking, dedupe, diversity
Media layer: transcoding, object storage, CDN, prewarming, origin protection
Safety layer: moderation, takedown, recall, abuse control, visibility filtering

5. Solution reasoning: should Feed push or pull?

This is the most important decision in the case. The core Feed question is: should home timelines be computed ahead of time, or when users refresh?

Option A: pure pull, compute on refresh

User refreshes
  └─▶ load everyone I follow
      └─▶ load their recent content
          └─▶ merge and rank

Benefit	Cost
Publishing is light; one write only	Refresh becomes expensive as follow count grows
Follow / unfollow takes effect immediately	Feed QPS makes read path hard to scale
Simple for MVP	Hard to keep P95 < 500ms

Option B: pure push, write into follower inboxes

User publishes
  └─▶ load follower list
      └─▶ write content ID into every follower Feed inbox

Benefit	Cost
Home reads are very fast: read own inbox	A top creator post creates millions of writes
Fits read-heavy systems	Fanout queues and timeline storage get heavy
Can precompute part of ranking	Follow graph changes need correction

Option C: hybrid push / pull

Ordinary creator publishes → push into follower inboxes
Top creator publishes       → write only to creator timeline
User refreshes home         → read inbox + pull followed top creators + rank merge

Benefit	Cost
Ordinary content reads fast	Feed read service merges multiple sources
Avoids top creator fanout explosion	Top creator content becomes a read hotspot and needs cache
Separates by follower threshold	Thresholds and mid-tier creator behavior need tuning

FeedStream chooses, for growth phase: hybrid push / pull. Ordinary creators fan out on write; top creators are pulled at read time; both paths merge in Feed read service.

The key is not "push is better" or "pull is better." The key is:

Follower distribution is extremely skewed, so the architecture must split cases. Optimize for the cheap 99%, and isolate the explosive 1%.

6. Key architecture decisions: record the "why" with ADRs

ADR means Architecture Decision Record. Feed systems are often questioned later: "Why only store IDs in timelines? Why do top creators not fan out? Why async transcode video? Why does moderation affect distribution?" Those answers should be recorded before memory fades.

ADR-01: use hybrid Feed instead of pure push or pure pull

Context: reads are far more frequent than writes, so home refresh must be fast; but follower counts are extremely skewed, and top creator fanout can explode.
Decision: creators below a follower threshold fan out on write into follower inboxes; top creators write only to author timelines and are pulled when followers refresh.
Gave up: one distribution strategy for all accounts.
Gained: ordinary users get fast home reads, and top creator posts do not overload fanout queues.
Risk: Feed reads become more complex: merge, dedupe, rank, and cache top creator content.
Revisit when: mid-tier creators frequently hit threshold edges; use active followers, content heat, and account tier to tune dynamically.

ADR-02: timeline inbox stores content IDs; bodies and media are batch-filled at read time

Context: one hot post can enter millions of inboxes. Copying full text and media data into every inbox would explode storage.
Decision: inbox entries store post_id, author, time, and light ranking hints; Feed read service batch-fills body, counters, and media URLs.
Gave up: copying complete content into every inbox.
Gained: storage stays manageable, the content body has one authority copy, and moderation state is easier to apply consistently.
Risk: reads need an extra batch-fill step, and hot content body storage can become a hotspot.
Revisit when: batch-fill becomes a bottleneck; add short caches for hot bodies, counters, and whole Feed pages.

ADR-03: video upload transcodes asynchronously and distributes through object storage + CDN

Context: source videos are large, user networks vary, and origin distribution would cause buffering and runaway bandwidth cost.
Decision: upload stores the source and returns processing; background workers transcode multiple segment qualities and a manifest; playback goes through CDN, and players use ABR to choose segment quality.
Gave up: synchronous transcoding inside upload requests and origin-serving video files directly.
Gained: upload does not block, playback is smoother, and bandwidth cost is governed by CDN hit rate.
Risk: upload-to-playable has delay; transcode queues and CDN origin fetch need monitoring.
Revisit when: viral content often saturates origin; add prewarming, request coalescing, and multi-CDN strategy.

ADR-04: moderation state controls Feed, search, recommendation, and CDN

Context: content platforms face illegal, harmful, infringing, or spam content; stronger distribution spreads mistakes faster.
Decision: content has one moderation state; high-risk content is reviewed before broad distribution; takedown events update Feed inboxes, search indexes, recommendation candidates, caches, and CDN; read path re-checks visibility.
Gave up: hiding unsafe content only on the post detail page.
Gained: unsafe content can be recalled from major surfaces, and visibility rules apply on both write and read sides.
Risk: recall is complex and one surface may be missed.
Revisit when: UGC grows; make moderation recall a tier-one capability with takedown regression tests.

7. Structure and data flow after evolution

FeedStream is not post CRUD. It is content distribution, ranking, media delivery, and moderation recall.

Starting path

User posts → posts table
User refreshes → load follow list → load posts → return
Video playback → origin file URL

Problem: read amplification, top creator fanout, video bandwidth, and moderation recall are not structural.

Evolved structure

User publishes
  │
  ▼
┌──────────────────────────────────────────────┐
│ Content service                               │
│ body / visibility / moderation / media meta   │
└──────┬───────────────────────┬───────────────┘
       │                       │
       ▼                       ▼
┌──────────────┐       ┌──────────────────────┐
│ Content store │       │ Media pipeline        │
│ post authority│       │ upload → transcode    │
└──────┬───────┘       │ → object store → CDN  │
       │               └──────────────────────┘
       ▼
┌──────────────────────────────────────────────┐
│ Feed distribution service                     │
│ ordinary creator push → follower inboxes      │
│ top creator author timeline → read-time pull  │
└──────┬───────────────────────┬───────────────┘
       ▼                       ▼
┌──────────────┐       ┌──────────────────────┐
│ Timeline      │       │ Search / ranking     │
│ inbox          │       │ index and features   │
│ user -> post_id│       │ recall / ranking     │
└──────┬───────┘       └──────────┬───────────┘
       ▼                          ▼
┌──────────────────────────────────────────────┐
│ Feed read service                             │
│ read inbox → pull top creators → batch-fill    │
│ → visibility filter → rank / dedupe / diversify│
│ → return one screen                            │
└──────────────────────────────────────────────┘

The core change is not "add recommendation." The structure is clearer:

Content service stores the authority copy, media references, moderation state, and visibility.
Feed distribution service decides push or pull and writes timeline inboxes asynchronously.
Timeline inbox stores only content IDs for low-latency reads.
Feed read service merges ordinary content, top creator content, and recommendation candidates, then fills, filters, ranks, and returns.
Media pipeline moves image / video bytes away from business origin into object storage and CDN.
Moderation recall reaches content, Feed, search, recommendation, caches, and CDN.

Follow one "ordinary creator publishes" flow end to end

1. Creator publishes an image post.
2. Content service writes the post authority copy, with status distributable or under review.
3. Feed distribution service loads follower list and sees 800 followers.
4. This is an ordinary account, so it asynchronously writes post_id into 800 follower timeline inboxes.
5. A follower refreshes home and reads latest post IDs from their inbox.
6. Feed read service batch-fills body, counters, and media URLs.
7. Ranking service scores candidates, then dedupes and diversifies.
8. User sees this screen of content.

Follow one "top creator viral video" flow end to end

1. Top creator uploads a short video.
2. Upload service stores the source video, enqueues transcoding, and returns processing.
3. Transcoding workers generate 1080p / 720p / 480p segments and manifest.
4. After moderation passes, content service marks the video distributable.
5. Because this is a top creator, Feed distribution does not write post_id into millions of follower inboxes. It writes only to the author timeline.
6. When followers refresh Feed, read service notices they follow this creator and pulls the author timeline.
7. Ranking merges this video into candidates; if score is high enough, it appears in home Feed.
8. Player requests manifest and video segments, usually hitting CDN.
9. If the video is predicted to be hot, top segments are prewarmed to CDN. On misses, origin fetches are coalesced to avoid a stampede.

Key points:

Top creator content avoids write fanout explosion.
Read-time pull of top creator content needs hotspot cache, or pressure merely moves to author timeline reads.
Video playback goes through object storage + CDN, not business services.
Unreviewed content must not enter Feed, search, or recommendation candidates.

8. What if it breaks: failure scenarios and fallbacks

Failure	Direct result	Detection	Architectural fallback
Pure pull creates read amplification	Home refresh slows and follow aggregation overloads DB	Feed P95, follow count distribution, query fanout	Move ordinary users to fanout-on-write and precomputed timeline inboxes
Pure push meets a top creator	One post creates millions of writes and queues lag	fanout queue lag, fanout count per post	Top creators use read-time pull; tune follower threshold dynamically
Timeline stores full bodies	Viral content is copied millions of times and storage explodes	Timeline storage growth	Inbox stores only post_id; read path batch-fills content
Top creator timeline lacks cache	Followers refresh and overload author timeline	Author timeline QPS, hot post fill count	Short cache for hot author timelines and hot post bodies
Fanout blocks publishing	Publishing is slow or fails	Publish latency, fanout duration	Publish writes authority copy only; fanout runs async
CDN hit rate drops	Video first frame slows and origin bandwidth spikes	CDN hit ratio, origin egress	Hot prewarming, layered cache, request coalescing
Transcoding queue backs up	Videos take too long to become playable	Queue length, task duration	Elastic transcode workers; fewer qualities for cold content
Search index lags	New content cannot be searched, or old unsafe content still appears	Index freshness, stale state hits	Near-realtime indexing; prioritize takedown events
Recommendation repeats low-quality content	Users see duplicates or spammy content	Duplication rate, negative feedback, watch time	Deduping, diversity rules, quality filters, feedback loop
Unsafe content has spread	Feed, search, and cache still show it	Takedown regression, content-state scanning	Moderation state controls all paths; takedown recalls indexes and caches
Visibility checked only in UI	Private content enters unauthorized Feed	Permission penetration tests, reports	Enforce visibility on both fanout and read paths
Interaction counters drift	Like / comment counts disagree with logs	Counter reconciliation, abnormal jumps	Counters can be cached, but interaction logs can recompute truth

Content distribution maturity is not measured by whether home Feed can show posts. It is measured by whether hotspots, unsafe content, origin fetches, and ranking degradation are structurally contained.

📌 Validate your reasoning against the templates

This case is not a rewrite of social Feed or video streaming templates. It puts the mutually amplifying paths of a content community into one system.

Reusable template / chapter	What this case reuses	What this case adds
Social Feed	Push / pull / hybrid, timeline inboxes, top creator fanout	Explains why ordinary creators push and top creators pull
Video Streaming	Transcoding, multi-bitrate, object storage, CDN, prewarming	Puts video delivery inside Feed hotspot scenarios
Search Engine	Inverted index, recall + ranking, index freshness	Shows site search and recommendation candidates both need indexing and filtering
Notification System	Async notifications, dedupe, rate limits	Interaction alerts must not block publishing or Feed reads
The mechanics of scale	Hotspots, fanout, caching, read/write amplification	Treats top creators, viral video, and CDN origin fetch as amplification problems
Security & Multi-Tenancy	Visibility, permissions, isolation	Applies privacy, block lists, and moderation state to distribution

Reading suggestion: read this case first, then return to the Social Feed template and Video Streaming template. Feed and video look separate, but hotspots and distribution cost tie them together.

🎯 Quick check

🤔Why should FeedStream not use pure push for every creator?

ABecause push model always has bad read performance
BBecause top creators have too many followers, and one post can create millions of writes; top creators should be pulled at read time
CBecause timelines cannot be precomputed

🤔Why should timeline inboxes usually store post_id instead of full post bodies?

ABecause post bodies cannot be stored in databases
BBecause hot content may be referenced by millions of inboxes, and copying full bodies would explode storage and update cost
CBecause Feed does not need post bodies

🤔A viral video launch saturates origin bandwidth. What is most directly missing?

AThe post table needs more columns
BCDN hit rate, hot prewarming, or origin protection is missing
CThe follow graph is too complex

🤔If unsafe content has entered Feed, search, and recommendation, is deleting only the post detail page enough?

AYes, nobody can view it after the detail page is gone
BNo, it must also be recalled from timelines, search indexes, recommendation candidates, caches, and CDN
CNo need; just wait for cache expiry

Case summary

Feed is not a posts-table query; it is a personalized distribution engine. Every user has a different home screen, and reads dominate writes.
Hybrid push / pull is the core large-scale Feed trade-off. Ordinary creators fan out on write for fast reads; top creators are pulled at read time to avoid fanout explosion.
Timelines store references; bodies are filled at read time. This controls storage, hot cache behavior, and moderation consistency.
Video content must leave the business origin. Transcoding, multi-bitrate segments, object storage, CDN, prewarming, and origin protection determine playback quality and bandwidth cost.
Recommendation and search are recall + ranking funnels. Do not recompute the whole world on each refresh; recall candidates first, then score a small set.
Moderation recall is part of distribution. Wherever the system distributes content, takedown must be able to reach.

Bridge forward: this case puts content Feed, search, recommendation, and video CDN into one community product. If the next case moves into AI Agent / coding Agent, the pressure changes again: tool calls, permissions, sandboxing, memory, context, and human approval.

Template cross-check: Social Feed · Video Streaming · Search Engine · Notification System
Methodology: 02 · The architect's thinking framework · 07 · Designing from 0 to 1 · 08 · ADRs & evolution
Hard parts: 13 · The mechanics of scale · 16 · Security & Multi-Tenancy

Case 05 · FeedStream: social feed and video content distribution ​

Opening: why Feed is not "query posts from people I follow" ​

Mini glossary before reading ​

1. Starting point: validate supply and consumption before building a recommendation empire ​

2. Quantified assumptions: writes will not kill it first; reads and hotspots will ​

3. Trigger signals: when version one starts to be insufficient ​

4. Core tension: every user needs a different home page ​

5. Solution reasoning: should Feed push or pull? ​

Option A: pure pull, compute on refresh ​

Option B: pure push, write into follower inboxes ​

Option C: hybrid push / pull ​

6. Key architecture decisions: record the "why" with ADRs ​

ADR-01: use hybrid Feed instead of pure push or pure pull ​

ADR-02: timeline inbox stores content IDs; bodies and media are batch-filled at read time ​

ADR-03: video upload transcodes asynchronously and distributes through object storage + CDN ​

ADR-04: moderation state controls Feed, search, recommendation, and CDN ​

7. Structure and data flow after evolution ​

Starting path ​

Evolved structure ​

Follow one "ordinary creator publishes" flow end to end ​

Follow one "top creator viral video" flow end to end ​

8. What if it breaks: failure scenarios and fallbacks ​

📌 Validate your reasoning against the templates ​

🎯 Quick check ​

Case summary ​

Related links ​

💬 Comments