System Design - data-engineering
#01 Data Engineering

Build Amazon's Product Search Ranking Premium

Design a search ranking system that incorporates real-time inventory, pricing, and sales velocity signals into results, personalizes rankings per user, and re-indexes a product catalog of 500 million items.

Read
#02 Data Engineering

Build Amazon's Review Aggregation Pipeline Premium

Design a review processing pipeline that deduplicates submissions, filters fraudulent reviews using behavioral signals, computes rolling star averages, and surfaces the most helpful reviews per product.

Read
#03 Databases

Build a Change Data Capture Pipeline Premium

Design a CDC pipeline that streams every database write to downstream consumers in the order they occurred, preserves transactional boundaries, and guarantees no event is dropped even if a consumer is temporarily offline.

Read
#04 Databases

Build a Columnar Storage Engine with SSTable and Compaction Premium

Design a write-optimized storage engine that buffers writes in memory, flushes them to immutable SSTables on disk, and runs background compaction to merge files while keeping read amplification low.

Read
#05 Security

Build a Content Moderation Pipeline Premium

Design a content moderation system that uses ML classifiers to automatically detect policy violations in text, images, and video, routes borderline cases to human reviewers, and minimizes both over-removal and under-removal.

Read
#06 Data Engineering

Build Gmail's Spam Detection Pipeline Premium

Design a spam classification system that processes 300 billion emails per day, classifies each one in under 100ms before delivery, and continuously retrains models as spammers adapt their tactics.

Read
#07 Data Engineering

Build Google Photos Duplicate Detection Pipeline Premium

Design a pipeline that detects near-duplicate photos across 28 billion uploads per day using perceptual similarity rather than byte-level matching, and does so without storing a full copy of every image.

Read
#08 Scalability

Build Instagram Explore Real-Time Recommendation Engine Premium

Design a content discovery system that generates a personalized Explore grid for 2 billion users, refreshes recommendations as users scroll, and surfaces trending content within minutes of it going viral.

Read
#09 Data Engineering

Build Instagram Stories Expiry and Archival Pipeline Free

Design a system that automatically expires 500 million Stories after 24 hours, moves them to cold archival storage, and lets users retrieve archived Stories on demand without impacting live traffic.

Read
#10 Data Engineering

Build Netflix's Recommendation Engine Premium

Design a personalization engine that generates a unique homepage for 280 million users, blends collaborative filtering with content-based signals, and updates recommendations within hours of new viewing behavior.

Read