Build a Chaos Engineering Platform Premium
Design a controlled failure injection platform that safely introduces latency, packet loss, and resource exhaustion into production services, enforces blast radius limits, and automatically halts experiments when SLOs degrade.
Build a Distributed Tracing System Premium
Design a tracing platform that stitches spans across 50+ microservices into a single request trace, stores billions of spans per day, and lets engineers query any trace by ID in under 1 second.
Build an Infrastructure Cost Attribution System Premium
Design a cloud cost attribution system that tags every compute, storage, and network resource to the team and service that owns it, updates cost breakdowns in near real-time, and surfaces anomalous spend spikes.
Build a Real-Time Log Aggregation Pipeline Premium
Design a log ingestion and querying system that handles 1 million events per second, supports full-text search with sub-second latency, and retains 90 days of logs without breaking the budget.