| |
| 🔗 Stories, Tutorials & Articles |
| |
|
| |
| Failure is inevitable: Learning from a large outage, and building for reliability in depth at |
| |
| |
Datadog ditched its “never fail” mindset after a March 2023 meltdown knocked out half its Kubernetes nodes and took major user features down with them. The fix? A full-stack rethink built around graceful degradation.
The team added disk-based persistence at intake, live-data prioritization, QoS-aware retry logic, and localized failover for control plane calls. In other words: no more all-or-nothing. If it breaks, it bends instead. |
|
| |
|
| |
|
| |
| The story of how we almost got hacked |
| |
| |
Team Invictus caught a BEC attempt using WeTransfer to slip in a fake Microsoft 365 login page powered by EvilProxy. Classic Adversary-in-the-Middle move, but dressed up with a slick delivery package.
Digging deeper, the team mapped the attacker’s setup and found something bigger: a credential grab campaign they’re calling VendorVandals. Think phishing lures disguised as procurement emails, blasted out from hijacked inboxes. Fully scripted and built to scale. |
|
| |
|
| |
|
| |
| You’ll never see attrition referenced in an RCA ✅ |
| |
| |
Lorin Hochstein argues that while high-profile engineer attrition is often speculated to contribute to major outages, it is universally absent from public Root Cause Analyses (RCAs). This exclusion occurs because public RCAs aim to reassure customers by focusing on technical fixes, whereas attrition is a complex, business-related organizational issue.
Internally, attrition may be discussed as a risk factor, but it is rarely documented as a direct cause, as traditional RCA methods fail to account for systemic, risk-increasing contributors. Ultimately, organizational factors like attrition play a role in every major incident, but remain unstated due to the narrow focus of formal incident reviews. |
|
| |
|
| |
|
| |
| Comparing AWS Lambda Arm64 vs x86_64 Performance Across Multiple Runtimes in Late 2025 |
| |
| |
A new open-source benchmark looked at 183,000 AWS Lambda invocations, and arm64 beats x86_64 across the board in both cost and speed.
Rust on arm64 with SHA-256 tuned in assembly? It clocks in 4–5× faster than x86 in CPU-heavy tasks. Cold starts are snappy too—5–8× quicker than Node.js and Python. |
|
| |
|
| |
|
| |
| Declarative Action Architecture |
| |
| |
| The Declarative Action Architecture (DAA) is a scalable E2E testing pattern that separates concerns across three distinct layers. The Test Layer is 100% declarative, stating what is being tested without any procedural logic, making tests read like documentation. The core Action Layer implements the execution logic by translating the declarative steps, with a mandatory rule of self-verification (an assertion is built into every action) and composing smaller, reusable actions . Finally, the Physical Layer acts as a "dumb" driver, handling pure execution and system interaction (like API calls or WebDriver commands) without any business logic or assertions. |
|
| |
|
| |
|
| |
| Advancing Our Chef Infrastructure: Safety Without Disruption |
| |
| |
Slack pulled back the curtain on Slack AI, its LLM-powered assistant built with a fortress mindset. Every customer gets their own isolated environment. Any data passed to vendor LLMs? It's ephemeral. Gone before it can stick.
No fine-tuning. No exporting data outside Slack. And there’s a whole middle-layer filter/audit setup watching every prompt like a hawk.
Why it matters: It’s a blueprint for threading LLMs into enterprise SaaS without handing the keys to your data. |
|
| |
|
| |
|
| |
| Why we're leaving serverless |
| |
| |
| Unkey slashed their latency by 6x, moving from Cloudflare Workers to stateful Go servers simplified architecture, enabling self-hosting and platform independence. Serverless limitations forced elaborate caching workarounds and data pipeline nightmares, leading to a new, high-speed solution. |
|
| |
|
| |
👉 Got something to share? Create your FAUN Page and start publishing your blog posts, tools, and updates. Grow your audience, and get discovered by the developer community. |