๐ Inside this Issue
Big orgs are quietly turning AI from a chat toy into real infrastructure: graph-shaped ML metadata at Netflix, agent knowledge systems at Meta, and plugin-driven code review at Cloudflare. On the other end of the spectrum, you have one person and a laptop trying to run a local model without it wrecking their workflow, plus AWS productizing MCP into something you can actually depend on.
๐ฌ Democratizing Machine Learning at Netflix: Building the Model Lifecycle Graph
๐ง How We Built an AI Second Brain for 60K Knowledge Workers
๐ก๏ธ Orchestrating AI Code Review at scale
๐ป Running local models on an M4 with 24GB memory
โ๏ธ The AWS MCP Server is now generally available
Steal the patterns, skip the hype, and ship something sturdier this week.
Until next time!
FAUN.dev() Team
๐ Stories, Tutorials & Articles

netflixtechblog.com
Netflix's Saish Sali, Nipun Kumar, and Sura Elamurugu describe the Metadata Service (MDS), a graph layer built to connect siloed ML tooling (model registry, pipeline orchestrator, experimentation platform, feature store, dataset platform, identity) across personalization, studio, payments, and ads.
The system assigns every ML asset a global AIP URI, ingests thin change events from each source over Kafka and SNS/SQS, then hydrates the full state from the source of truth so out-of-order or dropped events self-correct, with Datomic holding entities and reified edges and Elasticsearch powering search.
Background enrichment jobs walk multi-hop chains (model to pipeline run to A/B test cell to experiment) to materialize cross-system relationships, turning queries like "which experiments are running this model" or impact analysis on a feature change into a single graph traversal.

medium.com
Meta built an AI agent system internally called the AI Second Brain that now has over 63,000 installs and ~10,000 daily active users across engineering, PM, design, legal, finance, comms, and sales, growing from zero in roughly three months after a non-technical PM's adoption post. The architecture pairs Tiago Forte's PARA folder framework (Projects, Areas, Resources, Archives) with a root CLAUDE.md plus per-project CLAUDE.md files for progressive disclosure, an infrastructure layer of internal MCP servers and CLIs that give the agent scoped authenticated access to docs, meeting transcripts, task trackers, and code review, and a library of community-written skills as plain Markdown for workflows.The post credits four lessons: invest in the tool-access infrastructure layer before applications, prefer progressive disclosure over context dumping, low-friction bootstrap drives viral adoption, and composable Markdown skills turned the plugin into a platform that users extended themselves.

aws.amazon.com
AWS now offers AWS MCP Server as a managed remote MCP server in US East (N. Virginia) and Europe (Frankfurt). MCP-compatible clients can use existing IAM credentials to access more than 15,000 AWS API operations.
For GA, AWS added IAM context keys, documentation retrieval without authentication, lower token use, server-side Python execution in a sandbox with no network access, and separate CloudWatch and CloudTrail visibility for MCP calls. AWS service teams also maintain Skills for the server.

blog.cloudflare.com
Cloudflare engineers built an AI code review platform on OpenCode.
They split GitLab integration, model providers, prompts, and policy into separate plugins. A coordinator assigns up to seven domain reviewers across security, performance, code quality, documentation, release checks, and AGENTS.md compliance.
They stream review events as JSONL, route work by risk tier, protect each model with circuit breakers and failback chains, and let Workers/KV override model choices. They also track incremental re-review state and deduplicate prompt context to control cost and latency.
In the first 30 days, Cloudflare ran 131,246 reviews across 48,095 merge requests in 5,169 repos. The team reported a 3m39s median runtime and a $0.98 median cost per run.

jola.dev
Local LLMs work best as supervised coding assistants. The writer ran Qwen 3.5 9B (Q4) in LM Studio on a 24GB MacBook Pro and got about 40 tokens per second, with thinking mode, tool use, and a 128K context window. The author saw mixed results: Qwen helped with simple Elixir linter edits, then failed a basic git conflict by leaving conflict markers in place and trying to continue the rebase.
โ๏ธ Tools, Apps & Software

github.com
Open-source omnichannel chatbot for agentic workflows via APIs, CLI, and MCP. An alternative to Wati, ManyChat, and Respond.io

github.com
Utilyze measures how efficiently your GPU is doing useful work, not just whether it's busy. It runs live against your workload with negligible overhead.

github.com
DeepSeek 4 Flash local inference engine for Metal

github.com
Manage multiple Claude Code, OpenCode agents from either TUI or Web for easy access on mobile. Also supports Mistral Vibe, Codex CLI, Gemini CLI, Pi.dev, Copilot CLI, Factory Droid Coding. Uses tmux and git worktrees.

github.com
Turn any technical book PDF into a Claude Code skill โ ready to study, reference, and use while you work.