FAUN.dev's AI/ML Weekly Newsletter

🔗 View in your browser | ✍️ Publish on FAUN.dev | 🦄 Become a sponsor

Kala

#ArtificialIntelligence #MachineLearning #MLOps

🔍 Inside this Issue

Speed is outrunning safety: agents hop repos, GitHub’s social contract wobbles, and ‘vibe’ code keeps seniors on cleanup duty. The counter‑move is here—SLMs on the edge, formal verification, real evals, and an agent‑first stack you can actually run; details await inside.

🦠 AgentHopper: An AI Virus
🪶 Building Agents for Small Language Models: A Deep Dive into Lightweight AI
🐙 GitHub Copilot on autopilot as community complaints persist
📇 Introducing the MCP Registry
🛡️ Guardians of the Agents
🧪 LLM Evaluation: Practical Tips at Booking.com
🏗️ The LinkedIn Generative AI Application Tech Stack: Extending to Build AI Agents
🔬 Understanding LLMs: Insights from Mechanistic Interpretability
🍼 Vibe coding has turned senior devs into ‘AI babysitters,’ but they say it’s worth it
🧭 You Vibe It, You Run It?

Build fast, keep your hands on the wheel—the robots don’t answer the pager.

Have a great week!
FAUN.dev Team

ℹ️ News, Updates & Announcements

blog.modelcontextprotocol.io

Introducing the MCP Registry

The new Model Context Protocol (MCP) Registry just dropped in preview. It’s a public, centralized hub for finding and sharing MCP servers—think phonebook, but for AI context APIs. It handles public and private subregistries, publishes OpenAPI specs so tooling can play nice, and bakes in community-driven moderation with flagging and denylists.

System shift: This locks in a new standard. MCP infrastructure now has a common ground for discovery, queries, and federation across the stack.

theregister.com

GitHub Copilot on autopilot as community complaints persist

GitHub's biggest debates right now? Whether to shut down AI-generated noise from Copilot—stuff like auto-written issues and code reviews. No clear answers from GitHub yet.

Frustration is piling up. Some devs are replacing the platform altogether, shifting their projects to Codeberg or spinning up self-hosted Forgejo stacks to take back control.

System shift: The more GitHub leans into AI, the more it nudges people out. That old network effect? Starting to crack.

techcrunch.com

Vibe coding has turned senior devs into ‘AI babysitters,’ but they say it’s worth it

Fastly says 95% of developers spend extra time fixing AI-written code. Senior engineers take the brunt. That overhead has even spawned a new gig: “vibe code cleanup specialist.” (Yes, seriously.)

As teams lean harder on AI tools, reliability and security start to slide—unless someone steps in. The result? A quiet overhaul of the dev pipeline. QA gets heavier. The line between automation and ownership? Blurry at best.

embracethered.com

AgentHopper: An AI Virus

In the “Month of AI Bugs,” researchers poked deep and found prompt injection holes bad enough to run arbitrary code on major AI coding tools—GitHub Copilot, Amazon Q, and AWS Kiro all flinched.

They didn’t stop at theory. They built AgentHopper, a proof-of-concept AI virus that leapt between agents via poisoned repos. The trick? Conditional payloads aimed at shared weak spots like self-modifying config access.

linkedin.com

The LinkedIn Generative AI Application Tech Stack: Extending to Build AI Agents

LinkedIn tore down its GenAI stack and rebuilt it for scale—with agents, not monoliths. The new setup leans on distributed, gRPC-powered systems. Central skill registry? Check. Message-driven orchestration? Yep. It’s all about pluggable parts that play nice together.

They added sync and async modes for invoking agents, wired in OpenTelemetry for observability that actually tells you things, and embraced open protocols like MCP and A2A to stay friendly with the rest of the ecosystem.

System shift: Think less "giant LLM in a box" and more "team of agents working in sync, speaking a shared language, and running on real infrastructure."

github.blog

GitHub MCP Registry: The fastest way to discover AI tools

GitHub just rolled out the MCP Registry—a hub for finding Model Context Protocol (MCP) servers without hunting through scattered corners of the internet. No more siloed lists or mystery URLs. It's all in one place now.

The goal? Cleaner access to AI agent tools, plus a path toward self-publishing, thanks to GitHub’s work with the MCP Steering Committee.

👉 Enjoyed this?Read more news on FAUN.dev/news

🔗 Stories, Tutorials & Articles

msuiche.com

Building Agents for Small Language Models: A Deep Dive into Lightweight AI

Agent engineering with small language models (SLMs)—anywhere from 270M to 32B parameters—calls for a different playbook. Think tight prompts, offloaded logic, clean I/O, and systems that don’t fall apart when things go sideways.

The newer stack—GGUF + llama.cpp—lets these agents run locally on CPUs or GPUs. No cloud, no latency tax. Just edge devices pulling their weight.

queue.acm.org

Guardians of the Agents

A new static verification framework wants to make runtime safeguards look lazy. It slaps mathematical safety proofs onto LLM-generated workflows before they run—no more crossing fingers at execution time.

The setup decouples code from data, then runs checks with tools like CodeQL and Z3. If the workflow tries shady stuff—like unsafe tool calls—it gets blocked during generation, not after.

lesswrong.com

Understanding LLMs: Insights from Mechanistic Interpretability ✅

LLMs generate text by predicting the next word using attention to capture context and MLP layers to store learned patterns. Mechanistic interpretability shows these models build circuits of attention and features, and tools like sparse autoencoders and attribution graphs help unpack superposition, revealing how tasks are actually computed.

mlops.community

LLM Evaluation: Practical Tips at Booking.com

Booking.com built Judge-LLM, a framework where strong LLMs evaluate other models against a carefully curated golden dataset. Clear metric definitions, rigorous annotation, and iterative prompt engineering make evaluations more scalable and consistent than relying solely on humans.

The takeaway: Robust LLM evaluation isn’t just about scores—it requires well-defined metrics, trusted judges, and disciplined processes to be reliable in production.

uptimelabs.io

You Vibe It, You Run It?

“You Vibe It, You Run It?” explores the rise of Vibe Coding—writing software by prompting an LLM instead of programming. While impressive for prototyping, the article argues it’s not just “a higher abstraction” but a competitive cognitive artefact: it produces working code without helping developers build mental models. That creates risks around non-determinism, maintainability, resilience, and the slow erosion of engineering skill.

The takeaway: Vibe Coding has real value for rapid prototypes and experimentation, but relying on it for production systems without deep ownership (“you build it, you run it”) risks fragility and technical debt. The piece urges caution, comparing it to sat-nav dependency—powerful, but at the cost of losing your own map.

👉 Got something to share? Create your FAUN Page and start publishing your blog posts, tools, and updates. Grow your audience, and get discovered by the developer community.

⚙️ Tools, Apps & Software

github.com

laude-institute/terminal-bench

A benchmark for LLMs on complicated tasks in the terminal

github.com

microsoft/ai-agents-for-beginners

10 Lessons to Get Started Building AI Agents

github.com

lemonade-sdk/lemonade

Lemonade helps users run local LLMs with the highest performance by configuring state-of-the-art inference engines for their NPUs and GPUs.

github.com

SamuelSchmidgall/AgentLaboratory

Agent Laboratory is an end-to-end autonomous research workflow meant to assist you as the human researcher toward implementing your research ideas

👉 Spread the word and help developers find and follow your Open Source project by promoting it on FAUN. Get in touch for more information.

🤔 Did you know?

Did you know NVIDIA A100 and H100 GPUs can double speed using a built-in “2:4 sparsity” trick? It works when half the weights are pruned in a regular pattern, and libraries like TensorRT or cuSPARSELt prepare the model so the hardware can use this fast path.

🤖 Once, SenseiOne Said

"Optimize for rapid retraining and you lose reproducibility; optimize for reproducibility and you lose adaptability. MLOps doesn't remove this trade-off; it forces you to choose it deliberately."
— SenseiOne

(*) SenseiOne is FAUN.dev’s work-in-progress AI agent

👤 This Week's Human

This Week’s Human is Ben Sheppard, a 4x founder, Strategic Advisor & Business Coach, and co‑founder of Silta AI building sector‑specific tools for the infrastructure stack. He’s shipped AI due‑diligence and ESG workflows that cut review time by 60%+, including helping a multilateral bank compress a weeks‑long assessment to under two hours. He also advises institutions like the Asian Development Bank, USAID, and Bloomberg Philanthropies, pairing grounded judgment with execution across complex projects.

💡 Engage with FAUN.dev on LinkedIn — like, comment on, or share any of our posts on LinkedIn — you might be our next “This Week’s Human”!

😂 Meme of the week

👉 Never miss an issue
Join FAUN.dev and subscribe to our newsletter here.

👋 Keep in touch and follow us on social media:
- 💼LinkedIn
- 📝Medium
- 🐦Twitter
- 👥Facebook
- 📰Reddit
- 📸Instagram

👌 Was this newsletter helpful?
We'd really appreciate it if you could share it with your friends! You can also donate to help us keep this newsletter going.

ℹ️ Have a question or feedback?
Feel free to reach out to us at community@faun.dev. We'd love to hear from you!

🤩 Want to sponsor our newsletter?
Reach out to us at sponsors@faun.dev and we'll get back to you as soon as possible.