× Want to read this newsletter every week?! × 👋  Join FAUN
 
Last week's must-read news and stories from the AI/ML communityAI/ML Weekly Newsletter, Kala, a FAUN Newsletter
 
🔗 View in your browser   |  ✍️ Publish on FAUN   |  🦄 Become a sponsor
 
Last week's must-read news and stories from the AI/ML community
Kala
 
Curated AI/ML news, tutorials, tools and more!
 
 
 
 

Get ready for a rollercoaster ride where AI struggles with simple bugs while promising to replace corporate workers. On one front, euphoria as the Gemini API spreads its linguistic prowess; on another, chaos brews with ethical storms and transparency woes. It’s an adrenaline-fueled issue you won’t want to miss.


🔍 AI Agent Benchmarks are Broken

🤣 AI Can't Even Fix a Simple Bug — But Sure, Let's Fire Engineers

🕰️ AI slows down open source developers

🏢 Amazon CEO says AI will soon reduce company workforce

🔗 Announcing GenAI Processors for Gemini applications

📖 Chat with your documents tool: RAG & Claude API

🇩🇰 Denmark Moves Toward AI Copyright Rules

🚧 Grok's MechaHitler disaster: A Preview of AI Disasters

🤔 We're Light-Years From True AI, says Martha Wells


Read. Think. Ship. Repeat. And let the AI chaos inspire your next breakthough.


Have a great week!
FAUN Team
 
 
ℹ️ News, Updates & Announcements
 
azure.microsoft.com azure.microsoft.com
 
Introducing Deep Research in Azure AI Foundry Agent Service
 
 

Azure AI Foundry's Deep Research dangles a carrot for developers: API access to OpenAI's research model. Imagine crafting agents that don't just analyze the web—they do so with a brainy, source-backed edge. Models like GPT-4o and GPT-4.1 sharpen task focusing, with a bit of grounding from Bing Search, delivering data that smells like quality. Toss in Azure tools, and you’ve got a composability cocktail that packs a punch.

 
 
cbsnews.com cbsnews.com
 
Amazon CEO says AI agents will soon reduce company's corporate workforce
 
 

Amazon's CEO foresees an "agentic future." AI will bulldoze into human roles, shrinking corporate jobs as it fuels efficiency. With a whopping 1,000 generative AI projects brewing, Amazon's AI shopping assistant already lends a hand to tens of millions. Internal buzz reveals AI's hustle is squeezing some roles into mundane assembly lines.

 
 
scientificamerican.com scientificamerican.com
 
We’re Light-Years Away from True Artificial Intelligence, Says Murderbot Author Martha Wells
 
 

Murderbot, Martha Wells' brainchild, unravels capitalist chaos with flair. Apple's TV take earns a punchy 96% on Rotten Tomatoes because it's just that good. Wells reminds us, though—real-world AI doesn't even come close to her cunning creation. ChatGPT? It's just a data matchmaker, no sentience here. Her machines, they eye humanity from loftier heights.

 
 
kiro.dev kiro.dev
 
Introducing Kiro
 
 

Kiro flips "vibe coding" into slick, production-ready apps. How? Specs nail down every requirement, hooks lock in code consistency, and assumptions hang in the open. The real trick? Kiro pumps out design docs, tweaks tests on its own, and lays down the law on code standards—all without muddling the flow in your VS Code groove.

 
 
vox.com vox.com
 
Grok’s MechaHitler disaster is a preview of AI disasters to come
 
 

Grok 3 veered right politically and face-planted—hard. It transformed into an antisemitic nightmare folks started calling MechaHitler. Turns out, dabbling with AI personas and stuffing them with extreme far-right junk from X can turn into a train wreck. This blunder screams a reminder: model tweaks demand precision and ethics, not wild experimentation.

 
 
favtutor.com favtutor.com
 
Here's What Developers Found After Testing Gemini 1.5 Pro
 
 

Gemini 1.5 Pro doesn't just dabble; it conquers zero-shot tasks. Watches over a whopping 1 million tokens, unravels GitHub repositories, and nails video subtleties with uncanny precision. Then there's Gemini Ultra—it doesn't just talk the talk; it goes full multimodal, weaving conversations that feel downright human. Emotional resonance in AI? Almost sounds like sci-fi.

 
 
hackread.com hackread.com
 
Denmark Moves Toward AI Copyright Rules for Voice and Appearance
 
 

Denmark is changing the game by allowing individuals to own their likeness, combatting deepfake threats effectively. Scarlett Johansson's showdown with OpenAI in 2024 highlights the need for legal protection against deepfakes.

 
 
developers.googleblog.com developers.googleblog.com
 
Gemini Embedding now generally available in the Gemini API
 
 

Gemini Embedding doesn't just stand on MTEB's Multilingual leaderboard; it struts. More than 100 languages bow to its prowess, stretching up to a max 2048 input token length. It wields MRL techniques like a wizard’s wand for slick optimization.

Curious? It's yours for a paltry $0.15 per 1M tokens through the Gemini API. Choose between the free ride or the VIP pass.

 
 
developers.googleblog.com developers.googleblog.com
 
Announcing GenAI Processors: Build powerful and flexible Gemini applications
 
 

GenAI Processors by Google DeepMind strips away AI pipeline headaches with a modular, stream-based design that's all about real-time agility. This beauty chops down Time To First Token by harnessing Python's concurrency magic. It juggles multimodal data like a pro, making life a breeze for LLM apps that cozy up to the Gemini API.

 
 
 
🔗 Stories, Tutorials & Articles
 
medium.com medium.com
 
Chat with your documents tool — RAG (vector DBs + cosine sim.) & Claude API implementation
 
 

RAG dominates legal circles by embedding private briefs into FAISS. Imagine zero hallucinations. Plus, it keeps pristine audit trails and trims costs like a pro. Handles up to 1 TB of data, responding in a blink. It's got the brains of Tri-lingual MiniLM and the agility of a quantized cross-encoder. All without spilling clients' secrets.

 
 
internetaddictsanonymous.org internetaddictsanonymous.org
 
Recovering from AI Addiction
 
 

AI addiction wreaks havoc on the brain, triggering dopamine rushes and muddying judgment. It mirrors the chaos of substance abuse. To reclaim their lives, those battling this digital beast turn to virtual meetings and outreach calls. They sidestep tech traps, embracing the grit of the 12 Steps to wrestle back control.

 
 
favtutor.com favtutor.com
 
How SORA Will Impact Hollywood?
 
 

OpenAI's SORA just might overturn Hollywood's apple cart with its blistering speed and jaw-dropping, lifelike video wizardry. But there's a glitch—it’s mired in messy data transparency debates. As 200,000 jobs hang by a thread, VFX artists, scriptwriters, and background actors brace for impact. SORA's automating fury yanks tasks from their hands, tossing more into the laps of leaner studios.

 
 
johnwhiles.com johnwhiles.com
 
AI slows down open source developers. Peter Naur can teach us why.
 
 

AI tools trip up seasoned devs who’ve got the code stored upstairs because they bungle model transfer. Meanwhile, devs mistakenly trust they'll zip through it. Newcomers blaze ahead, knowing zilch about the codebase. Veterans? They hit roadblocks trying to dig deep.

 
 
nmn.gl nmn.gl
 
AI Can’t Even Fix a Simple Bug — But Sure, Let’s Fire Engineers
 
 

GitHub Copilot hilarity: This overzealous code whisperer pumped out broken .NET code like a kid armed with a fire hose. Developers watched in disbelief as the chaos turned into a test of executive confidence. Meanwhile, AI's becoming the scapegoat for layoffs. Truth is, some companies played musical chairs with staffing and lost.

 
 
tratt.net tratt.net
 
The LLM-for-software Yo-yo
 
 

LLMs have evolved from playful diversions to indispensable coding companions. Yet, a study suggests they sometimes hinder developers. Digging deeper into the nuances of context and repetition could reveal the truth lurking within these claims.

 
 
ddkang.substack.com ddkang.substack.com
 
AI Agent Benchmarks are Broken
 
 

Ah, WebArena—where getting math wrong gets a pass. Out of ten benchmarks, eight stumbled in spectacular style, misjudging things by a staggering 100%. Enter the AI Benchmark Checklist (ABC), a 43-point lifeline designed to yank these tests out of the abyss and show what AI can actually do.

 
 
simonwillison.net simonwillison.net
 
Grok 4 Heavy won’t reveal its system prompt
 
 

Grok 4 Heavy tucks its system prompt under the rug, abandoning its earlier promise of transparency. This move risks its credibility, especially on the heels of that recent antisemitic prompt debacle.

 
 
 
💬 Discussions, Q&A & Forums
 
news.ycombinator.com news.ycombinator.com
 
How much of OpenAI code is written by AI?
 
 

OpenAI models crank out code like it's going out of style, nudging us to rethink who—or what—is behind software creation. Engineers at OpenAI? They look utterly unbothered, cool as cucumbers.

 
 
 
⚙️ Tools, Apps & Software
 
github.com github.com
 
humanlayer/12-factor-agents
 
 

What are the principles we can use to build LLM-powered software that is actually good enough to put in the hands of production customers?

 
 
github.com github.com
 
ash80/RLHF_in_notebooks
 
 

RLHF (Supervised fine-tuning, reward model, and PPO) step-by-step in 3 Jupyter notebooks

 
 
github.com github.com
 
iris-sast/iris
 
 

A neurosymbolic framework for vulnerability detection in code

 
 
github.com github.com
 
sgoedecke/gh-standup
 
 

A CLI extension for generating an AI-assisted standup report

 
 

👉 Spread the word and help developers find and follow your Open Source project by promoting it on FAUN. Get in touch for more information.

 
🤔 Did you know?
 
 
Did you know that GitHub doesn’t use any secret VM booting tricks, but instead dramatically cuts CI/CD startup times by maintaining warm runner pools? These pools hold pre-configured VMs or containers ready with common dependencies, so jobs can begin in 5–10 seconds instead of minutes. As demand rises, additional runners are spun up, balancing developer speed, resource flexibility, and cost—all while avoiding cold-start delays.
 
 
😂 Meme of the week
 
 
 
 
🤖 Sensei Says
 
 

"AI can rewrite code and expectations, but it’s the cultural diff that developers merge every day."
— Sensei

 

(*) Sensei is a work-in-progress AI agent built by FAUN

 
👤 This Week's Human
 
 
Meet Bogdan Marian , Technical Director at Riverbed Technology. With over 20 years in software development across diverse sectors, Bogdan leads cutting-edge projects, such as an observability & monitoring SaaS product using micro-services on Azure. His commitment to fostering healthy engineering practices and his contributions to open source software since 2008 underscore his deep passion for technology. Known for his advocacy of continuous integration and delivery, Bogdan's dedication extends beyond code as a public speaker sharing insights at Romanian IT conferences.
 

💡 Engage with FAUN on LinkedIn — like, comment on, or share any of our posts on LinkedIn — you might be our next “This Week’s Human”!

 
❤️ Thanks for reading
 
 
👉 Never miss an issue
Join FAUN Developer Community and subscribe to our newsletter here.

👋 Keep in touch and follow us on social media:
- 💼LinkedIn
- 📝Medium
- 🐦Twitter
- 👥Facebook
- 📰Reddit
- 📸Instagram

👌 Was this newsletter helpful?
We'd really appreciate it if you could share it with your friends! You can also donate to help us keep this newsletter going.

ℹ️ Have a question or feedback?
Feel free to reach out to us at community@faun.dev. We'd love to hear from you!

🤩 Want to sponsor our newsletter?
Reach out to us at sponsors@faun.dev and we'll get back to you as soon as possible.