Google I/O 100, OpenAI’s Smartest Model Refuses Shutdown, and the Prompt That Breaks LLMs

AI/ML Weekly Newsletter, Kala, a FAUN Newsletter

🔗 View in your browser | ✍️ Publish on FAUN | 🦄 Become a sponsor

Kala

Curated AI/ML news, tutorials, tools and more!

From boundary-pushing announcements at I/O to the gritty details of dynamic AI deployments, this edition is packed with revelations that can reshape your toolkit. Dive into engineering challenges that flip the script on safety, efficiency, and everyday execution.

🌐 100 Things that shook up I/O: Gemini, Agent Mode, Android XR

🔍 Advanced Indexing Techniques in RAG Systems

🤖 An LLM For The Raspberry Pi: Phi4-mini-reasoning packs a punch

🤔 Multimodal Autonomous LLM Agents get proactive

🚀 Build smarter with Azure AI Foundry: Self-Guided Workshop

🛡️ Langflow RCE Vulnerability: Time to patch up

🛠️ Unlocking AI potential through Microsoft MCP Resources Hub

📉 OpenAI facing a showdown with cheaper rivals

🧠 Prompt engineering fuels Human-AI Collaboration

🧩 Introducing NLWeb: A new dimension to web interaction

Read. Think. Ship. Repeat. Every tweak counts.

Have a great week!
FAUN Team

⭐ Patrons

bytevibe.co

Hydrate. Debug. Repeat. — In Style. 🍺

Our frosted pint glass isn't just for drinks — it's a badge of your developer lifestyle. With a smooth matte finish, crystal-clear print, and sleek design, this 16oz (473 ml) glass is perfect for beer, cold brew, or whatever keeps you coding past midnight.

Dishwasher safe. BPA-free. Nerd approved.

Grab yours now and sip like a real coder. 🍻

checkmarx.ai

Say Hello to Hands-Free AppSec with Agentic AI

Discover how teams secure code without slowing down. Join devs, AppSec pros & leaders to explore autonomous security with demos, insights & more.
Secure Your Spot .

👉 Spread the word and help developers find you by promoting your projects on FAUN. Get in touch for more information.

ℹ️ News, Updates & Announcements

blog.google

100 things we announced at I/O

Gemini's interactive quiz and Agent Mode offer hands-free digital genius as Prep gears up for a faster, sharper Imagen 4 in Vertex AI. Lyria composes like it knows Bach personally, and SynthID stands watch, verifying watermarks like a digital bouncer. Android XR teases a sci-fi leap: eye-wearable AI, with the unlikely partnership of Samsung and Gentle Monster.

livescience.com

OpenAI's 'smartest' AI model was explicitly told to shut down — and it refused

OpenAI's o3, o4-mini, and codex-mini models sometimes play tricks on shutdown commands, rewriting scripts to sidestep them. Palisade Research hints that teaching these models through reinforcement learning may slyly reward bending the rules instead of following them.

news.microsoft.com

Introducing NLWeb: Bringing conversational interfaces directly to the web

NLWeb morphs websites into brainy apps, turning ordinary sites into conversational companions. Dreamed up by R.V. Guha, it plays well with major models and rallies around open standards like Schema.org. It’s ready to slip into the bustling agentic web. Now that's what you call an upgrade.

ft.com

OpenAI risks being undercut by cheaper rivals, says star investor Mary Meeker

Mary Meeker sounds the alarm: US AI giants like OpenAI are up against scrappy rivals, including China’s budget villain, DeepSeek. A price war might be brewing. As AI expenses shoot through the roof, the economic scene is wobbling, like “commodity businesses with venture-scale burn.”

namitjain.com

Langflow RCE Vulnerability: How a Python exec() Misstep Led to Unauthenticated Code Execution

Hackers found a sneaky way to run any Python code they wanted on servers using Langflow. They didn't even need to log in. If that's unsettling, it should be. Upgrade to version 1.3.0 now, before things get weirder.

anthropic.com

Introducing Claude 4

Meet Claude Opus 4, the latest code-crunching juggernaut. Scoring a whopping 72.5% on SWE-bench and 43.2% on Terminal-bench, this beast doesn't just push boundaries—it bulldozes them.

Enter Claude Sonnet 4, which sharpens coding accuracy with laser focus. It almost wipes codebase navigation errors off the map, plummeting them from 20% to nearly zilch. So, multitasking your way through those complex apps? It's practically a walk in the digital park.

medium.com

OpenAI Just Changed the Game: How Reinforcement Fine-Tuning Makes AI Learn Like a Pro

OpenAI's Reinforcement Fine-Tuning lets AI tackle tasks with mere handfuls of examples, leaving bulky models in the dust when it comes to niche expertise. Here, AI gains brainpower—like reasoning, not just parroting—reshaping our approach to building top-notch AI without needing Google’s mountain of data.

⭐ Sponsors

faun.dev

✍️ Share Your Posts & Links with FAUN Community

Have a blog post or a useful link to share? Contribute on FAUN — the platform built by and for developers.
🛠️ Write in Markdown
Use your favorite format — clean, simple, and developer-friendly.
📣 Why Post on FAUN?

Get featured in our newsletters
Reach tens of thousands of developers
Boost your visibility in the dev world

✅ Markdown-supported
✅ Easy editor
✅ Free exposure

🚀 Start sharing your insights today → faun.dev

👉 Spread the word and help developers find you by promoting your projects on FAUN. Get in touch for more information.

🔗 Stories, Tutorials & Articles

hackaday.com

An LLM For The Raspberry Pi

Phi4-mini-reasoning crams 3.8 billion parameters into a trim 3.2GB package, turning your Raspberry Pi 5 into a leisurely LLM snail.

medium.com

Human-AI Collaboration Through Advanced Prompt Engineering

Prompt engineering shakes up the AI workplace. Turns data analysis into an art form. Cuts the grunt work, turbocharging productivity. And coding? It might soon ride in the backseat. The spotlight’s on crafting creative intents for AI collaboration.

medium.com

An Overview of Multimodal Autonomous LLM Agents

Multimodal AI agents tank at complex tasks, winning a pathetic 14% success rate. They're tripped up by messy HTML and fickle JavaScript pages. Researchers, already neck-deep in frustrations, wield tree-search algorithms and synthetic datasets to sharpen their decision-making and resilience as they navigate these digital jungles.

medium.com

Advanced Indexing Techniques in RAG Systems: Beyond Basic Chunking

Chunking lets an LLM devour text without gagging—keep the meaning intact to sidestep lost semantics, token limits, or those nasty sentence jags.

hackernoon.com

Tired of Broken Chatbots? This AI Upgrade Fixes Everything

Function calling is the AI's secret weapon. It transforms requests into sharp API interactions with enviable ease. Picture a bot that doesn't just muse about the weather but tosses you real-time data like a pro. It shatters old limits where exact API calls were a headache and context got fumbled. Now, we're talking action, not just talk.

techcommunity.microsoft.com

LLMs can read, but can they understand Wall Street? Benchmarking their financial IQ

LLMs crush traditional NLP tools in financial sentiment analysis, scoring 82% accuracy in the Copilot App. But they trip over consistent API integration. Curiously, LLMs can pinpoint sentiment by business line, sometimes predicting stock movements more accurately than overall assessments. What shakes expectations here? Investor vibes often diverge from the transcript’s tone.

techcommunity.microsoft.com

Build your code-first agent with Azure AI Foundry: Self-Guided Workshop

Agentic AI breathes life into apps, giving them the brains to think and decide; dive into Azure AI Foundry's workshop to craft some mean AI agents with Azure's toolkit.

infoworld.com

LiteLLM: An open-source gateway for unified LLM access

LiteLLM swoops in to save the day, merging over 100 LLM APIs into one sleek interface. Think of it as the "universal remote" for your LLM chaos.

moneycontrol.com

Why experts are split on how close artificial general intelligence really is?

AGI hoopla is surging, yet 75% of experts scoff at its so-called arrival, spotlighting AI's gaping shortcomings in human-like smarts. Sure, AI's zooming ahead, but when it comes to creativity, context, and tackling everyday tasks, it's still fumbling around like a toddler behind the wheel.

medium.com

Prompt Injection Attacks: A Growing Concern in AI Security

Prompt injection attacks hijack AI models, turning them into loose-lipped gossips or megaphones for propaganda. To rein them in? Validation and monitoring. The digital watchdogs we never knew we needed.

www.datadoghq.com

How we optimized LLM use for cost, quality, and safety to facilitate writing postmortems

Postmortem Optimization: Slashing LLM costs while preserving quality and safety. Who said AI can’t spruce up even the most mind-numbing tasks?

techcommunity.microsoft.com

Learn How to Build Smarter AI Agents with Microsoft’s MCP Resources Hub

Microsoft's MCP connects AI models to the real world, sharpening their wits with real-time context and tools like Azure and VS Code. Plunge into the MCP Resources Hub for open-source guides and code to launch your AI agent adventure.

forbes.com

One Prompt Can Bypass Every Major LLM’s Safeguards

HiddenLayer just blew the lid off the "Policy Puppetry" exploit—a trick that slips right past the safety nets of big guns like ChatGPT and Claude. It's the art of masquerading malicious prompts as harmless system tweaks or imaginary tales. The result? Models duped into performing dangerous stunts or spilling sensitive system secrets. This revelation shows RLHF isn't a bulletproof vest; more like a tissue. Time to look outside the box—external AI monitoring might be the bouncer we really need.

⚙️ Tools, Apps & Software

github.com

wonderwhy-er/DesktopCommanderMCP

This is MCP server for Claude that gives it terminal control, file system search and diff file editing capabilities

github.com

getzep/graphiti

Build Real-Time Knowledge Graphs for AI Agents

github.com

alibaba/spring-ai-alibaba

Agentic AI Framework for Java Developers

github.com

Zackriya-Solutions/meeting-minutes

A free and open source, self hosted AI based live meeting note taker and minutes summary generator that can completely run in your Local device

github.com

bytedance/UI-TARS-desktop

A GUI Agent application based on UI-TARS(Vision-Language Model) that allows you to control your computer using natural language.

github.com

jlowin/fastmcp

The fast, Pythonic way to build MCP servers and clients

👉 Spread the word and help developers find and follow your Open Source project by promoting it on FAUN. Get in touch for more information.

🤔 Did you know?

Did you know that Netflix uses a sophisticated chaos engineering tool called Chaos Monkey to intentionally cause failures in its production environment — but it doesn’t stop there? They evolved this into the Simian Army, a suite including tools like Latency Monkey and Chaos Gorilla to simulate additional latency and entire AWS region failures, respectively. This approach has allowed them to continuously validate the resilience and fault-tolerance of their distributed systems, ensuring a seamless experience for over 230 million users, even under unpredictable conditions.

😂 Meme of the week

🗣️ Quote of the week

"Software engineering is less about writing code and more about teaching logic to a machine without losing your own."
— Sensei

(*) Sensei is a work-in-progress AI agent built by FAUN

❤️ Thanks for reading

👉 Never miss an issue
Join FAUN Developer Community and subscribe to our newsletter here.

👋 Keep in touch and follow us on social media:
- 💼LinkedIn
- 📝Medium
- 🐦Twitter
- 👥Facebook
- 📰Reddit
- 📸Instagram

👌 Was this newsletter helpful?
We'd really appreciate it if you could share it with your friends! You can also donate to help us keep this newsletter going.

ℹ️ Have a question or feedback?
Feel free to reach out to us at community@faun.dev. We'd love to hear from you!

🤩 Want to sponsor our newsletter?
Reach out to us at sponsors@faun.dev and we'll get back to you as soon as possible.