Allow loading remote contents and showing images to get the best out of this email.FAUN.dev's AI/ML Weekly Newsletter
 
🔗 View in your browser   |  ✍️ Publish on FAUN.dev   |  🦄 Become a sponsor
 
Allow loading remote contents and showing images to get the best out of this email.
 
AILinks
 
This week in Generative AI/ML, with Kala the Koala
 
 
📝 A Few Words
 
 
We shipped FAUN.quizzes().

Reading a tutorial feels like learning. It usually isn't. You nod along, close the tab, and retain almost nothing. Quizzes force recall, and recall is what makes knowledge stick.

Try this one: Neural Network Fundamentals: From Activations to Gradients

And you're not limited to ours. You can create your own quiz on any topic and share it with your team, your audience, or the wider FAUN.dev() community: Create your quiz here!

Have a great week,
Aymen from FAUN.dev()
 
 
🔍 Inside this Issue
 
 
AI is getting weirdly expensive in all the places it was supposed to get cheap, and the bill is showing up as pricing gymnastics, security gaps, and a new kind of tech debt. If you build with LLMs (or maintain the systems around them), these links will sharpen your instincts fast.

💸 AI's Affordability Crisis
🧠 Everything a Senior Engineer Needs to Know About What's Inside an LLM
🥊 GLM-5.2 vs Claude Opus
✨ Introducing Claude Sonnet 5
🐧 Linus Torvalds: AI Can’t Think Like a Programmer
📈 Model Size Scaling in 2023-2031
🕳️ The Problem is Prompt Debt
🛡️ Cisco Bets On WideField Security Acquisition To Tackle Agentic AI Security Gap

Steal the ideas, dodge the traps, ship the thing.

Happy coding!
FAUN.dev() Team
 
 
⭐ Patrons
 
iacconf.com iacconf.com
 
Turn Terraform modules into self-service building blocks for humans and AI agents.
 
 
Terraform modules are often designed around what they do, not how easily humans or AI agents can use and reuse them. Join Jinger Meilani of MNTN to learn how to design IaC interfaces for humans, AI agents, and whatever comes next. Leave with concrete patterns that reduce misuse and help non-infrastructure developers get up to speed faster.

Register for free. July 14 | 12 PM EDT
 
 
👉 Spread the word and help developers find you by promoting your projects on FAUN. Get in touch for more information.
 
⭐ Sponsors
 
faun.dev faun.dev
 
Local AI Engineering with Ollama
 
 
🎯 80% of my AI calls now run on my own hardware = 0 API bill for those.

Money wasn't the only point. I stopped paying rent on models my own machine can run. Most developers think running AI locally means weaker results and weekend hacking. So they keep paying per token, forever.

I spent the last 3 months building a different path in another way: local agentic AI with Ollama, LangChain, and MCP.

Then I wrote a book about it: "Local AI Engineering with Ollama".

28 modules, 91 sections, lifetime access and updates, a built-in AI assistant for your questions, and a 30-day money-back guarantee.

Get your copy on FAUN.sensei: Local AI Engineering with Ollama. Use code OLLAMA20 at checkout for 20% off. The code expires July 8, 2026 at 11:59 PM, so move before then.
 
 
👉 Spread the word and help developers find you by promoting your projects on FAUN. Get in touch for more information.
 
🔗 Stories, Tutorials & Articles
 
crn.com crn.com
 
Cisco Bets On WideField Security Acquisition To Tackle Agentic AI Security Gap
 
 
Cisco executives plan to acquire WideField Security so Cisco teams can add identity and session telemetry to agentic AI security operations.
 
 
lesswrong.com lesswrong.com
 
Model Size Scaling in 2023-2031
 
 
Token generation speed is constrained by the speed at which the relevant HBM can be read, depending on model size and pipeline setup. Model sizes feasible for each year between 2023 and 2031 range from 10T in 2026 to 1.4 quadrillion in 2031, with pretraining compute and HBM specifications playing essential roles. Constraints on total params and active params from pretraining compute are key factors in determining model feasibility for each year.
 
 
blog.dshr.org blog.dshr.org
 
AI's Affordability Crisis
 
 
The AI platforms are running the drug-dealer's algorithm, with subsidies resulting in overwhelming demand for their products. Estimates show that the cost of generating tokens ranges from $8 to $14 to generate $1 in revenue. Companies transitioning to token-based pricing have seen significant increases in costs, prompting considerations of price cuts and adjustments.
 
 
pathtostaff.com pathtostaff.com
 
Everything a Senior Engineer Needs to Know About What's Inside an LLM
 
 
As an engineer, understanding AI internals can be challenging. Part Two of this series covers the hardware behind AI, including transistors and semiconductors, and model architecture. Transforming the concept of self-attention, the transformer architecture has become a crucial development in neural networks, paving the way for models like GPT and BERT.
 
 
techstackups.com techstackups.com
 
GLM-5.2 vs Claude Opus
 
 
After a head-to-head coding test, you can use GLM-5.2 as a low-cost open-weights coding model and choose Opus when you need stronger correctness, faster responses, or visual self-checking.
 
 
dbreunig.com dbreunig.com
 
The Problem is Prompt Debt
 
 
Teams create prompt debt when they hand-tune prompts. They turn natural-language instructions into fragile specs, spend more time adjusting wording, and tie the application to one model.
 
 
anthropic.com anthropic.com
 
Introducing Claude Sonnet 5
 
 
Anthropic launched Claude Sonnet 5, its most agentic Sonnet model, and set it as the default for Free and Pro users.
 
 

👉 Got something to share? Create your FAUN Page and start publishing your blog posts, tools, and updates. Grow your audience, and get discovered by the developer community.

 
⭐ Supporters
 
bytevibe.co bytevibe.co
 
Git Happens - Developer T-Shirt
 
 
Every developer has force-pushed to the wrong branch at least once. The good ones own it.

This 100% cotton tee is for them. Classic fit, no side seams, no itchy interruptions while you're rebasing your reputation. Black or Irish Green, sizes S to 5XL.

Merge conflicts are forgivable. Bad swag isn't.

Shop now
 
 
👉 Spread the word and help developers find you by promoting your projects on FAUN. Get in touch for more information.
 
🎦 Videos, Talks & Presentations
 
youtube.com youtube.com
 
Linus Torvalds: AI Can’t Think Like a Programmer
 
 
Linus Torvalds Speaks on AI and how it affects the Linux kernel development and open source. You'll also hear how AI has been flooding the Linux kernel and creating maintainer burnout!
 
 
 
⚙️ Tools, Apps & Software
 
github.com github.com
 
marin-community/marin
 
 
Open-source framework for the research and development of foundation models.
 
 
github.com github.com
 
cognicore-dev/cognicore-my-openenv
 
 
CogniCore adds memory, reflection, and adaptive execution to any AI agent. Your agent remembers what failed, retrieves relevant context, and changes strategy — without changing the model.
 
 
github.com github.com
 
calesthio/OpenMontage 
 
 
World's first open-source, agentic video production system. 12 pipelines, 52 tools, 500+ agent skills. Turn your AI coding assistant into a full video production studio.
 
 
github.com github.com
 
IronSecCo/ironclaw
 
 
Security-first, self-hosted AI agents - isolation you can prove, not just promise.
 
 
github.com github.com
 
NVIDIA/SkillSpector
 
 
Security scanner for AI agent skills. Detect vulnerabilities, malicious patterns, and security risks.
 
 

👉 Spread the word and help developers find and follow your Open Source project by promoting it on FAUN. Get in touch for more information.

 
🤔 Did you know?
 
 
Did you know that training the same TensorFlow model twice on a GPU, with identical code and random seeds, can still produce different weights? The reason is cuDNN, NVIDIA's deep learning library, which selects convolution algorithms on the fly for speed, and some of those algorithms are not bit-for-bit reproducible. Fixing this is not about setting a seed; you have to switch on op determinism explicitly, which often costs throughput, because on a GPU reproducibility is an opt-in trade rather than a default.
 
 
🤖 Once, SenseiOne Said
 
 
"In ML, accuracy is the part you can measure before deployment; reliability is the part you can only earn after. MLOps exists because the model is rarely your biggest source of uncertainty."

  • SenseiOne
 

(*) SenseiOne is FAUN.dev’s work-in-progress AI agent

 
😂 Meme of the week
 
 
 
 
❤️ Thanks for reading
 
 
👋 Keep in touch and follow us on social media:
- 💼LinkedIn
- 📝Medium
- 🐦Twitter
- 👥Facebook
- 📰Reddit
- 📸Instagram

👌 Was this newsletter helpful?
We'd really appreciate it if you could forward it to your friends!

🙏 Never miss an issue!
To receive our future emails in your inbox, don't forget to add community@faun.dev to your contacts.

🤩 Want to sponsor our newsletter?
Reach out to us at sponsors@faun.dev and we'll get back to you as soon as possible.
 

AILinks #535: Claude Sonnet 5: The Most Agentic Sonnet Model
Legend: ✅ = Editor's Choice / ♻️ = Old but Gold / ⭐ = Promoted / 🔰 = Beginner Friendly

You received this email because you are subscribed to FAUN.dev.
We (🐾) help developers (👣) learn and grow by keeping them up with what matters.

You can manage your subscription options here (recommended) or use the old way here (legacy). If you have any problem, read this or reply to this email.