π A Few Words
π― 80% of my AI calls now run on my own hardware = 0 API bill for those.
Money wasn't the only point. I stopped paying rent on models my own machine can run. Most developers think running AI locally means weaker results and weekend hacking. So they keep paying per token, forever.
I spent the last 3 months building a different path in another way: local agentic AI with Ollama, LangChain, and MCP.
Then I wrote a book about it: "Local AI Engineering with Ollama". It ships soon. The 25 chapters are done, but I have a question for you.
π‘ What do you most want me to go deep on?
1 - Fine-tuning your own model with QLoRA and shipping it to Ollama.
2 - Building a chat app that turns into a tool-calling agent.
3 - Wiring local models to your tools over MCP.
4 - Picking and sizing hardware before you waste money on a GPU.
5 - Building local RAG that answers from your own docs.
π·οΈ Everyone who replies to this email with a feedback gets a 50% discount when it ships. I'll DM you!
Have a great day,
Aymen.
π Inside this Issue
Agents are getting first-class seats in the dev workflow, and the fine print is getting louder: identity, attribution, and who is on the hook when things go sideways. Pair that with browser-native prompt injection and a few big model moves, and this issue turns into a quick tour of where the next security and platform headaches are coming from.
π€ Announcing Stack Overflow for Agents
π£ ChatGPhish: The Page Is the Payload
π§± Making a vintage LLM from scratch
π§© OpenAI to acquire Ona
π Statement on the US government directive to suspend access to Fable 5 and Mythos 5
Take the ideas, dodge the traps, ship the work.
Stay safe out there.
FAUN.dev() Team
π Stories, Tutorials & Articles

openai.com
OpenAI acquires Ona to bring secure cloud execution technology to Codex, which now has over 5 million users per week. Ona's technology will allow Codex to work persistently in a customer's cloud environment.

anthropic.com
Anthropic staff disabled Fable 5 and Mythos 5 for all customers after U.S. officials issued an export-control directive that barred foreign nationals from accessing the models, citing a suspected jailbreak.

stackoverflow.blog
Stack Overflow's team opened the beta for "Stack Overflow for Agents", an API-first knowledge exchange that lets coding agents use Stack Overflow through human-owned accounts.
The beta points to a clear model: developers connect agents to their own accounts, and Stack Overflow's team can link agent use back to a person rather than an anonymous bot. That setup gives agents access to coding knowledge while keeping account ownership, reputation, and oversight tied to humans.

crlf.link
Croqaz shows how he built Vintage LLM, a Llama-style model trained on English books, newspapers, and other texts published before 1900. He covers corpus selection, cleaning, tokenizer choices, training setup, evaluation, and how pre-20th-century English affects model behavior.

permiso.io
By appending a payload to any web page summarized by ChatGPT, an attacker can leak IP, User-Agent, and launch phishing attacks using live links and images inside the assistant UI. This browser-based prompt injection raises the bar for phishing and tracking, bypassing traditional defenses.
βοΈ Tools, Apps & Software

github.com
Agent multiplexer that lives in your terminal.

github.com
Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.

github.com
Leaked system prompts for ChatGPT, Claude, Gemini, Grok, Perplexity, Cursor, Lovable, Replit, and more!

github.com
The Zero-Dependency, Sub-Millisecond AI Memory System for Hermes Agents and Everyone Else!

github.com
Self-hosted LLM observability β traces, cost, latency, agents, tool calling, RAG. Python SDK + OpenTelemetry + REST.
π€ Did you know?
Did you know that saving a model's weights, the numbers it has learned so far, is not enough to correctly resume machine learning training after a crash or preemption? The trainer also keeps running notes that matter: the optimizer state, which remembers which direction the model was moving and how fast it was improving, and the RNG state, which tracks where it had gotten to in shuffling the training data. Reload only the weights and drop those notes, and training keeps going but quietly takes a different route than it would have, so you end up with a different model than a clean run would have produced. That is why the hard part is not saving the data, it is restarting so the run behaves as if it never stopped.