Blog Posts
Large Language Models (LLMs) don’t start out as friendly assistants. They begin as vast, raw systems trained on enormous datasets—powerful but unpolis...
Ray is a distributed execution engine. Its job is to take a messy cluster of machines and make it feel like one giant computer....
vLLM is a library for running LLMs on GPUs. It is designed to be fast and efficient, and is a great choice for running LLMs on GPUs....
Agents don’t just think — they move data between systems....
Evaluating agents is messy. Traditional software is deterministic — same input, same output. Agents don’t work that way. They reason in loops, call to...
MAS are emerging as a serious pattern for tackling the limits of single agents....
We've covered a lot of topics in the past few posts, but one thing that is missing is the concept of agents....
When we talk about memory in LLM agents, we’re not talking about neurons or synapses — we’re talking about tokens, context windows, and clever hacks t...
In part one, I wrote about why structured outputs matter and why just asking an LLM to “return JSON” doesn’t cut it....
When you build with LLMs, you quickly run into a recurring issue:...
Listing some technical books that I higly recommend (and actually read)....