Roey Ben Chaim

Tinkerer, creator, and technology enthusiast. Drinking coffee and building stuff.

Blog Posts

Training LLMs 101
September 17, 2025

Large Language Models (LLMs) don’t start out as friendly assistants. They begin as vast, raw systems trained on enormous datasets—powerful but unpolis...

Ray for LLM Inference
September 12, 2025

Ray is a distributed execution engine. Its job is to take a messy cluster of machines and make it feel like one giant computer....

vLLM: LLM Inference That Doesn’t Waste Your GPU
September 12, 2025

vLLM is a library for running LLMs on GPUs. It is designed to be fast and efficient, and is a great choice for running LLMs on GPUs....

Three Practical Ways to Detect Sensitive Data
September 11, 2025

Agents don’t just think — they move data between systems....

Evals: How to Evaluate Agents
September 9, 2025

Evaluating agents is messy. Traditional software is deterministic — same input, same output. Agents don’t work that way. They reason in loops, call to...

Why Multi-Agent Systems Matter
September 7, 2025

MAS are emerging as a serious pattern for tackling the limits of single agents....

From Zero to Agent: ReAct, Reflection, and Planning
September 6, 2025

We've covered a lot of topics in the past few posts, but one thing that is missing is the concept of agents....

How Agents Remember: On Memory and the Art of Context Engineering
September 5, 2025

When we talk about memory in LLM agents, we’re not talking about neurons or synapses — we’re talking about tokens, context windows, and clever hacks t...

Structured Outputs in Practice: Instructor vs PydanticAI vs BAML
September 5, 2025

In part one, I wrote about why structured outputs matter and why just asking an LLM to “return JSON” doesn’t cut it....

Structured Output
September 4, 2025

When you build with LLMs, you quickly run into a recurring issue:...

Engineering Books
August 9, 2025

Listing some technical books that I higly recommend (and actually read)....