Last updated June 30 at 8 AM.
The 2026 AI Engineer World’s Fair brings together the people at the cutting edge of building the models, tools, and infrastructure behind the current agent boom, and I’ll be there for every moment. I’ll update this post throughout the conference with the talks worth knowing about, the patterns emerging across the expo floor, and the ideas that matter beyond any individual announcement.
Day 1: Intelligence is becoming table stakes
Day one was workshop day, and the shift from last year was obvious. Nobody was trying to prove that agents can write code, operate a browser, or do useful work. The sessions assumed the capability and focused on making that work dependable.
1. Trust is becoming an engineering discipline
Laurie Voss of Arize ran “From Vibes to Production: Evaluating and Shipping AI Agents That Work.” Atlassian focused on how orchestration changes the software development lifecycle. Nnenna Ndukwe of Qodo taught teams to build quality gates into agentic coding workflows. Charles Frye of Modal ran “What Is an Inference Engine, Anyway?”, digging into the infrastructure required to turn model capability into something fast, efficient, and usable in production. Accenture and Datadog won best title of the day with “Build a Platform, Unleash an Agent on It… and Watch It Burn!”
Evaluation came roaring back with a vengeance. Braintrust ran an observability workshop, IBM hosted “Evals in AI: A Deep Dive,” and Weights & Biases walked through building an end-to-end agent evaluation pipeline.
When a model only produced text that a human reviewed, evaluation was informal. A developer threw together some tests, eyeballed the results, and decided to ship. That stops working when an agent reviews critical code, updates a customer relationship management system, approves refunds, or changes production systems. At that point, an eval becomes both a measuring stick and a governor.
Think about what the calculator did for accounting. It made arithmetic trivial, so the real work moved to approvals, controls, records, and audits. Last year, we got the calculator. This year, we are building the accounting department.
2. Context engineering has grown beyond retrieval-augmented generation
Reliable context has become its own discipline, and retrieval-augmented generation (RAG) is now one piece of it.
Neo4j argued that “RAG Needs a Map.” Unblocked ran “Beyond RAG: Build a Relational Context Engine from Scratch.” Towards AI focused on compaction, memory, and cost, while Elastic declared that “Vector Isn’t Enough.”
The debate has moved past whether models need external information. The question now is how that information should be represented, connected, compressed, refreshed, secured, retrieved, and paid for.
A basic RAG demo proves that a model can find an answer in a document. A production context system must decide which documents matter, how they relate, who can see them, and which source to trust when two disagree. That is the difference between answering a question and knowing a business well enough to affect its bottom line.
3. The agent economy is building its back office
The expo floor was full of companies selling quality, verification, observability, incident response, code remediation, payments, and production infrastructure. Individually, they looked like separate products. Together, they looked like the departments of an established corporation.
The expo floor was not selling a new genius employee. It was selling the back office required to employ millions of them.
The workshops hinted at one crucial part of that back office: authority. Docker covered the move from approval loops to autonomous agents. PayPal showed how it designs command-line tools for agents. Microsoft focused on protecting hosted agents.
Once an agent decides what should happen, something still has to determine whether it can happen, whose authority it carries, which policies apply, and what gets recorded. That is the action layer, and it is the problem we work on at Arcade.dev.
Day one made one thing clear: as intelligence becomes more available, the advantage shifts toward the systems that make it useful and trustworthy.


