The Elephant in the AI Agent Room (and why your demos work but your products don’t)

The Elephant in the AI Agent Room (and why your demos work but your products don’t)

The Elephant in the AI Agent Room (and why your demos work but your products don’t)

Title:

The Elephant in the AI Agent Room (and why your demos work but your products don’t)

Read:

3 min

Date:

Dec 1, 2025

Author:

Massimo Falvo

Share this on:

Title:

The Elephant in the AI Agent Room (and why your demos work but your products don’t)

Read:

3 min

Date:

Dec 1, 2025

Author:

Massimo Falvo

Share this on:

Why "Cognitive Overload" isn’t just a human problem and how Anthropic is teaching us to design better interactions.

As an experience designer, I spend my life fighting cognitive overload. We know that if we show a user 100 buttons at once, they won’t press any of them, or they’ll press the wrong one. It’s called the Paradox of Choice.

And yet, paradoxically, that is exactly what we have done so far with our AI Agents. And the results show.

Anyone who has really tried knows this: you run the "Book me a flight" test and it looks like magic. Then you try to build a real 10-step workflow and, invariably, by the sixth or seventh step, the agent falls apart.

Reading a brilliant technical analysis by Cordero Core, I brought the point into focus: the reason our real products crumble is a gigantic Design problem, not a computing power problem.


The Firehose Problem (aka: Bad UX)

Until yesterday, the standard for letting models talk to tools (the MCP protocol) involved loading the technical documentation of all available tools into the AI's memory before even understanding the user's intent.

In UX terms, it’s like walking into a post office to ship a package and having the entire internal operations manual thrown at you before you can even say "Good morning."

The interface becomes a "firehose" of information. The model isn't just confused; it is technically drowning in noise.

And this drowning has devastating mathematical consequences. If a model has 90% accuracy on a single decision (a great grade!), when you chain 5 of these decisions together in a chaotic environment, the total reliability drops to just under 60%. We went from top marks to a failing grade. This is why complex flows fail: we designed a system destined to derail.


The Turning Point: From "Everything Now" to "Progressive Disclosure"

This is where my perspective as a designer comes into play. The solution quietly introduced by Anthropic (so-called "Skills") is nothing more than the brilliant application of a core interaction principle: Progressive Disclosure.

Instead of overloading the model, we design a tiered structure that mimics the onboarding of a new human colleague:

  1. The Index (Wayfinding): At the start, the Agent sees only the names of the available skills in the "menu" ("Marketing", "Data Analysis"). Nothing else.

  2. Activation (On-demand): Only when truly needed does the Agent choose and "open" the specific skill.

  3. Focus (The Recipe): Only at that precise moment does the AI download the detailed technical instructions for that specific task, ignoring the rest of the world.


We Need to Do "Service Design" for Machines

This evolution (technically RAG-MCP) tells us something fundamental for our industry: we can no longer limit ourselves to writing prompts. We must become knowledge architects.

We need to curate the AI's experience exactly as we curate a user's experience. You wouldn't give a new hire the keys to the entire archive on day one; you would give them a guide, context, and the right tools at the right time.

The real challenge for us today is not "how smart is the model," but "how well-designed is the interaction between the model and its tools."

We are moving from trying to build an omniscient (and confused) AI to designing a system of orchestrated specialists. The future of Agents lies not in brute force. It lies in the elegance of the architecture.

And elegance, in design, means removing everything that isn't needed, at the moment it isn't needed.

Share this on: