Mar 31, 2026

Surface Areas for Agents

As tools like OpenClaw and other AI agents attract attention, many people are rushing to test them without first adjusting their mental model of how these systems actually work. One of the most important things to understand when dealing with agents are surface areas.

At this stage in 2026, we should think of an agent less as an intern and more as software that can reason, but only through the tools and interfaces it has been given. Let’s think about surfaces for a moment.

If we think about humans, what are the surface areas we use for our daily work? A computer monitor to see, keyboard and mouse to control the computer, pen and paper to write on, speakers to listen to audio. What about an agent?

An agent’s usable surface area includes at least:

Chat surfaces: WhatsApp, Telegram, Slack, Discord, Teams, Signal, etc.
Web surfaces: browser automation, portals, dashboards, forms
System surfaces: CLI, filesystem, local apps, OS controls
API surfaces: first-party APIs, partner APIs, MCP servers, webhooks
Workflow surfaces: cron, triggers, queues
Human-in-the-loop surfaces: interacting with a person for review

You do not need to master these terms immediately to understand the key point: agents can only work through the routes that software makes available to them. One way would be to think about surfaces such as different apps and tools that we use.

Let’s use web surfaces as an example. A human can open Instagram, glance at an image, and understand the vibe instantly, however an agent may need a browser tool, permission to log in, webpage structure, and a way to interpret images or dynamic content

Another example would be system surfaces. I am often frustrated that such intelligence is unable to properly utilise the computer I’ve given it, but that is understandable too, because the current computer systems are designed for human users.

Humans and agents

What often gets overlooked is how humans interact with agents themselves. Humans need a way to interact with agents, and that interaction is its own kind of surface area. This includes the OpenClaw, ChatGPT, and other similar tools.

Depending on what you’re trying to do (coding, image generation, file management), we need to start thinking about how the agent can communicate. While the most common usage would be to type into ChatGPT, we also use other methods such as sending voice messages and having long voice calls with our agents, as voice discussion is generally a faster way to communicate. Unlike older voice-command systems, modern agents can usually handle more natural speech, including pauses, filler words, and half-finished thoughts.

Convergence

Surfaces themselves are increasingly converging. We see tools like Claude Cowork allow agents to better use our computers by reading what is displayed on the screen, and also controlling the mouse cursor.

In the other direction, tools like Claude Code and OpenCode bring humans back to the command line interface (CLI), interacting with computers by typing and reading text, similar to agents.

I believe eventually we’ll reach a point where applications and services that serve agents and humans will have to serve up multiple optimised surface areas. As we currently have focuses on accessibility for humans, so too will there be increased focus on accessibility for agents.

Conclusion

This gives you an introduction to an important part of understanding AI agents, and a small step towards improving the interactions between humans and agents.

The next time you prepare a task for an agent to perform, or an objective for it to meet, you also take the time to consider the surface areas it has access to, and its capabilities with those surfaces. Do that well, and you’ll start to see much more satisfactory outcomes from your interactions with agents.