---
title: 'MCP Is Not the Problem: The Missing Execution Surface for Agents'
date: '2026-03-13 01:39:47'
draft: false
summary: 真正缺的不是再吵一次 MCP vs API vs CLI，而是把 capability description、execution surface、workflow/orchestration
  三层拆开来看。
slug: mcp-is-not-the-problem
syndication:
- platform: X / Twitter
  url: https://x.com/jolestar/article/2032270366509511030
tags:
- mcp
- agent
- infrastructure
topics:
- ai
- software-engineering
type: post
---

Over the past few days, MCP has become the center of yet another argument about agent tooling.

Some people are saying MCP is dead. Some say serious teams are moving back to APIs and CLIs. Others argue that MCP itself is fine, and that the real problem is bad setup, bad auth, or bad tool loading strategies.

I think all of these reactions are circling the same issue.

I do not think MCP itself is the core problem.

What keeps happening, in my view, is that we collapse several distinct concerns into one argument. When people say “MCP sucks”, they are often describing failures in discovery, authentication, portability, context management, and execution. MCP was never meant to solve all of those things by itself.

## The missing piece is the execution surface

The most useful way I have found to think about this is to separate three concerns:

- capability description
- execution surface
- workflow / orchestration

Once you separate them, a lot of the confusion disappears.

Many discussions frame the problem as:

- MCP vs CLI
- MCP vs API
- structured tool protocols vs direct code execution

But these are not clean oppositions.

An API is a service interface. A CLI is a calling surface. MCP is a capability description and transport standard. These things operate at different levels.

In practice, most teams do not actually care whether the underlying thing is called MCP, REST, GraphQL, JSON-RPC, or a local CLI. What they care about is this:

- Can the agent discover what is available without loading everything into context at once?
- Can it authenticate reliably?
- Can it execute calls in a stable, repeatable way?
- Can the same setup work across machines, users, and environments?
- Can the result be consumed by software, not just by a human reading text?

Those are questions about the execution surface.

The model I currently find most useful is:

- MCP and other schema protocols: capability description
- an execution surface: discovery, auth, invocation, output contract
- SKILLs / workflows: orchestration, state, task completion

Each part of the stack does a different job.

## 1. Capability description

This is where MCP is strong.

It gives services a standard way to describe tools, resources, prompts, and other callable capabilities. In this article, that is the role of MCP I care about most.

OpenAPI, GraphQL introspection, OpenRPC, and gRPC reflection belong in the same broader family: they are all ways for a remote system to describe what it can do.

## 2. Execution surface

This is the missing middle in many current setups.

The execution surface is what turns “a thing that can describe itself” into “a thing an agent can use reliably”.

In practical terms, it should handle:

- progressive discovery
- auth binding
- signing when needed
- a stable invocation contract
- structured output
- environment portability

This is the part that often gets hand-waved away as “just use a CLI” or “just load the tools”. It does not have to look like a shell command. It could be a daemon-backed runtime, a local API, a typed tool environment, or a CLI. The point is not the outer shape. The point is that the agent gets one predictable way to discover, authenticate, and invoke capabilities without protocol-specific glue leaking into prompts and workflows.

Once this surface is missing, teams usually fall back into one of two states:

- raw integration
- bespoke wrappers everywhere

Raw integration means the agent has to remember protocol-specific rules, auth quirks, and argument formats. That does not scale.

Bespoke wrappers are often better in the short term, but they create drift. Every service ends up maintaining its own tool surface. Every new environment requires new setup. Every schema change becomes a migration problem.

This is why both of these statements can be true at the same time:

- “MCP alone is not enough.”
- “MCP itself is still useful.”

MCP is useful when you want a standardized way to expose capabilities. It is not, by itself, the full runtime an agent needs.

## 3. Workflow / orchestration

This is where SKILLs, task templates, and higher-level agent instructions live.

A workflow does not just say “here is a tool”. It says:

- when to call it
- in what order
- how to recover from failure
- how to close the loop on a task

That is a different concern.

If you skip the execution surface, workflows become brittle. They end up embedding auth assumptions, machine-local names, and protocol-specific calling details that should not live in prompts.

## Why this matters for agents specifically

Humans tolerate rough interfaces surprisingly well. Humans can infer missing details, correct mistakes manually, switch tools mid-task, and hold implicit environment knowledge in their heads.

Agents are much worse at all of those things unless the interface is designed explicitly for them.

That is why this middle surface matters so much in agent systems.

One possible shape for an agent-friendly surface is a progressive CLI:

```bash
tool-host -h
tool-host <operation> -h
tool-host <operation> key=value
tool-host <operation> '{"..."}'
```

This is not just a CLI preference. It is a progressive disclosure model. Instead of injecting everything up front, the agent pulls only the detail it needs at each step. That keeps context smaller, reduces guesswork, and makes the interface more composable.

The same model could also sit behind a daemon, a local API, or another runtime surface. The important thing is the contract, not the terminal.

## Where UXC fits

This is the framing that led me to build UXC.

UXC is not trying to replace MCP. It is not trying to replace APIs. It is not trying to claim that every remote system should become a local shell command.

What it tries to do is simpler: give agents one stable execution surface across OpenAPI, GraphQL, gRPC, MCP, and JSON-RPC.

The goal is to make discovery, auth, and invocation feel like one consistent contract:

```bash
uxc <host> -h
uxc <host> <operation_id> -h
uxc <host> <operation_id> key=value
uxc <host> <operation_id> '{"..."}'
```

Then, when a stable local name is useful, `uxc link` can turn an endpoint into a reusable local command.

That matters a lot for SKILLs and reusable agent workflows, because a workflow should not have to depend on whatever local MCP server name a user happened to configure on their machine.

The broader point is not “everyone should use UXC”. The broader point is that agent tooling needs this missing middle. If you do not provide it, teams will keep rebuilding it in fragmented, incompatible ways.

## So, is MCP good or bad?

I do not think that is the right question.

A better question is: what should each part of the stack be responsible for?

My current answer is:

- capability protocols should describe what exists
- the execution surface should make those capabilities discoverable, authenticated, portable, and callable
- workflows should decide when to call them, in what order, and how to recover when things fail

That missing middle is often what decides whether an agent integration feels elegant or miserable.

So I do not think “MCP is dead”. I also do not think “just use APIs” is the whole answer.

The conclusion I keep coming back to is simple: we need better execution surfaces for agents.

Once that exists, the MCP vs API vs CLI debate becomes much less dramatic. Protocols can do what they are good at. Workflows can stay focused on tasks. And agents no longer have to carry protocol-specific glue in their prompts just to get useful work done.

That, to me, is the more interesting direction for agent infrastructure.

Project: [https://github.com/holon-run/uxc](https://github.com/holon-run/uxc)