Enterprise MCP: lessons from GitHub's MCP server launch

Sam Morrow and his colleagues started building GitHub’s MCP server as an internal side project. One year and a bumpy start later, it’s the most-used remote MCP server in the world. Sam has much to teach us about scaling tool design, auth, and what agents actually need from an enterprise server.

This article is adapted from MCP MVP, a video series spotlighting the people building the agentic future. This episode features Sam Morrow, Lead Developer of the GitHub MCP Server.

In early 2025, a product manager named Toby Padilla, founder of Charm (the tool that makes Go CLIs beautiful) posted a message in an internal Go channel at GitHub: “Is anyone interested in making an MCP server?”

Sam Morrow was a software engineer on the Code Scanning team. He and two colleagues, William and Javier, volunteered their spare cycles. They grabbed the unofficial MCP-Go SDK (later transitioning to the official Go SDK after it launched) and started building. Javier even vibe-coded the first couple of tools.

Around the same time, GitHub ran an internal “mobility march” that let engineers join Copilot teams. Sam applied to Agent Services, a team working on parts of what would become GitHub Copilot coding agent.

Then, through an accident of shifting schedules, the GitHub MCP server’s open-source repo launched the same day VS Code shipped Agent mode, a release originally scheduled for the day before. Every “here’s what you can do with Agent mode” blog post pointed at GitHub’s MCP server as the example!

GitHub’s CEO posted about it. Satya Nadella reposted the news. GitHub’s MCP Server repo became the most-starred repo on GitHub for the week! But trouble was following on the heels of success.

When adoption != success

The viral launch drove awareness, but awareness isn’t always the best measure of success. The team hadn’t planned for such an overwhelming amount of attention. Developers and engineers challenged the team’s assumptions about how users would interact with the server, and feedback flooded in.

The team had taken a granular approach to covering GitHub’s API surface, and the server shipped with over 100 tools for issues, PRs, repos, Actions, code scanning, gists, and more. In early 2025, most agents’ models struggled with even 20 tools in context. Users didn’t cherry-pick the server’s tools as expected; they loaded them all into their agent’s context and became frustrated when the agent struggled.

The team learned that mapping tools too closely to REST API endpoints would require an agent to chain multiple atomic CRUD (Create-Read-Update-Delete) operations for common workflows, something agents still struggle with. Sam acknowledged the trade-off: “One-to-one mapping with APIs was a poor goal. We were trying to capture intent.” But consolidating tools limits what’s possible, and GitHub users range from those who only use repos and PRs to those running complex multi-project workflows with GitHub Projects.

The issue wasn’t that the GitHub MCP server had too many tools, though. It was that they needed better ways for users and agents to select the right tools to match their intents.

Advice for enterprise MCP server builders

Over the past year, Sam’s team has built and rebuilt the server through real-world adoption at massive scale. In a recent interview with RL Nabors for Arcade.dev’s MCP MVP series, Sam shared the following advice for building enterprise MCP servers today.

Start prompt-first

When designing MCP tools, start by identifying the prompts users are already using.

Agents will reach for tools that “sound like a solution” before chaining multiple tools together. For instance, a social media agent tasked with unfollowing an account’s mutuals will be more successful with an “unfollow-users-followers” tool than orchestrating “get-followers-by-user” twice and calling “unfollow-user” on each member of the overlap. (And while such an operation might be performable with Code Mode with ease, for actions requiring authorization, authorizable MCP tools are one of the more secure options for a remote server.)

Sam’s team tried prototyping dynamic tool selection using embeddings to match the agent’s intent to relevant toolsets at runtime. It worked, but they shelved it: “We decided we were the wrong level in the stack to solve that problem.” The token cost of dynamically expanding from 3 tools to 21 was real, and it felt like something the agent harness or a tool search solution like Anthropic’s tool search feature should handle.

Group your tools semantically and make the groups composable

The GitHub MCP server introduced toolsets, semantic groupings like issues, repos, pull_requests, actions, and code_security. Toolsets map to how people actually think about GitHub’s products. With some JSON configuration, engineers can cherry-pick from 108 tools to the exact subset their workflows use.

Toolsets are composable. Want read-only access to issues? Set issues plus read-only mode to get the exact intersection.

Toolsets are robust. If the MCP Server’s surface area changes, like four tools are consolidated into two, the toolset still works.

Instead of creating a tool for every endpoint in your API, create meaningful, composable toolsets that match common workflow patterns.

Write error messages for agents, not terminals

Most MCP servers inherit CLI-style error messages that are terse, technical, and difficult to parse even for humans who deeply know the system. An agent sees fatal: could not sign commit and immediately attempts to disable commit signing because it’s not clear what the next steps are.

Think of the agent as a junior engineer encountering these issues for the first time. If error messages assume a naive debugger and are written to be understood by a human, the agent’s model has a much better time following the golden path.

Sam advises explaining what happened in plain language and suggesting next steps. Instead of “403 Forbidden,” use “Tried to commit, but the user needs to authorize commit signing. Please ask the user to approve the SSH agent prompt.” If it’s a terminal failure, say so explicitly, explain why, and tell the agent to move on rather than retrying.

“They really take the steering,” Sam said of models reading error responses.

An agent that gets a useful error message will often self-correct. An agent that gets a cryptic one will start “attacking the problem” by trying creative workarounds that take it further from the user’s intent.

Filter tools by what the auth token can actually do

One of the GitHub MCP server’s more interesting innovations is filtering tools based on the authentication token’s actual scopes—something Pulse MCP’s newsletter recently highlighted.

If you connect with a classic personal access token that has no write scopes, the server only shows 20 read-only tools. Connect with a more privileged token and you get 40, including write operations. If a token is missing just the gist scope, you lose gist tools but keep everything else. For GitHub Actions tokens—which have no user identity at all—the server removes user-specific tools because they’ll never work in that context.

“You know exactly what tools are going to work because they’re what you get,” Sam said. “And conversely, you’re not going to get tools that will never work.”

For the remote server, the team went further with OAuth scope challenges: if your token lacks a needed scope, the server returns a special payload that tells the Agent Harness to prompt you to authorize it. Approve the scope and the original tool call succeeds—no retry needed. It’s progressive authorization: start with minimal scopes and escalate as the agent actually needs more.

Use MCP tools’ `annotations` property

Sam is emphatic about one underused part of the MCP spec: annotations. These are metadata hints on MCP tools that tell the agent harness things like “this tool is read-only,” “this tool is destructive,” “this tool’s output comes from the open internet and might contain prompt injection.”

“It’s good for your agentic system to know when something destructive is going to happen,” Sam said. In VS Code and the Copilot CLI, annotations trigger confirmation dialogs for write and destructive operations, providing a moment between the model deciding to call a tool and the call executing, keeping the human in the loop.

Sam is co-authoring a specification enhancement proposal with a colleague from OpenAI to improve the annotation system. The current vocabulary isn’t granular enough for emerging use cases like MCP Apps, where knowing whether an action is reversible matters as much as knowing whether it’s destructive.

Design for the loop, not the model

Sam wants more builders to internalize that the model is not the agent. The software is not the agent. The MCP server is definitely not the agent. The agent is the loop.

“You actually are just running stuff on a loop,” he said. “And the model decides the next thing to do.” Inside that loop, before the model decides and before that decision is executed, there are moments where the harness can intercept a tool call, check its annotations, prompt the user, inspect the response for open-world content, and decide whether to inject it into context.

These beats are “nothing to do with models, but really important in terms of protecting the user.” Understanding them helps you design an MCP server holistically, incorporating the model, the Agent Harness, the user, and the set of trust boundaries between them.

This also means the “right” tool design depends on the execution context. Copilot CLI can write large tool responses to temporary files and then let the agent grep them. This lets the agent search for a needle in a haystack without overwhelming its context. A chat-based agent can’t do that.

“All these best practices that come and go are questioned every day depending on the execution context,” Sam said. “Check out the advanced features of the agents you’re actually using.”

What’s coming next

The GitHub MCP server ships new features behind an insiders flag. Just add /insiders to the remote server URL or set the header. It is low-effort to join and low-cost to leave. File issues or start discussions at github.com/github/github-mcp-server.

Sam is also proposing a new AAIF working group to develop better tool annotations. If you’re interested in shaping the standards themselves, the MCP specification repo and its Discord are where that work happens. The community is small enough that your input will be read, and responsive enough that you’ll get feedback.

“If you post a comment, people will read it and they will respond to you,” he said of the MCP community. “Even David,” one of MCP’s co-creators, “you’ll see him in the Discord all the time.”

MCP MVP is a video series from Arcade spotlighting the builders shaping the agentic ecosystem. Watch the full interview with Sam Morrow →

Building enterprise MCP servers and want to handle auth without PAT tokens in JSON files? Check out Arcade’s GitHub tools for OAuth-native MCP integration.

Too many tools: What GitHub learned about building for agents from their MCP Server launch

Sam Morrow and his colleagues started building GitHub’s MCP server as an internal side project. One year and a bumpy start later, it’s the most-used remote MCP server in the world. Sam has much to teach us about scaling tool design, auth, and what agents actually need from an enterprise server.

When adoption != success