MCP & Skills · vExpertAI Academy

The pain that drives this chapter

The Spiceworks threads and IETF mailing list discussions we collected during research surface the same complaint in different words: vendor APIs are not consistent. Cisco's NX-OS REST API speaks YANG one way; Juniper's NETCONF speaks YANG another way; Arista has eAPI; PaloAlto has XML; Fortinet has its own thing; Aruba has its own. The standards exist — RESTCONF, gNMI, NETCONF — but the implementations disagree on enough details that you can't reuse the same tool across vendors. Build an automation for Cisco; it doesn't work on Juniper. Hire someone who knows Juniper; they re-implement everything for Arista when you swap vendors. The vendor-lock-in problem is not just about hardware — it's about every line of code your team writes against any vendor's API.

In chapter 05 you built tools as Python functions inside a notebook. That solved the demo. It does not solve the production problem. Five teams in your company building five agents would write the same show_interface function five times, each calling a slightly different vendor library. When you upgrade Claude to GPT, or to a local Llama model, every agent has to be rewritten because tool APIs differ. When you onboard a new device vendor, every existing agent has to learn the new API.

The question this chapter answers: how do you write a tool ONCE and have it work with every LLM, in every agent, on every team — including agents that don't exist yet?

The answer, in 2026, is MCP — the Model Context Protocol. MCP is an open standard, originally proposed by Anthropic in late 2024, that defines how an LLM agent talks to a tool server. By 2026 it is supported by all major LLM providers (Anthropic, OpenAI, Google), all major agent frameworks, and a growing list of vendor servers. The bet of this chapter: MCP is the boring infrastructure that will outlast every framework fad.

The complementary concept is Skills — Anthropic's mechanism for packaging task-specific knowledge that a model can reach for. Skills and MCP solve different problems but are often confused. We will untangle them.

What MCP actually is (without the hype)

MCP is a protocol. That word matters. It is not a framework, not a library, not a model. It is an agreed-upon shape of messages over a transport. You can implement it in any language. You can speak it over stdio, HTTP, WebSockets. The protocol itself defines four kinds of things a server can expose: tools (functions the agent can call), resources (data the agent can read), prompts (parameterized prompt templates), and sampling (the agent's ability to ask the server to do its own LLM call). Most useful servers today implement just tools and resources.

The architecture in three pieces:

MCP server. A process that exposes tools/resources/prompts. You write it once. It runs as a separate service. It can run on your laptop (for personal use), on a server (for team use), on the device itself (for embedded automation). The server knows how to do real work — call netmiko, query NetBox, send to Slack — and exposes that work through a standardized protocol.

MCP client. The LLM-side of the protocol. The client discovers what tools the server offers, presents them to the LLM, dispatches calls when the LLM asks. Claude Code, Claude Desktop, Cursor, Continue, OpenHands — all are MCP clients. Most agent frameworks (LangGraph, CrewAI, OpenAI Agents) have MCP client adapters.

Protocol. JSON-RPC 2.0 messages over a chosen transport. initialize to handshake. tools/list to discover. tools/call to invoke. Results come back as structured JSON. That's the whole protocol surface a beginner needs to know.

The diagram is simpler than people make it:

[LLM (Claude, GPT, Llama, ...)]
        │
        │ (provider-specific)
        ▼
[MCP Client (Claude Code, your agent)]
        │
        │ JSON-RPC over stdio/HTTP
        ▼
[MCP Server (your network tools)]
        │
        ▼
[Real systems: netmiko, NetBox, Prometheus, ...]

The point: anything below the protocol line is yours, anything above it is somebody else's. Swap the LLM, swap the client, swap the agent framework — the server keeps working. Build the server once, use it everywhere.

Why "write a tool once" is the entire value proposition

Look back at chapter 05. The notebook defined five tools as Python functions, with a hand-rolled dispatcher, all in-process. That works for a demo. The pain hits when:

You build a second agent and want the same tools. You copy-paste the tool code into the second project. Now you have two copies to maintain.
You switch the second agent from Claude to GPT. The tool-use API surface differs. You rewrite the dispatcher.
You move the agent from a notebook to a long-running service. The in-process function calls need to become network calls. You build a transport.
Your security team asks: "who can call show-running-config? where is that authorization enforced?" In-process tools have no answer.

MCP solves all four by separating tool implementation from tool consumption. The implementation lives in a server with its own process boundary, authentication, logging, and rate limits. The consumption is whatever LLM-driven thing wants to use it. The server doesn't know or care which LLM is calling — it speaks JSON-RPC.

This is exactly the lesson the broader software industry learned with REST APIs in the 2010s. The microservices wave was not really about microservices; it was about boundaries. MCP is the microservices version of LLM tooling. The protocol exists so that the boundary can be drawn somewhere, and it turns out the right place to draw it is between "the model and its client" and "the operational systems and the world."

The vendor inconsistency problem, solved sideways

MCP does not standardize the operational systems behind the server. Cisco still speaks IOS, Juniper still speaks Junos, Arista still speaks EOS. MCP sits above that layer.

You can build a single MCP server called network-show-tools that exposes show_interface(hostname, interface). Behind that single tool definition, your server's implementation can detect the device vendor and call netmiko-Cisco for Cisco, ncclient for Juniper, pyEAPI for Arista. The LLM doesn't know any of this; it sees one tool with one signature. The vendor adapter logic lives in your server, written once, by you, in the language you choose.

This is the version of the answer to vendor inconsistency that beats "wait for vendors to standardize" — they have been failing to standardize for thirty years. Instead, you abstract over the inconsistency yourself, expose a clean interface, and stop carrying the abstraction in every agent. The vendors can keep diverging; the agents don't notice.

The notebook will show this concretely: a show_interface MCP tool whose backend is a stub-netmiko that pretends to support three vendors. Real code, real shape, swap netmiko-stub for real netmiko and you're done.

Skills: the other Anthropic concept that's not MCP

Beginners often conflate MCP and Skills. They are related but distinct.

Skills are packages of expertise that Claude can reach for. Think of a skill as a folder containing: a SKILL.md (the description of what the skill does and when to use it), plus optional code, prompts, references, or examples. Anthropic ships some skills built in (xlsx for spreadsheets, pdf for PDFs, frontend-design for UI). Anyone can author skills. When a Claude session encounters a task that matches a skill description, the skill's contents get loaded into context and Claude follows them.

The differences from MCP:

MCP is a protocol; Skills is a content format. MCP defines how messages flow. Skills defines how knowledge is packaged.
MCP runs code; Skills loads instructions. An MCP tool call executes a function on a server. A Skill activation injects markdown into the prompt.
MCP is Claude-agnostic; Skills is Claude-specific. MCP is an open standard that any LLM provider can adopt. Skills, as of 2026, is Anthropic's mechanism, though similar concepts exist elsewhere.

When to reach for each:

Use MCP when you need to call code: query a database, push a config, send a notification, hit a vendor API. Anything that has side effects or returns live data.

Use Skills when you need to teach Claude a workflow: "how to write a network postmortem in our team's format," "how to triage a customer ticket through our process," "how to format a runbook for our wiki." Anything that's instructions rather than action.

Many real workflows want both. A "diagnose customer issue and write postmortem" task uses MCP to query monitoring and devices, and Skills to format the postmortem the way your team writes them. The two compose cleanly.

The notebook focuses on MCP — it's the harder concept and the higher-leverage one for engineers. Skills get a worked example in chapter 07 (Claude Code), which uses Skills heavily.

The shape of a good MCP server

If you build one MCP server, do these three things well.

Tool descriptions are the API. The LLM picks tools by reading their descriptions. A bad description ("calls show_interface") causes wrong-tool selection. A good description tells the LLM exactly when and why to call this tool, what it returns, and what it won't do. "Returns the operational status of one interface on one device. Includes line protocol, MTU, MAC, last input/output timestamps, and input/output error counters. Read-only. Does NOT return the configuration — for config, use show_running_config_section." Notice the "does NOT" — telling the model what a tool is not for is as important as telling it what it is for.

Input schemas are how the model learns to call correctly. Define every parameter with type, description, and constraints (enum values, regex patterns, ranges). The model uses these to construct calls. Sparse schemas produce malformed calls and confused models.

Errors are part of the API. When a tool fails — bad input, device unreachable, permission denied — return a structured error the model can reason about. "Device 'edge-isp-z' not found. Did you mean 'edge-isp-a' or 'edge-isp-b'?" lets the model recover. A bare RuntimeError traceback does not. Treat error messages with the same care you treat success outputs.

Beyond these three: keep tools narrow (one tool one job, not "do-everything"), make read tools idempotent, gate write tools behind explicit confirmation parameters (the chapter 05 pattern), log every call with arguments and result.

Security: the part that bites a year in

MCP makes it easy to plug agents into anything. The mistake teams make is plugging too much in.

Authentication. An MCP server should authenticate its clients. The protocol doesn't mandate a mechanism — token, OAuth, mTLS — but you must add one. Without it, anyone on the network can invoke your tools. "It runs on my laptop" is fine for personal use; it is not fine for shared servers.

Authorization. Not every client should call every tool. The server enforces who can call what. "Junior on-call engineers can call read tools; senior engineers can call write tools with confirmation; only the change-management process can call mass-write tools." The authorization model belongs in the server, not in the agent — the agent will say whatever the user prompts it to say.

Secret handling. Tools that need credentials (SSH keys, API tokens, vendor passwords) load them from a secret store, not from environment variables visible to the LLM. The LLM should never see a secret; it should call a tool that uses a secret on its behalf. Treat the LLM as a low-trust user.

Prompt injection (chapter 05 mentioned this). If your MCP server returns data that came from an untrusted source — log lines, ticket descriptions, customer emails — that data may contain instructions trying to redirect the agent. Strip or sanitize when possible; design your tools so their outputs can't carry executable instructions.

Audit. Every tool call is logged with timestamp, client identity, tool name, full arguments, result. Logs go somewhere the agent cannot reach. This is your post-incident reconstruction. Production MCP servers without audit logs are a CISO's nightmare and your future regret.

None of this is exotic. It's the same disciplines you'd apply to any internal API server. The mistake is treating the MCP layer as a toy because the demos look casual.

Where MCP is going in 2026

Three patterns to watch.

Vendor-published MCP servers. Cisco, Juniper, Arista, and Palo Alto have all shipped or are shipping official MCP servers in 2025-2026. They wrap their existing APIs in the MCP standard. You can pull them in to your agent without writing adapter code. This is the closest thing to a universal vendor API that the industry has ever had — not because the vendors agreed, but because the consumer (the LLM) demanded it.

Composable servers. Your agent talks to a network-tools MCP server, an incident-management MCP server, and a documentation MCP server, all at once. The agent picks tools from all three. The servers don't know about each other. This is the production pattern: one specialized server per concern, composed at the agent level.

Self-hosted vs. cloud servers. Some MCP servers run on your laptop (personal); some run in your data center (shared team); some are SaaS (Anthropic hosts a few; vendor SaaS is growing). The choice depends on what credentials and data the server touches. Read-only Wikipedia? Cloud is fine. Push-config to your core router? Self-hosted, behind your perimeter.

The architectural direction is clear: distinct servers per domain, agents that compose them, protocol stability over time. This is the maturing version of the agent ecosystem. Chapter 06 of this course is here because the protocol matters more than any specific implementation — when CrewAI replaces LangGraph or Claude replaces GPT, the MCP server keeps working.

What the notebook will give you

The notebook builds an MCP-shaped tool layer for network operations. We don't run a real MCP server — that requires a separate process and adds Colab friction — but we implement the exact shape an MCP server uses, with the same tool definitions, schemas, and dispatch model. Swap the in-notebook dispatcher for a mcp Python library wrapper and you have a real MCP server in 30 lines.

Setup. Stub netmiko that pretends to support three vendors. Mock devices with different vendor types.

Define tools. Six tools with proper MCP-style schemas: - list_devices — read-only enumeration. - show_version(hostname) — read-only, vendor-detects. - show_interface(hostname, interface) — read-only, vendor-detects. - show_running_config_section(hostname, section) — read-only. - search_configs(query) — RAG-style, reuses chapter 04 logic conceptually. - apply_config_change(hostname, config_text, dry_run, confirmation_token) — write tool with chapter 05 safety pattern.

Vendor adapter. Each read tool detects the device vendor and dispatches to the right backend stub. The agent sees one consistent tool surface; the server hides the inconsistency.

Agent integration. Final cells use the chapter 05 agent loop with these new tools. The investigation now spans simulated multi-vendor devices, and you watch the agent pick tools that hide the vendor differences.

Compare to MCP for real. A markdown cell shows the minimal pip install mcp + 20-line Python that would convert this notebook's tool layer into a real MCP server you could run as a process. You won't run it in Colab, but you'll know exactly how.

What comes next

Chapter 07 is Claude Code — the most immediately useful container for everything you've built so far. Claude Code is a CLI that lets you point Claude at your codebase or environment and have it work. It's an MCP client by default (so all your servers from chapter 06 plug in directly), it understands Skills (so your team workflows package up nicely), and it ships with safety patterns the chapter 05 / chapter 06 disciplines feed into. By the end of chapter 07 you'll have a working Claude Code session that uses your local MCP server to investigate a (mock) network problem, end-to-end.

For now: run the notebook. Define a tool with a proper schema. Watch the agent pick the right tool based on the description. Internalize that the protocol is the boundary — everything below the boundary is yours to design, everything above it composes with everyone else's work. That boundary is the most important architectural concept in the rest of the course.

Field exercise: take ONE existing read-only script in your team's collection. Wrap it as an MCP tool using the schema in the notebook. Even if you don't run it as a real server yet, write the tool definition exactly as MCP wants it. You'll discover gaps in the script's interface (vague parameters, ambiguous return values) that have always been there and that the LLM forces you to fix.

Wrong way to use this chapter: treat MCP as a framework to learn. Right way: treat MCP as a contract between two things — the LLM that wants tools and the operations world that has them — and put effort into the tool definitions, not the protocol mechanics.

Pain anchored: T5 (tooling fragmentation, no unified query) + T11 (YANG/API inconsistency across vendors). MCP is the layer at which you abstract over the inconsistency once, on your terms. Maps to: chapter 06-mcp-and-skills. Pairs with the polished MCP chapters already in this folder (00-syllabus.md through 10-capstone.md, plus tools-mcp-skills-production-ai-agents.md).