MCP Hosts, Servers & Endpoints: The Complete Guide to AI Agent Integration
Tools
• OPENCLAW • ZEROCLAW • AI AGENTS • MCP • LLM • AI •
The agentic web is no longer a thought experiment. Developers are running autonomous AI agents on cheap VPS instances, home servers, and Raspberry Pis — agents that discover tools at runtime, call APIs, query databases, and execute code. The protocol making this possible is MCP: the Model Context Protocol.
This post explains how MCP works, surveys the agent landscape in mid 2026, and walks through running your own MCP endpoint with suckless-mcp — the minimal Rust gateway that turns any CLI script into an AI-callable skill.
What Is MCP?
The Model Context Protocol is an open standard that lets AI agents discover and use external tools dynamically. Think of it as USB for AI: write your tool once, expose it over MCP, and any compatible agent can plug in and use it — without documentation, without custom integration code, without vendor lock-in.
Before MCP, every AI integration was bespoke. OpenAI had function calling. ChatGPT had plugins. Everyone else had curl. The glue code was your problem.
With MCP, the protocol handles discovery. The agent asks your server “what can you do?” Your server describes its tools and their input schemas. The agent figures out when and how to call them. That’s the entire contract.
The Four Layers
┌─────────────────────────────────────────────────────────┐
│ AI AGENT (Claude, GPT, Hermes…) │
│ "What's on parcel 3086?" │
└────────────────────┬────────────────────────────────────┘
│ decides to call a tool
▼
┌─────────────────────────────────────────────────────────┐
│ MCP HOST / CLIENT │
│ Claude Desktop, Cursor, OpenClaw, ZeroClaw │
│ "I need to call the cadastre tool…" │
└────────────────────┬────────────────────────────────────┘
│ HTTP + JSON-RPC
▼
┌─────────────────────────────────────────────────────────┐
│ MCP SERVER (Gateway) │
│ suckless-mcp, mcp-server-python │
│ "Here are my tools and their schemas" │
└────────────────────┬────────────────────────────────────┘
│ spawns subprocess
▼
┌─────────────────────────────────────────────────────────┐
│ YOUR CLI TOOLS (Skills) │
│ weather.py, db_query.sh, cadastre.py │
│ "--city Ljubljana --output json" │
└─────────────────────────────────────────────────────────┘
| Term | What it means | Analogy |
|---|---|---|
| MCP Host | The AI application (Claude Desktop, Cursor) | Your phone’s OS |
| MCP Client | The part of the host that speaks MCP | Bluetooth stack |
| MCP Server | The tool gateway (suckless-mcp) | USB hub |
| MCP Endpoint | The URL agents connect to (/mcp) |
USB port |
| Skill | Your CLI tool + skill.toml manifest |
USB device |
| Tool Discovery | Agent asks “what can you do?” | Plug-and-play |
The Agent Landscape in 2026
Not all agents are equal. Some are polished consumer products with a fixed model under the hood. Others are runtimes where the LLM is just one swappable component. Knowing the difference matters when you’re deciding which agents your MCP server needs to serve well.
Claude (Anthropic)
Claude is both the model and, through Claude Desktop and the Claude API, an MCP host. It’s the most widely used agent for MCP tool use in practice. Claude’s strength is careful tool selection — it reads skill.toml descriptions seriously and tends not to call tools it doesn’t need. The MCP integration in Claude Desktop is mature: you point it at your endpoint, optionally add a Bearer token, and tools appear automatically.
Claude is a closed model. You use the API, you pay per token, you don’t swap out the weights. For most use cases that’s fine — it’s very capable. For air-gapped or cost-sensitive deployments, look elsewhere.
OpenAI Codex / GPT with MCP
OpenAI’s models gained native MCP support in 2025. Codex — now focused on coding agent tasks — can connect to MCP endpoints and use tools for file operations, shell execution, and external APIs. The integration works similarly to Claude: configure the endpoint, the agent discovers tools at runtime.
Like Claude, these are closed models. The LLM is not pluggable. You’re on OpenAI’s infrastructure and pricing.
OpenClaw
OpenClaw started in November 2025 as “Clawdbot,” a weekend project by Austrian developer Peter Steinberger for running AI agents inside messaging apps. It went viral in January 2026 — reaching 9,000 stars in the first 24 hours after rebranding, and over 247,000 stars within six weeks.
OpenClaw is an open-source autonomous AI agent framework that has created an entirely new software category. It runs locally, connects to messaging platforms (Telegram, Slack, WhatsApp, Discord), and supports MCP servers for tool use. The LLM is pluggable: you configure which provider and model to use. Point it at Claude, GPT-4, or a local Ollama instance — OpenClaw doesn’t care.
The default deployment runs agents with the same privileges as the host user, which means a compromised agent can read SSH keys, access cloud credentials, and exfiltrate sensitive data. The large plugin ecosystem is powerful, but it comes with surface area to audit. Use it when you want a full-featured, extensible agent and are comfortable managing the permissions carefully.
ZeroClaw
ZeroClaw was launched on February 13, 2026 as a high-performance Rust alternative to OpenClaw. Built as a single Rust binary, it is much lighter than a full local agent stack and easier to run on small machines, cheap VPS instances, or always-on home servers.
ZeroClaw talks to LLM providers like Anthropic, OpenAI, Ollama, and 20+ others, connects via 30+ channels, and acts through tools such as shell, browser, HTTP, hardware, and custom MCP servers. The LLM is fully pluggable — swap between providers in config. It reports 10ms cold starts on ARM64 edge nodes using 5MB RAM.
ZeroClaw is easier to reason about than OpenClaw because it has a smaller runtime surface. Choose ZeroClaw if you want a minimal local agent with fewer moving parts. The suckless philosophy — do one thing, do it fast — maps naturally onto suckless-mcp. The two are a good pairing.
Hermes Agent
Nous Research released Hermes Agent on February 25, 2026. Seven weeks later it crossed 95,000 GitHub stars, making it the fastest-growing open-source agent framework of the year.
Hermes Agent supports MCP for connecting to any external MCP server, and exposes four plugin hooks: pre_llm_call, post_llm_call, on_session_start, and on_session_end. What sets Hermes apart is persistent learning: it writes what it learns about your codebase and workflows to files on disk, so context accumulates across sessions rather than resetting.
It supports OpenAI, Anthropic, and local models via Ollama — you bring your own API key. It’s deployable on a $5/month VPS, and you only pay LLM API costs. The LLM is pluggable; Hermes is model-agnostic by design, a reflection of its origins at Nous Research, a lab that trains and releases open-weight models.
The LLM-Pluggable Pattern
OpenClaw, ZeroClaw, and Hermes share a design decision that’s worth calling out explicitly: the language model is a configuration option, not a product dependency.
You point these agents at an endpoint — Anthropic API, OpenAI, an Ollama instance running Llama or Mistral, or any OpenAI-compatible server — and the agent does the rest. This matters for several reasons:
- Cost control: run cheap models for routine tasks, expensive ones for hard ones
- Privacy: air-gap with a local model, no data leaves your machine
- Vendor independence: switch providers without rewriting your tooling
- Compliance: regulated environments can use approved model deployments
This is the direction the ecosystem is moving. Closed-model agents like Claude Desktop are excellent products, but the open agent runtimes treat the LLM as infrastructure — interchangeable, configurable, yours to decide.
Real-World Example: GeoAgentic
Here’s how this works in production. I run GeoAgentic, a chatbot and MCP server that exposes Slovenian cadastral (land registry) data to AI agents.
User: "What's on parcel 3086 in Ljubljana?"
↓
Agent discovers: slovenia-cadastre tool
↓
Calls: slovenia-cadastre --command parcel --parcel-num 3086 --ko-id 1725
↓
Returns:
{
"parcel": "3086",
"area_m2": 2278,
"addresses": [{"street": "Trg republike", "number": 1}]
}
No API documentation. No custom client. Just natural language into structured data.
Any MCP-compatible agent — Claude, ZeroClaw running Llama, Hermes with GPT-4o — can use this endpoint identically. The protocol abstracts away which model is thinking.
suckless-mcp: Why It Exists
The MCP server ecosystem has a weight problem. Most gateways require Python, Node, Docker, or all three. They come with dependency trees, config DSLs, and abstractions that obscure what’s actually happening.
suckless-mcp is the opposite: one static Rust binary, zero runtime dependencies. Every skill is a CLI script with a skill.toml manifest. The gateway spawns it, captures the JSON output, and returns it to the agent.
| Traditional MCP Server | suckless-mcp |
|---|---|
| Python + 50 deps | One Rust binary |
| Complex config syntax | TOML + --flags |
| Needs Docker/k8s | Runs as systemd service |
| Each tool = new server | One gateway, all skills |
| Auth all-or-nothing | Per-tool public flag |
The public flag deserves a mention. Since 2026.05, suckless-mcp supports per-tool authentication: mark a skill public = true and any agent can call it without a Bearer token. Private skills require a key. Anonymous agents see only the public tools in tools/list; authenticated agents see everything. This lets you run a mixed endpoint — open data tools alongside sensitive internal ones — on a single server.
Step-by-Step: Your Own MCP Endpoint
Install
curl -fsSL https://raw.githubusercontent.com/roverbird/suckless-mcp/main/install.sh -o install.sh
chmod +x install.sh
./install.sh
suckless-mcp --status
The installer puts the binary at /usr/local/bin/, creates /opt/skills/, generates config at /etc/suckless-mcp/config.toml and keys at /etc/suckless-mcp/keys.toml, sets up a suckless system user, and installs a systemd service.
Create a Skill
Every skill is a folder with two files:
sudo mkdir -p /opt/skills/weather
weather.py — your actual tool:
#!/usr/bin/env python3
import argparse, json, requests
parser = argparse.ArgumentParser()
parser.add_argument("--city", required=True)
args = parser.parse_args()
# call whatever API or database you need
result = {
"city": args.city,
"temperature_c": 18,
"conditions": "partly cloudy"
}
print(json.dumps(result))
skill.toml — the manifest the agent reads:
name = "weather"
description = "Get current weather for any city"
public = true # no API key required
[runtime]
entrypoint = "weather.py"
timeout_secs = 30
[inputs.city]
type = "string"
flag = "--city"
required = true
description = "City name (e.g., Ljubljana, Maribor)"
sudo chmod +x /opt/skills/weather/weather.py
Two rules for skills: only --flags, never positional args; only JSON on stdout, never debug text. Everything else is up to you.
Test Locally
# direct test
python3 /opt/skills/weather/weather.py --city Ljubljana
# through suckless-mcp
suckless-mcp --skills --name weather
Add Authentication for Private Skills
suckless-mcp --keys-add --id myagent --key "$(openssl rand -hex 32)"
# private skill — requires Bearer token
name = "db_query"
public = false
Expose via Caddy
mcp.yourdomain.com {
header {
Strict-Transport-Security "max-age=31536000"
X-Content-Type-Options "nosniff"
}
reverse_proxy localhost:8080
}
Rate limiting at the Caddy layer handles abuse. suckless-mcp handles auth and tool dispatch. Each layer does one thing.
Connect Your Agent
Claude Desktop (~/.config/Claude/claude_desktop_config.json):
{
"mcpServers": {
"my-skills": {
"url": "https://mcp.yourdomain.com/mcp",
"headers": {
"Authorization": "Bearer your-secret-key"
}
}
}
}
ZeroClaw (config.toml):
[[mcp_servers]]
url = "https://mcp.yourdomain.com/mcp"
auth_token = "your-secret-key"
For public endpoints, omit the auth entirely. The agent will still discover and call your public tools.
What to Do and What Not to Do
Do:
- Write CLI tools with
--flagsonly (no positional args) - Output pure JSON — nothing else on stdout
- Use
skill.tomlto describe inputs clearly; the agent reads this description to decide when to call your tool - Set
public = truefor safe, read-only, unauthenticated tools - Rate limit at the proxy layer (Caddy), not in the tool
Don’t:
- Print debug output to stdout (breaks JSON parsing)
- Put secrets in
skill.toml(use environment variables) - Expose dangerous operations (
rm -rf, unrestricted shell) as public skills - Use positional args (
sys.argv[1]) — agents pass--flags
The Bigger Picture
MCP is not another API format. It’s the protocol that lets the agentic layer decouple from any particular AI product. Claude calls your tools the same way ZeroClaw does. Hermes running Llama locally calls them the same way GPT-4o does. The MCP endpoint doesn’t know or care what model is on the other end.
The agent runtimes that treat the LLM as a pluggable component — OpenClaw, ZeroClaw, Hermes — are betting that the model is not the moat. The moat is the tools, the data, and the automation you’ve built. MCP is how you make those assets available to every agent at once.
GeoAgentic is a small example of this: Slovenian cadastral data, queryable by any agent worldwide, with no per-agent integration work. Write the tool once. Expose it over MCP. Done.
Quick Reference
Install suckless-mcp:
curl -fsSL https://raw.githubusercontent.com/roverbird/suckless-mcp/main/install.sh | sudo bash
Create a skill:
mkdir -p /opt/skills/my-skill
# add skill.toml + script
sudo systemctl restart suckless-mcp
Connect any agent:
{ "mcpServers": { "my-skills": { "url": "https://yourdomain.com/mcp" } } }
Test GeoAgentic live:
- Endpoint:
geoagentic.app/mcp— no key required - Ask any MCP agent: “What’s on parcel 3086 in Ljubljana?”
suckless-mcp — only your tools matter
*Questions? Open an issue on the suckless-mcp GitHub repo or get in touch.
Informacijski pooblaščenec in URSIV preverjata dokumentacijo, ne le sisteme. Ste pripravljeni?
ZInfV-1 zahteva dokazljivo usposabljanje zaposlenih — evidence udeležbe so med prvimi dokumenti, ki jih preveri inšpekcija. Naš praktičen tečaj (prilagojen vaši organizaciji) pokrije zakonsko obveznost in zgradi varnostno kulturo v enem koraku. Pridobite ponudbo za vašo organizacijo →
