Files
local-mcp/README.md
2026-03-27 17:54:38 +08:00

19 KiB

local-mcp

local-mcp is a localhost-first MCP server whose primary responsibility is to deliver the latest user instruction to an agent through the get_user_request tool, while also providing a responsive web UI for managing the instruction queue and monitoring server/agent activity.

This document is the implementation plan for the project.

1. Goals

  • Provide a single MCP tool, get_user_request, that returns at most one instruction per call.
  • Give the user a polished local web UI to add, edit, remove, review, and monitor instructions.
  • Preserve queue integrity so consumed instructions are clearly visible but no longer editable/deletable.
  • Support configurable waiting/default-response behavior when no instruction is available.
  • Show live server status and inferred agent connectivity in the UI.
  • Keep the stack lightweight, maintainable, debuggable, and friendly to local development.

Backend

  • Language/runtime: Python 3.11+
  • MCP integration: official Python MCP SDK
  • HTTP server/API layer: FastAPI
  • ASGI server: Uvicorn
  • Persistence: SQLite via Python standard library sqlite3
  • Concurrency/state coordination: asyncio + standard library synchronization primitives where needed
  • Logging/error handling: Python logging, structured request logs, centralized exception handling
  • Configuration: environment variables + small local config file (.json or .toml)

Why this backend stack

  • The MCP SDK is the correct dependency for exposing the MCP tool cleanly.
  • FastAPI + Uvicorn is a small, pragmatic backend stack that simplifies routing, validation, health endpoints, and server-sent updates without introducing a heavy framework.
  • SQLite keeps the system local-first, dependency-light, and durable enough for instruction history and settings.
  • Most supporting concerns remain in the Python standard library, which keeps third-party dependencies minimal.

Frontend

  • UI technology: plain HTML, CSS, and JavaScript only
  • Realtime updates: Server-Sent Events (preferred) with polling fallback if necessary
  • Styling: local CSS files with design tokens and component-specific stylesheets
  • Client architecture: modular vanilla JS organized by feature (api.js, state.js, events.js, instructions.js, etc.)
  • Assets: all fonts/icons/scripts/styles stored locally in the repository; no CDN usage

Mandatory frontend implementation instruction

Any future frontend implementation work must first read and follow:

  • .github/instructions/frontend-design.instructions.md

This instruction file is mandatory for the UI because it requires a distinctive, production-grade, non-generic frontend. The implementation should not default to generic dashboard aesthetics.

3. Product/Architecture Plan

Core backend responsibilities

  1. Expose the MCP tool get_user_request.
  2. Maintain an instruction queue with durable storage.
  3. Mark instructions as consumed atomically when delivered to an agent.
  4. Expose local HTTP endpoints for the web UI.
  5. Stream status/instruction updates to the browser in real time.
  6. Infer agent connectivity from recent MCP tool activity.
  7. Persist and serve server configuration such as wait timeout and default empty response.

Core frontend responsibilities

  1. Show queued and consumed instructions in separate, clearly labeled sections.
  2. Allow add/edit/delete only for instructions that are still pending.
  3. Cross out and grey out consumed instructions.
  4. Show server status, inferred agent status, last fetch time, and configuration values.
  5. Update live as instruction state changes.
  6. Remain usable and visually polished on desktop and smaller screens.

Suggested repository layout

local-mcp/
├─ main.py
├─ README.md
├─ requirements.txt
├─ app/
│  ├─ __init__.py
│  ├─ config.py
│  ├─ database.py
│  ├─ logging_setup.py
│  ├─ models.py
│  ├─ services/
│  │  ├─ instruction_service.py
│  │  ├─ status_service.py
│  │  └─ event_service.py
│  ├─ api/
│  │  ├─ routes_instructions.py
│  │  ├─ routes_status.py
│  │  └─ routes_config.py
│  └─ mcp_server.py
├─ static/
│  ├─ index.html
│  ├─ css/
│  │  ├─ base.css
│  │  ├─ layout.css
│  │  └─ components.css
│  ├─ js/
│  │  ├─ api.js
│  │  ├─ app.js
│  │  ├─ events.js
│  │  ├─ instructions.js
│  │  └─ status.js
│  └─ assets/
└─ data/
   └─ local_mcp.sqlite3

4. Data Model Plan

instructions

  • id - string/UUID primary key
  • content - text, required
  • status - enum: pending, consumed
  • created_at - datetime
  • updated_at - datetime
  • consumed_at - nullable datetime
  • consumed_by_agent_id - nullable string
  • position - integer for stable queue order

settings

  • default_wait_seconds - integer — seconds the tool waits before returning an empty/default response; set exclusively by the user via the web UI
  • default_empty_response - text, nullable
  • agent_stale_after_seconds - integer

agent_activity

  • agent_id - string primary key
  • last_seen_at - datetime
  • last_fetch_at - datetime
  • last_result_type - enum: instruction, empty, default_response

5. Detailed API Design

All routes are local-only and intended for localhost usage.

5.1 MCP tool contract

Tool: get_user_request

Purpose

  • Return the next pending instruction, if one exists.
  • If none exists, wait for a configurable duration, then return the server-controlled default response.
  • Record agent activity so the UI can infer whether an agent is currently connected/recently active.

Suggested input schema

{
  "agent_id": "optional-string"
}

Suggested output schema when an instruction is delivered

{
  "status": "ok",
  "result_type": "instruction",
  "instruction": {
	"id": "uuid",
	"content": "user instruction text",
	"consumed_at": "ISO-8601 timestamp"
  },
  "remaining_pending": 3,
  "waited_seconds": 0
}

Suggested output schema when queue is empty

{
  "status": "ok",
  "result_type": "empty",
  "instruction": null,
  "response": "",
  "remaining_pending": 0,
  "waited_seconds": 10
}

Suggested output schema when a default response is returned

{
  "status": "ok",
  "result_type": "default_response",
  "instruction": null,
  "response": "call this tool `get_user_request` again to fetch latest user input...",
  "remaining_pending": 0,
  "waited_seconds": 10
}

Behavior rules

  • Deliver the oldest pending instruction first.
  • Mark the delivered instruction as consumed in the same transaction used to claim it.
  • Never return more than one instruction per call.
  • default_wait_seconds is fully server-controlled (set by the user via the web UI). Agents cannot override it.
  • Clamp actual_wait to an absolute server maximum (86400 s).
  • Update the agent activity record on every call, even when no instruction is returned.
  • The UI should infer "agent connected" if the latest activity is within agent_stale_after_seconds.
  • Agent implementations should continue calling this tool instead of ending their work session on their own, so they can pick up newly added instructions without missing critical follow-up requests.

5.2 HTTP API for the web UI

GET /healthz

Returns service health.

Response

{
  "status": "ok",
  "server_time": "ISO-8601 timestamp"
}

GET /api/status

Returns current server and agent summary.

Response

{
  "server": {
	"status": "up",
	"started_at": "ISO-8601 timestamp"
  },
  "agent": {
	"connected": true,
	"last_seen_at": "ISO-8601 timestamp",
	"last_fetch_at": "ISO-8601 timestamp",
	"agent_id": "copilot-agent"
  },
  "queue": {
	"pending_count": 2,
	"consumed_count": 8
  },
  "settings": {
	"default_wait_seconds": 10,
  "default_empty_response": "call this tool `get_user_request` again to fetch latest user input...",
	"agent_stale_after_seconds": 30
  }
}

GET /api/instructions

Returns all instructions in queue order.

Query params

  • status=pending|consumed|all (default all)

Response

{
  "items": [
	{
	  "id": "uuid",
	  "content": "Implement logging",
	  "status": "pending",
	  "created_at": "ISO-8601 timestamp",
	  "updated_at": "ISO-8601 timestamp",
	  "consumed_at": null,
	  "consumed_by_agent_id": null,
	  "position": 1
	}
  ]
}

POST /api/instructions

Creates a new pending instruction.

Request

{
  "content": "Add a new status indicator"
}

Response: 201 Created

{
  "item": {
	"id": "uuid",
	"content": "Add a new status indicator",
	"status": "pending",
	"created_at": "ISO-8601 timestamp",
	"updated_at": "ISO-8601 timestamp",
	"consumed_at": null,
	"consumed_by_agent_id": null,
	"position": 3
  }
}

PATCH /api/instructions/{instruction_id}

Edits a pending instruction only.

Request

{
  "content": "Reword an existing pending instruction"
}

Rules

  • Return 409 Conflict if the instruction has already been consumed.
  • Return 404 Not Found if the instruction does not exist.

DELETE /api/instructions/{instruction_id}

Deletes a pending instruction only.

Rules

  • Return 409 Conflict if the instruction has already been consumed.
  • Return 204 No Content on success.

GET /api/config

Returns editable runtime settings.

Response

{
  "default_wait_seconds": 10,
  "default_empty_response": "call this tool `get_user_request` again to fetch latest user input...",
  "agent_stale_after_seconds": 30
}

PATCH /api/config

Updates runtime settings.

Request

{
  "default_wait_seconds": 15,
  "default_empty_response": "",
  "agent_stale_after_seconds": 45
}

GET /api/events

Server-Sent Events endpoint for live UI updates.

Event types

  • instruction.created
  • instruction.updated
  • instruction.deleted
  • instruction.consumed
  • status.changed
  • config.updated

SSE payload example

{
  "type": "instruction.consumed",
  "timestamp": "ISO-8601 timestamp",
  "data": {
	"id": "uuid",
	"consumed_by_agent_id": "copilot-agent"
  }
}

6. UI/UX Plan

Layout priorities

  • A strong local-control dashboard feel rather than a generic admin template
  • Clear separation between pending work and already-consumed history
  • High-visibility connection/status strip for server and agent state
  • Fast creation flow for new instructions
  • Mobile-friendly stacking without losing queue readability

Required screens/sections

  • Header with project identity and server status
  • Agent activity panel with last seen/fetch information
  • Composer form for new instructions
  • Pending instructions list with edit/delete actions
  • Consumed instructions list with crossed-out styling and metadata
  • Settings panel for wait timeout/default response behavior

Frontend quality bar

  • Follow .github/instructions/frontend-design.instructions.md before implementing any UI.
  • Use only local assets.
  • Build a visually distinctive interface with careful typography, color, spacing, motion, and responsive behavior.
  • Keep accessibility in scope: semantic HTML, keyboard support, visible focus states, sufficient contrast.

7. Logging, Reliability, and Error Handling Plan

  • Log startup, shutdown, configuration load, database initialization, and MCP registration.
  • Log each instruction lifecycle event: created, updated, deleted, consumed.
  • Log each get_user_request call with agent id, wait time, and result type.
  • Return structured JSON errors for API failures.
  • Protect queue consumption with transactions/locking so two simultaneous fetches cannot consume the same instruction.
  • Validate payloads and reject empty or whitespace-only instructions.
  • Handle browser reconnects for SSE cleanly.

8. Todo List

  • Project setup

    • Create the backend package structure under app/.
    • Add requirements.txt with only the required dependencies.
    • Replace the placeholder contents of main.py with the application entrypoint.
    • Add a local configuration strategy for defaults and runtime overrides.
  • Data layer

    • Create SQLite schema for instructions, settings, and agent_activity.
    • Add startup migration/initialization logic.
    • Implement queue ordering and atomic consumption behavior.
    • Seed default settings on first run.
  • MCP server

    • Register the get_user_request tool using the official MCP Python SDK.
    • Implement one-at-a-time delivery semantics.
    • Implement wait-until-timeout behavior when the queue is empty.
    • Return empty/default responses based on configuration.
    • Record agent activity on every tool call.
  • HTTP API

    • Implement GET /healthz.
    • Implement GET /api/status.
    • Implement GET /api/instructions.
    • Implement POST /api/instructions.
    • Implement PATCH /api/instructions/{instruction_id}.
    • Implement DELETE /api/instructions/{instruction_id}.
    • Implement GET /api/config.
    • Implement PATCH /api/config.
    • Implement GET /api/events for SSE.
  • Frontend

    • Read and follow .github/instructions/frontend-design.instructions.md before starting UI work.
    • Create static/index.html and split CSS/JS into separate folders/files.
    • Build the instruction composer.
    • Build the pending instruction list with edit/delete controls.
    • Build the consumed instruction list with crossed-out/greyed-out styling.
    • Build the live server/agent status panel.
    • Build the settings editor for timeout/default-response behavior.
    • Wire SSE updates into the UI so changes appear in real time.
    • Make the interface responsive and keyboard accessible.
  • Observability and robustness

    • Add centralized logging configuration.
    • Add structured error responses and exception handling.
    • Add queue-consumption concurrency protection.
    • Add validation for invalid edits/deletes of consumed instructions.
    • Add tests for empty-queue, timeout, and consume-once behavior.
  • Improvements (post-launch)

    • Replace 1-second polling wait loop with asyncio.Event-based immediate wakeup.
    • Min-wait is a floor only when the queue is empty — a new instruction immediately wakes any waiting tool call (verified with timing test in tests/test_wakeup.py).
    • Enrich SSE events with full item payloads (no extra re-fetch round-trips).
    • Auto-refresh relative timestamps in the UI every 20 s.
    • Document title badge showing pending instruction count.
    • SSE reconnecting indicator in the header.
    • Dark / light theme toggle defaulting to OS colour-scheme preference.
    • default_wait_seconds changed to fully server-controlled (agents can no longer override wait time).
    • Non-blocking server.ps1 management script (start / stop / restart / status / logs).
    • Non-blocking server.sh bash management script — identical feature set for macOS / Linux.
    • MCP stateless/stateful mode configurable via MCP_STATELESS env var (default true).
    • Per-agent generation counter prevents abandoned (timed-out) coroutines from silently consuming instructions meant for newer calls.
    • tests/test_wakeup.py covers both immediate-wakeup timing and concurrent-call generation safety.
    • Optional Bearer-token authentication via API_TOKEN env var (disabled by default); web UI prompts for token on first load.
  • Documentation and developer experience

    • Document local run instructions.
    • Document the MCP tool contract clearly.
    • Document the HTTP API with request/response examples.
    • Document how agent connectivity is inferred.
    • Document how the frontend design instruction must be used during UI implementation.

9. Running the Server

Prerequisites

  • Python 3.11+
  • pip

Install dependencies

pip install -r requirements.txt

Start the server

python main.py

Or use the included management scripts (recommended — non-blocking):

PowerShell (Windows)

.\server.ps1 start      # start in background, logs to logs/
.\server.ps1 stop       # graceful stop
.\server.ps1 restart    # stop + start
.\server.ps1 status     # PID, memory, tail logs
.\server.ps1 logs       # show last 40 stdout lines
.\server.ps1 logs -f    # follow logs live
.\server.ps1 logs 100   # show last 100 lines

Bash (macOS / Linux)

chmod +x server.sh      # make executable once
./server.sh start       # start in background, logs to logs/
./server.sh stop        # graceful stop
./server.sh restart     # stop + start
./server.sh status      # PID, memory, tail logs
./server.sh logs        # show last 40 stdout lines
./server.sh logs -f     # follow logs live
./server.sh logs 100    # show last 100 lines

The server starts on http://localhost:8000 by default.

URL Description
http://localhost:8000/ Web UI
http://localhost:8000/mcp MCP streamable-HTTP endpoint
http://localhost:8000/docs FastAPI interactive API docs

Environment variable overrides

Variable Default Description
HOST 0.0.0.0 Bind address
HTTP_PORT 8000 HTTP port
DB_PATH data/local_mcp.sqlite3 SQLite database path
LOG_LEVEL INFO Logging level
DEFAULT_WAIT_SECONDS 10 Default tool wait timeout
DEFAULT_EMPTY_RESPONSE call this tool \get_user_request` again to fetch latest user input...` Default response when queue is empty
AGENT_STALE_AFTER_SECONDS 30 Seconds of inactivity before agent shown as idle
MCP_STATELESS true true for stateless sessions (survives restarts, recommended); false for stateful
API_TOKEN (empty) When set, all /api/* and /mcp requests require Authorization: Bearer <token>; web UI prompts for the token on first load

Configuring an MCP client (agent)

Point the agent's MCP client to the streamable-HTTP transport:

{
  "mcpServers": {
    "local-mcp": {
      "url": "http://localhost:8000/mcp",
      "transport": "streamable-http"
    }
  }
}

If API_TOKEN is set, include the token as a request header:

{
  "mcpServers": {
    "local-mcp": {
      "url": "http://localhost:8000/mcp",
      "transport": "streamable-http",
      "headers": {
        "Authorization": "Bearer <your-token>"
      }
    }
  }
}

The agent should call get_user_request aggressively and continuously — do not end the working session. Every call returns the next pending instruction (if any). When the queue is empty the tool waits up to wait_seconds before returning an empty/default response, so the agent should loop and call again.

10. Implementation Notes for Future Work

  • Prefer small, explicit modules over monolithic files.
  • Keep the project localhost-first and avoid remote asset dependencies.
  • Treat the MCP tool and the web UI as two views over the same instruction queue.
  • Optimize for correctness of queue semantics first, then refine the visual and realtime experience.