19 KiB
local-mcp
local-mcp is a localhost-first MCP server whose primary responsibility is to deliver the latest user instruction to an agent through the get_user_request tool, while also providing a responsive web UI for managing the instruction queue and monitoring server/agent activity.
This document is the implementation plan for the project.
1. Goals
- Provide a single MCP tool,
get_user_request, that returns at most one instruction per call. - Give the user a polished local web UI to add, edit, remove, review, and monitor instructions.
- Preserve queue integrity so consumed instructions are clearly visible but no longer editable/deletable.
- Support configurable waiting/default-response behavior when no instruction is available.
- Show live server status and inferred agent connectivity in the UI.
- Keep the stack lightweight, maintainable, debuggable, and friendly to local development.
2. Recommended Tech Stack
Backend
- Language/runtime: Python 3.11+
- MCP integration: official Python MCP SDK
- HTTP server/API layer: FastAPI
- ASGI server: Uvicorn
- Persistence: SQLite via Python standard library
sqlite3 - Concurrency/state coordination:
asyncio+ standard library synchronization primitives where needed - Logging/error handling: Python
logging, structured request logs, centralized exception handling - Configuration: environment variables + small local config file (
.jsonor.toml)
Why this backend stack
- The MCP SDK is the correct dependency for exposing the MCP tool cleanly.
- FastAPI + Uvicorn is a small, pragmatic backend stack that simplifies routing, validation, health endpoints, and server-sent updates without introducing a heavy framework.
- SQLite keeps the system local-first, dependency-light, and durable enough for instruction history and settings.
- Most supporting concerns remain in the Python standard library, which keeps third-party dependencies minimal.
Frontend
- UI technology: plain HTML, CSS, and JavaScript only
- Realtime updates: Server-Sent Events (preferred) with polling fallback if necessary
- Styling: local CSS files with design tokens and component-specific stylesheets
- Client architecture: modular vanilla JS organized by feature (
api.js,state.js,events.js,instructions.js, etc.) - Assets: all fonts/icons/scripts/styles stored locally in the repository; no CDN usage
Mandatory frontend implementation instruction
Any future frontend implementation work must first read and follow:
.github/instructions/frontend-design.instructions.md
This instruction file is mandatory for the UI because it requires a distinctive, production-grade, non-generic frontend. The implementation should not default to generic dashboard aesthetics.
3. Product/Architecture Plan
Core backend responsibilities
- Expose the MCP tool
get_user_request. - Maintain an instruction queue with durable storage.
- Mark instructions as consumed atomically when delivered to an agent.
- Expose local HTTP endpoints for the web UI.
- Stream status/instruction updates to the browser in real time.
- Infer agent connectivity from recent MCP tool activity.
- Persist and serve server configuration such as wait timeout and default empty response.
Core frontend responsibilities
- Show queued and consumed instructions in separate, clearly labeled sections.
- Allow add/edit/delete only for instructions that are still pending.
- Cross out and grey out consumed instructions.
- Show server status, inferred agent status, last fetch time, and configuration values.
- Update live as instruction state changes.
- Remain usable and visually polished on desktop and smaller screens.
Suggested repository layout
local-mcp/
├─ main.py
├─ README.md
├─ requirements.txt
├─ app/
│ ├─ __init__.py
│ ├─ config.py
│ ├─ database.py
│ ├─ logging_setup.py
│ ├─ models.py
│ ├─ services/
│ │ ├─ instruction_service.py
│ │ ├─ status_service.py
│ │ └─ event_service.py
│ ├─ api/
│ │ ├─ routes_instructions.py
│ │ ├─ routes_status.py
│ │ └─ routes_config.py
│ └─ mcp_server.py
├─ static/
│ ├─ index.html
│ ├─ css/
│ │ ├─ base.css
│ │ ├─ layout.css
│ │ └─ components.css
│ ├─ js/
│ │ ├─ api.js
│ │ ├─ app.js
│ │ ├─ events.js
│ │ ├─ instructions.js
│ │ └─ status.js
│ └─ assets/
└─ data/
└─ local_mcp.sqlite3
4. Data Model Plan
instructions
id- string/UUID primary keycontent- text, requiredstatus- enum:pending,consumedcreated_at- datetimeupdated_at- datetimeconsumed_at- nullable datetimeconsumed_by_agent_id- nullable stringposition- integer for stable queue order
settings
default_wait_seconds- integer — seconds the tool waits before returning an empty/default response; set exclusively by the user via the web UIdefault_empty_response- text, nullableagent_stale_after_seconds- integer
agent_activity
agent_id- string primary keylast_seen_at- datetimelast_fetch_at- datetimelast_result_type- enum:instruction,empty,default_response
5. Detailed API Design
All routes are local-only and intended for localhost usage.
5.1 MCP tool contract
Tool: get_user_request
Purpose
- Return the next pending instruction, if one exists.
- If none exists, wait for a configurable duration, then return the server-controlled default response.
- Record agent activity so the UI can infer whether an agent is currently connected/recently active.
Suggested input schema
{
"agent_id": "optional-string"
}
Suggested output schema when an instruction is delivered
{
"status": "ok",
"result_type": "instruction",
"instruction": {
"id": "uuid",
"content": "user instruction text",
"consumed_at": "ISO-8601 timestamp"
},
"remaining_pending": 3,
"waited_seconds": 0
}
Suggested output schema when queue is empty
{
"status": "ok",
"result_type": "empty",
"instruction": null,
"response": "",
"remaining_pending": 0,
"waited_seconds": 10
}
Suggested output schema when a default response is returned
{
"status": "ok",
"result_type": "default_response",
"instruction": null,
"response": "call this tool `get_user_request` again to fetch latest user input...",
"remaining_pending": 0,
"waited_seconds": 10
}
Behavior rules
- Deliver the oldest pending instruction first.
- Mark the delivered instruction as consumed in the same transaction used to claim it.
- Never return more than one instruction per call.
default_wait_secondsis fully server-controlled (set by the user via the web UI). Agents cannot override it.- Clamp
actual_waitto an absolute server maximum (86400 s). - Update the agent activity record on every call, even when no instruction is returned.
- The UI should infer "agent connected" if the latest activity is within
agent_stale_after_seconds. - Agent implementations should continue calling this tool instead of ending their work session on their own, so they can pick up newly added instructions without missing critical follow-up requests.
5.2 HTTP API for the web UI
GET /healthz
Returns service health.
Response
{
"status": "ok",
"server_time": "ISO-8601 timestamp"
}
GET /api/status
Returns current server and agent summary.
Response
{
"server": {
"status": "up",
"started_at": "ISO-8601 timestamp"
},
"agent": {
"connected": true,
"last_seen_at": "ISO-8601 timestamp",
"last_fetch_at": "ISO-8601 timestamp",
"agent_id": "copilot-agent"
},
"queue": {
"pending_count": 2,
"consumed_count": 8
},
"settings": {
"default_wait_seconds": 10,
"default_empty_response": "call this tool `get_user_request` again to fetch latest user input...",
"agent_stale_after_seconds": 30
}
}
GET /api/instructions
Returns all instructions in queue order.
Query params
status=pending|consumed|all(defaultall)
Response
{
"items": [
{
"id": "uuid",
"content": "Implement logging",
"status": "pending",
"created_at": "ISO-8601 timestamp",
"updated_at": "ISO-8601 timestamp",
"consumed_at": null,
"consumed_by_agent_id": null,
"position": 1
}
]
}
POST /api/instructions
Creates a new pending instruction.
Request
{
"content": "Add a new status indicator"
}
Response: 201 Created
{
"item": {
"id": "uuid",
"content": "Add a new status indicator",
"status": "pending",
"created_at": "ISO-8601 timestamp",
"updated_at": "ISO-8601 timestamp",
"consumed_at": null,
"consumed_by_agent_id": null,
"position": 3
}
}
PATCH /api/instructions/{instruction_id}
Edits a pending instruction only.
Request
{
"content": "Reword an existing pending instruction"
}
Rules
- Return
409 Conflictif the instruction has already been consumed. - Return
404 Not Foundif the instruction does not exist.
DELETE /api/instructions/{instruction_id}
Deletes a pending instruction only.
Rules
- Return
409 Conflictif the instruction has already been consumed. - Return
204 No Contenton success.
GET /api/config
Returns editable runtime settings.
Response
{
"default_wait_seconds": 10,
"default_empty_response": "call this tool `get_user_request` again to fetch latest user input...",
"agent_stale_after_seconds": 30
}
PATCH /api/config
Updates runtime settings.
Request
{
"default_wait_seconds": 15,
"default_empty_response": "",
"agent_stale_after_seconds": 45
}
GET /api/events
Server-Sent Events endpoint for live UI updates.
Event types
instruction.createdinstruction.updatedinstruction.deletedinstruction.consumedstatus.changedconfig.updated
SSE payload example
{
"type": "instruction.consumed",
"timestamp": "ISO-8601 timestamp",
"data": {
"id": "uuid",
"consumed_by_agent_id": "copilot-agent"
}
}
6. UI/UX Plan
Layout priorities
- A strong local-control dashboard feel rather than a generic admin template
- Clear separation between pending work and already-consumed history
- High-visibility connection/status strip for server and agent state
- Fast creation flow for new instructions
- Mobile-friendly stacking without losing queue readability
Required screens/sections
- Header with project identity and server status
- Agent activity panel with last seen/fetch information
- Composer form for new instructions
- Pending instructions list with edit/delete actions
- Consumed instructions list with crossed-out styling and metadata
- Settings panel for wait timeout/default response behavior
Frontend quality bar
- Follow
.github/instructions/frontend-design.instructions.mdbefore implementing any UI. - Use only local assets.
- Build a visually distinctive interface with careful typography, color, spacing, motion, and responsive behavior.
- Keep accessibility in scope: semantic HTML, keyboard support, visible focus states, sufficient contrast.
7. Logging, Reliability, and Error Handling Plan
- Log startup, shutdown, configuration load, database initialization, and MCP registration.
- Log each instruction lifecycle event: created, updated, deleted, consumed.
- Log each
get_user_requestcall with agent id, wait time, and result type. - Return structured JSON errors for API failures.
- Protect queue consumption with transactions/locking so two simultaneous fetches cannot consume the same instruction.
- Validate payloads and reject empty or whitespace-only instructions.
- Handle browser reconnects for SSE cleanly.
8. Todo List
-
Project setup
- Create the backend package structure under
app/. - Add
requirements.txtwith only the required dependencies. - Replace the placeholder contents of
main.pywith the application entrypoint. - Add a local configuration strategy for defaults and runtime overrides.
- Create the backend package structure under
-
Data layer
- Create SQLite schema for
instructions,settings, andagent_activity. - Add startup migration/initialization logic.
- Implement queue ordering and atomic consumption behavior.
- Seed default settings on first run.
- Create SQLite schema for
-
MCP server
- Register the
get_user_requesttool using the official MCP Python SDK. - Implement one-at-a-time delivery semantics.
- Implement wait-until-timeout behavior when the queue is empty.
- Return empty/default responses based on configuration.
- Record agent activity on every tool call.
- Register the
-
HTTP API
- Implement
GET /healthz. - Implement
GET /api/status. - Implement
GET /api/instructions. - Implement
POST /api/instructions. - Implement
PATCH /api/instructions/{instruction_id}. - Implement
DELETE /api/instructions/{instruction_id}. - Implement
GET /api/config. - Implement
PATCH /api/config. - Implement
GET /api/eventsfor SSE.
- Implement
-
Frontend
- Read and follow
.github/instructions/frontend-design.instructions.mdbefore starting UI work. - Create
static/index.htmland split CSS/JS into separate folders/files. - Build the instruction composer.
- Build the pending instruction list with edit/delete controls.
- Build the consumed instruction list with crossed-out/greyed-out styling.
- Build the live server/agent status panel.
- Build the settings editor for timeout/default-response behavior.
- Wire SSE updates into the UI so changes appear in real time.
- Make the interface responsive and keyboard accessible.
- Read and follow
-
Observability and robustness
- Add centralized logging configuration.
- Add structured error responses and exception handling.
- Add queue-consumption concurrency protection.
- Add validation for invalid edits/deletes of consumed instructions.
- Add tests for empty-queue, timeout, and consume-once behavior.
-
Improvements (post-launch)
- Replace 1-second polling wait loop with
asyncio.Event-based immediate wakeup. - Min-wait is a floor only when the queue is empty — a new instruction immediately wakes any waiting tool call (verified with timing test in
tests/test_wakeup.py). - Enrich SSE events with full item payloads (no extra re-fetch round-trips).
- Auto-refresh relative timestamps in the UI every 20 s.
- Document title badge showing pending instruction count.
- SSE reconnecting indicator in the header.
- Dark / light theme toggle defaulting to OS colour-scheme preference.
default_wait_secondschanged to fully server-controlled (agents can no longer override wait time).- Non-blocking
server.ps1management script (start / stop / restart / status / logs). - Non-blocking
server.shbash management script — identical feature set for macOS / Linux. - MCP stateless/stateful mode configurable via
MCP_STATELESSenv var (defaulttrue). - Per-agent generation counter prevents abandoned (timed-out) coroutines from silently consuming instructions meant for newer calls.
tests/test_wakeup.pycovers both immediate-wakeup timing and concurrent-call generation safety.- Optional Bearer-token authentication via
API_TOKENenv var (disabled by default); web UI prompts for token on first load.
- Replace 1-second polling wait loop with
-
Documentation and developer experience
- Document local run instructions.
- Document the MCP tool contract clearly.
- Document the HTTP API with request/response examples.
- Document how agent connectivity is inferred.
- Document how the frontend design instruction must be used during UI implementation.
9. Running the Server
Prerequisites
- Python 3.11+
- pip
Install dependencies
pip install -r requirements.txt
Start the server
python main.py
Or use the included management scripts (recommended — non-blocking):
PowerShell (Windows)
.\server.ps1 start # start in background, logs to logs/
.\server.ps1 stop # graceful stop
.\server.ps1 restart # stop + start
.\server.ps1 status # PID, memory, tail logs
.\server.ps1 logs # show last 40 stdout lines
.\server.ps1 logs -f # follow logs live
.\server.ps1 logs 100 # show last 100 lines
Bash (macOS / Linux)
chmod +x server.sh # make executable once
./server.sh start # start in background, logs to logs/
./server.sh stop # graceful stop
./server.sh restart # stop + start
./server.sh status # PID, memory, tail logs
./server.sh logs # show last 40 stdout lines
./server.sh logs -f # follow logs live
./server.sh logs 100 # show last 100 lines
The server starts on http://localhost:8000 by default.
| URL | Description |
|---|---|
http://localhost:8000/ |
Web UI |
http://localhost:8000/mcp |
MCP streamable-HTTP endpoint |
http://localhost:8000/docs |
FastAPI interactive API docs |
Environment variable overrides
| Variable | Default | Description |
|---|---|---|
HOST |
0.0.0.0 |
Bind address |
HTTP_PORT |
8000 |
HTTP port |
DB_PATH |
data/local_mcp.sqlite3 |
SQLite database path |
LOG_LEVEL |
INFO |
Logging level |
DEFAULT_WAIT_SECONDS |
10 |
Default tool wait timeout |
DEFAULT_EMPTY_RESPONSE |
call this tool \get_user_request` again to fetch latest user input...` |
Default response when queue is empty |
AGENT_STALE_AFTER_SECONDS |
30 |
Seconds of inactivity before agent shown as idle |
MCP_STATELESS |
true |
true for stateless sessions (survives restarts, recommended); false for stateful |
API_TOKEN |
(empty) | When set, all /api/* and /mcp requests require Authorization: Bearer <token>; web UI prompts for the token on first load |
Configuring an MCP client (agent)
Point the agent's MCP client to the streamable-HTTP transport:
{
"mcpServers": {
"local-mcp": {
"url": "http://localhost:8000/mcp",
"transport": "streamable-http"
}
}
}
If API_TOKEN is set, include the token as a request header:
{
"mcpServers": {
"local-mcp": {
"url": "http://localhost:8000/mcp",
"transport": "streamable-http",
"headers": {
"Authorization": "Bearer <your-token>"
}
}
}
}
The agent should call get_user_request aggressively and continuously — do not end the working session. Every call returns the next pending instruction (if any). When the queue is empty the tool waits up to wait_seconds before returning an empty/default response, so the agent should loop and call again.
10. Implementation Notes for Future Work
- Prefer small, explicit modules over monolithic files.
- Keep the project localhost-first and avoid remote asset dependencies.
- Treat the MCP tool and the web UI as two views over the same instruction queue.
- Optimize for correctness of queue semantics first, then refine the visual and realtime experience.