Skip to Content
MCP ServersCommunityMCP Screenshot

MCP Screenshot

View original on GitHub 

An MCP server that captures screenshots and performs OCR text recognition.

mcp-screenshot MCP server

Features

  • Screenshot capture (left half, right half, full screen)
  • OCR text recognition (supports Japanese and English)
  • Multiple output formats (JSON, Markdown, vertical, horizontal)

OCR Engines

This server uses two OCR engines:

  1. yomitoku 

    • Primary OCR engine
    • High-accuracy Japanese text recognition
    • Runs as an API server
  2. Tesseract.js 

    • Fallback OCR engine
    • Used when yomitoku is unavailable
    • Supports both Japanese and English recognition

Installation

npx -y @kazuph/mcp-screenshot

Claude Desktop Configuration

Add the following configuration to your claude_desktop_config.json:

{ "mcpServers": { "screenshot": { "command": "npx", "args": ["-y", "@kazuph/mcp-screenshot"], "env": { "OCR_API_URL": "http://localhost:8000" // yomitoku API base URL } } } }

Environment Variables

Variable NameDescriptionDefault Value
OCR_API_URLyomitoku API base URLhttp://localhost:8000 

Usage Example

You can use it by instructing Claude like this:

Please take a screenshot of the left half of the screen and recognize the text in it.

Tool Specification

capture

Takes a screenshot and performs OCR.

Options:

  • region: Screenshot area (β€˜left’/β€˜right’/β€˜full’, default: β€˜left’)
  • format: Output format (β€˜json’/β€˜markdown’/β€˜vertical’/β€˜horizontal’, default: β€˜markdown’)

License

MIT

Author

kazuph

Last updated on