Documentation
Image Analysis (img:)
Learn how to use the img: command to analyze screenshots, extract text, and understand visual content with AI.
What is the img: Command?
The img: command lets you capture a screenshot and ask AI about it. This combines Tesseract OCR (to extract text from images) with a vision-capable AI model (to understand the visual content). It's like having AI eyes that can see and understand your screen.
Requirements
The img: command requires two things to work:
Vision-capable AI Model
You need a model that can understand images, like llama3.2-vision:11b
How to get it: Download from the Models page in Typilot
Tesseract OCR
OCR (Optical Character Recognition) extracts text from screenshots so the AI can read it
How to get it: Install Tesseract - see platform instructions below
Installing Tesseract OCR
Tesseract OCR is free, open-source software that reads text from images. Here's how to install it on your platform:
Windows
- Visit: github.com/UB-Mannheim/tesseract/wiki
- Download the Windows installer
- Run the installer and follow the wizard
- Note the installation path (usually C:\Program Files\Tesseract-OCR)
- Add Tesseract to your system PATH if not done automatically
Verify installation:
Open Command Prompt: tesseract --versionmacOS
- Open Terminal
- Install via Homebrew: brew install tesseract
- Or download from the Tesseract website
- Verify installation is complete
Verify installation:
In Terminal: tesseract --versionLinux
- Ubuntu/Debian: sudo apt-get install tesseract-ocr
- Fedora: sudo dnf install tesseract
- Arch: sudo pacman -S tesseract
- Verify the installation
Verify installation:
In terminal: tesseract --versionHow to Use img:
- 1Make sure you have a vision-capable model downloaded (like llama3.2-vision:11b) and Tesseract OCR installed
- 2Press your activation shortcut (Ctrl+Space default) anywhere on your system
- 3Type "img:" followed by your question (e.g., "img: describe this screenshot")
- 4Press the send key (period by default) - Typilot will capture a screenshot automatically
- 5Wait for Tesseract to extract text, then AI will analyze both the image and extracted text
- 6AI response appears right where you're typing!
Common Use Cases
Analyze Screenshots
Take a screenshot and ask the AI to explain what it sees. Perfect for understanding complex UIs or error messages.
Example:
img: describe what's happening in this screenshotExtract Text from Images
OCR extracts text from images, then AI can format, analyze, or summarize it.
Example:
img: extract and format the table data from this imageUnderstand Error Messages
Capture error dialogs or system messages and get explanations in plain language.
Example:
img: what does this error dialog mean?Analyze UI Elements
Great for documenting interfaces or understanding unfamiliar applications.
Example:
img: what buttons and options are visible on this screen?Troubleshooting
- Ensure you have a vision-capable model (llama3.2-vision:11b) downloaded
- Verify Tesseract OCR is installed and in your system PATH
- Check Settings to ensure Tesseract path is correct
- Try restarting Typilot after installing Tesseract
Tip: The screenshot captures text around your last cursor position or current input area. The auto-screenshot feature focuses on this zone to provide the most relevant context, so ensure the text you want analyzed is visible and unobstructed.