Documentation

Image Analysis (img:)

Learn how to use the img: command to analyze screenshots, extract text, and understand visual content with AI.

What is the img: Command?

The img: command lets you capture a screenshot and ask AI about it. This combines Tesseract OCR (to extract text from images) with a vision-capable AI model (to understand the visual content). It's like having AI eyes that can see and understand your screen.

Requirements

The img: command requires two things to work:

Vision-capable AI Model

Required

You need a model that can understand images, like llama3.2-vision:11b

How to get it: Download from the Models page in Typilot

Tesseract OCR

Required

OCR (Optical Character Recognition) extracts text from screenshots so the AI can read it

How to get it: Install Tesseract - see platform instructions below

Installing Tesseract OCR

Tesseract OCR is free, open-source software that reads text from images. Here's how to install it on your platform:

Windows

Visit: github.com/UB-Mannheim/tesseract/wiki
Download the Windows installer
Run the installer and follow the wizard
Note the installation path (usually C:\Program Files\Tesseract-OCR)
Add Tesseract to your system PATH if not done automatically

Verify installation:

Open Command Prompt: tesseract --version

macOS

Open Terminal
Install via Homebrew: brew install tesseract
Or download from the Tesseract website
Verify installation is complete

Verify installation:

In Terminal: tesseract --version

Linux

Ubuntu/Debian: sudo apt-get install tesseract-ocr
Fedora: sudo dnf install tesseract
Arch: sudo pacman -S tesseract
Verify the installation

Verify installation:

In terminal: tesseract --version

How to Use img:

1Make sure you have a vision-capable model downloaded (like llama3.2-vision:11b) and Tesseract OCR installed
2Press your activation shortcut (Ctrl+Space default) anywhere on your system
3Type "img:" followed by your question (e.g., "img: describe this screenshot")
4Press the send key (period by default) - Typilot will capture a screenshot automatically
5Wait for Tesseract to extract text, then AI will analyze both the image and extracted text
6AI response appears right where you're typing!

Common Use Cases

Analyze Screenshots

Take a screenshot and ask the AI to explain what it sees. Perfect for understanding complex UIs or error messages.

Example:

img: describe what's happening in this screenshot

Extract Text from Images

OCR extracts text from images, then AI can format, analyze, or summarize it.

Example:

img: extract and format the table data from this image

Understand Error Messages

Capture error dialogs or system messages and get explanations in plain language.

Example:

img: what does this error dialog mean?

Analyze UI Elements

Great for documenting interfaces or understanding unfamiliar applications.

Example:

img: what buttons and options are visible on this screen?

Troubleshooting

Problem: img: command doesn't work

Ensure you have a vision-capable model (llama3.2-vision:11b) downloaded
Verify Tesseract OCR is installed and in your system PATH
Check Settings to ensure Tesseract path is correct
Try restarting Typilot after installing Tesseract

Tip: The screenshot captures text around your last cursor position or current input area. The auto-screenshot feature focuses on this zone to provide the most relevant context, so ensure the text you want analyzed is visible and unobstructed.