OCR — Image & PDF to Text

Extract editable text from photos, screenshots, and scanned PDFs. 35+ languages, camera capture, paste-from-clipboard, image enhancement, page-mode control, and Word / PDF / TXT / ZIP exports — 100% in your browser.

Drop images or PDFs here or click to browse

PNG · JPG · WebP · BMP · Scanned PDFs · 35+ languages · 100% private

0 files

🌐 Language & Recognition

Primary language

Language model downloads once on first use, then it's cached.

Secondary language (optional)

Mix two languages in one document (e.g. English + Urdu).

Recognition quality

Also upscales small images & photos for sharper recognition.

Page layout mode

Use "Single line" for signs/receipts, "Sparse" for screenshots.

✨ Image Enhancement

Auto-enhance (grayscale + contrast boost) Black & white (binarize)

⚙️ Output Settings

Output mode

Smart hybrid (use PDF text layer if present) Smart reflow (join wrapped lines, keep lists & receipts) Fix hyphenated line breaks Include confidence scores

⚡

First-run heads-up: On the first extraction, the selected language model (~5–10 MB) downloads to your browser. After that it's cached and OCR runs instantly. Future runs in the same language need no download.

🔒 All OCR processing happens in your browser via Tesseract.js. Your files never leave your device.

How to Use the OCR Tool

Add images or PDFs — Drag & drop one or many files, scan directly with your camera on mobile, or just paste a screenshot with Ctrl+V. Scanned PDFs, photos of documents, screenshots, and receipts all work. Remove any file individually from the preview list.
Pick the language(s) — Choose the document's language (35+ supported). Add a secondary language to mix two scripts in one file.
Tune recognition — Set quality (Fast / Balanced / Best — higher quality also upscales small photos for sharper recognition) and a page-layout mode (Auto, Single line, Sparse…). Optionally enable image enhancement or binarization for faded or noisy scans.
Extract — Click the button. The first run downloads the language model (~5–10 MB), then OCR runs fully offline.
Copy or download — Use "Copy as Rich Text" to paste into Word with formatting, or export as Word (.doc), PDF (with full Urdu / Arabic / Hindi / Chinese support), TXT, Markdown, JSON, or a ZIP of one text file per upload.

What is OCR?

OCR (Optical Character Recognition) is the technology that turns pictures of text — photos, screenshots, scanned documents — into real editable text you can search, copy, and paste. Without OCR, a scanned PDF or a photo of a page is just an image: a computer sees pixels, not words. Our tool uses Tesseract.js, the browser port of the industry-standard Tesseract OCR engine, to recognize text directly on your device. Nothing is uploaded.

OCR is also the missing piece for our PDF to Text tool: regular PDF-to-Text only works on PDFs that already contain a text layer. Scanned PDFs are images — they need OCR first. This tool's Smart hybrid mode handles both automatically.

Common Use Cases

📑 Scanned documentsTurn scanned contracts, IDs, certificates, or old archives into editable text.

📸 Photos of pagesSnap a phone photo of a textbook, sign, or whiteboard — or use the built-in camera button — and pull the text out.

🧾 Receipts & invoicesExtract amounts and line items from receipt photos; smart reflow keeps each line intact.

🖼️ ScreenshotsPaste a screenshot straight from your clipboard with Ctrl+V and grab quotes, errors, or chat text with Sparse mode.

🌍 Translation prepOCR first, then paste into a translation tool for foreign-language documents.

♿ AccessibilityConvert image-based content into screen-reader-friendly, selectable text.

Why Choose Our OCR Tool?

✅ 100% private — runs in your browser, files never uploaded
✅ 35+ languages — English, Arabic, Urdu, Hindi, Persian, Chinese, Tamil & more
✅ Dual-language OCR — mix two scripts in a single document
✅ Camera capture & clipboard paste — scan with your phone or Ctrl+V a screenshot
✅ Image + PDF support — PNG, JPG, WebP, BMP, and scanned/multi-page PDFs
✅ Smart hybrid mode — uses the PDF text layer when present, OCRs only image pages
✅ Auto-upscaling — small photos & screenshots are sharpened to OCR-grade resolution
✅ Image enhancement — grayscale, contrast boost, and binarize for faded scans
✅ Smart reflow — joins wrapped paragraphs but preserves lists, headings & receipt lines
✅ Unicode PDF export — Urdu, Arabic, Hindi & Chinese export correctly, with RTL support
✅ 6 export formats — Word, PDF, TXT, Markdown, JSON, and per-file ZIP
✅ Dark mode, fullscreen, live search, confidence scores — and no signup, watermarks, or limits

Frequently Asked Questions

Is this OCR tool free?

Yes — completely free, no signup, no daily limit, no watermarks.

Is my image or PDF uploaded anywhere?

No. The OCR engine (Tesseract.js) runs entirely in your browser. Your files never leave your device.

Can I scan with my phone camera or paste a screenshot?

Yes — tap Scan with camera on mobile to photograph a document directly, or press Ctrl+V (or the Paste button) to drop in a screenshot from your clipboard. Both go straight into the file list.

Why is the first run slow?

The language model (~5–10 MB) downloads once on first use. After that it's cached and OCR runs quickly. Adding a secondary language downloads a second small model the first time.

How accurate is OCR, and how can I improve it?

Accuracy depends on image quality. Clear, high-contrast scans typically reach 95%+. Small or low-resolution photos are automatically upscaled, and raising quality to Best increases that further. For blurry, faded, or noisy images, also enable Auto-enhance or Binarize and pick the matching page-layout mode.

Does PDF export work for Urdu, Arabic, Hindi or Chinese?

Yes. The PDF exporter detects non-Latin scripts and renders them through your browser's own text engine — including correct right-to-left layout for Urdu and Arabic — so the downloaded PDF shows the text exactly as recognized.

Can I paste the result into Word with formatting?

Yes — use "Copy as Rich Text", then paste into Word, Google Docs, or LibreOffice. You can also download directly as a .doc Word file or a PDF.

Can OCR read handwriting?

Tesseract is optimized for printed text. Handwriting accuracy is limited — clean printed text gives the best results.

What does "Smart hybrid" mode do?

For PDFs, it first checks if a page already has a text layer. If yes, it extracts text directly (fast and 100% accurate). If a page is image-only (scanned), it falls back to OCR automatically.

Can I extract from many files at once?

Yes. Drop multiple images and PDFs together, then export everything as one combined document or as a ZIP containing one text file per upload.

Does it work on mobile?

Yes — fully responsive with camera capture, dark mode and fullscreen. OCR works on iOS/Android too, though it may be slower than desktop due to CPU.

OCR — Image & PDF to Text

How to Use the OCR Tool

What is OCR?

Common Use Cases

Why Choose Our OCR Tool?

Frequently Asked Questions

Related Tools