PDF Extractor Pro
Overview
Extract text and tables from digital and scanned PDFs โ fully local, no cloud, no APIs.
# PDF Text & Table Extractor โ 100% Local OCR Extract text and tables from both digital and scanned PDF files directly inside your browser. Everything runs locally on your device โ no servers, no cloud processing, no API keys, and no data uploads. Works completely offline after installation and OCR language data caching. ## ๐ Privacy First * 100% local processing * No backend or cloud services * No external API calls * No account required * No tracking * Works offline after setup Your PDF files never leave your computer. --- ## โจ Features ### ๐ Digital PDF Text Extraction Extract text from standard PDFs while preserving the original reading order using PDF.js. ### ๐ OCR for Scanned PDFs Convert image-based PDFs into searchable text using built-in OCR. * English OCR support * Automatically detects scanned pages * No manual page selection required ### ๐ Automatic Table Detection Extract tables without manually drawing boxes or selecting regions. Uses coordinate-based row and column grouping to detect table structures automatically. ### ๐ฅ Multiple Export Formats * **JSON** โ Structured output with text blocks, tables, and OCR content * **CSV** โ Table data only * **Plain Text** โ Text with table rendering * **Excel (.xlsx)** โ Separate sheets for text blocks, tables, and OCR pages ### ๐ Multiple Input Methods #### Upload PDF Files Process PDF documents directly from your computer. #### Extract PDFs From Browser Tabs Open any PDF URL in Chrome and extract its contents without downloading files manually. --- # How to Use ## Upload a PDF 1. Click the extension icon. 2. Press **Upload PDF**. 3. Select a `.pdf` file. 4. Wait while pages are processed (OCR pages usually take 2โ5 seconds each). 5. Choose an export format. 6. Download the extracted results. --- ## Extract From the Current Browser Tab 1. Open any PDF in Chrome. 2. Click the extension icon. 3. The **Current Tab** option becomes available automatically. 4. Click it to process the PDF directly from the active tab. --- ## OCR Mode ### ON (Default) * Scanned pages are processed with Tesseract OCR. * Image pages are converted into searchable text. ### OFF * Faster processing. * Image-based pages are skipped. * Useful when working with text-based PDFs only. > During the first OCR run, the extension downloads the English language model (`eng.traineddata`, ~10 MB) and stores it locally. Once cached, all future OCR operations work completely offline. --- # Export Formats ### JSON Full structured output: ```json { "pages": [ { "textBlocks": [], "tables": [], "ocrText": "" } ] } ``` ### CSV Exports detected tables with separate sections for each table and page. ### Plain Text Exports all extracted text with readable table formatting. ### Excel (.xlsx) Creates separate worksheets for text blocks, tables, and OCR content. --- ## Perfect For * Researchers * Students * Accountants * Data analysts * Office work * Archiving documents * Converting scanned PDFs * Extracting tables from reports and invoices Fast, private, and fully local PDF text and table extraction for Chrome.
0 out of 5No ratings
Details
- Version1.0.0
- UpdatedJune 29, 2026
- Offered byLiminal Vault
- Size749KiB
- LanguagesEnglish
- Developer
Email
contact@liminalvault.com - Non-traderThis developer has not identified itself as a trader. For consumers in the European Union, please note that consumer rights do not apply to contracts between you and this developer.
Privacy
PDF Extractor Pro has disclosed the following information regarding the collection and usage of your data. More detailed information can be found in the developer's privacy policy.
PDF Extractor Pro handles the following:
This developer declares that your data is
- Not being sold to third parties, outside of the approved use cases
- Not being used or transferred for purposes that are unrelated to the item's core functionality
- Not being used or transferred to determine creditworthiness or for lending purposes