Item logo image for Page to Markdown Scraper

Page to Markdown Scraper

ExtensionTools13 users
Item media 5 (screenshot) for Page to Markdown Scraper
Item media 1 (screenshot) for Page to Markdown Scraper
Item media 2 (screenshot) for Page to Markdown Scraper
Item media 3 (screenshot) for Page to Markdown Scraper
Item media 4 (screenshot) for Page to Markdown Scraper
Item media 5 (screenshot) for Page to Markdown Scraper
Item media 1 (screenshot) for Page to Markdown Scraper
Item media 1 (screenshot) for Page to Markdown Scraper
Item media 2 (screenshot) for Page to Markdown Scraper
Item media 3 (screenshot) for Page to Markdown Scraper
Item media 4 (screenshot) for Page to Markdown Scraper
Item media 5 (screenshot) for Page to Markdown Scraper

Overview

Capture page content and OCR text from any page or PDF. Saves as Markdown/ZIP for ChatGPT and LLMs. 100% local processing.

Page to Markdown Scraper captures web page content and saves it as a clean Markdown file in a ZIP, ready to upload to ChatGPT, Claude, Gemini, or any other LLM. VERSION 1.5.0 - NOW WITH BUILT-IN OCR Three capture modes: MANUAL MODE Capture the current page with one click. Text, images, and metadata are extracted and downloaded immediately as a ZIP file. AUTO MODE Start a recording session, then browse normally. Every page you visit is captured automatically. A red border and REC badge show that recording is active. Stop the session to download everything in a single ZIP. OCR MODE (NEW) Extract text from screenshots using built-in optical character recognition. Works on content that normal scraping cannot reach: images, scanned documents, PDFs, canvas elements, and pages that block text selection. - OCR Viewport: Capture and extract text from the visible screen area - OCR Full Page: Auto-scrolls the entire page or PDF, capturing and recognising each section - PDF support: Automatically detects PDF files and scrolls through the full document - Keyboard shortcut: Ctrl+Shift+O (Cmd+Shift+O on Mac) for quick viewport capture - Results panel: Shows word count and confidence score with a one-click download button PRIVACY - All processing happens locally in your browser. Nothing is sent to any server. - OCR runs via Tesseract.js (WebAssembly) entirely on your device. No cloud APIs. - No analytics, no tracking, no accounts, no data collection. - Open source: https://github.com/SlambertDK/page-to-markdown-extension OUTPUT - content.md: Clean Markdown optimised for LLM upload - images/ folder: All page images and OCR screenshots - README.txt: Usage instructions - Everything in a single ZIP file DISCLAIMER You are responsible for ensuring you have the right to capture and use any content. A terms disclaimer is shown before every capture session.

Details

  • Version
    1.5.0
  • Updated
    February 21, 2026
  • Offered by
    henriklambert1979
  • Size
    3.47MiB
  • Languages
    English
  • Developer
    Email
    henriklambert@proton.me
  • Non-trader
    This developer has not identified itself as a trader. For consumers in the European Union, please note that consumer rights do not apply to contracts between you and this developer.

Privacy

Manage extensions and learn how they're being used in your organization
The developer has disclosed that it will not collect or use your data.

This developer declares that your data is

  • Not being sold to third parties, outside of the approved use cases
  • Not being used or transferred for purposes that are unrelated to the item's core functionality
  • Not being used or transferred to determine creditworthiness or for lending purposes
Google apps