#python #document_analysis #layout_analysis #ocr #parser #pdf #pdf_converter #pdf_parser #python #vlm_ocr
Dolphin is a smart AI tool that can analyze and understand complex document images, like pages with text, tables, formulas, and pictures. It works in two steps: first, it figures out the layout and reading order of the page; then, it quickly parses each element using special prompts. This makes it fast and accurate for turning document images into structured data like JSON or Markdown. You can use pre-trained models and easy code to process single pages, PDFs, or specific elements. This helps you save time and effort when extracting information from complicated documents efficiently.
https://github.com/bytedance/Dolphin
Dolphin is a smart AI tool that can analyze and understand complex document images, like pages with text, tables, formulas, and pictures. It works in two steps: first, it figures out the layout and reading order of the page; then, it quickly parses each element using special prompts. This makes it fast and accurate for turning document images into structured data like JSON or Markdown. You can use pre-trained models and easy code to process single pages, PDFs, or specific elements. This helps you save time and effort when extracting information from complicated documents efficiently.
https://github.com/bytedance/Dolphin
GitHub
GitHub - bytedance/Dolphin: The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.
The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025. - bytedance/Dolphin