#python #agent #agents #ai_search #chatbot #chatgpt #data_pipelines #deep_learning #document_parser #document_understanding #genai #graph #graphrag #llm #nlp #pdf_to_text #preprocessing #rag #retrieval_augmented_generation #table_structure_recognition #text2sql
RAGFlow is an open-source tool that helps businesses answer questions accurately using large language models and deep document understanding. It extracts information from various complex data formats, such as Word documents, Excel files, and web pages, and provides grounded citations to support its answers. You can try a demo online or set it up on your own server using Docker. The setup is relatively straightforward, requiring a few steps like cloning the repository, building the Docker image, and configuring the system settings. RAGFlow offers key features like template-based chunking, reduced hallucinations, and compatibility with multiple data sources, making it a powerful tool for truthful question-answering capabilities. This benefits users by providing reliable and explainable answers, streamlining their workflow, and supporting integration with their business systems.
https://github.com/infiniflow/ragflow
RAGFlow is an open-source tool that helps businesses answer questions accurately using large language models and deep document understanding. It extracts information from various complex data formats, such as Word documents, Excel files, and web pages, and provides grounded citations to support its answers. You can try a demo online or set it up on your own server using Docker. The setup is relatively straightforward, requiring a few steps like cloning the repository, building the Docker image, and configuring the system settings. RAGFlow offers key features like template-based chunking, reduced hallucinations, and compatibility with multiple data sources, making it a powerful tool for truthful question-answering capabilities. This benefits users by providing reliable and explainable answers, streamlining their workflow, and supporting integration with their business systems.
https://github.com/infiniflow/ragflow
GitHub
GitHub - infiniflow/ragflow: RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge…
RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs - infiniflow/ragflow
#python #ai #convert #documents #pdf #tables
Docling is a tool that helps you convert different types of documents (like PDF, DOCX, PPTX, and more) into Markdown or JSON format quickly and easily. It can read complex PDFs, extract metadata, and even work with scanned documents using OCR. Docling also integrates well with other tools for powerful question-answering applications. You can install it using `pip install docling` and start converting documents right away. This makes it easier to manage and use your documents in various formats, saving you time and effort.
https://github.com/DS4SD/docling
Docling is a tool that helps you convert different types of documents (like PDF, DOCX, PPTX, and more) into Markdown or JSON format quickly and easily. It can read complex PDFs, extract metadata, and even work with scanned documents using OCR. Docling also integrates well with other tools for powerful question-answering applications. You can install it using `pip install docling` and start converting documents right away. This makes it easier to manage and use your documents in various formats, saving you time and effort.
https://github.com/DS4SD/docling
GitHub
GitHub - docling-project/docling: Get your documents ready for gen AI
Get your documents ready for gen AI. Contribute to docling-project/docling development by creating an account on GitHub.
#java #docker #java #pdf #pdf_converter #pdf_editor #pdf_manipulation #pdf_merger #pdf_ocr #pdf_tools #pdf_web_apps #pdfmerger
Stirling-PDF is a powerful tool for managing PDF files locally on your computer or server. It allows you to perform various operations like splitting, merging, converting, and editing PDFs without sending your files to external servers, ensuring your data stays private. You can add images, rotate pages, compress files, and even convert PDFs to other formats like Word or images. The tool supports multiple languages and has features like dark mode, custom download options, and API integration for advanced users. It's easy to set up using Docker and offers customizable settings and security features like login authentication. This makes it a versatile and secure solution for all your PDF needs.
https://github.com/Stirling-Tools/Stirling-PDF
Stirling-PDF is a powerful tool for managing PDF files locally on your computer or server. It allows you to perform various operations like splitting, merging, converting, and editing PDFs without sending your files to external servers, ensuring your data stays private. You can add images, rotate pages, compress files, and even convert PDFs to other formats like Word or images. The tool supports multiple languages and has features like dark mode, custom download options, and API integration for advanced users. It's easy to set up using Docker and offers customizable settings and security features like login authentication. This makes it a versatile and secure solution for all your PDF needs.
https://github.com/Stirling-Tools/Stirling-PDF
GitHub
GitHub - Stirling-Tools/Stirling-PDF: #1 PDF Application on GitHub that lets you edit PDFs on any device anywhere
#1 PDF Application on GitHub that lets you edit PDFs on any device anywhere - Stirling-Tools/Stirling-PDF
👍1
#python #ai4science #document_analysis #extract_data #layout_analysis #ocr #parser #pdf #pdf_converter #pdf_extractor_llm #pdf_extractor_pretrain #pdf_extractor_rag #pdf_parser #python
MinerU is a tool that converts PDFs into machine-readable formats like markdown or JSON. Here are the key benefits and features MinerU removes headers, footers, and other unnecessary elements to ensure the text is semantically coherent and in human-readable order, even for complex layouts.
- **Structure Preservation** It extracts images, image descriptions, tables, and table titles.
- **Formula Conversion** Recognizes tables and converts them to LaTeX or HTML format.
- **OCR Support** Supports multiple output formats and various visualization results.
- **GPU and CPU Compatibility**: Works on both CPU and GPU environments, compatible with Windows, Linux, and Mac.
You can try MinerU through an online demo, a quick CPU demo, or by using a GPU for faster processing. For detailed usage, refer to the command line options, API integration, and deployment guides provided.
https://github.com/opendatalab/MinerU
MinerU is a tool that converts PDFs into machine-readable formats like markdown or JSON. Here are the key benefits and features MinerU removes headers, footers, and other unnecessary elements to ensure the text is semantically coherent and in human-readable order, even for complex layouts.
- **Structure Preservation** It extracts images, image descriptions, tables, and table titles.
- **Formula Conversion** Recognizes tables and converts them to LaTeX or HTML format.
- **OCR Support** Supports multiple output formats and various visualization results.
- **GPU and CPU Compatibility**: Works on both CPU and GPU environments, compatible with Windows, Linux, and Mac.
You can try MinerU through an online demo, a quick CPU demo, or by using a GPU for faster processing. For detailed usage, refer to the command line options, API integration, and deployment guides provided.
https://github.com/opendatalab/MinerU
GitHub
GitHub - opendatalab/MinerU: Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.
Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows. - opendatalab/MinerU
#ruby #daisyui #document_signing #documents #e_signature #github_catalyst #hotwired_turbo #legaltech #open_source #pdf #pdf_sign #pdf_signature #ruby_on_rails #self_hosted #tailwindcss #vuejs #webpack
DocuSeal is a free and open-source platform that helps you fill and sign documents online easily. You can create PDF forms with various field types like signatures, dates, and checkboxes, and these forms can be filled and signed on any device. It offers features like automated emails, multiple language support, and integration with cloud storage services. The platform is mobile-optimized and has tools for user management and API integrations. This makes it convenient for businesses to integrate document signing into their apps, reducing costs and ensuring security and compliance. You can try it out with a live demo or deploy it quickly using various hosting options.
https://github.com/docusealco/docuseal
DocuSeal is a free and open-source platform that helps you fill and sign documents online easily. You can create PDF forms with various field types like signatures, dates, and checkboxes, and these forms can be filled and signed on any device. It offers features like automated emails, multiple language support, and integration with cloud storage services. The platform is mobile-optimized and has tools for user management and API integrations. This makes it convenient for businesses to integrate document signing into their apps, reducing costs and ensuring security and compliance. You can try it out with a live demo or deploy it quickly using various hosting options.
https://github.com/docusealco/docuseal
GitHub
GitHub - docusealco/docuseal: Open source DocuSign alternative. Create, fill, and sign digital documents ✍️
Open source DocuSign alternative. Create, fill, and sign digital documents ✍️ - docusealco/docuseal
#python #docx #llm #parser #pdf #powerpoint
MegaParse is a powerful tool that helps you parse different types of documents like text, PDFs, PowerPoint presentations, and Word documents without losing any information. It is fast, efficient, and supports many file formats. You can use it for free since it is open source. To use MegaParse, you just need to install it with a simple command and set up some additional tools depending on your needs. This tool benefits you by making it easy to extract data from various documents quickly and accurately, saving you time and effort.
https://github.com/QuivrHQ/MegaParse
MegaParse is a powerful tool that helps you parse different types of documents like text, PDFs, PowerPoint presentations, and Word documents without losing any information. It is fast, efficient, and supports many file formats. You can use it for free since it is open source. To use MegaParse, you just need to install it with a simple command and set up some additional tools depending on your needs. This tool benefits you by making it easy to extract data from various documents quickly and accurately, saving you time and effort.
https://github.com/QuivrHQ/MegaParse
GitHub
GitHub - QuivrHQ/MegaParse: File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal…
File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs. - GitHub - QuivrHQ/MegaParse: File Parser optimised for LLM Ingestion with no loss...
#lua #cbz #djvu #djvu_reflow #ebook #ebook_reader #eink #epub #ereader #fb2 #kindle #kobo #luajit #opds #pdf #pdf_reflow #pocketbook #reader #reflow #remarkable_tablet #ubuntu_touch
KOReader is a powerful document viewer designed for e-ink readers and other devices. It supports many file formats like PDF, EPUB, and more, and allows you to customize the reading experience with adjustable margins, line spacing, and fonts. It's fast, even on older devices, and integrates with tools like calibre and Google Translate. KOReader is also optimized for e-ink devices with features like easy zoom and no animations. This makes reading comfortable and efficient, giving you a better experience overall.
https://github.com/koreader/koreader
KOReader is a powerful document viewer designed for e-ink readers and other devices. It supports many file formats like PDF, EPUB, and more, and allows you to customize the reading experience with adjustable margins, line spacing, and fonts. It's fast, even on older devices, and integrates with tools like calibre and Google Translate. KOReader is also optimized for e-ink devices with features like easy zoom and no animations. This makes reading comfortable and efficient, giving you a better experience overall.
https://github.com/koreader/koreader
GitHub
GitHub - koreader/koreader: An ebook reader application supporting PDF, DjVu, EPUB, FB2 and many more formats, running on Cervantes…
An ebook reader application supporting PDF, DjVu, EPUB, FB2 and many more formats, running on Cervantes, Kindle, Kobo, PocketBook and Android devices - koreader/koreader
#javascript #book #cb7 #cbr #cbt #cbz #comic #docx #ebook #epub #fb2 #html #markdown #mobi #pdf #reader #rtf #txt #xml
Koodo Reader is a powerful ebook reader that works on many platforms like Windows, macOS, Linux, and even the web. It supports many file formats such as EPUB, PDF, MOBI, and more. You can customize how your books look by changing font size, color, and background. It also has features like text-to-speech, translation, and night mode. You can save your books to cloud services like OneDrive, Google Drive, and Dropbox, making it easy to access your books on different devices. This makes reading convenient and enjoyable anywhere you go.
https://github.com/koodo-reader/koodo-reader
Koodo Reader is a powerful ebook reader that works on many platforms like Windows, macOS, Linux, and even the web. It supports many file formats such as EPUB, PDF, MOBI, and more. You can customize how your books look by changing font size, color, and background. It also has features like text-to-speech, translation, and night mode. You can save your books to cloud services like OneDrive, Google Drive, and Dropbox, making it easy to access your books on different devices. This makes reading convenient and enjoyable anywhere you go.
https://github.com/koodo-reader/koodo-reader
GitHub
GitHub - koodo-reader/koodo-reader: A modern ebook manager and reader with sync and backup capacities for Windows, macOS, Linux…
A modern ebook manager and reader with sync and backup capacities for Windows, macOS, Linux, Android, iOS and Web - koodo-reader/koodo-reader
❤1
#python #chinese #english #japanese #korean #latex #openai #pdf #pdf2zh #russian #translation
This tool, called PDFMathTranslate, helps you translate scientific PDF papers while keeping formulas, charts, and other important parts intact. You can use it in several ways: through a command line, an interactive user interface, or using Docker. It supports multiple languages and various translation services like Google, DeepL, and more. You can even try it online without installing anything. This makes it easier to understand and work with documents in different languages, saving you time and effort.
https://github.com/Byaidu/PDFMathTranslate
This tool, called PDFMathTranslate, helps you translate scientific PDF papers while keeping formulas, charts, and other important parts intact. You can use it in several ways: through a command line, an interactive user interface, or using Docker. It supports multiple languages and various translation services like Google, DeepL, and more. You can even try it online without installing anything. This makes it easier to understand and work with documents in different languages, saving you time and effort.
https://github.com/Byaidu/PDFMathTranslate
GitHub
GitHub - PDFMathTranslate/PDFMathTranslate: [EMNLP 2025 Demo] PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的…
[EMNLP 2025 Demo] PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/MCP/Docker/Zotero - PDFMathTranslate/PDFMathTrans...
#typescript #digital_signature #document_signing #docusign_alternative #e_signature #esign #esignature #next_auth #nextjs #open_source #pades_standard #pdf #pdf_sign #pdf_signature #postgresql #prisma #self_hosted #signing #typescript
Documenso is an open-source alternative to DocuSign, allowing you to sign documents digitally in a secure and transparent way. You can self-host it, which means you have full control over how it works and can review the code. This builds trust because you aren't relying on a third-party provider. Joining the community helps in creating a more open and trustworthy signing tool. You can test it locally, provide feedback, and even contribute to its development. This gives you flexibility and control over your document signing process.
https://github.com/documenso/documenso
Documenso is an open-source alternative to DocuSign, allowing you to sign documents digitally in a secure and transparent way. You can self-host it, which means you have full control over how it works and can review the code. This builds trust because you aren't relying on a third-party provider. Joining the community helps in creating a more open and trustworthy signing tool. You can test it locally, provide feedback, and even contribute to its development. This gives you flexibility and control over your document signing process.
https://github.com/documenso/documenso
GitHub
GitHub - documenso/documenso: The Open Source DocuSign Alternative.
The Open Source DocuSign Alternative. Contribute to documenso/documenso development by creating an account on GitHub.