GitHub Trends
10.1K subscribers
15.3K links
See what the GitHub community is most excited about today.

A bot automatically fetches new repositories from https://github.com/trending and sends them to the channel.

Author and maintainer: https://github.com/katursis
Download Telegram
#python #agent #agents #ai_search #chatbot #chatgpt #data_pipelines #deep_learning #document_parser #document_understanding #genai #graph #graphrag #llm #nlp #pdf_to_text #preprocessing #rag #retrieval_augmented_generation #table_structure_recognition #text2sql

RAGFlow is an open-source tool that helps businesses answer questions accurately using large language models and deep document understanding. It extracts information from various complex data formats, such as Word documents, Excel files, and web pages, and provides grounded citations to support its answers. You can try a demo online or set it up on your own server using Docker. The setup is relatively straightforward, requiring a few steps like cloning the repository, building the Docker image, and configuring the system settings. RAGFlow offers key features like template-based chunking, reduced hallucinations, and compatibility with multiple data sources, making it a powerful tool for truthful question-answering capabilities. This benefits users by providing reliable and explainable answers, streamlining their workflow, and supporting integration with their business systems.

https://github.com/infiniflow/ragflow
#python #ai #convert #documents #pdf #tables

Docling is a tool that helps you convert different types of documents (like PDF, DOCX, PPTX, and more) into Markdown or JSON format quickly and easily. It can read complex PDFs, extract metadata, and even work with scanned documents using OCR. Docling also integrates well with other tools for powerful question-answering applications. You can install it using `pip install docling` and start converting documents right away. This makes it easier to manage and use your documents in various formats, saving you time and effort.

https://github.com/DS4SD/docling
#java #docker #java #pdf #pdf_converter #pdf_editor #pdf_manipulation #pdf_merger #pdf_ocr #pdf_tools #pdf_web_apps #pdfmerger

Stirling-PDF is a powerful tool for managing PDF files locally on your computer or server. It allows you to perform various operations like splitting, merging, converting, and editing PDFs without sending your files to external servers, ensuring your data stays private. You can add images, rotate pages, compress files, and even convert PDFs to other formats like Word or images. The tool supports multiple languages and has features like dark mode, custom download options, and API integration for advanced users. It's easy to set up using Docker and offers customizable settings and security features like login authentication. This makes it a versatile and secure solution for all your PDF needs.

https://github.com/Stirling-Tools/Stirling-PDF
👍1
#python #ai4science #document_analysis #extract_data #layout_analysis #ocr #parser #pdf #pdf_converter #pdf_extractor_llm #pdf_extractor_pretrain #pdf_extractor_rag #pdf_parser #python

MinerU is a tool that converts PDFs into machine-readable formats like markdown or JSON. Here are the key benefits and features MinerU removes headers, footers, and other unnecessary elements to ensure the text is semantically coherent and in human-readable order, even for complex layouts.
- **Structure Preservation** It extracts images, image descriptions, tables, and table titles.
- **Formula Conversion** Recognizes tables and converts them to LaTeX or HTML format.
- **OCR Support** Supports multiple output formats and various visualization results.
- **GPU and CPU Compatibility**: Works on both CPU and GPU environments, compatible with Windows, Linux, and Mac.

You can try MinerU through an online demo, a quick CPU demo, or by using a GPU for faster processing. For detailed usage, refer to the command line options, API integration, and deployment guides provided.

https://github.com/opendatalab/MinerU
#ruby #daisyui #document_signing #documents #e_signature #github_catalyst #hotwired_turbo #legaltech #open_source #pdf #pdf_sign #pdf_signature #ruby_on_rails #self_hosted #tailwindcss #vuejs #webpack

DocuSeal is a free and open-source platform that helps you fill and sign documents online easily. You can create PDF forms with various field types like signatures, dates, and checkboxes, and these forms can be filled and signed on any device. It offers features like automated emails, multiple language support, and integration with cloud storage services. The platform is mobile-optimized and has tools for user management and API integrations. This makes it convenient for businesses to integrate document signing into their apps, reducing costs and ensuring security and compliance. You can try it out with a live demo or deploy it quickly using various hosting options.

https://github.com/docusealco/docuseal
#python #docx #llm #parser #pdf #powerpoint

MegaParse is a powerful tool that helps you parse different types of documents like text, PDFs, PowerPoint presentations, and Word documents without losing any information. It is fast, efficient, and supports many file formats. You can use it for free since it is open source. To use MegaParse, you just need to install it with a simple command and set up some additional tools depending on your needs. This tool benefits you by making it easy to extract data from various documents quickly and accurately, saving you time and effort.

https://github.com/QuivrHQ/MegaParse
#lua #cbz #djvu #djvu_reflow #ebook #ebook_reader #eink #epub #ereader #fb2 #kindle #kobo #luajit #opds #pdf #pdf_reflow #pocketbook #reader #reflow #remarkable_tablet #ubuntu_touch

KOReader is a powerful document viewer designed for e-ink readers and other devices. It supports many file formats like PDF, EPUB, and more, and allows you to customize the reading experience with adjustable margins, line spacing, and fonts. It's fast, even on older devices, and integrates with tools like calibre and Google Translate. KOReader is also optimized for e-ink devices with features like easy zoom and no animations. This makes reading comfortable and efficient, giving you a better experience overall.

https://github.com/koreader/koreader
#javascript #book #cb7 #cbr #cbt #cbz #comic #docx #ebook #epub #fb2 #html #markdown #mobi #pdf #reader #rtf #txt #xml

Koodo Reader is a powerful ebook reader that works on many platforms like Windows, macOS, Linux, and even the web. It supports many file formats such as EPUB, PDF, MOBI, and more. You can customize how your books look by changing font size, color, and background. It also has features like text-to-speech, translation, and night mode. You can save your books to cloud services like OneDrive, Google Drive, and Dropbox, making it easy to access your books on different devices. This makes reading convenient and enjoyable anywhere you go.

https://github.com/koodo-reader/koodo-reader
1
#python #chinese #english #japanese #korean #latex #openai #pdf #pdf2zh #russian #translation

This tool, called PDFMathTranslate, helps you translate scientific PDF papers while keeping formulas, charts, and other important parts intact. You can use it in several ways: through a command line, an interactive user interface, or using Docker. It supports multiple languages and various translation services like Google, DeepL, and more. You can even try it online without installing anything. This makes it easier to understand and work with documents in different languages, saving you time and effort.

https://github.com/Byaidu/PDFMathTranslate
#typescript #digital_signature #document_signing #docusign_alternative #e_signature #esign #esignature #next_auth #nextjs #open_source #pades_standard #pdf #pdf_sign #pdf_signature #postgresql #prisma #self_hosted #signing #typescript

Documenso is an open-source alternative to DocuSign, allowing you to sign documents digitally in a secure and transparent way. You can self-host it, which means you have full control over how it works and can review the code. This builds trust because you aren't relying on a third-party provider. Joining the community helps in creating a more open and trustworthy signing tool. You can test it locally, provide feedback, and even contribute to its development. This gives you flexibility and control over your document signing process.

https://github.com/documenso/documenso