GitHub Trends

#html #data_pipelines #deep_learning #document_ai #document_image_analysis #document_image_processing #document_parser #document_parsing #docx #donut #information_retrieval #langchain #machine_learning #ml #natural_language_processing #nlp #ocr #pdf #pdf_to_json #pdf_to_text #preprocessing

https://github.com/Unstructured-IO/unstructured

GitHub

GitHub - Unstructured-IO/unstructured: Convert documents to structured data effortlessly. Unstructured is open-source ETL solution…

Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website...

929 views10:58

GitHub Trends

#python #document_ai #document_image_analysis #document_layout_analysis #document_parser #document_understanding #layoutlm #nlp #ocr #publaynet #pubtabnet #pytorch #table_detection #table_recognition #tensorflow

https://github.com/deepdoctection/deepdoctection

GitHub

GitHub - deepdoctection/deepdoctection: A Repo For Document AI

A Repo For Document AI. Contribute to deepdoctection/deepdoctection development by creating an account on GitHub.

1.3K views14:57

GitHub Trends

#python #beit #beit_3 #bitnet #deepnet #document_ai #foundation_models #kosmos #kosmos_1 #layoutlm #layoutxlm #llm #minilm #mllm #multimodal #nlp #pre_trained_model #textdiffuser #trocr #unilm #xlm_e

Microsoft is developing advanced AI models through large-scale self-supervised pre-training across various tasks, languages, and modalities. These models, such as Foundation Transformers (Magneto) and Kosmos-2.5, are designed to be highly generalizable and capable of handling multiple tasks like language understanding, vision, speech, and multimodal interactions. The benefit to users includes state-of-the-art performance in document AI, speech recognition, machine translation, and more, making these models highly versatile and efficient for a wide range of applications. Additionally, tools like TorchScale and Aggressive Decoding enhance stability, efficiency, and speed in model training and deployment.

https://github.com/microsoft/unilm

GitHub

GitHub - microsoft/unilm: Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities - microsoft/unilm

310 views14:15

About

Blog

Apps

Platform