GitHub Trends
10.1K subscribers
15.3K links
See what the GitHub community is most excited about today.

A bot automatically fetches new repositories from https://github.com/trending and sends them to the channel.

Author and maintainer: https://github.com/katursis
Download Telegram
#javascript #deep_learning #javascript #ocr #tesseract #webassembly

Tesseract.js is a JavaScript library that helps you extract text from images in almost any language. It works in both browsers and on servers using Node.js. You can easily install it using a script tag, webpack, or npm. Here’s how it benefits you: it allows you to convert images into text quickly and accurately, supporting multiple languages and formats. This can be very useful for tasks like scanning documents, recognizing text in videos, and more. The library is also efficient, with smaller file sizes and lower memory usage, making it faster to use.

https://github.com/naptha/tesseract.js
#python #image_processing #ocr #pdf #python #tesseract

OCRmyPDF is a tool that makes scanned PDF files searchable and editable. It adds a text layer to the PDF, so you can search for words or copy and paste text from the document. It supports many languages, fixes misrotated or crooked pages, and optimizes the file size. The tool works on various operating systems like Linux, Windows, and macOS, and it uses multiple CPU cores to speed up the process. This makes it easier to work with scanned documents and keeps your files organized and searchable.

https://github.com/ocrmypdf/OCRmyPDF