DeepSeek-OCR: Optical Character Recognition for Visual-Text Compression
By
pierre
Front-window bakery material. Catches the eye, delivers the goods.
Summary
DeepSeek-OCR is a GitHub repository for optical character recognition (OCR) technology focused on visual-text compression. The project provides installation instructions for CUDA 11.8 and PyTorch 2.6.0 environments, including setup for vLLM inference and transformer compatibility. The repository contains model downloads, paper links, and implementation details for exploring visual-text compression boundaries.
Key quotes
· 5 pulledExplore the boundaries of visual-text compression.
Our environment is cuda11.8+torch2.6.0.
Clone this repository and navigate to the DeepSeek-OCR folder
Note: if you want vLLM and transformers codes to run in the same environment, you don't need to worry about this installation error
Note: change the INPUT_PATH/OUTPUT_PATH and other settings in th
You might also wanna read
DeepSeek-V3.1: Open-Source Language Model with Hybrid Inference for Advanced Reasoning and Coding
DeepSeek-V3.1 is an open-source large language model that introduces hybrid inference with both 'Think' and 'Non-Think' modes, optimized for
DeepSeek-V3.1-Terminus: Latest Open-Source LLM with Enhanced Stability and Agent Capabilities
DeepSeek-V3.1-Terminus is the latest open-source large language model from DeepSeek, representing the 7th launch in their series. This refin
