Improving OCR on Low-Quality Documents with AuraSR-v2 and MiniCPM-V 2.6
Welcome, fellow learners! In this video, we'll explore how to combine two newly released open-source models to achieve better OCR results on low-quality scanned documents. The first model, AuraSR, is a GAN-based super-resolution model that enhances the quality of scanned document images. The second model is MiniCPM-V 2.6, a recently released multimodal LLM, which we'll use to extract text from the upscaled document images.
Notebook - https://colab.research.google.com/drive/11_0W59kZBoSf7aSeB_tc-SAX06kMu-xX?usp=sharing
MiniCPM-V 2.6 - https://huggingface.co/openbmb/MiniCPM-V-2_6
AuraSR-v2 - https://huggingface.co/fal/AuraSR-v2
#ocr #superresolution #aurasr #minicpm #documentscanning #machinelearning #deeplearning #opensource #imageprocessing #gan #llm #generativeadversarialnetworks #lowqualityimages #4xresolution
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
More on: Multimodal LLMs
View skill →
🎓
Tutor Explanation
DeepCamp AI