Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2404.05669

Papers - Image - OCR - Binarization - Sauvola

NAF-DPM: A Nonlinear Activation-Free Diffusion Probabilistic Model for Document Enhancement

Paper • 2404.05669 • Published Apr 8 • 1

Papers - Image - OCR - Binarization - Otsu

NAF-DPM: A Nonlinear Activation-Free Diffusion Probabilistic Model for Document Enhancement

Paper • 2404.05669 • Published Apr 8 • 1
TorMentor: Deterministic dynamic-path, data augmentations with fractals

Paper • 2204.03776 • Published Apr 7, 2022 • 1

Papers - Image - Augmentation - Binarization - NAF-DPM

NAF-DPM: A Nonlinear Activation-Free Diffusion Probabilistic Model for Document Enhancement

Paper • 2404.05669 • Published Apr 8 • 1

Papers - Image - LPIPS

Dynamic Typography: Bringing Words to Life

Paper • 2404.11614 • Published Apr 17 • 43
Scene Coordinate Reconstruction: Posing of Image Collections via Incremental Learning of a Relocalizer

Paper • 2404.14351 • Published Apr 22 • 5
BlenderAlchemy: Editing 3D Graphics with Vision-Language Models

Paper • 2404.17672 • Published Apr 26 • 18
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation

Paper • 2406.06525 • Published Jun 10 • 64

Papers - Image - Ordinary Differential Equations (ODE)

ConsistencyDet: Robust Object Detector with Denoising Paradigm of Consistency Model

Paper • 2404.07773 • Published Apr 11 • 1
Hyper-SD: Trajectory Segmented Consistency Model for Efficient Image Synthesis

Paper • 2404.13686 • Published Apr 21 • 27
Align Your Steps: Optimizing Sampling Schedules in Diffusion Models

Paper • 2404.14507 • Published Apr 22 • 21
NAF-DPM: A Nonlinear Activation-Free Diffusion Probabilistic Model for Document Enhancement

Paper • 2404.05669 • Published Apr 8 • 1

Papers - Document - OCR

Noise-Aware Training of Layout-Aware Language Models

Paper • 2404.00488 • Published Mar 30 • 7
FormNet: Structural Encoding beyond Sequential Modeling in Form Document Information Extraction

Paper • 2203.08411 • Published Mar 16, 2022 • 1
FormNetV2: Multimodal Graph Contrastive Learning for Form Document Information Extraction

Paper • 2305.02549 • Published May 4, 2023 • 6
ETC: Encoding Long and Structured Inputs in Transformers

Paper • 2004.08483 • Published Apr 17, 2020 • 1

Papers - Image - Historical

Insightful analysis of historical sources at scales beyond human capabilities using unsupervised Machine Learning and XAI

Paper • 2310.09091 • Published Oct 13, 2023 • 2
Evolution and Transformation of Scientific Knowledge over the Sphaera Corpus: A Network Study

Paper • 2004.00520 • Published Apr 1, 2020 • 2
NAF-DPM: A Nonlinear Activation-Free Diffusion Probabilistic Model for Document Enhancement

Paper • 2404.05669 • Published Apr 8 • 1

Papers - Image - Fine-tuning

DocLLM: A layout-aware generative language model for multimodal document understanding

Paper • 2401.00908 • Published Dec 31, 2023 • 181
Visual Instruction Tuning

Paper • 2304.08485 • Published Apr 17, 2023 • 13
Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering

Paper • 2403.09622 • Published Mar 14 • 16
Lumiere: A Space-Time Diffusion Model for Video Generation

Paper • 2401.12945 • Published Jan 23 • 86

Papers - Image - OCR Handwriting

Vulnerability Analysis of Transformer-based Optical Character Recognition to Adversarial Attacks

Paper • 2311.17128 • Published Nov 28, 2023 • 2
Data Generation for Post-OCR correction of Cyrillic handwriting

Paper • 2311.15896 • Published Nov 27, 2023 • 3
An End-to-End OCR Framework for Robust Arabic-Handwriting Recognition using a Novel Transformers-based Model and an Innovative 270 Million-Words Multi-Font Corpus of Classical Arabic with Diacritics

Paper • 2208.11484 • Published Aug 20, 2022 • 3
Transformer based Urdu Handwritten Text Optical Character Reader

Paper • 2206.04575 • Published Jun 9, 2022 • 2

Previous
1
2
Next

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs