Docling: Get your documents ready for Gen AI
Docling: Simplify Document Processing for Gen AI
Ideally, feeding documents into a Generative AI model would be as simple as uploading a file. In reality, parsing complex formats like PDFs with tables, formulas, and multi-column layouts is a major bottleneck. You often end up with garbled text and lost structure.
Docling solves this frustration by providing an easy way to parse diverse document formats into a unified, expressive representation.
Key Features
🗂️ Diverse Format Support
Docling isn’t limited to just text files. It handles a wide array of formats, ensuring you can process almost anything you throw at it:
- Documents: PDF, DOCX, PPTX, XLSX, HTML
- Media: Images (PNG, TIFF, JPEG), Audio (WAV, MP3) via ASR
- Standard: Markdown, AsciiDoc
📑 Advanced Understanding
The real magic of Docling lies in its ability to understand the structure of a document. It doesn’t just extract text; it preserves formatting and context:
- Layout Analysis: Respects page reading order and multi-column layouts.
- Table Structure: accurately reconstructs tables.
- Formula & Code Recognition: Identifies and preserves technical content.
- OCR Support: Powerful OCR for scanned PDFs and images.
🤖 Seamless Integration
Docling is designed to fit right into your existing AI workflows. It offers plug-and-play integrations with popular frameworks:
- Agentic AI: LangChain, LlamaIndex, Crew AI, Haystack.
- Visual Models: Support for VLMs like GraniteDocling.
- MCP Server: Connect to any agent using the Model Context Protocol.
Get Started
Docling works on macOS, Linux, and Windows. Installation is a breeze with pip:
pip install doclingPython Usage
To convert a document, simply use the DocumentConverter:
from docling.document_converter import DocumentConverter
source = "https://arxiv.org/pdf/2408.09869" # Local path or URL
converter = DocumentConverter()
result = converter.convert(source)
print(result.document.export_to_markdown())CLI Usage
Docling also includes a handy command-line interface for quick conversions:
docling https://arxiv.org/pdf/2206.01062Why This Tool Rocks
- It Just Works: Handles the messy reality of document parsing so you can focus on building your AI application.
- Open Source: Transparent, customizable, and free to use under the MIT license.
- Unified Output: Converts everything into a consistent format, making downstream processing much simpler.
Crepi il lupo! 🐺