Burrow: Open-Source RAG-as-a-Service Platform

Burrow: Production-Grade RAG Without the Complexity

Building a production-ready RAG (Retrieval-Augmented Generation) system is hard. You need to parse complex documents, chunk them intelligently, generate embeddings, and implement sophisticated retrieval-and that’s before you even think about scaling or infrastructure. Burrow https://burrow-io.github.io/ is an open-source RAG-as-a-service platform that handles all of this for you, deploying as a serverless solution in your own AWS environment.

Think of it as the missing middleware between your documents and your AI applications.

Key Features

📄 State-of-the-Art Document Processing

Burrow uses Docling, IBM’s advanced document parsing library, to handle complex layouts that break traditional parsers:

Layout-Aware Parsing - Vision models detect headers, columns, tables, and reading order
HybridChunker - Intelligent chunking that respects document structure and semantic boundaries
OCR Support - Extracts text from images and scanned documents
Table Preservation - Converts tables to Markdown format while maintaining row/column relationships

🔍 Production-Ready Retrieval

Advanced retrieval techniques exposed through a simple REST API:

Hybrid Search - Combines vector similarity with keyword matching for better results
Reranking - Cohere reranking model improves result relevance (optional, toggleable)
Metadata Filtering - Narrow searches by document type, date, or custom fields
Two API Modes - /retrieve for raw chunks, /query for synthesized LLM responses

⚡ Serverless & Scalable

Built on AWS for automatic scaling and cost efficiency:

Event-Driven Architecture - ECS Fargate tasks spin up per document, scale to zero when idle
95%+ Cost Reduction - Compared to always-on infrastructure
Bursty Workload Support - Handles hundreds of documents during peaks, minimizes costs during quiet periods
Data Sovereignty - Everything stays in your AWS account

🖥️ Management Dashboard

A web UI for teams to manage the pipeline:

Document Upload - Drag-and-drop or API-based ingestion
Status Monitoring - Real-time progress tracking via Server-Sent Events
Multi-User Support - Team access with JWT authentication
OpenAPI Documentation - Built-in API docs for easy integration

Tech Stack

Frontend: Tailwind CSS + Single-page application
Pipeline API: Python/FastAPI on ECS
RAG API: Python/FastAPI on ECS (separate service for scaling)
Document Processing: Docling for parsing and chunking
Embeddings: AWS Bedrock Titan Text Embeddings V2
Vector Store: Aurora Serverless PostgreSQL with pgvector
Reranking: Cohere Rerank via Bedrock
Storage: S3 for documents, DynamoDB for metadata
Orchestration: EventBridge for event-driven triggers

Platforms

Burrow deploys to your own AWS environment:

☁️ AWS Cloud
🐳 Docker (for local development)

Get Started

Prerequisites

AWS account with appropriate permissions
Docker and Docker Compose (for local testing)
AWS CLI configured

Deployment

Burrow uses infrastructure-as-code for automated deployment. Visit the official repository for detailed setup instructions:

# Clone the repository
git clone https://github.com/burrow-io/burrow.git
cd burrow

# Follow the deployment guide in the repository
# for AWS infrastructure setup

Using the API

Once deployed, you can interact with Burrow via REST API:

import requests

# Upload a document
files = {'file': open('manual.pdf', 'rb')}
response = requests.post(
    'https://your-burrow-instance.com/api/documents',
    files=files,
    headers={'Authorization': 'Bearer YOUR_TOKEN'}
)

# Query your knowledge base
response = requests.post(
    'https://your-burrow-instance.com/api/query',
    json={
        'query': 'What is the travel expense policy?',
        'hybrid_search': True,
        'rerank': True
    },
    headers={'Authorization': 'Bearer YOUR_TOKEN'}
)

🔗 Website: burrow.io

🔗 GitHub: github.com/burrow-io

Why This Tool Rocks

Complex Document Handling: Layout-aware parsing actually understands document structure-no more broken tables or scrambled columns
Cost-Effective Scaling: Event-driven architecture means you only pay when processing documents, not for idle servers
No Vendor Lock-in: Self-hosted in your AWS account with open-source components throughout
Production-Ready: Built-in hybrid search, reranking, and metadata filtering-techniques that usually require significant engineering effort
Team-Friendly: Web UI for non-technical users, REST API for developers, with proper multi-user support

Crepi il lupo! 🐺