Burrow: Open-Source RAG-as-a-Service Platform
Burrow: Production-Grade RAG Without the Complexity
Building a production-ready RAG (Retrieval-Augmented Generation) system is hard. You need to parse complex documents, chunk them intelligently, generate embeddings, and implement sophisticated retrieval-and that’s before you even think about scaling or infrastructure. Burrow https://burrow-io.github.io/ is an open-source RAG-as-a-service platform that handles all of this for you, deploying as a serverless solution in your own AWS environment.
Think of it as the missing middleware between your documents and your AI applications.
Key Features
📄 State-of-the-Art Document Processing
Burrow uses Docling, IBM’s advanced document parsing library, to handle complex layouts that break traditional parsers:
- Layout-Aware Parsing - Vision models detect headers, columns, tables, and reading order
- HybridChunker - Intelligent chunking that respects document structure and semantic boundaries
- OCR Support - Extracts text from images and scanned documents
- Table Preservation - Converts tables to Markdown format while maintaining row/column relationships
🔍 Production-Ready Retrieval
Advanced retrieval techniques exposed through a simple REST API:
- Hybrid Search - Combines vector similarity with keyword matching for better results
- Reranking - Cohere reranking model improves result relevance (optional, toggleable)
- Metadata Filtering - Narrow searches by document type, date, or custom fields
- Two API Modes -
/retrievefor raw chunks,/queryfor synthesized LLM responses
⚡ Serverless & Scalable
Built on AWS for automatic scaling and cost efficiency:
- Event-Driven Architecture - ECS Fargate tasks spin up per document, scale to zero when idle
- 95%+ Cost Reduction - Compared to always-on infrastructure
- Bursty Workload Support - Handles hundreds of documents during peaks, minimizes costs during quiet periods
- Data Sovereignty - Everything stays in your AWS account
🖥️ Management Dashboard
A web UI for teams to manage the pipeline:
- Document Upload - Drag-and-drop or API-based ingestion
- Status Monitoring - Real-time progress tracking via Server-Sent Events
- Multi-User Support - Team access with JWT authentication
- OpenAPI Documentation - Built-in API docs for easy integration
Tech Stack
- Frontend: Tailwind CSS + Single-page application
- Pipeline API: Python/FastAPI on ECS
- RAG API: Python/FastAPI on ECS (separate service for scaling)
- Document Processing: Docling for parsing and chunking
- Embeddings: AWS Bedrock Titan Text Embeddings V2
- Vector Store: Aurora Serverless PostgreSQL with pgvector
- Reranking: Cohere Rerank via Bedrock
- Storage: S3 for documents, DynamoDB for metadata
- Orchestration: EventBridge for event-driven triggers
Platforms
Burrow deploys to your own AWS environment:
- ☁️ AWS Cloud
- 🐳 Docker (for local development)
Get Started
Prerequisites
- AWS account with appropriate permissions
- Docker and Docker Compose (for local testing)
- AWS CLI configured
Deployment
Burrow uses infrastructure-as-code for automated deployment. Visit the official repository for detailed setup instructions:
# Clone the repository
git clone https://github.com/burrow-io/burrow.git
cd burrow
# Follow the deployment guide in the repository
# for AWS infrastructure setupUsing the API
Once deployed, you can interact with Burrow via REST API:
import requests
# Upload a document
files = {'file': open('manual.pdf', 'rb')}
response = requests.post(
'https://your-burrow-instance.com/api/documents',
files=files,
headers={'Authorization': 'Bearer YOUR_TOKEN'}
)
# Query your knowledge base
response = requests.post(
'https://your-burrow-instance.com/api/query',
json={
'query': 'What is the travel expense policy?',
'hybrid_search': True,
'rerank': True
},
headers={'Authorization': 'Bearer YOUR_TOKEN'}
)🔗 Website: burrow.io
🔗 GitHub: github.com/burrow-io
Why This Tool Rocks
- Complex Document Handling: Layout-aware parsing actually understands document structure-no more broken tables or scrambled columns
- Cost-Effective Scaling: Event-driven architecture means you only pay when processing documents, not for idle servers
- No Vendor Lock-in: Self-hosted in your AWS account with open-source components throughout
- Production-Ready: Built-in hybrid search, reranking, and metadata filtering-techniques that usually require significant engineering effort
- Team-Friendly: Web UI for non-technical users, REST API for developers, with proper multi-user support
Crepi il lupo! 🐺