Burrow: Open-Source RAG-as-a-Service Platform

⬅️ Back to Tools

Burrow: Production-Grade RAG Without the Complexity

Building a production-ready RAG (Retrieval-Augmented Generation) system is hard. You need to parse complex documents, chunk them intelligently, generate embeddings, and implement sophisticated retrieval-and that’s before you even think about scaling or infrastructure. Burrow https://burrow-io.github.io/ is an open-source RAG-as-a-service platform that handles all of this for you, deploying as a serverless solution in your own AWS environment.

Think of it as the missing middleware between your documents and your AI applications.

Key Features

📄 State-of-the-Art Document Processing

Burrow uses Docling, IBM’s advanced document parsing library, to handle complex layouts that break traditional parsers:

  • Layout-Aware Parsing - Vision models detect headers, columns, tables, and reading order
  • HybridChunker - Intelligent chunking that respects document structure and semantic boundaries
  • OCR Support - Extracts text from images and scanned documents
  • Table Preservation - Converts tables to Markdown format while maintaining row/column relationships

🔍 Production-Ready Retrieval

Advanced retrieval techniques exposed through a simple REST API:

  • Hybrid Search - Combines vector similarity with keyword matching for better results
  • Reranking - Cohere reranking model improves result relevance (optional, toggleable)
  • Metadata Filtering - Narrow searches by document type, date, or custom fields
  • Two API Modes - /retrieve for raw chunks, /query for synthesized LLM responses

⚡ Serverless & Scalable

Built on AWS for automatic scaling and cost efficiency:

  • Event-Driven Architecture - ECS Fargate tasks spin up per document, scale to zero when idle
  • 95%+ Cost Reduction - Compared to always-on infrastructure
  • Bursty Workload Support - Handles hundreds of documents during peaks, minimizes costs during quiet periods
  • Data Sovereignty - Everything stays in your AWS account

🖥️ Management Dashboard

A web UI for teams to manage the pipeline:

  • Document Upload - Drag-and-drop or API-based ingestion
  • Status Monitoring - Real-time progress tracking via Server-Sent Events
  • Multi-User Support - Team access with JWT authentication
  • OpenAPI Documentation - Built-in API docs for easy integration

Tech Stack

  • Frontend: Tailwind CSS + Single-page application
  • Pipeline API: Python/FastAPI on ECS
  • RAG API: Python/FastAPI on ECS (separate service for scaling)
  • Document Processing: Docling for parsing and chunking
  • Embeddings: AWS Bedrock Titan Text Embeddings V2
  • Vector Store: Aurora Serverless PostgreSQL with pgvector
  • Reranking: Cohere Rerank via Bedrock
  • Storage: S3 for documents, DynamoDB for metadata
  • Orchestration: EventBridge for event-driven triggers

Platforms

Burrow deploys to your own AWS environment:

  • ☁️ AWS Cloud
  • 🐳 Docker (for local development)

Get Started

Prerequisites

  • AWS account with appropriate permissions
  • Docker and Docker Compose (for local testing)
  • AWS CLI configured

Deployment

Burrow uses infrastructure-as-code for automated deployment. Visit the official repository for detailed setup instructions:

# Clone the repository
git clone https://github.com/burrow-io/burrow.git
cd burrow

# Follow the deployment guide in the repository
# for AWS infrastructure setup

Using the API

Once deployed, you can interact with Burrow via REST API:

import requests

# Upload a document
files = {'file': open('manual.pdf', 'rb')}
response = requests.post(
    'https://your-burrow-instance.com/api/documents',
    files=files,
    headers={'Authorization': 'Bearer YOUR_TOKEN'}
)

# Query your knowledge base
response = requests.post(
    'https://your-burrow-instance.com/api/query',
    json={
        'query': 'What is the travel expense policy?',
        'hybrid_search': True,
        'rerank': True
    },
    headers={'Authorization': 'Bearer YOUR_TOKEN'}
)

🔗 Website: burrow.io

🔗 GitHub: github.com/burrow-io

Why This Tool Rocks

  • Complex Document Handling: Layout-aware parsing actually understands document structure-no more broken tables or scrambled columns
  • Cost-Effective Scaling: Event-driven architecture means you only pay when processing documents, not for idle servers
  • No Vendor Lock-in: Self-hosted in your AWS account with open-source components throughout
  • Production-Ready: Built-in hybrid search, reranking, and metadata filtering-techniques that usually require significant engineering effort
  • Team-Friendly: Web UI for non-technical users, REST API for developers, with proper multi-user support

Crepi il lupo! 🐺