audio-kb: Record, Store and Search Audio Notes from the Terminal
audio-kb: Your Personal Audio Notes Search Engine
Ever wish you could just ask your audio notes a question and get an instant answer? audio-kb https://github.com/run-llama/audio-kb is a CLI tool from the makers of LlamaIndex that lets you record, store, and search audio notes from your terminal, using state-of-the-art AI for transcription and semantic search.
Whether it’s meeting recordings, voice memos, or podcast highlights, audio-kb transcribes everything and makes it searchable using vector embeddings. Just ask a question in natural language and find exactly what you said.
Key Features
🎙️ Record or Import Audio
Capture voice notes your way:
- Record Directly - Just run
audio-kb processand speak into your microphone - Import Files - Process existing MP3 files with
audio-kb process --file audio.mp3 - Save Recordings - Save terminal recordings to a specific file for future reference
📝 Automatic Transcription
Your audio gets processed by LlamaParse, which extracts the full text content:
- Accurate Transcription - LlamaParse handles the audio-to-text conversion
- Chunking - Content is split into manageable pieces for better search
- Embedded Storage - Each chunk gets converted to 3072-dimensional vectors
🔍 Semantic Search
Find exactly what you need using natural language:
audio-kb search "What did I say I would buy tonight for dinner?"
audio-kb search "What are the movies I said I would watch?" --limit 3
audio-kb search "What is the name of the main character?" --json- Vector Search - Uses Gemini Embedding 2 to convert your query into vectors
- Cosine Similarity - Matches your query against stored audio content
- Flexible Output - Plain text or JSON output
🗄️ Local Vector Storage
All your data stays local with SurrealDB:
- HNSW Index - Fast approximate nearest neighbor search
- On-Disk Storage - Persistent storage with rocksdb backend
- Local Only - Your audio notes never leave your machine
- Docker Support - Run SurrealDB locally or via Docker
How It Works
The pipeline is fully automated:
- Input - Record from microphone or import an MP3 file
- Transcription - LlamaParse extracts text content from audio
- Chunking - Text is split into overlapping chunks
- Embedding - Gemini Embedding 2 converts chunks to 3072-dim vectors
- Storage - Vectors uploaded to local SurrealDB with HNSW indexing
- Search - Your query gets embedded and matched via cosine similarity
Platforms
- 🖥️ macOS
- 🐧 Linux
- 🪟 Windows
Get Started
Installation
Install from GitHub (Recommended):
uv tool install git+https://github.com/run-llama/audio-kb
audio-kb --helpInstall from Source:
git clone https://github.com/run-llama/audio-kb
cd audio-kb/
uv pip install -e .Set Up SurrealDB
# Install SurrealDB CLI
curl -sSf https://install.surrealdb.com | sh
# Run locally with on-disk backup
surreal start --user root --pass some-password rocksdb://slides.dbOr use Docker:
docker run --rm --pull always -p 8000:8000 -v $(pwd)/mydata:/mydata \
surrealdb/surrealdb:latest start --user root --pass some-password \
rocksdb:mydatabase.dbConfigure
Create a config.json in your working directory:
{
"database": {
"url": "http://localhost:8000",
"user": "root",
"password": "some-password",
"namespace": "audio-kb",
"database": "audio"
},
"llama_cloud": {
"llama_cloud_api_key": "$LLAMA_CLOUD_API_KEY"
},
"embedding": {
"api_key": "$GOOGLE_API_KEY",
"model_name": "gemini-embedding-001"
},
"splitter": {
"chunk_size": 1024,
"chunk_overlap": 200
}
}Set the required environment variables:
export LLAMA_CLOUD_API_KEY=your_key_here
export GOOGLE_API_KEY=your_key_hereUsage
# Record and process audio from microphone
audio-kb process
# Process an existing MP3 file
audio-kb process --file meeting.mp3
# Record and save to specific file
audio-kb process --recording-file my_note.mp3
# Search your audio notes
audio-kb search "What did I say about the project?"
# Search with JSON output
audio-kb search "meeting notes" --json
# Limit results
audio-kb search "action items" --limit 5🔗 GitHub: github.com/run-llama/audio-kb
Why This Tool Rocks
- Terminal-First - Everything from recording to searching happens in your CLI
- Semantic Search - Find exactly what you said using natural language queries
- State-of-the-Art AI - Powered by LlamaParse for transcription and Gemini Embeddings 2
- Local Storage - SurrealDB keeps your data on your machine, not the cloud
- Flexible Input - Record live or import existing audio files
- JSON Output - Script-friendly for automation and integration
- Vector Similarity - Cosine similarity search for accurate matching
- From LlamaIndex - Built by the creators of the popular LlamaIndex framework
- Free & Open Source - MIT licensed
Crepi il lupo! 🐺