audio-kb: Record, Store and Search Audio Notes from the Terminal

audio-kb: Your Personal Audio Notes Search Engine

Ever wish you could just ask your audio notes a question and get an instant answer? audio-kb https://github.com/run-llama/audio-kb is a CLI tool from the makers of LlamaIndex that lets you record, store, and search audio notes from your terminal, using state-of-the-art AI for transcription and semantic search.

Whether it’s meeting recordings, voice memos, or podcast highlights, audio-kb transcribes everything and makes it searchable using vector embeddings. Just ask a question in natural language and find exactly what you said.

Key Features

🎙️ Record or Import Audio

Capture voice notes your way:

Record Directly - Just run audio-kb process and speak into your microphone
Import Files - Process existing MP3 files with audio-kb process --file audio.mp3
Save Recordings - Save terminal recordings to a specific file for future reference

📝 Automatic Transcription

Your audio gets processed by LlamaParse, which extracts the full text content:

Accurate Transcription - LlamaParse handles the audio-to-text conversion
Chunking - Content is split into manageable pieces for better search
Embedded Storage - Each chunk gets converted to 3072-dimensional vectors

🔍 Semantic Search

Find exactly what you need using natural language:

audio-kb search "What did I say I would buy tonight for dinner?"
audio-kb search "What are the movies I said I would watch?" --limit 3
audio-kb search "What is the name of the main character?" --json

Vector Search - Uses Gemini Embedding 2 to convert your query into vectors
Cosine Similarity - Matches your query against stored audio content
Flexible Output - Plain text or JSON output

🗄️ Local Vector Storage

All your data stays local with SurrealDB:

HNSW Index - Fast approximate nearest neighbor search
On-Disk Storage - Persistent storage with rocksdb backend
Local Only - Your audio notes never leave your machine
Docker Support - Run SurrealDB locally or via Docker

How It Works

The pipeline is fully automated:

Input - Record from microphone or import an MP3 file
Transcription - LlamaParse extracts text content from audio
Chunking - Text is split into overlapping chunks
Embedding - Gemini Embedding 2 converts chunks to 3072-dim vectors
Storage - Vectors uploaded to local SurrealDB with HNSW indexing
Search - Your query gets embedded and matched via cosine similarity

Platforms

🖥️ macOS
🐧 Linux
🪟 Windows

Get Started

Installation

Install from GitHub (Recommended):

uv tool install git+https://github.com/run-llama/audio-kb
audio-kb --help

Install from Source:

git clone https://github.com/run-llama/audio-kb
cd audio-kb/
uv pip install -e .

Set Up SurrealDB

# Install SurrealDB CLI
curl -sSf https://install.surrealdb.com | sh

# Run locally with on-disk backup
surreal start --user root --pass some-password rocksdb://slides.db

Or use Docker:

docker run --rm --pull always -p 8000:8000 -v $(pwd)/mydata:/mydata \
  surrealdb/surrealdb:latest start --user root --pass some-password \
  rocksdb:mydatabase.db

Configure

Create a config.json in your working directory:

{
  "database": {
    "url": "http://localhost:8000",
    "user": "root",
    "password": "some-password",
    "namespace": "audio-kb",
    "database": "audio"
  },
  "llama_cloud": {
    "llama_cloud_api_key": "$LLAMA_CLOUD_API_KEY"
  },
  "embedding": {
    "api_key": "$GOOGLE_API_KEY",
    "model_name": "gemini-embedding-001"
  },
  "splitter": {
    "chunk_size": 1024,
    "chunk_overlap": 200
  }
}

Set the required environment variables:

export LLAMA_CLOUD_API_KEY=your_key_here
export GOOGLE_API_KEY=your_key_here

Usage

# Record and process audio from microphone
audio-kb process

# Process an existing MP3 file
audio-kb process --file meeting.mp3

# Record and save to specific file
audio-kb process --recording-file my_note.mp3

# Search your audio notes
audio-kb search "What did I say about the project?"

# Search with JSON output
audio-kb search "meeting notes" --json

# Limit results
audio-kb search "action items" --limit 5

🔗 GitHub: github.com/run-llama/audio-kb

Why This Tool Rocks

Terminal-First - Everything from recording to searching happens in your CLI
Semantic Search - Find exactly what you said using natural language queries
State-of-the-Art AI - Powered by LlamaParse for transcription and Gemini Embeddings 2
Local Storage - SurrealDB keeps your data on your machine, not the cloud
Flexible Input - Record live or import existing audio files
JSON Output - Script-friendly for automation and integration
Vector Similarity - Cosine similarity search for accurate matching
From LlamaIndex - Built by the creators of the popular LlamaIndex framework
Free & Open Source - MIT licensed

Crepi il lupo! 🐺