ColGrep for Codex and OpenCode: Global macOS Setup
Overview
colgrep is a local semantic code search CLI from LightOn’s NextPlaid stack. It keeps the familiar grep workflow, but adds semantic ranking and hybrid regex + semantic search.
Why this matters for AI agents:
- Runs 100% local (no code upload to remote search APIs)
- Understands intent, not only exact keyword matches
- Supports incremental index updates, so it stays fast while code changes
This guide now includes the common ONNX runtime error (often typed as “Onyx”), why it happens, and the exact fix.
1. Install ColGrep Globally on macOS
Use the official installer (this installs to ~/.cargo/bin):
curl --proto '=https' --tlsv1.2 -LsSf https://github.com/lightonai/next-plaid/releases/latest/download/colgrep-installer.sh | shLoad it in the current shell:
source ~/.cargo/envVerify:
which colgrep
colgrep --versionIf your shell startup files are not writable, add this line manually to ~/.zprofile (or ~/.zshrc):
. "$HOME/.cargo/env"2. Install Agent Integrations
Official commands:
colgrep --install-codex
colgrep --install-opencodeThis writes integration instructions into:
~/.codex/AGENTS.md~/.config/opencode/AGENTS.md
Restart Codex/OpenCode after installation.
3. Add a Conda-Safe Wrapper (Recommended)
If conda is active in your shell, it can inject an incompatible ONNX runtime and cause colgrep to fail during indexing.
Add this once:
cat >> ~/.zshrc <<'EOF'
# >>> colgrep compatibility helpers >>>
# Avoid conda ONNX runtime conflicts with colgrep.
colgrep() {
env -u CONDA_PREFIX -u ORT_DYLIB_PATH command colgrep "$@"
}
# Optional explicit names / shortcuts
alias cgrep='colgrep'
alias raw_colgrep='command colgrep'
alias cg='colgrep'
alias cgs='colgrep status'
alias cgi='colgrep init .'
alias cgc='colgrep clear'
# <<< colgrep compatibility helpers <<<
EOF
source ~/.zshrc4. First Use in Any Project
You install colgrep once for your Mac user, then index each repo you care about:
cd /path/to/project
colgrep init -y
colgrep status
colgrep "find retry logic for API calls"
colgrep -e "async fn" "error handling" --include="*.rs"Useful note: indexes are stored outside your repos on macOS in:
~/Library/Application Support/colgrep/indices/
5. Troubleshooting the ONNX/“Onyx” Exit 101 Error
Symptom
You may see this while indexing:
🤖 Model: lightonai/LateOn-Code-edge
📂 Building index...
101
No index found for /path/to/project
Run `colgrep <query>` to create one.Why it happens
When conda is active, colgrep can pick up ONNX runtime from your conda environment (for example, libonnxruntime.1.20.1.dylib) instead of the version it expects (1.23.0). That mismatch can crash model initialization and return exit code 101.
Quick fix (one-shot)
env -u CONDA_PREFIX -u ORT_DYLIB_PATH colgrep init .
env -u CONDA_PREFIX -u ORT_DYLIB_PATH colgrep statusPermanent fix
Use the wrapper from Section 3. After that, normal colgrep ... commands automatically run with safe env vars.
If you cannot download ONNX runtime from GitHub
If your network blocks runtime download, install a matching ONNX runtime in your current env and point ORT_DYLIB_PATH to it:
pip install -U "onnxruntime==1.23.0"
export ORT_DYLIB_PATH="$CONDA_PREFIX/lib/python3.12/site-packages/onnxruntime/capi/libonnxruntime.1.23.0.dylib"
colgrep init .Adjust the Python version path if your env is not python3.12.
6. Paste-Ready Full AGENTS.md for Codex
Paste this as-is into ~/.codex/AGENTS.md:
cat > ~/.codex/AGENTS.md <<'EOF'
<!-- COLGREP_START -->
# Semantic Code Search
This repository has `colgrep` installed - a semantic code search CLI.
**Use `colgrep` as your PRIMARY search tool** instead of `Search / Grep / Glob`.
## Quick Reference
```bash
# Basic semantic search
colgrep "<natural language query>" --results 10 # Basic search
colgrep "<query>" -k 25 # Exploration (more results)
colgrep "<query>" ./src/parser # Search in specific folder
colgrep "<query>" ./src/main.rs # Search in specific file
colgrep "<query>" ./src/main.rs ./src/lib.rs # Search in multiple files
colgrep "<query>" ./crate-a ./crate-b # Search multiple directories
# File filtering
colgrep --include="*.rs" "<query>" # Include only .rs files
colgrep --include="src/**/*.rs" "<query>" # Recursive glob pattern
colgrep --include="*.{rs,md}" "<query>" # Multiple file types (brace expansion)
colgrep --exclude="*.test.ts" "<query>" # Exclude test files
colgrep --exclude-dir=vendor "<query>" # Exclude vendor directory
# Pattern-only search (no semantic query needed)
colgrep -e "<pattern>" # Search by pattern only
colgrep -e "async fn" --include="*.rs" # Pattern search with file filter
# Hybrid search (text + semantic)
colgrep -e "<text>" "<semantic query>" # Hybrid: text + semantic
colgrep -e "<regex>" -E "<semantic query>" # Hybrid with extended regex (ERE)
colgrep -e "<literal>" -F "<semantic query>" # Hybrid with fixed string (no regex)
colgrep -e "<word>" -w "<semantic query>" # Hybrid with whole word match
# Output options
colgrep -l "<query>" # List files only
colgrep -n 6 "<query>" # Show 6 context lines (use -n for more context)
colgrep --json "<query>" # JSON output
```
## Grep-Compatible Flags
| Flag | Description | Example |
| --------------- | ------------------------------------------- | -------------------------------------------- |
| `-e <PATTERN>` | Text pattern pre-filter | `colgrep -e "async" "concurrency"` |
| `-E` | Extended regex (ERE) for `-e` | `colgrep -e "async\|await" -E "concurrency"` |
| `-F` | Fixed string (no regex) for `-e` | `colgrep -e "foo[bar]" -F "query"` |
| `-w` | Whole word match for `-e` | `colgrep -e "test" -w "testing"` |
| `-k, --results` | Number of results to return | `colgrep --results 20 "query"` |
| `-n, --lines` | Number of context lines (default: 6) | `colgrep -n 10 "query"` |
| `-l` | List files only | `colgrep -l "authentication"` |
| `-r` | Recursive (default, for compatibility) | `colgrep -r "query"` |
| `--include` | Include files matching pattern (repeatable) | `colgrep --include="*.py" "query"` |
| `--exclude` | Exclude files matching pattern | `colgrep --exclude="*.min.js" "query"` |
| `--exclude-dir` | Exclude directories | `colgrep --exclude-dir=node_modules "query"` |
**Notes:**
- `-F` takes precedence over `-E` (like grep)
- Default exclusions always apply: `.git`, `node_modules`, `target`, `.venv`, `__pycache__`
- When running from a subdirectory, results are restricted to that subdirectory. To search the full project, specify `.` or `..` as the path
- Multiple `--include` patterns use OR logic (matches if file matches any pattern)
- Brace expansion is supported: `*.{rs,md,py}` expands to match all three types
- When `conda` is active, use `env -u CONDA_PREFIX -u ORT_DYLIB_PATH colgrep ...` to avoid ONNX runtime conflicts during indexing
## When to Use What
| Task | Tool |
| ------------------------------- | -------------------------------------------- |
| Find code by intent/description | `colgrep "query" -k 10` |
| Explore/understand a system | `colgrep "query" -k 25` (increase k) |
| Search by pattern only | `colgrep -e "pattern"` (no semantic query) |
| Know text exists, need context | `colgrep -e "text" "semantic query"` |
| Literal text with special chars | `colgrep -e "foo[0]" -F "semantic query"` |
| Whole word match | `colgrep -e "test" -w "testing utilities"` |
| Search in a specific file | `colgrep "query" ./src/main.rs` |
| Search in multiple files | `colgrep "query" ./src/main.rs ./src/lib.rs` |
| Search specific file type | `colgrep --include="*.ext" "query"` |
| Search multiple file types | `colgrep --include="*.{rs,md,py}" "query"` |
| Exclude test files | `colgrep --exclude="*_test.go" "query"` |
| Exclude vendor directories | `colgrep --exclude-dir=vendor "query"` |
| Search in specific directories | `colgrep --include="src/**/*.rs" "query"` |
| Search multiple directories | `colgrep "query" ./src ./lib ./api` |
| Search CI/CD configs | `colgrep --include="**/.github/**/*" "q" .` |
| Need more context lines | `colgrep -n 10 "query"` |
| Exact string/regex match only | Built-in `Grep` tool |
| Find files by name | Built-in `Glob` tool |
## Key Rules
1. **Default to `colgrep`** for any code search
2. **Increase `--results`** (or `-k`) when exploring (20-30 results)
3. **Use `-e`** for hybrid text+semantic filtering
4. **Use `-E`** with `-e` for extended regex (alternation `|`, quantifiers `+?`, grouping `()`)
5. **Use `-F`** with `-e` when pattern contains regex special characters you want literal
6. **Use `-w`** with `-e` to avoid partial matches (e.g., "test" won't match "testing")
7. **Use `--exclude`/`--exclude-dir`** to filter out noise (tests, vendors, generated code)
8. **Use brace expansion** for multiple file types (e.g., `--include="*.{rs,md,py}"`)
9. **Agents should use `colgrep`** - when spawning Task/Explore agents, they should also use colgrep instead of Grep
## Need Help?
Run `colgrep --help` for complete documentation on all flags and options.
<!-- COLGREP_END -->
EOF7. Paste-Ready Full AGENTS.md for OpenCode
Paste this as-is into ~/.config/opencode/AGENTS.md:
cat > ~/.config/opencode/AGENTS.md <<'EOF'
# OpenCode Agent Tools
<!-- COLGREP_START -->
# Semantic Code Search
This repository has `colgrep` installed - a semantic code search CLI.
**Use `colgrep` as your PRIMARY search tool** instead of `Search / Grep / Glob`.
## Quick Reference
```bash
# Basic semantic search
colgrep "<natural language query>" --results 10 # Basic search
colgrep "<query>" -k 25 # Exploration (more results)
colgrep "<query>" ./src/parser # Search in specific folder
colgrep "<query>" ./src/main.rs # Search in specific file
colgrep "<query>" ./src/main.rs ./src/lib.rs # Search in multiple files
colgrep "<query>" ./crate-a ./crate-b # Search multiple directories
# File filtering
colgrep --include="*.rs" "<query>" # Include only .rs files
colgrep --include="src/**/*.rs" "<query>" # Recursive glob pattern
colgrep --include="*.{rs,md}" "<query>" # Multiple file types (brace expansion)
colgrep --exclude="*.test.ts" "<query>" # Exclude test files
colgrep --exclude-dir=vendor "<query>" # Exclude vendor directory
# Pattern-only search (no semantic query needed)
colgrep -e "<pattern>" # Search by pattern only
colgrep -e "async fn" --include="*.rs" # Pattern search with file filter
# Hybrid search (text + semantic)
colgrep -e "<text>" "<semantic query>" # Hybrid: text + semantic
colgrep -e "<regex>" -E "<semantic query>" # Hybrid with extended regex (ERE)
colgrep -e "<literal>" -F "<semantic query>" # Hybrid with fixed string (no regex)
colgrep -e "<word>" -w "<semantic query>" # Hybrid with whole word match
# Output options
colgrep -l "<query>" # List files only
colgrep -n 6 "<query>" # Show 6 context lines (use -n for more context)
colgrep --json "<query>" # JSON output
```
## Grep-Compatible Flags
| Flag | Description | Example |
| --------------- | ------------------------------------------- | -------------------------------------------- |
| `-e <PATTERN>` | Text pattern pre-filter | `colgrep -e "async" "concurrency"` |
| `-E` | Extended regex (ERE) for `-e` | `colgrep -e "async\|await" -E "concurrency"` |
| `-F` | Fixed string (no regex) for `-e` | `colgrep -e "foo[bar]" -F "query"` |
| `-w` | Whole word match for `-e` | `colgrep -e "test" -w "testing"` |
| `-k, --results` | Number of results to return | `colgrep --results 20 "query"` |
| `-n, --lines` | Number of context lines (default: 6) | `colgrep -n 10 "query"` |
| `-l` | List files only | `colgrep -l "authentication"` |
| `-r` | Recursive (default, for compatibility) | `colgrep -r "query"` |
| `--include` | Include files matching pattern (repeatable) | `colgrep --include="*.py" "query"` |
| `--exclude` | Exclude files matching pattern | `colgrep --exclude="*.min.js" "query"` |
| `--exclude-dir` | Exclude directories | `colgrep --exclude-dir=node_modules "query"` |
**Notes:**
- `-F` takes precedence over `-E` (like grep)
- Default exclusions always apply: `.git`, `node_modules`, `target`, `.venv`, `__pycache__`
- When running from a subdirectory, results are restricted to that subdirectory. To search the full project, specify `.` or `..` as the path
- Multiple `--include` patterns use OR logic (matches if file matches any pattern)
- Brace expansion is supported: `*.{rs,md,py}` expands to match all three types
- When `conda` is active, use `env -u CONDA_PREFIX -u ORT_DYLIB_PATH colgrep ...` to avoid ONNX runtime conflicts during indexing
## When to Use What
| Task | Tool |
| ------------------------------- | -------------------------------------------- |
| Find code by intent/description | `colgrep "query" -k 10` |
| Explore/understand a system | `colgrep "query" -k 25` (increase k) |
| Search by pattern only | `colgrep -e "pattern"` (no semantic query) |
| Know text exists, need context | `colgrep -e "text" "semantic query"` |
| Literal text with special chars | `colgrep -e "foo[0]" -F "semantic query"` |
| Whole word match | `colgrep -e "test" -w "testing utilities"` |
| Search in a specific file | `colgrep "query" ./src/main.rs` |
| Search in multiple files | `colgrep "query" ./src/main.rs ./src/lib.rs` |
| Search specific file type | `colgrep --include="*.ext" "query"` |
| Search multiple file types | `colgrep --include="*.{rs,md,py}" "query"` |
| Exclude test files | `colgrep --exclude="*_test.go" "query"` |
| Exclude vendor directories | `colgrep --exclude-dir=vendor "query"` |
| Search in specific directories | `colgrep --include="src/**/*.rs" "query"` |
| Search multiple directories | `colgrep "query" ./src ./lib ./api` |
| Search CI/CD configs | `colgrep --include="**/.github/**/*" "q" .` |
| Need more context lines | `colgrep -n 10 "query"` |
| Exact string/regex match only | Built-in `Grep` tool |
| Find files by name | Built-in `Glob` tool |
## Key Rules
1. **Default to `colgrep`** for any code search
2. **Increase `--results`** (or `-k`) when exploring (20-30 results)
3. **Use `-e`** for hybrid text+semantic filtering
4. **Use `-E`** with `-e` for extended regex (alternation `|`, quantifiers `+?`, grouping `()`)
5. **Use `-F`** with `-e` when pattern contains regex special characters you want literal
6. **Use `-w`** with `-e` to avoid partial matches (e.g., "test" won't match "testing")
7. **Use `--exclude`/`--exclude-dir`** to filter out noise (tests, vendors, generated code)
8. **Use brace expansion** for multiple file types (e.g., `--include="*.{rs,md,py}"`)
9. **Agents should use `colgrep`** - when spawning Task/Explore agents, they should also use colgrep instead of Grep
## Need Help?
Run `colgrep --help` for complete documentation on all flags and options.
<!-- COLGREP_END -->
EOF8. Recommended Agent Workflow
- Start with a semantic query.
- Add
-ewhen you know part of a pattern and want hybrid filtering. - Keep output lean first, then open files for deeper reads.
This usually gives better relevance while reducing context noise for the agent.
References
- Hugging Face announcement: LateOn-Code & ColGrep
- Official repository: lightonai/next-plaid
- Official CLI docs: ColGrep README