ColGrep for Codex and OpenCode: Global macOS Setup

⬅️ Back to Projects

Overview

colgrep is a local semantic code search CLI from LightOn’s NextPlaid stack. It keeps the familiar grep workflow, but adds semantic ranking and hybrid regex + semantic search.

Why this matters for AI agents:

  • Runs 100% local (no code upload to remote search APIs)
  • Understands intent, not only exact keyword matches
  • Supports incremental index updates, so it stays fast while code changes

This guide now includes the common ONNX runtime error (often typed as “Onyx”), why it happens, and the exact fix.

1. Install ColGrep Globally on macOS

Use the official installer (this installs to ~/.cargo/bin):

curl --proto '=https' --tlsv1.2 -LsSf https://github.com/lightonai/next-plaid/releases/latest/download/colgrep-installer.sh | sh

Load it in the current shell:

source ~/.cargo/env

Verify:

which colgrep
colgrep --version

If your shell startup files are not writable, add this line manually to ~/.zprofile (or ~/.zshrc):

. "$HOME/.cargo/env"

2. Install Agent Integrations

Official commands:

colgrep --install-codex
colgrep --install-opencode

This writes integration instructions into:

  • ~/.codex/AGENTS.md
  • ~/.config/opencode/AGENTS.md

Restart Codex/OpenCode after installation.

3. Add a Conda-Safe Wrapper (Recommended)

If conda is active in your shell, it can inject an incompatible ONNX runtime and cause colgrep to fail during indexing.

Add this once:

cat >> ~/.zshrc <<'EOF'

# >>> colgrep compatibility helpers >>>
# Avoid conda ONNX runtime conflicts with colgrep.
colgrep() {
  env -u CONDA_PREFIX -u ORT_DYLIB_PATH command colgrep "$@"
}

# Optional explicit names / shortcuts
alias cgrep='colgrep'
alias raw_colgrep='command colgrep'
alias cg='colgrep'
alias cgs='colgrep status'
alias cgi='colgrep init .'
alias cgc='colgrep clear'
# <<< colgrep compatibility helpers <<<
EOF

source ~/.zshrc

4. First Use in Any Project

You install colgrep once for your Mac user, then index each repo you care about:

cd /path/to/project
colgrep init -y
colgrep status
colgrep "find retry logic for API calls"
colgrep -e "async fn" "error handling" --include="*.rs"

Useful note: indexes are stored outside your repos on macOS in:

~/Library/Application Support/colgrep/indices/

5. Troubleshooting the ONNX/“Onyx” Exit 101 Error

Symptom

You may see this while indexing:

🤖 Model: lightonai/LateOn-Code-edge
📂 Building index...
101
No index found for /path/to/project
Run `colgrep <query>` to create one.

Why it happens

When conda is active, colgrep can pick up ONNX runtime from your conda environment (for example, libonnxruntime.1.20.1.dylib) instead of the version it expects (1.23.0). That mismatch can crash model initialization and return exit code 101.

Quick fix (one-shot)

env -u CONDA_PREFIX -u ORT_DYLIB_PATH colgrep init .
env -u CONDA_PREFIX -u ORT_DYLIB_PATH colgrep status

Permanent fix

Use the wrapper from Section 3. After that, normal colgrep ... commands automatically run with safe env vars.

If you cannot download ONNX runtime from GitHub

If your network blocks runtime download, install a matching ONNX runtime in your current env and point ORT_DYLIB_PATH to it:

pip install -U "onnxruntime==1.23.0"
export ORT_DYLIB_PATH="$CONDA_PREFIX/lib/python3.12/site-packages/onnxruntime/capi/libonnxruntime.1.23.0.dylib"
colgrep init .

Adjust the Python version path if your env is not python3.12.

6. Paste-Ready Full AGENTS.md for Codex

Paste this as-is into ~/.codex/AGENTS.md:

cat > ~/.codex/AGENTS.md <<'EOF'
<!-- COLGREP_START -->
# Semantic Code Search

This repository has `colgrep` installed - a semantic code search CLI.

**Use `colgrep` as your PRIMARY search tool** instead of `Search / Grep / Glob`.

## Quick Reference

```bash
# Basic semantic search
colgrep "<natural language query>" --results 10   # Basic search
colgrep "<query>" -k 25                           # Exploration (more results)
colgrep "<query>" ./src/parser                    # Search in specific folder
colgrep "<query>" ./src/main.rs                   # Search in specific file
colgrep "<query>" ./src/main.rs ./src/lib.rs      # Search in multiple files
colgrep "<query>" ./crate-a ./crate-b             # Search multiple directories

# File filtering
colgrep --include="*.rs" "<query>"                # Include only .rs files
colgrep --include="src/**/*.rs" "<query>"         # Recursive glob pattern
colgrep --include="*.{rs,md}" "<query>"           # Multiple file types (brace expansion)
colgrep --exclude="*.test.ts" "<query>"           # Exclude test files
colgrep --exclude-dir=vendor "<query>"            # Exclude vendor directory

# Pattern-only search (no semantic query needed)
colgrep -e "<pattern>"                            # Search by pattern only
colgrep -e "async fn" --include="*.rs"            # Pattern search with file filter

# Hybrid search (text + semantic)
colgrep -e "<text>" "<semantic query>"            # Hybrid: text + semantic
colgrep -e "<regex>" -E "<semantic query>"        # Hybrid with extended regex (ERE)
colgrep -e "<literal>" -F "<semantic query>"      # Hybrid with fixed string (no regex)
colgrep -e "<word>" -w "<semantic query>"         # Hybrid with whole word match

# Output options
colgrep -l "<query>"                              # List files only
colgrep -n 6 "<query>"                            # Show 6 context lines (use -n for more context)
colgrep --json "<query>"                          # JSON output
```

## Grep-Compatible Flags

| Flag            | Description                                 | Example                                      |
| --------------- | ------------------------------------------- | -------------------------------------------- |
| `-e <PATTERN>`  | Text pattern pre-filter                     | `colgrep -e "async" "concurrency"`           |
| `-E`            | Extended regex (ERE) for `-e`               | `colgrep -e "async\|await" -E "concurrency"` |
| `-F`            | Fixed string (no regex) for `-e`            | `colgrep -e "foo[bar]" -F "query"`           |
| `-w`            | Whole word match for `-e`                   | `colgrep -e "test" -w "testing"`             |
| `-k, --results` | Number of results to return                 | `colgrep --results 20 "query"`               |
| `-n, --lines`   | Number of context lines (default: 6)        | `colgrep -n 10 "query"`                      |
| `-l`            | List files only                             | `colgrep -l "authentication"`                |
| `-r`            | Recursive (default, for compatibility)      | `colgrep -r "query"`                         |
| `--include`     | Include files matching pattern (repeatable) | `colgrep --include="*.py" "query"`           |
| `--exclude`     | Exclude files matching pattern              | `colgrep --exclude="*.min.js" "query"`       |
| `--exclude-dir` | Exclude directories                         | `colgrep --exclude-dir=node_modules "query"` |

**Notes:**

- `-F` takes precedence over `-E` (like grep)
- Default exclusions always apply: `.git`, `node_modules`, `target`, `.venv`, `__pycache__`
- When running from a subdirectory, results are restricted to that subdirectory. To search the full project, specify `.` or `..` as the path
- Multiple `--include` patterns use OR logic (matches if file matches any pattern)
- Brace expansion is supported: `*.{rs,md,py}` expands to match all three types
- When `conda` is active, use `env -u CONDA_PREFIX -u ORT_DYLIB_PATH colgrep ...` to avoid ONNX runtime conflicts during indexing

## When to Use What

| Task                            | Tool                                         |
| ------------------------------- | -------------------------------------------- |
| Find code by intent/description | `colgrep "query" -k 10`                      |
| Explore/understand a system     | `colgrep "query" -k 25` (increase k)         |
| Search by pattern only          | `colgrep -e "pattern"` (no semantic query)   |
| Know text exists, need context  | `colgrep -e "text" "semantic query"`         |
| Literal text with special chars | `colgrep -e "foo[0]" -F "semantic query"`    |
| Whole word match                | `colgrep -e "test" -w "testing utilities"`   |
| Search in a specific file       | `colgrep "query" ./src/main.rs`              |
| Search in multiple files        | `colgrep "query" ./src/main.rs ./src/lib.rs` |
| Search specific file type       | `colgrep --include="*.ext" "query"`          |
| Search multiple file types      | `colgrep --include="*.{rs,md,py}" "query"`   |
| Exclude test files              | `colgrep --exclude="*_test.go" "query"`      |
| Exclude vendor directories      | `colgrep --exclude-dir=vendor "query"`       |
| Search in specific directories  | `colgrep --include="src/**/*.rs" "query"`    |
| Search multiple directories     | `colgrep "query" ./src ./lib ./api`          |
| Search CI/CD configs            | `colgrep --include="**/.github/**/*" "q" .`  |
| Need more context lines         | `colgrep -n 10 "query"`                      |
| Exact string/regex match only   | Built-in `Grep` tool                         |
| Find files by name              | Built-in `Glob` tool                         |

## Key Rules

1. **Default to `colgrep`** for any code search
2. **Increase `--results`** (or `-k`) when exploring (20-30 results)
3. **Use `-e`** for hybrid text+semantic filtering
4. **Use `-E`** with `-e` for extended regex (alternation `|`, quantifiers `+?`, grouping `()`)
5. **Use `-F`** with `-e` when pattern contains regex special characters you want literal
6. **Use `-w`** with `-e` to avoid partial matches (e.g., "test" won't match "testing")
7. **Use `--exclude`/`--exclude-dir`** to filter out noise (tests, vendors, generated code)
8. **Use brace expansion** for multiple file types (e.g., `--include="*.{rs,md,py}"`)
9. **Agents should use `colgrep`** - when spawning Task/Explore agents, they should also use colgrep instead of Grep

## Need Help?

Run `colgrep --help` for complete documentation on all flags and options.

<!-- COLGREP_END -->
EOF

7. Paste-Ready Full AGENTS.md for OpenCode

Paste this as-is into ~/.config/opencode/AGENTS.md:

cat > ~/.config/opencode/AGENTS.md <<'EOF'
# OpenCode Agent Tools

<!-- COLGREP_START -->
# Semantic Code Search

This repository has `colgrep` installed - a semantic code search CLI.

**Use `colgrep` as your PRIMARY search tool** instead of `Search / Grep / Glob`.

## Quick Reference

```bash
# Basic semantic search
colgrep "<natural language query>" --results 10   # Basic search
colgrep "<query>" -k 25                           # Exploration (more results)
colgrep "<query>" ./src/parser                    # Search in specific folder
colgrep "<query>" ./src/main.rs                   # Search in specific file
colgrep "<query>" ./src/main.rs ./src/lib.rs      # Search in multiple files
colgrep "<query>" ./crate-a ./crate-b             # Search multiple directories

# File filtering
colgrep --include="*.rs" "<query>"                # Include only .rs files
colgrep --include="src/**/*.rs" "<query>"         # Recursive glob pattern
colgrep --include="*.{rs,md}" "<query>"           # Multiple file types (brace expansion)
colgrep --exclude="*.test.ts" "<query>"           # Exclude test files
colgrep --exclude-dir=vendor "<query>"            # Exclude vendor directory

# Pattern-only search (no semantic query needed)
colgrep -e "<pattern>"                            # Search by pattern only
colgrep -e "async fn" --include="*.rs"            # Pattern search with file filter

# Hybrid search (text + semantic)
colgrep -e "<text>" "<semantic query>"            # Hybrid: text + semantic
colgrep -e "<regex>" -E "<semantic query>"        # Hybrid with extended regex (ERE)
colgrep -e "<literal>" -F "<semantic query>"      # Hybrid with fixed string (no regex)
colgrep -e "<word>" -w "<semantic query>"         # Hybrid with whole word match

# Output options
colgrep -l "<query>"                              # List files only
colgrep -n 6 "<query>"                            # Show 6 context lines (use -n for more context)
colgrep --json "<query>"                          # JSON output
```

## Grep-Compatible Flags

| Flag            | Description                                 | Example                                      |
| --------------- | ------------------------------------------- | -------------------------------------------- |
| `-e <PATTERN>`  | Text pattern pre-filter                     | `colgrep -e "async" "concurrency"`           |
| `-E`            | Extended regex (ERE) for `-e`               | `colgrep -e "async\|await" -E "concurrency"` |
| `-F`            | Fixed string (no regex) for `-e`            | `colgrep -e "foo[bar]" -F "query"`           |
| `-w`            | Whole word match for `-e`                   | `colgrep -e "test" -w "testing"`             |
| `-k, --results` | Number of results to return                 | `colgrep --results 20 "query"`               |
| `-n, --lines`   | Number of context lines (default: 6)        | `colgrep -n 10 "query"`                      |
| `-l`            | List files only                             | `colgrep -l "authentication"`                |
| `-r`            | Recursive (default, for compatibility)      | `colgrep -r "query"`                         |
| `--include`     | Include files matching pattern (repeatable) | `colgrep --include="*.py" "query"`           |
| `--exclude`     | Exclude files matching pattern              | `colgrep --exclude="*.min.js" "query"`       |
| `--exclude-dir` | Exclude directories                         | `colgrep --exclude-dir=node_modules "query"` |

**Notes:**

- `-F` takes precedence over `-E` (like grep)
- Default exclusions always apply: `.git`, `node_modules`, `target`, `.venv`, `__pycache__`
- When running from a subdirectory, results are restricted to that subdirectory. To search the full project, specify `.` or `..` as the path
- Multiple `--include` patterns use OR logic (matches if file matches any pattern)
- Brace expansion is supported: `*.{rs,md,py}` expands to match all three types
- When `conda` is active, use `env -u CONDA_PREFIX -u ORT_DYLIB_PATH colgrep ...` to avoid ONNX runtime conflicts during indexing

## When to Use What

| Task                            | Tool                                         |
| ------------------------------- | -------------------------------------------- |
| Find code by intent/description | `colgrep "query" -k 10`                      |
| Explore/understand a system     | `colgrep "query" -k 25` (increase k)         |
| Search by pattern only          | `colgrep -e "pattern"` (no semantic query)   |
| Know text exists, need context  | `colgrep -e "text" "semantic query"`         |
| Literal text with special chars | `colgrep -e "foo[0]" -F "semantic query"`    |
| Whole word match                | `colgrep -e "test" -w "testing utilities"`   |
| Search in a specific file       | `colgrep "query" ./src/main.rs`              |
| Search in multiple files        | `colgrep "query" ./src/main.rs ./src/lib.rs` |
| Search specific file type       | `colgrep --include="*.ext" "query"`          |
| Search multiple file types      | `colgrep --include="*.{rs,md,py}" "query"`   |
| Exclude test files              | `colgrep --exclude="*_test.go" "query"`      |
| Exclude vendor directories      | `colgrep --exclude-dir=vendor "query"`       |
| Search in specific directories  | `colgrep --include="src/**/*.rs" "query"`    |
| Search multiple directories     | `colgrep "query" ./src ./lib ./api`          |
| Search CI/CD configs            | `colgrep --include="**/.github/**/*" "q" .`  |
| Need more context lines         | `colgrep -n 10 "query"`                      |
| Exact string/regex match only   | Built-in `Grep` tool                         |
| Find files by name              | Built-in `Glob` tool                         |

## Key Rules

1. **Default to `colgrep`** for any code search
2. **Increase `--results`** (or `-k`) when exploring (20-30 results)
3. **Use `-e`** for hybrid text+semantic filtering
4. **Use `-E`** with `-e` for extended regex (alternation `|`, quantifiers `+?`, grouping `()`)
5. **Use `-F`** with `-e` when pattern contains regex special characters you want literal
6. **Use `-w`** with `-e` to avoid partial matches (e.g., "test" won't match "testing")
7. **Use `--exclude`/`--exclude-dir`** to filter out noise (tests, vendors, generated code)
8. **Use brace expansion** for multiple file types (e.g., `--include="*.{rs,md,py}"`)
9. **Agents should use `colgrep`** - when spawning Task/Explore agents, they should also use colgrep instead of Grep

## Need Help?

Run `colgrep --help` for complete documentation on all flags and options.

<!-- COLGREP_END -->
EOF

8. Recommended Agent Workflow

  1. Start with a semantic query.
  2. Add -e when you know part of a pattern and want hybrid filtering.
  3. Keep output lean first, then open files for deeper reads.

This usually gives better relevance while reducing context noise for the agent.

References