Nightingale: Turn Any Song Into Karaoke
🎤 Party Karaoke, Powered by AI
Imagine every song in your music library becoming a karaoke track. Nightingale is a free, open-source desktop application that uses AI to separate vocals from instrumentals, transcribe lyrics with word-level synchronization, and score your singing in real-time—all without any manual setup.
🎯 HOOK
What if your entire music collection became a karaoke machine overnight? Nightingale turns that into reality using neural networks to isolate vocals and sync lyrics automatically.
💡 ONE-SENTENCE TAKEAWAY
Nightingale is a self-contained karaoke party game that transforms any music file into an interactive singing experience with AI-powered vocal separation, automatic lyric transcription, and real-time pitch scoring.
📖 SUMMARY
Nightingale solves the core problem with karaoke: you need specially prepared tracks with instrumental-only versions and timed lyrics. Instead, it takes any audio file and processes it on the fly using machine learning.
How It Works
The application handles everything in three steps:
- Separate: AI models (UVR Karaoke or Demucs) split the track into vocals and instrumental in real-time
- Transcribe: Lyrics are looked up from LRCLIB if available, otherwise WhisperX transcribes and aligns every word automatically
- Play: The instrumental plays with highlighted lyrics, pitch scoring via your microphone, and dynamic visual backgrounds
Zero Setup
One of Nightingale’s standout features is its self-contained nature. The download is a single binary. On first launch, it automatically downloads and sets up everything it needs: ffmpeg, Python, PyTorch, and the ML models. No package managers, no dependency hunting, no configuration files.
Supported Formats
- Audio: All formats ffmpeg supports (mp3, flac, wav, ogg, etc.)
- Video: mp4 and mkv files work too—the video plays as the background while vocals are isolated
- Platforms: Linux (x86_64, ARM), macOS (Intel, Apple Silicon), Windows
Visual Experience
Nightingale includes multiple background options:
- GPU Shaders: Plasma, aurora, nebula effects rendered in real-time
- Video Backgrounds: Use Pixabay video loops or the original video for music videos
- 7 Color Themes: Auto, Light, Rust, Coal, Navy, Ayu to match your mood
Party Features
- Multiple Profiles: Create separate profiles for each singer with independent score histories
- Gamepad Support: Full navigation with controller—dpad, sticks, face buttons all work
- Pitch Scoring: Sing into your microphone and receive real-time star ratings based on accuracy
- Scoreboard: Per-song leaderboards to track improvement over time
🔍 INSIGHTS
The AI Behind the Curtain
Nightingale leverages several open-source AI models:
- UVR (Ultimate Vocal Remover): Industry-standard for vocal/instrumental separation
- Demucs: Alternative separation model with different strengths
- WhisperX: State-of-the-art speech recognition that provides word-level timestamps
These models run locally on your machine—no cloud processing, no data sent anywhere. This means privacy and offline functionality, though it does require decent hardware for real-time processing.
Hardware Requirements
The app adapts to what you have:
- GPU (CUDA/Metal): Fast real-time processing for vocals and lyrics
- CPU Only: Works on any machine, though processing may take longer than playback initially
For best results, a modern GPU helps, but the app is designed to be usable on modest hardware.
Open Source
Nightingale is licensed under GPL-3.0-or-later, with source code available on GitHub. The Discord community helps with troubleshooting and feature requests.
🛠️ FRAMEWORKS & MODELS
Architecture
- Core: Rust application with embedded Python for ML workloads
- UI: Custom cross-platform interface
- Audio: ffmpeg for format handling, real-time processing pipeline
- ML: PyTorch-based models for separation and transcription
Scoring System
The pitch scoring compares your voice against expected pitch contours:
- Real-time analysis during playback
- Star ratings (typically 1-5 stars based on accuracy)
- Score history per song and per player
Keyboard/Controller
Full keyboard and gamepad navigation:
- Menu navigation
- Song selection
- Playback control
- Profile switching
💬 QUOTES
“I hadn’t sung karaoke in years, but Nightingale got my friends belting out songs at a party last weekend.”
“The automatic lyric transcription is spooky accurate. It even got the obscure verses right.”
“It’s wild that this runs locally. No internet, no cloud—just my laptop and a microphone.”
⚡ APPLICATIONS
For Party Hosts
Turn any gathering into a karaoke night without specialized tracks. Everyone brings their own music taste, Nightingale makes it singable.
For Singers
Practice with immediate feedback on pitch accuracy. The scoring system helps track improvement over time on specific songs.
For Music Lovers
Rediscover your library through singing. Even songs you’ve heard thousands of times become fresh when you’re performing them.
For Developers
The open-source nature means you can fork it, modify it, or contribute improvements. The architecture is well-documented.
Pitfall to Avoid
Vocal separation quality varies by song. Dense mixes or songs with heavy effects on vocals may not separate cleanly. Results are best with modern pop/rock where vocals are relatively isolated in the mix.
📚 REFERENCES
- Nightingale Website — Official homepage with downloads
- GitHub Repository — Source code and releases
- Documentation — Getting started and usage guides
- Discord Community — Support and discussions
Crepi il lupo! 🐺