Six directions we're studying.
- 01voice
Voice that sounds like you.
Translation that preserves accent, rhythm, and identity across languages — in real time.
- Prosody-aware neural vocoders
- Identity embeddings across 40+ languages
- Sub-200ms streaming latency
<200msend-to-end latency - 02turn-taking
AI that knows when to speak.
No wake words. Reads acoustic and behavioral cues to know when you're inviting a response.
- Prosodic boundary detection
- Speaker diarization + overlap prediction
- Conversational state tracking
94%turn prediction accuracy - 03fact-check
Real-time fact correction.
Catches errors mid-sentence and corrects them without breaking conversation flow.
- Streaming claim extraction
- Retrieval-augmented verification
- Gentle interjection timing
~800msfact verification latency - 04multimodal
AI that reads the room.
Facial expressions, micro-gestures, vocal tone — emotional feedback that shapes responses.
- Micro-expression detection (AU-level)
- Paralinguistic feature extraction
- Affective loop closure
12emotion dimensions tracked - 05context
Parallel context execution.
Multiple time-anchored instructions running independently without bleeding into each other.
- Temporal scope isolation
- Dependency graph scheduling
- Priority-based interruption
∞parallel contexts supported - 06agency
AI that acts, not just answers.
Long-horizon task execution on live systems through natural spoken commands.
- Computer-use model fine-tuning
- Multi-step plan generation
- Failure recovery + clarification
20+apps integrated
Sylica Stealth — already shipping.
A quiet desktop assistant that reads your screen and helps through meetings, browsing, and everyday work.
How we ship.
Local by default
On-device capture. You choose what leaves your machine.
Bring your own keys
Your OpenAI / Anthropic / xAI / Google keys. No markup.
Research in the open
Write-ups, demos, honest negative results — on /blogs.