Research Division

Active Research

Six directions we are pursuing to move beyond the chat box. Each area includes technical documentation, current results, and open questions.

research areas

active

in pilot

R01Active research

Voice Identity Preservation in Neural Translation

Preserving speaker characteristics across language boundaries

Most neural translation systems optimize for semantic accuracy and discard paralinguistic features—accent, prosody, rhythm, and vocal identity. We are researching how to preserve these characteristics when translating speech across languages, enabling real-time communication that sounds like the speaker, not a generic synthetic voice.

speechtranslation+2

Speech Systems Team

R02Active research

Implicit Turn-Taking in Conversational AI

Inferring speech readiness without explicit wake words

Wake words and push-to-talk create friction in human-AI interaction. Humans detect turn-taking cues from prosody, breathing patterns, and conversational context. We are building models that predict when a user has finished speaking or is inviting AI contribution, enabling truly conversational interfaces.

interactionacoustics+2

Interaction Lab

R03Pilot system

Real-Time Fact Correction in Generated Speech

Detecting and correcting factual errors without breaking flow

Large language models hallucinate. In spoken dialogue, the cost of an error is high and the window for correction is short. We are researching how to detect likely factual errors in real-time generated speech and issue corrections that feel like natural repair sequences rather than interruptions.

factualityrag+2

Knowledge Systems

R04Exploratory

Multimodal Room Reading for Contextual Response

Integrating facial expression, gesture, and vocal tone

Current AI systems process text or speech but ignore the rich contextual signals humans use: facial expressions, micro-gestures, gaze direction, and vocal tone. We are researching how to integrate these modalities to infer emotional state, engagement level, and conversational context, using them to shape more appropriate AI responses.

multimodalvision+2

Perception Team

R05Active research

Parallel Context Execution for Multi-Task AI

Independent time-anchored instruction streams

Most conversational AI handles one thing at a time. Human cognition maintains multiple parallel threads: tracking background tasks, monitoring ongoing processes, and handling interruptions. We are formalizing a model where AI can maintain multiple independent context threads, executing them in parallel without cross-contamination.

systemsparallelism+2

Systems Research

R06Pilot system

Conversational Computer Use and Task Execution

Bridging natural language dialogue with system action

The gap between conversational AI and computer use is large: one answers questions, the other performs actions. We are researching how to bridge this gap—enabling AI to understand natural language task descriptions, formulate multi-step plans, execute actions on live systems, and handle failures through dialogue.

agentscomputer-use+2

Agent Systems

Research conducted at Sylica AI Labs. For collaboration inquiries, contact our research team.