Voice & Communication

Voice Arena

Next.js playground for four English voice personas with Web Speech API, RMS VAD, and Azure Edge TTS

Persona-Driven Voice Interaction

Voice Arena enables natural conversations with 4 distinct AI personas, each with unique thinking styles and communication patterns. Built on Next.js with browser-native speech recognition and Azure's neural text-to-speech.

Each conversation flows through browser speech-to-text, RMS-based voice activity detection, persona-aware prompt routing, and Azure Edge TTS for spoken responses.

BYOK Model

API DeepSeek Compatible
TTS Azure Edge (AvaNeural)
STT Web Speech API
Framework Next.js

4 Distinct Personas

Each persona maintains isolated memory stacks for head-to-head comparisons

Navigator
Tactical Mentor

Focuses on execution clarity, breaking down complex problems into actionable steps. Provides structured guidance and practical frameworks for implementation.

Edge Tester
Sharp Critic

Probes assumptions and exposes failure modes. Challenges ideas with rigorous analysis to identify weaknesses before they become problems.

Pattern Sage
Systems Thinker

Connects long arcs and identifies recurring patterns across domains. Synthesizes high-level insights from complex interconnected systems.

Creative Spark
Energetic Ideator

Unlocks alternative perspectives and generates novel solutions. Brings enthusiasm and creative approaches to problem-solving.

Core Capabilities

Web Speech API

Browser-native speech-to-text with automatic fallback to text input when unsupported.

RMS-Based VAD

Lightweight voice activity detection with tunable RMS thresholds and hangover time.

Persona Routing

Intelligent prompt routing to DeepSeek-compatible endpoints with persona-specific context.

Azure Edge TTS

Neural text-to-speech with AvaNeural voice. MPEG audio output with configurable formats.

Isolated Memory

Each persona maintains separate conversation history for comparative analysis.

Browser-Native

Built entirely on web standards. No native apps or plugins required.

Technical Specifications

Framework
Next.js
Speech Recognition
Web Speech API
Voice Activity Detection
RMS-based VAD
Text-to-Speech
Azure Edge TTS (AvaNeural)
LLM Backend
DeepSeek API Compatible
Audio Format
MPEG (Configurable)
Deployment
Vercel/Node.js
Personas
4 Isolated Stacks

Getting Started

Quick setup guide for Voice Arena

1

Install Dependencies

npm install
2

Configure Environment

Create .env.local with your API credentials:

LLM_API_KEY=<DS api key>
LLM_BASE_URL=<optional, defaults to DeepSeek>
LLM_MODEL=<optional, defaults to deepseek-chat>
EDGE_TRUSTED_TOKEN=<optional cached token>
3

Start Development Server

npm run dev
4

Tune VAD Parameters (Optional)

Adjust RMS_THRESHOLD and HANGOVER_MS in hooks/useVoiceActivity.ts for your environment.

Ready to Explore Voice Personas?

Experience natural conversations with 4 distinct AI personas