Zuah — Urdu Voice AI

CAPABILITIES

Everything You Need for Urdu Voice AI

From zero-shot cloning to real-time deepfake detection - Zuah handles the full pipeline.

Zero-Shot Voice Cloning

Clone any Urdu voice from just 3-10 seconds of reference audio. No speaker training required.

OmniVoice

Deepfake Detection

Multi-model fusion of Wav2Vec2, AASIST, and Conformer delivers state-of-the-art anti-spoofing performance.

95.45% Accuracy

Batch Processing

Pair up to 10 reference voices with 10 Urdu scripts and run all jobs sequentially with per-job progress tracking.

Up to 10x10

Urdu RTL Support

Full right-to-left Urdu script input using Noto Nastaliq Urdu typography with RTL-aware placeholders.

Bilingual

Real-Time Confidence Bars

Animated Real % and Fake % bars update after each analysis, giving instant visual confidence feedback.

Live Results

GPU-Accelerated Backend

Flask + PyTorch on Google Colab GPU, tunneled via ngrok for zero-infrastructure browser access.

CUDA / Colab

DEVELOPMENT PROCESS

Software Development Life Cycle

How Zuah was designed, built, tested, and deployed

Week 1-2

Requirements Gathering

Identified gap: no unified Urdu voice AI platform
Defined dual goals: cloning + detection
Surveyed ASVspoof 2019 benchmark for detection baseline
Chose zero-shot approach to avoid Urdu dataset scarcity

Week 3-4

System Design

Client-server architecture via ngrok tunnel
Selected Wav2Vec2, AASIST, Conformer for detection fusion
OmniVoice selected for zero-shot Urdu TTS
Flask REST API designed (/health, /clone, /predict)
Single-file HTML frontend (no build pipeline)

Week 5-10

Implementation

Flask backend with PyTorch model loading + CUDA inference
OmniVoice integrated for reference + script — WAV synthesis
Vanilla JS frontend: drag-drop, batch jobs, RTL Urdu UI
ngrok tunnel configuration for Colab deployment
Custom audio player and confidence bar visualizations

Week 11-13

Testing & Evaluation

Detection accuracy: 95.45% on evaluation set
Tested on ASVspoof 2019 subsets (LA and PA partitions)
Cross-browser testing (Chrome, Firefox, Edge, Safari)
Latency benchmarking: 10-30s clone / ~2s detect
UI testing: drag-drop, batch cloning, RTL input

Week 14

Deployment

Google Colab GPU as backend runtime (free tier)
ngrok tunnel: static subdomain for persistent URL
Frontend: single index.html (GitHub Pages / local)
Models loaded from Google Drive on Colab startup
End-to-end FYP demo: generate — detect workflow

Weeks of Development

Detection Accuracy

Lines of Code

Hear the Truth.
Speak with Intelligence.

Welcome back

Two Powerful AI Capabilities.
One Unified System.

Zero-Shot Cloning

Deepfake Detection

Everything You Need for Urdu Voice AI

Zero-Shot Voice Cloning

Deepfake Detection

Batch Processing

Urdu RTL Support

Real-Time Confidence Bars

GPU-Accelerated Backend

Software Development Life Cycle

Requirements Gathering

System Design

Implementation

Testing & Evaluation

Deployment

Hear the Truth.Speak with Intelligence.

Welcome back

Two Powerful AI Capabilities.One Unified System.

Zero-Shot Cloning

Deepfake Detection

Everything You Need for Urdu Voice AI

Zero-Shot Voice Cloning

Deepfake Detection

Batch Processing

Urdu RTL Support

Real-Time Confidence Bars

GPU-Accelerated Backend

Software Development Life Cycle

Requirements Gathering

System Design

Implementation

Testing & Evaluation

Deployment

Hear the Truth.
Speak with Intelligence.

Two Powerful AI Capabilities.
One Unified System.