Hear the Truth.
Speak with
Intelligence.
سچ سنو۔ ذہانت سے بولو۔
Zuah is an AI-powered Urdu voice system that clones any voice from 3 seconds of audio and detects deepfakes with 95.45% accuracy - powered by Wav2Vec2, AASIST, Conformer, and OmniVoice.
Scroll to explore
WORKFLOW
From Audio to Intelligence in 4 Steps
Upload Reference Audio
Drop 1-10 audio files (.wav .mp3 .flac .m4a), each 3-10 seconds.
Enter Your Urdu Scripts
Type Urdu text in RTL rows. Add up to 10 dynamically.
Generate Cloned Voices
POST /clone via ngrok - sequential jobs with progress.
Listen, Download, or Detect
Download WAV or send directly to the detection panel.
Upload Audio
Upload any file or use cloned output from the demo.
Analyze via /predict
Flask backend returns verdict and confidence scores.
View REAL/FAKE Verdict
Animated result card with shield or alert icon.
Review Confidence Bars
Real % and Fake % bars animate to final values.
INTERACTIVE DEMO
Try Zuah Live
? Requires your ngrok backend URL to be active
Contact: +923000781176 · muhammadzaid@gmail.com
Running on Google Colab? Paste the ngrok URL from your notebook output.
How to get your ngrok URL
- Open your Zuah notebook in Google Colab
- Run all cells - last cell starts Flask
- Copy the ngrok HTTPS URL
- Paste above and click Contact
!ngrok config add-authtoken YOUR_TOKEN
Get free ngrok token
Clone a Voice OmniVoice
Live Recording
Record 3–10 seconds directly from your microphone.
Drop audio files here or click to browse
.wav .mp3 .flac .m4a � 3-10 seconds � Max 10 filesReference Transcription (optional)
Detect Deepfake 95.45% Accuracy
Drop audio or click to browse
History (last 5)
SAVED AUDIO
Audio Library
Original uploads, live recordings, and generated clones — saved
when python local_storage_server.py
runs on port 8765.
No saved audio yet. Generate a clone or record a sample in the demo.
FAQ
Frequently Asked Questions
Zuah is an Urdu voice AI platform combining zero-shot voice cloning (OmniVoice) and deepfake detection (Wav2Vec2 + AASIST + Conformer) in a single browser-based application.
The frontend is a single HTML file - just open it in any modern browser. You do need a running backend (Google Colab + ngrok) to use the clone and detect features.
Zuah achieves 95.45% accuracy on our evaluation set, using a fusion of Wav2Vec2 waveform features, AASIST spectral-temporal graphs, and Conformer mel-spectrogram encoding.
OmniVoice is optimized for Urdu. While it may produce output for other languages, best results are with Urdu Nastaliq script input.
The minimum is 3 seconds. Optimal quality is achieved with 6-8 seconds of clear, noise-free speech from the target speaker.
No. Audio is processed in memory on Google Colab and returned to your browser. Nothing is stored on any server.
Yes. The detection panel accepts any audio upload independently. You can upload any audio file to check if it is real or AI-generated.
This happens with the free ngrok plan. To get a persistent URL, create a free ngrok account and use a static domain, or use ngrok's reserved subdomain feature.
Upload: .wav, .mp3, .flac, .m4a for both cloning and detection. Output: synthesized voices are returned as .wav files.
Yes. The codebase is available on GitHub (link in footer). The models (OmniVoice, etc.) are subject to their own licenses.