Publication: Automated Detection of Deepfake Audio in Real-Time VoIP Communication
| dc.contributor.author | Chandrasiri, D.D.C.M. | |
| dc.date.accessioned | 2026-02-10T07:13:00Z | |
| dc.date.issued | 2025-12 | |
| dc.description.abstract | With the increasing sophistication of AI-generated deepfake audio, real-time voice communication systems such as Voice over IP (VoIP) are at heightened risk of misuse through impersonation, fraud, and misinformation. Existing detection methods primarily rely on computationally expensive deep learning models trained on static data, which are impractical for live applications constrained by low latency and limited resources. This research addresses this gap by investigating the viability of a lightweight, highly efficient Random Forest (RF) classifier for real-time deepfake audio detection in VoIP environments. The proposed system utilizes a focused methodology: raw audio is segmented into 2-second chunks and transformed into a comprehensive 800-dimension feature vector comprising Mel-Frequency Cepstral Coefficients (MFCCs), Chroma, Spectral Contrast, and Zero-Crossing Rate. Through an iterative training process using combined standard and 'in-the-wild' datasets to ensure generalization, the final RF model achieved an overall accuracy of 93.77% on an independent test set. Critically, the system demonstrated extremely low end-to-end processing latency of approximately 76 milliseconds (well below the <200ms target). The findings prove that this computationally efficient, classical machine learning approach can achieve both high accuracy and speed. The final model successfully met the False Positive Rate objective (<5%) with a measured FPR of 2.85% on independent data, making it a viable and practical solution for enhancing the security and trustworthiness of real-time voice interactions against emerging deepfake threats. | |
| dc.identifier.uri | https://rda.sliit.lk/handle/123456789/4587 | |
| dc.language.iso | en | |
| dc.publisher | Sri Lanka Institute of Information Technology | |
| dc.subject | Automated Detection | |
| dc.subject | Deepfake Audio | |
| dc.subject | Real-Time VoIP | |
| dc.subject | VoIP Communication | |
| dc.title | Automated Detection of Deepfake Audio in Real-Time VoIP Communication | |
| dc.type | Thesis | |
| dspace.entity.type | Publication |
Files
Original bundle
1 - 2 of 2
- Name:
- Automated Detection of Deepfake Audio in Real-Time VoIP Communication 1-9.pdf
- Size:
- 299.61 KB
- Format:
- Adobe Portable Document Format
No Thumbnail Available
- Name:
- Automated Detection of Deepfake Audio in Real-Time VoIP Communication.pdf
- Size:
- 841.08 KB
- Format:
- Adobe Portable Document Format
License bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- license.txt
- Size:
- 1.69 KB
- Format:
- Item-specific license agreed upon to submission
- Description:
