International Conference on Advancements in Computing [ICAC]

Permanent URI for this communityhttps://rda.sliit.lk/handle/123456789/312

The International Conference on Advancements in Computing (ICAC) is organized by the Faculty of Computing of the Sri Lanka Institute of Information Technology (SLIIT) as an open forum for academics along with industry professionals to present the latest findings and research output and practical deployments in computing.

The primary objective of ICAC is to promote innovative research that addresses real-world challenges and contributes to the social well-being of communities. The conference provides a dynamic platform for researchers from around the world to present groundbreaking findings, exchange ideas, and establish meaningful collaborations.

https://icac.lk

Browse

Search Results

Now showing 1 - 1 of 1
  • Thumbnail Image
    PublicationEmbargo
    Deepfake Audio Detection: A Deep Learning Based Solution for Group Conversations
    (2020 2nd International Conference on Advancements in Computing (ICAC), SLIIT, 2020-12-10) Wijethunga, R.L.M.A.P.C.; Matheesha, D.M.K.; Al Noman, A.; De Silva, K.H.V.T.A.; Tissera, M.; Rupasinghe, L.
    The recent advancements in deep learning and other related technologies have led to improvements in various areas such as computer vision, bio-informatics, and speech recognition etc. This research mainly focuses on a problem with synthetic speech and speaker diarization. The developments in audio have resulted in deep learning models capable of replicating naturalsounding voice also known as text-to-speech (TTS) systems. This technology could be manipulated for malicious purposes such as deepfakes, impersonation, or spoofing attacks. We propose a system that has the capability of distinguishing between real and synthetic speech in group conversations.We built Deep Neural Network models and integrated them into a single solution using different datasets, including but not limited to Urban- Sound8K (5.6GB), Conversational (12.2GB), AMI-Corpus (5GB), and FakeOrReal (4GB). Our proposed approach consists of four main components. The speech-denoising component cleans and preprocesses the audio using Multilayer-Perceptron and Convolutional Neural Network architectures, with 93% and 94% accuracies accordingly. The speaker diarization was implemented using two different approaches, Natural Language Processing for text conversion with 93% accuracy and Recurrent Neural Network model for speaker labeling with 80% accuracy and 0.52 Diarization-Error-Rate. The final component distinguishes between real and fake audio using a CNN architecture with 94% accuracy. With these findings, this research will contribute immensely to the domain of speech analysis.