SLIIT Conference and Symposium Proceedings

Permanent URI for this communityhttps://rda.sliit.lk/handle/123456789/295

All SLIIT faculties annually conduct international conferences and symposiums. Publications from these events are included in this collection.

Browse

Search Results

Now showing 1 - 2 of 2
  • Thumbnail Image
    PublicationEmbargo
    Evaluating the Threshold of Authenticity in Deepfake Audio and Its Implications Within Criminal Justice
    (SLIIT, Faculty of Engineering, 2024-10) Rodgers, J; Jones, K.O; Robinson, C; Chandler-Crnigoj, S; Burrell, H; McColl, S
    Deepfake technology has come a long way in recent years and the world has already seen cases where it has been used maliciously. After a deepfake of UK independent financial advisor and poverty champion Martin Lewis was released on social media, a theory has been proposed where the deepfake target is accompanied by additional media to increase the authenticity of the file, for instance, ambient noise or processing to match how the deepfake would sound if it was recorded from a specific device such as a cellular/mobile phone. Focussing on deepfake audio, a critical listening experiment was conducted where participants were asked to identify the deepfake audio file from a set of three, across a number of sets of three files. A number of audio files were created using real voices with additional sounds added, volunteers recording their voice which is then put through a deepfake generation system, and voices taken from publicly available podcasts which were also applied to the deepfake software – the latter set mimics using web accessible voice recordings of prominent or famous people, such as the Prime Minister of the UK. The results show participants were able to successfully detect one third of the deepfake audio files presented, however they also incorrectly marked another one third of the real files as deepfakes whilst the remaining third were missed. Results also showed no definitive confirmation that audio and/or forensic professionals had any greater ability to successfully detect deepfake audio files when compared to others. The false positive result may also reinforce the scepticism and lack of trust created by what is known as “Liar’s Dividend”. The paper details how the files were created, the testing methodology, and the experimental results. Furthermore, a discussion on the future directions of research and the effects that deepfakes may have on the criminal justice system is presented.
  • Thumbnail Image
    PublicationEmbargo
    Deepfake Audio Detection: A Deep Learning Based Solution for Group Conversations
    (2020 2nd International Conference on Advancements in Computing (ICAC), SLIIT, 2020-12-10) Wijethunga, R.L.M.A.P.C.; Matheesha, D.M.K.; Al Noman, A.; De Silva, K.H.V.T.A.; Tissera, M.; Rupasinghe, L.
    The recent advancements in deep learning and other related technologies have led to improvements in various areas such as computer vision, bio-informatics, and speech recognition etc. This research mainly focuses on a problem with synthetic speech and speaker diarization. The developments in audio have resulted in deep learning models capable of replicating naturalsounding voice also known as text-to-speech (TTS) systems. This technology could be manipulated for malicious purposes such as deepfakes, impersonation, or spoofing attacks. We propose a system that has the capability of distinguishing between real and synthetic speech in group conversations.We built Deep Neural Network models and integrated them into a single solution using different datasets, including but not limited to Urban- Sound8K (5.6GB), Conversational (12.2GB), AMI-Corpus (5GB), and FakeOrReal (4GB). Our proposed approach consists of four main components. The speech-denoising component cleans and preprocesses the audio using Multilayer-Perceptron and Convolutional Neural Network architectures, with 93% and 94% accuracies accordingly. The speaker diarization was implemented using two different approaches, Natural Language Processing for text conversion with 93% accuracy and Recurrent Neural Network model for speaker labeling with 80% accuracy and 0.52 Diarization-Error-Rate. The final component distinguishes between real and fake audio using a CNN architecture with 94% accuracy. With these findings, this research will contribute immensely to the domain of speech analysis.