Publication:
Robust Speech Analysis Framework Using CNN

Thumbnail Image

Type:

Article

Date

2021-12-09

Journal Title

Journal ISSN

Volume Title

Publisher

2021 3rd International Conference on Advancements in Computing (ICAC), SLIIT

Research Projects

Organizational Units

Journal Issue

Abstract

Voice is the main component of human communication and learning about and recognizing somebody's behavior. By listening to people's voices, humans can recognize a person's identity, speech fluency, accent, emotions, and stress level. It is difficult to understand what the speaker is saying when Speech fluency is poor. It varies from person to person. With the help of specific information in a person's voice, we can recognize human emotion, stress level, and identity. Every person has a unique vocal feature that facilitates recognizing them from others. This proposed framework is developed to identify a person's identity, emotions, fluency in speaking, and stress level of the speaker using their voice. The proposed framework is developed using machine learning techniques, and deep learning algorithms are highlighted in this study. Convolution Neural Network (CNN) is the used deep learning algorithm, and Fast Fourier transform (FFT), (MFCC), and Random Forest are machine learning techniques. The proposed AI-based framework provides comparatively accurate results in a user-friendly way.

Description

Keywords

speaker identification, stress analysis, speech emotion analysis, speaker fluency analysis, audio analysis, CNN

Citation

Endorsement

Review

Supplemented By

Referenced By