lynx   »   [go: up one dir, main page]

Academia.eduAcademia.edu

Speech Processing

11,869 papers
28,102 followers
AI Powered
Speech processing is the interdisciplinary field that focuses on the analysis, synthesis, and recognition of human speech. It encompasses various techniques and technologies for converting spoken language into a machine-readable format, enabling applications such as speech recognition, speech synthesis, and speaker identification.
This review examines how researchers have approached speech errors across a wide range of studies, drawing on work from databases such as Scopus, Web of Science, and Google Scholar. Rather than beginning with fixed categories, patterns... more
This paper presents a method for extracting MFCC parameters from a normalised power spectrum density. The underlined spectral normalisation method is based on the fact that the speech regions with less energy need more robustness, since... more
Normally, voice activity detection (VAD) refers to speech processing algorithms for detecting the presence or absence of human speech in segments of audio signals. In this paper, however, we focus on speech detection algorithms that take... more
Recent developments in large vocabulary continuous speech recognition (LVCSR) have shown the effectiveness of discriminative training approaches, employing the following three representative techniques: discriminative Gaussian training... more
A lower-error and lower-variance n X ?Z multiplier is suitably proposed for VLSI design. Considering next lower significant stage in P,-' column and useful error-compensation model in the least significant part, and utilizing a near... more
Given the importance of speech in our day-to-day activities and almost all forms of communication, it is crucial to address issues that hinder effective communication. Often, in critical security and military applications, the... more
A novel technique based on dynamic stochastic resonance (DSR) in discrete cosine transform (DCT) domain has been proposed in this paper for the enhancement of dark as well as low-contrast images. In conventional DSR-based techniques, the... more
In many functional magnetic resonance imaging (fMRI) studies blind humans were found to show cross-modal reorganization engaging the visual system in non-visual tasks. For example, blind people can manage to understand (synthetic) spoken... more
by A Mars
This paper proposes an acoustic phonetic study of the foreign accents in the Arabic language. To analyze on a large scale of the connected variations, the contribution of the automatic tools acoustico-phonetic decoding tools along the... more
In this paper, we propose an emotion recognition system from speech signal using both spectral and prosodic features. Most traditional systems have focused on spectral features or prosodic features. Since both the spectral and the... more
Several techniques have been introduced over the years to mitigate the effect of howling sound production in sound reinforcement applications. The algorithmic computational complexity of the howling controllers directly affects their... more
This article presents a full end-to-end pipeline for Arabic Dialect Identification (ADI) using intonation patterns and acoustic representations. Recent approaches to language and dialect identification use linguistic-aware deep... more
The autonomous vehicle market is experiencing significant growth, with indications of transitioning from the "trough of disillusionment" to the "slope of enlightenment" on the Gartner hype cycle chart. Fundamental technologies... more
This paper investigates the performance of a selection of state-of-the-art array signal-processing techniques for the purpose of predicting the binaural listening experiments from the equalization and cancellation (EC) paper by Durlach... more
This study delves into designing intelligent voice assistants through the implementation of opensource speech recognition algorithms. Developers can build AI-powered voice interfaces by utilizing technologies such as Whisper, DeepSpeech,... more
In this paper a methodology for extraction of features in emotional speech recognition is presented. Different emotional states of a speaker produce physiological changes in the human speech system, which is reflected in the variation of... more
Employing the methodology of Editorial Criticism, this article seeks to demonstrate that Book II of the Psalter (i.e., Psalms 42-72) consists of three parallel, compositional arcs that take the form of a journey. Based on keyword links,... more
 First application  Classical Fock space  Transfer packet and transfer value  Solution as double continued fractions  Second application  Calculus in Sweedler's duals  Conclusion
In numerous applications that rely on audible voice signals, including speech recognition, audio recording, and telecommunications, the suppression of background noise is an essential component. The present study introduces an innovative... more
When the audio and visual portions of a speech stimulus are presented synchronously, the resulting enhancement in intelligibility is generally much larger than the one obtained when the audio and visual stimuli are presented sequentially.... more
In any signal noise is an undesired quantity, however most of thetime every signal get mixed with noise at different levels of theirprocessing and application, due to which the information containedby the signal gets distorted and makes... more
Speech is a form of communication that most people came across in their day to day life. Speech can be used for many purposes like speech communication, speech recognition, speaker identification etc. In all of these applications a noise... more
In recent years, the established link between the various human communication production domains has become more widely utilised in the field of speech processing. In this work, a state of the art Semi Adaptive Appearance Model (SAAM)... more
To automate assessments of beginning readers, especially those still learning English, we have investigated the types of knowledge sources that teachers use and have tried to incorporate them into an automated system. We describe a set of... more
Recent advancements in self-supervised speech-representation learning for automatic speech recognition (ASR) approaches have significantly improved the results on many benchmarks with low-cost data labeling. In this paper, we train two... more
SiCRNN: A Siamese Approach for Sleep Apnea Identification via
Tracheal Microphone Signals
Listening to speech in noise depletes cognitive resources, affecting speech processing. The present study investigated how remaining resources or cognitive spare capacity (CSC) can be deployed by young adults with normal hearing. We... more
The articulatory phonology study requires the simultaneous recording of the speech wave and as many articulatory parameters as possible. To this end, for many years,we have developed the integrated PHONART workstation for speech... more
3rd International Conference on Speech and NLP (SPNLP 2025) will provide an excellent international forums for sharing knowledge and results in theory, methodology and applications of speech and Natural Language Processing (NLP).
A codec for wideband 12kHz speech and audio in a video conferencing application is proposed in this paper. The codec is based on warped linear predictive coding algorithm which utilizes the auditory Bark frequency resolution. The... more
Frequency-warped signal processing techniques are attractive to many wideband speech and audio applications since they have a clear connection to the frequency resolution of human hearing. A warped version of linear predictive coding... more
The electromagnetic articulography (EMA) is a relatively exact and efficient method used in study on speech production physiology. It allows to precisely estimate movement trajectories of speech articulators like tongue, lips, jaw etc. by... more
Speech recognition refers to the capability of software or hardware to receive a speech signal, identify the speaker’s features in the speech signal, and recognize the speaker thereafter. In general, the speech recognition process... more
It is argued in this article that the common interpretation of Ps 1 as a call for obedience, a view exemplified by Walter Brueggemann's influential article, "Bounded by Obedience and Praise: The Psalms as Canon," does not quite capture... more
In this work, handwritten word recognition problem is modeled in the framework of hidden Markov model (HMM). The states of HMM are identified with the letters of the alphabet. The optimum symbols are then generated by experimental study... more
This paper investigates the effects of personality traits on listening-oriented dialogue to gain insight into building automated listening agents. The analysis of the frequency of dialogue act and the dialogue flow using Hidden Markov... more
Indicios de Percepción Binaural. Guillermo Jardon-2022 Resumen El siguiente artículo es una adaptación de los capítulos 1, 2 y 3 del libro Head Related Transfer Function and Acoustic Virtual Reality de Kazuhiro Iida (Springer; 2019) donde... more
Signal processing methods can improve the quality and intelligibility of oesophageal speech. Current methods show only moderate improvement leaving potential for better results. Quantifying parameters of oesophageal speech relative to... more
Signal processing methods can improve the quality and intelligibility of oesophageal speech. Current methods show only moderate improvement leaving potential for better results. Quantifying parameters of oesophageal speech relative to... more
HAL is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or... more
3rd International Conference on Speech and NLP (SPNLP 2025) will provide an excellent international forums for sharing knowledge and results in theory, methodology and applications of speech and Natural Language Processing (NLP).
Artificial Intelligence hasn’t left a facet of life untouched. As it makes its mark in every step of an organization, its application in the field of Human Resource(HR) needs critical analysis. Recruitment is the backbone of a... more
The adoption of electronics learning (e-learning) as a method of disseminating knowledge in the global educational system is growing at a rapid rate, and has created a shift in the knowledge acquisition methods from the conventional... more
The adoption of accent-based automatic speech recognition (ASR) to remove the limitations of accent variations among the e-learning participants from different accents background has been considered a milestone. Several accents-based... more
The fast development of multimedia computing has led to the demand of using digital speech and images. The manipulation, storage and transmission of speech and images in their raw form is very expensive, and significantly slows the... more
The interaction of humans and robots has the potential to set new grounds in industrial applications as well as in service robotics because it combines the strengths of humans, such as flexibility and adaptability, and the strengths of... more
Variations of the basic string-alignment algorithm are commonly used for the detection and classification of speech-recognition errors. In this procedure, reference and system-output hypothesis speech transcriptions are first aligned... more
Turkiy adabiyotning buyuk siymosi Alisher Navoiy butun hayoti, ijodi va kuch – qudratlarini inson baxt saodati uchun kurashga bag’ishladi. Shoirning o‘lmas asarlarida o‘zi yashagan zamon va muhitning barcha muhim masalalarini qamrab oldi,... more
Лучший частный хостинг