In this paper we propose the use of an HMM-based phonetic aligner together with a speech-synthesis-based one to improve the accuracy of the global alignment system. We also present a phone duration-independent measure to evaluate the... more
This paper presents the results of our effort in improving the accuracy of a DTW-based automatic phonetic aligner. The adopted model assumes that the phonetic segment sequence is already known and so the goal is only to align the spoken... more
Speech is the powerful engine of communication among human beings and language is meant for communicating with the world. This has motivated new researchers to study automatic speech recognition and expand a computer system so it can... more
Reconnaissance hors ligne des chiffres Reconnaissance hors ligne des chiffres Reconnaissance hors ligne des chiffres Reconnaissance hors ligne des chiffres manuscrits isolés par manuscrits isolés par manuscrits isolés par manuscrits... more
In Machine Learning (ML) supervised classification problems, it is often beneficial to crunch down the input data, mapping it to an initial pre-processing stage, to improve the performance of pattern recognition systems, such as... more
This essay explores the motif of "the Pit" as a third, thematic focus for the Psalter (alongside "refuge" and "pathway"). Especially as it comes to fullest expression in Psalm 88, the Pit can be seen as the "black hole" at the center of... more
There have been wide and fundamental changes in the field of speech and language neuropsychology since the publication of Paul Broca’s (1824–1880) epoch-making work on “aphasie” (siège du langage articulé) in 1863 (Broca, 1863). This... more
Speech summarization has become an essential tool for efficiently managing and accessing the growing volume of spoken and audiovisual content. However, despite its increasing importance, speech summarization is still not clearly defined... more
The accuracy and computational complexity of keyword spotting (KWS) systems are heavily influenced by the choice of audio features in speech signals. This paper introduces a novel approach for audio feature extraction in KWS by leveraging... more
Functional MRI was used to investigate the characteristics of the human cerebral response to dynamic ripples. Dynamic ripples are sound stimuli containing regular spectrotemporal modulations, which are of major importance in speech... more
This study identifies, categorizes, and analyzes speech errors committed by EFL students and explores their implications for English teaching. Data were collected through contentanalysis, following a qualitative... more
This review examines how researchers have approached speech errors across a wide range of studies, drawing on work from databases such as Scopus, Web of Science, and Google Scholar. Rather than beginning with fixed categories, patterns... more
Transcription systems are crucial for linguistic research, language preservation, and computational applications. However, existing systems, such as the International Phonetic Alphabet (IPA), often lack phonemic specificity and require... more
This paper presents a method for extracting MFCC parameters from a normalised power spectrum density. The underlined spectral normalisation method is based on the fact that the speech regions with less energy need more robustness, since... more
Normally, voice activity detection (VAD) refers to speech processing algorithms for detecting the presence or absence of human speech in segments of audio signals. In this paper, however, we focus on speech detection algorithms that take... more
This paper proposes an efficient method based on the steered-response power (SRP) technique for sound source localization using microphone arrays: the volumetric SRP (V-SRP). As compared to the SRP, by deploying a sparser volumetric grid,... more
Recent developments in large vocabulary continuous speech recognition (LVCSR) have shown the effectiveness of discriminative training approaches, employing the following three representative techniques: discriminative Gaussian training... more
A lower-error and lower-variance n X ?Z multiplier is suitably proposed for VLSI design. Considering next lower significant stage in P,-' column and useful error-compensation model in the least significant part, and utilizing a near... more
One of the practical issues of the voice activity detection (VAD) algorithms, that are commonly installed in today's call centres, lies in the tremendous variability of call recording conditions. Particularly, the intermittent occurence... more
Given the importance of speech in our day-to-day activities and almost all forms of communication, it is crucial to address issues that hinder effective communication. Often, in critical security and military applications, the... more
A novel technique based on dynamic stochastic resonance (DSR) in discrete cosine transform (DCT) domain has been proposed in this paper for the enhancement of dark as well as low-contrast images. In conventional DSR-based techniques, the... more
In many functional magnetic resonance imaging (fMRI) studies blind humans were found to show cross-modal reorganization engaging the visual system in non-visual tasks. For example, blind people can manage to understand (synthetic) spoken... more
In recent years, deep learning approaches have gained significant interest as a way of building hierarchical representations from unlabeled data. Additionally, in the field of sound direction-of-arrival (DOA) estimation, the binaural... more
This paper proposes an acoustic phonetic study of the foreign accents in the Arabic language. To analyze on a large scale of the connected variations, the contribution of the automatic tools acoustico-phonetic decoding tools along the... more
In this paper, we propose an emotion recognition system from speech signal using both spectral and prosodic features. Most traditional systems have focused on spectral features or prosodic features. Since both the spectral and the... more
Several techniques have been introduced over the years to mitigate the effect of howling sound production in sound reinforcement applications. The algorithmic computational complexity of the howling controllers directly affects their... more
1 It has been observed that in a HLH (High-Low-High) tone sequence, the second H tone is lowered in pitch, as compared to a HHH tone sequence, which was termed as downstep. To calculate the downstep effect and test its scope, we compared... more
У статті наведено теоретичне узагальнення й нове розв’язання наукової проблеми формування термінологічного апарату ґрунтознавства в процесі наукового аналізу та підготовки виконання про- фесійних завдань майбутніх фахівців. Визначено... more
This article presents a full end-to-end pipeline for Arabic Dialect Identification (ADI) using intonation patterns and acoustic representations. Recent approaches to language and dialect identification use linguistic-aware deep... more
The autonomous vehicle market is experiencing significant growth, with indications of transitioning from the "trough of disillusionment" to the "slope of enlightenment" on the Gartner hype cycle chart. Fundamental technologies... more
This paper investigates the performance of a selection of state-of-the-art array signal-processing techniques for the purpose of predicting the binaural listening experiments from the equalization and cancellation (EC) paper by Durlach... more
This study delves into designing intelligent voice assistants through the implementation of opensource speech recognition algorithms. Developers can build AI-powered voice interfaces by utilizing technologies such as Whisper, DeepSpeech,... more
In this paper a methodology for extraction of features in emotional speech recognition is presented. Different emotional states of a speaker produce physiological changes in the human speech system, which is reflected in the variation of... more
Employing the methodology of Editorial Criticism, this article seeks to demonstrate that Book II of the Psalter (i.e., Psalms 42-72) consists of three parallel, compositional arcs that take the form of a journey. Based on keyword links,... more
En 1953, C. Baur édite trois homélies festives pseudo-chrysostomiennes, qui, d'après lui, proviendraient de l'entourage de Nestorius sinon de celuici même 1 . L'une s'intitule Sur la sainte Pâque 2 , les deux autres Sur l'Ascension de... more
This paper documents a submission to the shared task on scene segmentation hosted at KONVENS 2021 (Zehe et al., 2021b). The aim of this shared task was to find methods for segmenting narrative texts into different scenes -segments of text... more
First application Classical Fock space Transfer packet and transfer value Solution as double continued fractions Second application Calculus in Sweedler's duals Conclusion
In numerous applications that rely on audible voice signals, including speech recognition, audio recording, and telecommunications, the suppression of background noise is an essential component. The present study introduces an innovative... more
When the audio and visual portions of a speech stimulus are presented synchronously, the resulting enhancement in intelligibility is generally much larger than the one obtained when the audio and visual stimuli are presented sequentially.... more
In any signal noise is an undesired quantity, however most of thetime every signal get mixed with noise at different levels of theirprocessing and application, due to which the information containedby the signal gets distorted and makes... more
Speech is a form of communication that most people came across in their day to day life. Speech can be used for many purposes like speech communication, speech recognition, speaker identification etc. In all of these applications a noise... more
Modélisation pseudo bidimensionnelle pour la reconnaissance de chaînes de caractères arabes imprimés
Nous décrivons un modèle stochastique de type PHMM-Pseudo 2D Hidden Markov Model-, pour la reconnaissance globale de chaînes de caractères arabes imprimés. Le modèle est appliqué directement sur l'image sans segmentation au préalable. La... more
In recent years, the established link between the various human communication production domains has become more widely utilised in the field of speech processing. In this work, a state of the art Semi Adaptive Appearance Model (SAAM)... more
Connected Word Recognition (CWR) is used in many applications such as voice-dialing telephone, automatic data entry, automated banking systems and, etc. This paper presents a novel architecture for CWR based on synergic Hidden Markov... more
To automate assessments of beginning readers, especially those still learning English, we have investigated the types of knowledge sources that teachers use and have tried to incorporate them into an automated system. We describe a set of... more