Speech Processing Research Papers

Automatic Phonetic Alignment and Its Confidence Measures

2025, Lecture Notes in Computer Science

In this paper we propose the use of an HMM-based phonetic aligner together with a speech-synthesis-based one to improve the accuracy of the global alignment system. We also present a phone duration-independent measure to evaluate the... more

descriptionView Paper arrow_downwardDownload

DTW-based phonetic alignment using multiple acoustic features

by Instituto Cervantes

2025, Proceedings of Eurospeech

This paper presents the results of our effort in improving the accuracy of a DTW-based automatic phonetic aligner. The adopted model assumes that the phonetic segment sequence is already known and so the goal is only to align the spoken... more

descriptionView Paper arrow_downwardDownload

Human Factors and Aging: Identifying and Compensating for Age-related Deficits in Sensory and Cognitive Function

by F. Schieber

2025

descriptionView Paper arrow_downwardDownload

Recent Advances in Audio-Visual Speech Recognition: Deep Learning Perspective

by Diksha Pawar

2025, First International Conference on Advances in Computer Vision and Artificial Intelligence Technologies (ACVAIT 2022)Atlantis Press

Speech is the powerful engine of communication among human beings and language is meant for communicating with the world. This has motivated new researchers to study automatic speech recognition and expand a computer system so it can... more

descriptionView Paper arrow_downwardDownload

Bilingual language learning: An ERP study relating early brain responses to speech, language input, and later word production

by Harriett Romo

2025, Journal of Phonetics

descriptionView Paper arrow_downwardDownload

Reconnaissance hors ligne des chiffres manuscrits isolés par l'approcch Neuro-Génétique

by Bachir Djebbar

2025, Revue de l'Information Scientifique et Technique

Reconnaissance hors ligne des chiffres Reconnaissance hors ligne des chiffres Reconnaissance hors ligne des chiffres Reconnaissance hors ligne des chiffres manuscrits isolés par manuscrits isolés par manuscrits isolés par manuscrits... more

descriptionView Paper arrow_downwardDownload

Speech Feature Extraction for Emotion Recognition Using Machine Learning

by Arthur N dos Santos

2025

In Machine Learning (ML) supervised classification problems, it is often beneficial to crunch down the input data, mapping it to an initial pre-processing stage, to improve the performance of pattern recognition systems, such as... more

descriptionView Paper arrow_downwardDownload

Incertitudes Stochastiques sur des Modèles de Markov Cachés: Application dans l'Aide à la Décision pour une Maintenance Préventive Industrielle

by Frédéric KRATZ

2025

descriptionView Paper arrow_downwardDownload

The Black Hole at the Center of the Psalms

by Brent A Strawn

2025, Interpretation

This essay explores the motif of "the Pit" as a third, thematic focus for the Psalter (alongside "refuge" and "pathway"). Especially as it comes to fullest expression in Psalm 88, the Pit can be seen as the "black hole" at the center of... more

descriptionView Paper arrow_downwardDownload

Changing Perspectives in Speech and Language Neuropsychology, 1863-2023

by Frank Stahnisch

2025, Frontiers in Psychology

There have been wide and fundamental changes in the field of speech and language neuropsychology since the publication of Paul Broca’s (1824–1880) epoch-making work on “aphasie” (siège du langage articulé) in 1863 (Broca, 1863). This... more

There have been wide and fundamental changes in the field of speech and language neuropsychology since the publication of Paul Broca’s (1824–1880) epoch-making work on “aphasie” (siège du langage articulé) in 1863 (Broca, 1863). This Research Topic surveys the efforts to understand the relationship between human behavior and brain function with respect to language, cognition, and memory with a focus on activities from the 1860s to 1960s in Europe and North America. The reviewed period begins with the groundbreaking work of Broca in France, John Hughlings Jackson (1835–1911) in Great Britain, and Carl Wernicke (1848–1905) in Germany to identify the neuropathological sources of selective impairments in language (Figure 1) (Levelt, 2013).
Efforts continued throughout the second half of the 19th century, leading to increased activity in the wake of the two World Wars. One hundred years later, interest resurged in the earlier ideas when new approaches were initiated by individuals such as Wilder Penfield (1891–1976) in Canada (Penfield, 1949) and Norman Geschwind (1926–1984) in the United States (Geschwind, 1970). Historiographical approaches to the understanding
of these efforts have considered the applied models and metaphors of speech and language neuropsychology, methodological approaches in the clinic and laboratory, as well as on the status of evidence, the flow of ideas and people, along with interdisciplinary exchanges with anthropology, education, linguistics, medicine, and sociocultural contexts (Eling, 1994).
The scholarly collaboration showcased in this Research Topic has generated novel insights and stimulating perspectives on the emergence of speech and language neuropsychology. For instance, the twenty contributing authors have investigated phases and events of the long-standing debate between localizationists and holists in the field of neuropsychology. They have analyzed how and why new concepts and theories have emerged, including for clinical and rehabilitation purposes (Stahnisch and Hoffmann, 2010). Furthermore, the limits imposed by certain models on basic and clinical research since Broca’s and Wernicke’s times were investigated (Tremblay and Dick, 2016).

descriptionView Paper arrow_downwardDownload

From Speech to Summary: A Comprehensive Survey of Speech Summarization

by Fabian Retkowski

2025

Speech summarization has become an essential tool for efficiently managing and accessing the growing volume of spoken and audiovisual content. However, despite its increasing importance, speech summarization is still not clearly defined... more

descriptionView Paper arrow_downwardDownload

Automatic Audio Feature Extraction for Keyword Spotting

by Paola Vitolo

2025, IEEE Signal Processing Letters

The accuracy and computational complexity of keyword spotting (KWS) systems are heavily influenced by the choice of audio features in speech signals. This paper introduces a novel approach for audio feature extraction in KWS by leveraging... more

descriptionView Paper arrow_downwardDownload

Spectrotemporal features of the auditory cortex: the activation in response to dynamic ripples

by W. Backes

2025, NeuroImage

Functional MRI was used to investigate the characteristics of the human cerebral response to dynamic ripples. Dynamic ripples are sound stimuli containing regular spectrotemporal modulations, which are of major importance in speech... more

descriptionView Paper arrow_downwardDownload

English as a Foreign Language (EFL) Students' Speech Errors: Implications for English Language Teaching

by Roniel Fortuna

2025, Formosa Journal of Multidisciplinary Research (FJMR)

This study identifies, categorizes, and analyzes speech errors committed by EFL students and explores their implications for English teaching. Data were collected through contentanalysis, following a qualitative... more

descriptionView Paper arrow_downwardDownload

Revisiting the speech error phenomenon: A thematic narrative review

by Roniel Fortuna

2025, International Journal of Science and Research Archive

This review examines how researchers have approached speech errors across a wide range of studies, drawing on work from databases such as Scopus, Web of Science, and Google Scholar. Rather than beginning with fixed categories, patterns... more

descriptionView Paper arrow_downwardDownload

NAVLIPI: A Universal Phonemic Transcription System for the World’s Languages

by Sam Robert

2025, Journal of Indian Languages and Indian Literature in English

Transcription systems are crucial for linguistic research, language preservation, and computational applications. However, existing systems, such as the International Phonetic Alphabet (IPA), often lack phonemic specificity and require... more

descriptionView Paper arrow_downwardDownload

Spectral Normalisation MFCC Derived Features for Robust Speech Recognition

by Adriano tavares

2025, 9th Conference Speech and …

This paper presents a method for extracting MFCC parameters from a normalised power spectrum density. The underlined spectral normalisation method is based on the fact that the speech regions with less energy need more robustness, since... more

descriptionView Paper arrow_downwardDownload

Inferring Speech Activity from Encrypted Skype Traffic

by Chin-Laung Lei

2025

Normally, voice activity detection (VAD) refers to speech processing algorithms for detecting the presence or absence of human speech in segments of audio signals. In this paper, however, we focus on speech detection algorithms that take... more

descriptionView Paper arrow_downwardDownload

Efficient Steered-Response Power Methods for Sound Source Localization Using Microphone Arrays

by Tadeu Diniz Ferreira

2025

This paper proposes an efficient method based on the steered-response power (SRP) technique for sound source localization using microphone arrays: the volumetric SRP (V-SRP). As compared to the SRP, by deploying a sparser volumetric grid,... more

descriptionView Paper arrow_downwardDownload

Audio segmentation for meetings speech processing

by Nelson Morgan

2025

descriptionView Paper arrow_downwardDownload

Combining Discriminative Feature, Transform, and Model Training for Large Vocabulary Speech Recognition

by Nelson Morgan

2025, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07

Recent developments in large vocabulary continuous speech recognition (LVCSR) have shown the effectiveness of discriminative training approaches, employing the following three representative techniques: discriminative Gaussian training... more

descriptionView Paper arrow_downwardDownload

Design of a lower-error fixed-width multiplier for speech processing application

by Wu-Shiung Feng

2025

A lower-error and lower-variance n X ?Z multiplier is suitably proposed for VLSI design. Considering next lower significant stage in P,-' column and useful error-compensation model in the least significant part, and utilizing a near... more

descriptionView Paper arrow_downwardDownload

Voice activity detection under the highly fluctuant recording conditions of call centres

by Ivan Mica

2025

One of the practical issues of the voice activity detection (VAD) algorithms, that are commonly installed in today's call centres, lies in the tremendous variability of call recording conditions. Particularly, the intermittent occurence... more

descriptionView Paper arrow_downwardDownload

Speech Intelligibility Improvement through Optimized Voice Transformation in Transfer Learning Framework

by Ritujoy Biswas

2025, Ph.D. Thesis

Given the importance of speech in our day-to-day activities and almost all forms of communication, it is crucial to address issues that hinder effective communication. Often, in critical security and military applications, the... more

Given the importance of speech in our day-to-day activities and almost all forms of communication, it is crucial to address issues that hinder effective communication. Often, in critical security and military applications, the intelligibility of speech is of higher importance than the quality. In such cases, it is crucial that the speech utterance is comprehended for the exact message it was meant to convey, while the pleasantness of the speech and how good it sounds assumes secondary importance. In cases where noise degrades speech intelligibility, it is vital to develop measures that allow speech utterances to retain their intelligibility despite the ambient noise. When modified in a certain way, the intelligibility is improved over the original voice in the presence of noise. This is known as the Lombard effect. Of the several techniques to achieve Lombard speech, one effective approach is formant shifting, where the formant frequencies are relocated to regions in the spectrum where the noise cannot mask the phonetic information they contain. This shift was guided by a trapezoidal voice transformation function (TVTF) that mapped the original locations of formant frequencies in the spectrum to new locations. However, when done empirically, formant shifting results in artifacts.

{Hence, the objective of this thesis is to ascertain techniques to optimize such shifts in formants in order to maximize the boost in intelligibility in a near-end noisy environment. Such optimization should encompass factors like the language being spoken, the presence of realistic noises that are frequently encountered, as well as variation in noise intensities in terms of signal SNR.} {To that end, as the first contribution of this thesis}, we propose optimizing the shaping parameters of the TVTF via a genetic optimization technique called comprehensive learning particle swarm optimization (CLPSO). {Such optimization was specific to a certain combination of language, noise type, and SNR level. Next}, we propose a joint enhancement scheme where we combine several voice modification techniques with formant shifting to preserve speech quality while boosting intelligibility. These include time-scale modification, energy redistribution, and smoothing of formant contours. Although the performance increased, the optimization, which was already computationally intensive, took even longer to converge due to the increased dimension of the input vector corresponding to the additional techniques used in Lombard speech generation. To address this, {as yet another contribution,} a statistical configuration of the VTF is proposed - the Gaussian voice transformation function (GVTF), which had just three parameters to be optimized (instead of five in the TVTF). This reduced the convergence time and resulted in a smoother contour of the shifted formants across the frames.

In case of changes in the ambient conditions of SNR levels, language, and/or noise type, {our contributions focus on the proposition of} transfer learning on all environmental factors like SNR, languages, and noise types. The transfer learning across SNR levels was achieved through a Gaussian process regression, where the known VTF parameters at some SNR levels were used to estimate the unknown parameters for other SNR levels. In case of a change in the language, the comparative analysis of pitch and formant frequencies between the source and target languages was used to modify the shaping parameters of the VTF to conform to the target language. However, the transfer across noises was only made possible through GVTF. This is handled through comparative analysis of the Gaussian approximations of the noise magnitude spectra of the source and target noises. Finally, we establish a mechanism for combined transfer learning across languages and noises for dealing with cases where both those conditions change.

To demonstrate the efficacy of the proposed algorithms in real-time, the entire optimization cycle, using TVTF and GVTF, along with all transfer learning mechanisms, have all been built into an application user interface developed in the AppDesigner in MATLAB. As a practical application of the optimization used in this thesis, the importance of an optimal dataset generated through CLPSO has been demonstrated towards training a lightweight ANN on a resource-constrained Raspberry Pi to suggest optimal microphone location in a room relative to a speaker.

descriptionView Paper arrow_downwardDownload

Dark and low-contrast image enhancement using dynamic stochastic resonance in discrete cosine transform domain

by Rajib K Jha

2025, APSIPA Transactions on Signal and Information Processing

A novel technique based on dynamic stochastic resonance (DSR) in discrete cosine transform (DCT) domain has been proposed in this paper for the enhancement of dark as well as low-contrast images. In conventional DSR-based techniques, the... more

descriptionView Paper arrow_downwardDownload

Network Modeling for Functional Magnetic Resonance Imaging (fMRI) Signals during Ultra-Fast Speech Comprehension in Late-Blind Listeners

by Hermann Ackermann

2025, PLOS ONE

In many functional magnetic resonance imaging (fMRI) studies blind humans were found to show cross-modal reorganization engaging the visual system in non-visual tasks. For example, blind people can manage to understand (synthetic) spoken... more

descriptionView Paper arrow_downwardDownload

Unsupervised feature learning on monaural DOA estimation using convolutional deep belief networks

by Craig Jin

2025

In recent years, deep learning approaches have gained significant interest as a way of building hierarchical representations from unlabeled data. Additionally, in the field of sound direction-of-arrival (DOA) estimation, the binaural... more

descriptionView Paper arrow_downwardDownload

Foreign accent classification for Arabic speech learning

by A Mars

2025, world-comp.org

This paper proposes an acoustic phonetic study of the foreign accents in the Arabic language. To analyze on a large scale of the connected variations, the contribution of the automatic tools acoustico-phonetic decoding tools along the... more

descriptionView Paper arrow_downwardDownload

Speech emotion recognition system using both spectral and prosodic features

by cr srinivasan

2025

In this paper, we propose an emotion recognition system from speech signal using both spectral and prosodic features. Most traditional systems have focused on spectral features or prosodic features. Since both the spectral and the... more

descriptionView Paper arrow_downwardDownload

On the Use of Forward Gain Switching for Acoustic Howling Control

by Oluwatobi Balogun

2025, 2024 IEEE NIGERCON

Several techniques have been introduced over the years to mitigate the effect of howling sound production in sound reinforcement applications. The algorithmic computational complexity of the howling controllers directly affects their... more

descriptionView Paper arrow_downwardDownload

Downstep and its interaction with focus and boundary in Mandarin Chinese

by Frank Kügler

2025

1 It has been observed that in a HLH (High-Low-High) tone sequence, the second H tone is lowered in pitch, as compared to a HHH tone sequence, which was termed as downstep. To calculate the downstep effect and test its scope, we compared... more

descriptionView Paper arrow_downwardDownload

ФОРМУВАННЯ ТЕРМІНОЛОГІЧНОГО АПАРАТУ ҐРУНТОЗНАВСТВА В ПРОЦЕСІ НАУКОВОГО АНАЛІЗУ ТА ПІДГОТОВКИ ДО ВИКОНАННЯ ПРОФЕСІЙНИХ ЗАВДАНЬ МАЙБУТНІХ ФАХІВЦІВ

by Валентина Володимирівна Оніпко

2025

У статті наведено теоретичне узагальнення й нове розв’язання наукової проблеми формування термінологічного апарату ґрунтознавства в процесі наукового аналізу та підготовки виконання про- фесійних завдань майбутніх фахівців. Визначено... more

У статті наведено теоретичне узагальнення й нове розв’язання наукової проблеми формування термінологічного апарату ґрунтознавства в процесі наукового аналізу та підготовки виконання про- фесійних завдань майбутніх фахівців. Визначено методичні засади формування термінологічного апа- рату в галузі ґрунтознавства. Проаналізовано наукові дослідження та літературні джерела щодо методів і підходів до вивчення термінології ґрунтознавства.
Висвітлено роль наукового аналізу в управлінні та оптимізації процесу формування термінологічного апарату, а саме: стандартизації термінології, що дає змогу визначити ключові терміни та забезпечує однозначне розуміння понять, їх визначення, полегшує спілкування між фахівцями; адаптації до
нових відкриттів та методів, що може призвести до оновлення або розширення термінологічного
апарату; уникнення синонімізму та плутанини, які можуть виникати через використання різних тер- мінів для позначення одних і тих самих понять або через нечіткі визначення; забезпечення точності й
зрозумілості; оптимізований термінологічний апарат може сприяти розвитку ґрунтознавства, зрос- танню інтересу до досліджень у цій сфері та впровадженню нових підходів у практику. Доведено, що формування термінологічного апарату в курсі Ґрунтознавства є системою, інтегра- тивною єдністю таких складників, як: вивчення базових понять, розуміння взаємозв’язків, практичні
навички, стале поповнення знань, застосування міждисциплінарних знань.
Обґрунтовано взаємозв’язок між формуванням термінологічного апарату та успішним виконанням
професійних завдань майбутніми фахівцями у сфері ґрунтознавства, агрономії, геодезії та землеу- строї. Акцентовано, що термінологічний апарат ґрунтознавства допомагає студентам розробляти
комплексні стратегії в галузі агрономії та геодезії, що відповідають вимогам сучасного ринку праці
й сприяє їхній успішній професійній інтеграції після завершення навчання, забезпечуючи необхідні
інструменти для ефективної роботи, дає змогу майбутнім фахівцям чітко спілкуватися з колегами,
розуміти наукову літературу та виконувати різноманітні аналітичні завдання, визначати проблеми в
ґрунтознавстві й розробляти ефективні рішення для їх вирішення, чіткість та уніфікація термінології
сприяє точності й надійності результатів досліджень, підвищення професійного статусу.

descriptionView Paper arrow_downwardDownload

Learning Intonation Pattern Embeddings for Arabic Dialect Identification

by Elsayed Issa

2025, Interspeech 2020

This article presents a full end-to-end pipeline for Arabic Dialect Identification (ADI) using intonation patterns and acoustic representations. Recent approaches to language and dialect identification use linguistic-aware deep... more

descriptionView Paper arrow_downwardDownload

Environmental Sound Recognition in Embedded Systems: Bridging Experiments in Passenger Vehicles to Autonomous Vehicle Applications in Smart Cities

by Andre L Florentino

2025, Master´s thesis

The autonomous vehicle market is experiencing significant growth, with indications of transitioning from the "trough of disillusionment" to the "slope of enlightenment" on the Gartner hype cycle chart. Fundamental technologies... more

The autonomous vehicle market is experiencing significant growth, with indications of transitioning from the "trough of disillusionment" to the "slope of enlightenment" on the Gartner hype cycle chart. Fundamental technologies encompassing extensive data analytics, computational capabilities, and sensor fusion techniques have already been established, and all stakeholders in this industry are persistently exploring novel approaches to enhance the overall perception of end users in terms of safety and trustworthiness. In this context, this project aims to develop and implement an Environmental Sound Recognition (ESR) algorithm in an embedded system for deployment in autonomous vehicles for Smart Cities in 2025, targeting advanced functionalities for early warning systems. Due to hardware constraints, a regular passenger vehicle was used, embedding the ESR algorithm in a Raspberry Pi with a microphone array. The limited literature on ESR algorithms for vehicles primarily focuses on siren detection without real-time inferences, and to address this, a dataset benchmarking study confirmed classifiers’ accuracy, leading to the creation of a new dataset tailored to autonomous vehicles. This new dataset provided a comprehensive baseline where several classifiers were trained and evaluated for accuracy, memory usage, and prediction time, with CNN 2D using aggregated features emerging as the top-performing model, achieving an average accuracy of 80% in the sliding window process. During the indoor experiment, the total prediction time attained an average of 47.6 ms, validating the algorithm’s performance with weighted F1-scores close to or better than cross-validation results. In the final phase of the methodology, real-world tests conducted in a passenger vehicle yielded similar results. However, inconsistencies were observed in certain classes due to insufficient sample diversity and environmental noise, which affected their accuracy. The results of this project indicate that its general objective was successfully achieved, contributing to understanding of ESR algorithms in embedded systems within passenger vehicles, and it is ready for integration into the electric and electronic architecture of autonomous vehicles for Smart Cities. Additionally, upon conducting further experiments across various vehicle categories to assess cabin insulation effects, this project could potentially enhance safety features for drivers with hearing impairments by adapting the ESR algorithm as an add-on feature in regular passenger vehicles.

descriptionView Paper arrow_downwardDownload

Comparison between the equalization and cancellation model and state of the art beamforming techniques

by Jesper Udesen

2025

This paper investigates the performance of a selection of state-of-the-art array signal-processing techniques for the purpose of predicting the binaural listening experiments from the equalization and cancellation (EC) paper by Durlach... more

descriptionView Paper arrow_downwardDownload

Building an Intelligent Voice Assistant Using Open-Source Speech Recognition Systems

by Venkata Baladari

2025, Journal of Scientific and Engineering Research

This study delves into designing intelligent voice assistants through the implementation of opensource speech recognition algorithms. Developers can build AI-powered voice interfaces by utilizing technologies such as Whisper, DeepSpeech,... more

descriptionView Paper arrow_downwardDownload

Development of Indonesian large vocabulary continuous speech recognition system within A-STAR project

by Puji Lestari

2025

descriptionView Paper arrow_downwardDownload

Emotional recognition applying speech signal processing

by Alvaro Angel Orozco Gutierrez

2025

In this paper a methodology for extraction of features in emotional speech recognition is presented. Different emotional states of a speaker produce physiological changes in the human speech system, which is reflected in the variation of... more

descriptionView Paper arrow_downwardDownload

JOURNEY TO THE KINGDOM: THE THREE-FOLD COMPOSITIONAL ARC OF BOOK II OF THE PSALTER

by Jerod Gilcher

2025

Employing the methodology of Editorial Criticism, this article seeks to demonstrate that Book II of the Psalter (i.e., Psalms 42-72) consists of three parallel, compositional arcs that take the form of a journey. Based on keyword links,... more

descriptionView Paper arrow_downwardDownload

Un discours anonyme (CPG 4739): Témoin d'une tradition homilétique pour l'ascension

by Helene Grelier

2025, Questions Liturgiques\/studies in Liturgy

En 1953, C. Baur édite trois homélies festives pseudo-chrysostomiennes, qui, d'après lui, proviendraient de l'entourage de Nestorius sinon de celuici même 1 . L'une s'intitule Sur la sainte Pâque 2 , les deux autres Sur l'Ascension de... more

En 1953, C. Baur édite trois homélies festives pseudo-chrysostomiennes, qui, d'après lui, proviendraient de l'entourage de Nestorius sinon de celuici même 1 . L'une s'intitule Sur la sainte Pâque 2 , les deux autres Sur l'Ascension de Notre Seigneur Jésus-Christ 3 . Selon lui, la pensée et l'expression reflètent les discussions christologiques qui ont commencé entre Eusthate d'Antioche et Diodore de Tarse, qui se sont poursuivies avec Théodore de Mopsueste et qui ont atteint un point culminant avec Nestorius. Mais l'attribution de ces textes à Chrysostome serait due aux circonstances historiques et ecclésiales: Théodose avait ordonné de brûler les ouvrages de Nestorius à la suite de sa condamnation 4 . La tradition manuscrite atteste qu'assez tôt les oeuvres de Nestorius ont été transmises sous le patronage d'un nom célèbre à Antioche, Jean Chrysostome, pour échapper à leur destruction 5 . Ces trois homélies pourraient appartenir, selon C. Baur, à cette catégorie de textes. Toutefois, les deux homélies sur l'Ascension sont très différentes tant par leur coloration scripturaire que par les caractéristiques de la fête qu'elles mettent en avant, l'une étant marquée par un trait de polémique anti-juive à la différence de l'autre. On peut légitimement penser qu'elles ont donc été écrites dans des circonstances ou pour des destinataires bien distincts. Telle est la raison pour laquelle nous avons pris le parti de ne pas les examiner ensemble et de présenter une étude de l'homélie CPG 4739 pour elle-même 6 . C. Baur a édité le texte à partir d'un choix de douze manuscrits, après en avoir consulté dix-sept, sans ignorer en outre l'existence du Messina S. Salvatore 3 (XII s.) et du manuscrit d'Istanbul, Megalê tou genous Scholê, 62 (1) daté de 1373 7 . Le texte figure dans des collections d'homélies liturgiques en l'honneur des fêtes du Christ voire de Marie, au milieu de laquelle se trouve la plupart du temps, un sous-corpus sur l'Ascension 8 . L'homélie CPG 4739 est le plus souvent associée aux homélies de Basile de Séleucie (CPG 6659) 9 et du Ps.-Eusèbe d'Alexandrie (CPG 5528) 10 , et dans une moindre mesure à une autre homélie pseudo-chrysostomienne (CPG 4533) 11 ainsi qu'à l'une de Chrysostome lui-même (CPG 4342) 12 . Nous reproduisons ci-dessous le texte édité par C. Baur, accompagné du titre par lequel la tradition manuscrite a transmis l'homélie sous le nom de Jean Chrysostome, en y adjoignant une traduction. 6. La tradition manuscrite a très rarement associé les deux textes. D'après nos enquêtes sur la base de données Pinakes (), seul le manuscrit Berlin gr. 77, Phillipps 1481 (XII e s.) présente les deux homélies l'une à la suite de l'autre. 7. La base Pinakes recense d'autres manuscrits non mentionnés pas C. Baur: le Meteora 138 (XVI e s.); le Messina S. Salv. 092 (XI e s.), Bibliotheca Regionale Universitaria, folios 174-187; le Messina S. Salv. 026 (XII e -XIII e s.), folios 43v-45v; l'Athena 0282 (XVIe s.), EBE, folios 275-277v; Oxford, Holkham gr. 022 (XV e -XVI e s.), Bodl. Libr., folios 134v-136v; Paris, gr. 1186 (XIV e s.), BnF, folios 200-201v.; Tyrnabo 33 (XVI e s.), Dêmotikê bibl., folios 312v-315. En revanche, plusieurs manuscrits utilisés par C. Baur ne se trouvent pas répertoriés par Pinakes: Cod. Ottobon. gr. 411 (XIV e -XV e s.); Vat. gr. 1190 (XVI e s.), Athos, Monè Batopèdiou, 633 (1422); Venedig, Marcian. gr. 43 (XV e s.); Vaticanus gr. 564 (XII e s.); Paris gr. 1175 (XI e s.); Saloniki, monè tôn Blataïôn 6 (X e s.). 8. Nous renvoyons au descriptif sommaire de chaque manuscrit présenté par C. Baur en introduction de son édition, Drei unedierte Festpredigten aus der Zeit der nestorianischen Streitigkeiten, pp. 111-112, ainsi qu'à la base Pinakes pour le détail de chaque manuscrit mentionné. 9. D'après les indications de la base Pinakes, elle figure dans neuf manuscrits avec l'homélie sur l'Ascension de Basile de Séleucie. 10. D'après le même outil, on la trouve dans douze manuscrits avec le texte du Ps. Eusèbe d'Alexandrie. 11. Les deux textes figurent ensemble dans sept manuscrits. 12. On trouve les deux homélies associées dans sept manuscrits.

descriptionView Paper arrow_downwardDownload

Twin BERT Contextualized Sentence Embedding Space Learning and Gradient-Boosted Decision Tree Ensembles for Scene Segmentation in German Literature

by Sebastian Gombert

2025

This paper documents a submission to the shared task on scene segmentation hosted at KONVENS 2021 (Zehe et al., 2021b). The aim of this shared task was to find methods for segmenting narrative texts into different scenes -segments of text... more

descriptionView Paper arrow_downwardDownload

Rational expressions: Two applications in Combinatorial Physics

by Christophe Tollu

2025

 First application  Classical Fock space  Transfer packet and transfer value  Solution as double continued fractions  Second application  Calculus in Sweedler's duals  Conclusion

descriptionView Paper arrow_downwardDownload

Digital Signal Processing For Noise Suppression In Voice Signals

by Muthukumaran Vaithianathan

2025, Digital Signal Processing For Noise Suppression In Voice Signals

In numerous applications that rely on audible voice signals, including speech recognition, audio recording, and telecommunications, the suppression of background noise is an essential component. The present study introduces an innovative... more

descriptionView Paper arrow_downwardDownload

The effects of temporal asynchrony on the intelligibility of accelerated speech

by Brian Simpson

2025

When the audio and visual portions of a speech stimulus are presented synchronously, the resulting enhancement in intelligibility is generally much larger than the one obtained when the audio and visual stimuli are presented sequentially.... more

descriptionView Paper arrow_downwardDownload

A study on different linear and non-linear filtering techniques of speech and speech recognition

by Minajul haque

2025, ADBU Journal of Engineering Technology (AJET)

In any signal noise is an undesired quantity, however most of thetime every signal get mixed with noise at different levels of theirprocessing and application, due to which the information containedby the signal gets distorted and makes... more

descriptionView Paper arrow_downwardDownload

A review on speech filtering and its different techniques

by Minajul haque

2025, ADBU Journal of Engineering Technology (AJET)

Speech is a form of communication that most people came across in their day to day life. Speech can be used for many purposes like speech communication, speech recognition, speaker identification etc. In all of these applications a noise... more

descriptionView Paper arrow_downwardDownload

Modélisation pseudo bidimensionnelle pour la reconnaissance de chaînes de caractères arabes imprimés

by Abdel Belaid

2025

Nous décrivons un modèle stochastique de type PHMM-Pseudo 2D Hidden Markov Model-, pour la reconnaissance globale de chaînes de caractères arabes imprimés. Le modèle est appliqué directement sur l'image sans segmentation au préalable. La... more

Nous décrivons un modèle stochastique de type PHMM-Pseudo 2D Hidden Markov Model-, pour la reconnaissance globale de chaînes de caractères arabes imprimés. Le modèle est appliqué directement sur l'image sans segmentation au préalable. La notion de durée est utilisée à la fois horizontalement et verticalement afin de modéliser respectivement, les ligatures horizontales et ses allongements ainsi que les chevauchements des caractères. L'extension de ces modèles à la reconnaissance de noms de villes tunisiennes, s'appuie sur des règles syntaxiques. Mots clés : ligatures horizontales et verticales, chaînes de caractères, durée, PHMM. Plusieurs solutions ont été proposées pour la reconnaissance de l'écriture arabe, nous trouvons dans [2] une liste de travaux déjà réalisés. Comme pour toute écriture cursive, des méthodes globales et analytiques ont été testées. Cependant, à cause de certaines caractéristiques de l'écriture arabe (Tableau 1), la solution n'est pas triviale. En effet, l'étude morphologique de l'arabe montre qu'il est difficile d'opérer la reconnaissance au niveau du caractère et ceci pour les raisons suivantes : Ÿ Le caractère arabe peut avoir jusqu'à 4 formes différentes selon sa position dans le mot. Ÿ Plusieurs groupes de caractères possèdent le même corps mais un nombre et/ou un emplacement de points diacritiques différents. Ces points se situent au dessus ou au dessous de la ligne de base à des endroits différents dans l'image en fonction du caractère et dans certains cas de sa position dans le mot. Ces points sont sensibles au bruit, ils peuvent être accolés au corps ou souvent confondus avec du bruit. Ÿ Certains caractères arabes incluent une boucle qui peut avoir différentes formes. La boucle est très souvent bouchée ou ouverte. Toutes ces raisons nous ont conduit à s'orienter vers la chaîne de caractères, qu'il est convenu d'appeler PAW-Piece of Arabic Word-; le PAW peut correspondre au mot ou à la partie du mot. Les PAWs présentent une structure facile à isoler (séparation en composantes connexes, extraction de contours...), ils peuvent occuper différentes positions dans le mot sans pour autant changer de formes. De plus, ils caractérisent toute la morphologie de l'écriture arabe: on y trouve des caractères isolés et des caractères ligaturés. Cependant, les PAWs contiennent des ligatures horizontales et verticales bien spécifiques : Ÿ Les ligatures horizontales sont variables, elles se traduisent par un allongement de la ligature de base ce qui complique le processus de segmentation. Ÿ Les ligatures verticales, bien qu'elles soient rares, posent d'importants problèmes lorsqu'elles existent. Le chevauchement vertical des caractères modifie souvent la morphologie de certains d'entre eux et la ligne de base n'est plus horizontale. Au vu de ces problèmes, nous avons opté pour une modélisation globale au niveau du PAW. Cependant, ce choix est pénalisé par le nombre important de PAWs dans le cadre d'un vocabulaire libre : il serait inconcevable d'envisager un

descriptionView Paper arrow_downwardDownload

Maximising Audiovisual Correlation with Automatic Lip Tracking and Vowel Based Segmentation

by Quóc Anh Nguyễn

2025, Lecture Notes in Computer Science

In recent years, the established link between the various human communication production domains has become more widely utilised in the field of speech processing. In this work, a state of the art Semi Adaptive Appearance Model (SAAM)... more

descriptionView Paper arrow_downwardDownload

A New Connected Word Recognition Using Synergic HMM and DTW

by mohammad mosleh

2025

Connected Word Recognition (CWR) is used in many applications such as voice-dialing telephone, automatic data entry, automated banking systems and, etc. This paper presents a novel architecture for CWR based on synergic Hidden Markov... more

descriptionView Paper arrow_downwardDownload

Assessment of emerging reading skills in young native speakers and language learners

by THẢO MINH DƯƠNG

2025, Speech Communication

To automate assessments of beginning readers, especially those still learning English, we have investigated the types of knowledge sources that teachers use and have tried to incorporate them into an automated system. We describe a set of... more

descriptionView Paper arrow_downwardDownload

Log In

Speech Processing

Related Topics