Artículos
- Mingote, Victoria; Miguel, Antonio; Ortega, Alfonso; Lleida, Eduardo. Class token and knowledge distillation for multi-head self-attention speaker verification systems. DIGITAL SIGNAL PROCESSING. 2023. DOI: 10.1016/j.dsp.2022.103859
- Ribas, Dayana; Pastor, Miguel A.; Miguel, Antonio; Martinez, David; Ortega, Alfonso; Lleida, Eduardo. Automatic voice disorder detection using self-supervised representations. IEEE ACCESS. 2023. DOI: 10.1109/ACCESS.2023.3243986
- Mingote, Victoria; Viñals, Ignacio; Gimeno, Pablo; Miguel, Antonio; Ortega, Alfonso; Lleida, Eduardo. Multimodal Diarization Systems by Training Enrollment Models as Identity Representations. APPLIED SCIENCES (SWITZERLAND). 2022. DOI: 10.3390/app12031141
- Mingote, V.; Miguel, A.; Ribas, D.; Ortega, A.; Lleida, E. aDCF loss function for deep metric learning in end-to-end text-dependent speaker verification systems. IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING. 2022. DOI: 10.1109/TASLP.2022.3145307
- Gimeno, P.; Ribas, D.; Ortega, A.; Miguel, A.; Lleida, E. Unsupervised adaptation of deep speech activity detection models to unseen domains. APPLIED SCIENCES (SWITZERLAND). 2022. DOI: 10.3390/app12041832
- Llombart, J.; Ribas, D.; Miguel, A.; Vicente, L.; Ortega, A.; Lleida, E. Progressive loss functions for speech enhancement with deep neural networks. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING. 2021. DOI: 10.1186/s13636-020-00191-3
- Gimeno, Pablo; Viñals, Ignacio; Ortega, Alfonso; Miguel, Antonio; Lleida, Eduardo. Multiclass audio segmentation based on recurrent neural networks for broadcast domain data. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING. 2020. DOI: 10.1186/s13636-020-00172-6
- Prieto, S.; Ortega, A.; López-Espejo, I.; Lleida, E. Shouted speech compensation for speaker verification robust to vocal effort conditions. INTERSPEECH (USB). 2020. DOI: 10.21437/Interspeech.2020-1402
- Gimeno, P.; Mingote, V.; Ortega, A.; Miguel, A.; Lleida, E. Partial AUC optimisation using recurrent neural networks for music detection with limited training data. INTERSPEECH (USB). 2020. DOI: 10.21437/Interspeech.2020-1108
- Mingote, V.; Miguel, A.; Ortega, A.; Lleida, E. Training speaker enrollment models by network optimization. INTERSPEECH (USB). 2020. DOI: 10.21437/Interspeech.2020-2325
- Mingote, V.; Miguel, A.; Ortega, A.; Lleida, E. Optimization of the area under the ROC curve using neural network supervectors for text-dependent speaker verification. COMPUTER SPEECH AND LANGUAGE. 2020. DOI: 10.1016/j.csl.2020.101078
- Mingote, V.; Castan, D.; Mclaren, M.; Nandwana, M.K.; Ortega, A.; Lleida, E.; Miguel, A. Language recognition using triplet neural networks. INTERSPEECH (USB). 2019. DOI: 10.21437/Interspeech.2019-2437
- Viñals, I.; Ribas, D.; Mingote, V.; Llombart, J.; Gimeno, P.; Miguel, A.; Ortega, A.; Lleida, E. Phonetically-aware embeddings, wide residual networks with time-delay neural networks and self attention models for the 2018 NIST speaker recognition evaluation. INTERSPEECH (USB). 2019. DOI: 10.21437/Interspeech.2019-2417
- Mingote, V.; Miguel, A.; Ribas, D.; Ortega, A.; Lleida, E. Optimization of false acceptance/rejection rates and decision threshold for end-to-end text-dependent speaker verification systems. INTERSPEECH (USB). 2019. DOI: 10.21437/Interspeech.2019-2550
- Viñals, I.; Gimeno, P.; Ortega, A.; Miguel, A.; Lleida, E. Vivolab speaker diarization system for the Dihard 2019 challenge. INTERSPEECH (USB). 2019. DOI: 10.21437/Interspeech.2019-2462
- Llombart, J.; Ribas, D.; Miguel, A.; Vicente, L.; Ortega, A.; Lleida, E. Progressive speech enhancement with residual connections. INTERSPEECH (USB). 2019. DOI: 10.21437/Interspeech.2019-1748
- Llombart, J.; Ribas, D.; Miguel, A.; Vicente, L.; Ortega, A.; Lleida, E. Speech enhancement with wide residual networks in reverberant environments. INTERSPEECH (USB). 2019. DOI: 10.21437/Interspeech.2019-1745
- Lleida, Eduardo; Ortega, Alfonso; Miguel, Antonio; Bazán-Gil, Virginia; Perez, Carmen; Gómez, Manuel; de Prada, Alberto. Albayzin 2018 Evaluation: The IberSpeech-RTVE Challenge on Speech Technologies for Spanish Broadcast Media. APPLIED SCIENCES (SWITZERLAND). 2019. DOI: 10.3390/app9245412
- Viñals, Ignacio; Ortega, Alfonso; Villalba, Jesús; Miguel, Antonio; Lleida, Eduardo. Unsupervised adaptation of PLDA models for broadcast diarization. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING. 2019. DOI: 10.1186/s13636-019-0167-7
- Viñals, Ignacio; Ortega, Alfonso; Miguel, Antonio; Lleida, Eduardo. An analysis of the short utterance problem for speaker characterization. APPLIED SCIENCES (SWITZERLAND). 2019. DOI: 10.3390/app9183697
- Mingote, Victoria; Miguel, Antonio; Ortega, Alfonso; Lleida, Eduardo. Supervector extraction for encoding speaker and phrase information with neural networks for text-dependent speaker verification. APPLIED SCIENCES (SWITZERLAND). 2019. DOI: 10.3390/app9163295
- Lleida, E.; Rodriguez-Fuentes, L.J. Speaker and language recognition and characterization: Introduction to the CSL special issue. COMPUTER SPEECH AND LANGUAGE. 2018. DOI: 10.1016/j.csl.2017.12.001
- Viñals, I.; Gimeno, P.; Ortega, A.; Miguel, A.; Lleida, E. Estimation of the number of speakers with variational Bayesian PLDA in the dihard diarization challenge. INTERSPEECH (USB). 2018. DOI: 10.21437/Interspeech.2018-1841
- Cabello, L.; Lleida, E.; Simon, J.; Miguel, A.; Ortega, A. Text-to-Pictogram Summarization for Augmentative and Alternative Communication. PROCESAMIENTO DEL LENGUAJE NATURAL. 2018. DOI: 10.26342/2018-61-1
- Villalba, J.; Ortega, A.; Miguel, A.; Lleida, E. Analysis of speech quality measures for the task of estimating the reliability of speaker verification decisions. SPEECH COMMUNICATION. 2016. DOI: 10.1016/j.specom.2016.01.005
- Villalba López, Jesús; Ortega Giménez, Alfonso; Miguel Artiaga, Antonio; Lleida Solano, Eduardo. Bayesian Networks to Model the Variability of Speaker Veri¿cation Scores in Adverse Environments. IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING. 2016. DOI: 10.1109/TASLP.2016.2607343
- Martínez, D.;Lleida, E.;Green, P.;Christensen, H.;Ortega, A.;Miguel, A. Intelligibility assessment and speech recognizer word accuracy rate prediction for dysarthric speakers in a factor analysis subspace. ACM TRANSACTIONS ON ACCESSIBLE COMPUTING. 2015. DOI: 10.1145/2746405
- Garcìa, P.; Lleida, E.; Castan, D.; Marcos, J. M.; Romero, D. Context-aware communicator for all. LECTURE NOTES IN COMPUTER SCIENCE. 2015. DOI: 10.1007/978-3-319-20678-3_41
- Castán, D.; Tavarez, D.; Lopez-Otero, P.; Franco-Pedroso, J.; Delgado, H.; Navas, E.; Docio-Fernández, L.; Ramos, D.; Serrano, J.; Ortega, A.; Lleida, E. Albayzín-2014 evaluation: audio segmentation and classification in broadcast news domains. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING. 2015. DOI: 10.1186/s13636-015-0076-3
- Garcia, José Enrique; Ortega Giménez, Alfonso; Miguel Artiaga, Antonio; Lleida Solano, Eduardo. Low bit rate compression methods of feature vectors for distributed speech recognition. SPEECH COMMUNICATION. 2014. DOI: 10.1016/j.specom.2013.11.007
- Castán, D.; Ortega, A.; Miguel, A.; Lleida, E. Audio segmentation-by-classification approach based on factor analysis in broadcast news domain. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING. 2014. DOI: 10.1186/s13636-014-0034-5
- Martínez González, David; Burget, Lukas; Stafylakis, Themos; Lei, Yun; Kenny, Patrick; Lleida, Eduardo. Unscented Transform for iVector-Based Noisy Speaker Recognition. PROCEEDINGS OF THE IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING. 2014. DOI: 10.1109/ICASSP.2014.6854361
- Martínez González, David; Ribas, Dayana; Lleida, Eduardo; Ortega, Alfonso; Miguel, Antonio Suprasegmental information modelling for autism disorder spectrum and specific language impairment classification. PROCEEDINGS OF THE ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, INTERSPEECH. 2013
- Vaquero,C.;Ortega,A.;Miguel,A.;Lleida,E. Quality assessment for speaker diarization and its application in speaker characterization. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING. 2013. DOI: 10.1109/TASL.2012.2236317
- Justo, R.;Saz, O.;Miguel, A.;Torres, M. I.;Lleida, E. Improving language models in speech-based human-machine interaction. INTERNATIONAL JOURNAL OF ADVANCED ROBOTIC SYSTEMS. 2013. DOI: 10.5772/55407
Capítulos
- Bottleneck Based Front-End for Diarization Systems. Viñals Bailo, Ignacio; Villalba Lopez, Jesús; Ortega Giménez, Alfonso; Miguel Artiaga, Antonio; Lleida Solano, Eduardo. ADVANCES IN SPEECH AND LANGUAGE TECHNOLOGIES FOR IBERIAN LANGUAGES: IBERSPEECH 2016. 2016
- A preliminary study of Acoustic Events Classification with Factor Analysis in Meeting Rooms. Ortega Giménez, Alfonso; Castán, Diego; Miguel, Antonio; Lleida, Eduardo. ADVANCES IN SPEECH AND LANGUAGE TECHNOLOGIES FOR IBERIAN LANGUAGES. 2014
- Confidence Measures in Automatic Speech Recognition for Error Detection in Restricted Domains. Ortega Giménez, Alfonso; Olcoz, Julia; Miguel, Antonio; Lleida, Eduardo. ADVANCES IN SPEECH AND LANGUAGE TECHNOLOGIES FOR IBERIAN LANGUAGES. IBERSPEECH 2014. 2014
- Unsupervised Accent Modeling for Language Identification. Martínez González, David; Villalba, Jesús; Lleida, Eduardo; Ortega, Alfonso. ADVANCES IN SPEECH AND LANGUAGE TECHNOLOGIES FOR IBERIAN LANGUAGES. 2014
- Evaluation of a New Beam-Search Formant Tracking Algorithm in Noisy Environments. Ribas Gonzalez, Dayana; García Laínez, Enrique; Ortega Giménez, Alfonso; Miguel Artiaga, Antonio; Lleida Solano, Eduardo; Calvo de Lara, José Ramón. ADVANCES IN SPEECH AND LANGUAGE TECHNOLOGIES FOR IBERIAN LANGUAGES. 2012
- Factor Analysis Segmentation and Classification in Broadcast News Domain. Castán Lavilla, Diego; Ortega Giménez, Alfonso; Lleida Solano, Eduardo. ADVANCES IN SPEECH AND LANGUAGE TECHNOLOGIES FOR IBERIAN LANGUAGES. 2012
- Reliability Estimation of the Speaker Verification Decisions Using Bayesian Networks to Combine Information from Multiple Speech Quality Measures. Villalba Lopez, Jesús; Lleida Solano, Eduardo; Ortega Giménez, Alfonso; Miguel Artiaga, Antonio. ADVANCES IN SPEECH AND LANGUAGE TECHNOLOGIES FOR IBERIAN LANGUAGES. 2012
- Score Level versus Audio Level Fusion for Voice Pathology Detection on the Saarbrücken Voice Database. Martínez González, David; Lleida, Eduardo; Ortega, Alfonso; Miguel, Antonio. ADVANCES IN SPEECH AND LANGUAGE TECHNOLOGIES FOR IBERIAN LANGUAGES. 2012
- Voice Pathology Detection on the Saarbrücken Voice Database with Calibration and Fusion of Scores Using MultiFocal Toolkit. Martínez González, David; Lleida, Eduardo; Ortega, Alfonso; Miguel, Antonio; Villalba, Jesús. ADVANCES IN SPEECH AND LANGUAGE TECHNOLOGIES FOR IBERIAN LANGUAGES. 2012
- ViVoLab UZ Language Recognition System for Albayzin 2010 LRE. Martínez González, David; Villalba, Jesús; Miguel, Antonio; Ortega, Alfonso; Lleida, Eduardo. PROCEEDINGS OF VI JORNADAS DE TECNOLOGÍA DEL HABLA AND II IBERIAN SLTECH WORKSHOP. 2010
- Cross-Probability Model Based on Gmm for Feature Vector Normalization. Buera Rodriguez, Luis; Miguel Artiaga, Antonio; Saz Torralba, Oscar; Lleida Solano, Eduardo; Ortega Giménez, Alfonso. IN-VEHICLE CORPUS AND SIGNAL PROCESSING FOR DRIVER BEHAVIOR.
Proyectos
- TIN2014-54288-C4-2-R: PROCESADO DE AUDIO, HABLA Y LENGUAJE PARA ANÁLISIS DE INFORMACIÓN MULTIMEDIA-UZ. 01/01/15 - 30/09/18
- IRIS / Towards Natural Interaction and Communication (G.A.no. 610986). 01/01/14 - 31/12/17
Contratos
- SISTEMA DE RECONOCIMIENTO DE VOZ DE LOS SUBTÍTULOS EMITIDOS EN LOS PROGRAMAS DEL TIEMPO PARA LA UNIDAD DE TELETEXTO DE LA CORPORACIÓN RTVE EN TORRESPAÑA, MADRID. 27/02/13 - 29/04/14
- SISTEMA DE SUPERVISIÓN DE SUBTÍTULOS EMITIDOS MEDIANTE RECONOCIMIENTO DE VOZ PARA LA UNIDAD DE TELETEXTO DE LA SME TVE EN TORRESPAÑA. 15/08/12 - 16/08/13
- SISTEMA AUTOMÁTICO DE SUBTITULADO DIFERIDO ASISTIDO DE GUIONES POR RECONOCIMIENTO DE VOZ. 01/01/12 - 31/12/12
Dirección de tesis
- Subspace Gaussian Mixture Models for Language Identification and Dysarthric Speech Intelligibility Assessment. Universidad de Zaragoza. Sobresaliente cum laude. 22/09/15
- Advances on Speaker Recognition in non Collaboarative Environments. Universidad de Zaragoza. Sobresaliente "Cum Laude". 27/11/14
- Discriminative methods for model optimization in speaker verification. Universidad de Zaragoza. Sobresaliente "Cum Laude". 29/05/14
- Aplicación de las tecnologías del habla en la educación de la voz infantil alterada. Universidad de Zaragoza. Sobresaliente cum laude. 17/12/10
- Personalización y adaptación on-line y variaciones de la voz en sistemas de reconocimiento del habla. Universidad de Zaragoza. Sobresaliente "Cum Laude". 09/12/09
- Acoustic Modeling Advances for Speech Recognition. Universidad de Zaragoza. Sobresaliente "Cum Laude". 12/12/08
- Normalización y adaptación a entornos acústicos para la robustez en sistemas de reconocimiento automático del habla. Universidad de Zaragoza. Sobresaliente "Cum Laude". 03/12/07
- Improvements in speech recognition for embedded devices by taking advantage of lip reading techniques. Universidad de Zaragoza. Sobresaliente "Cum Laude". 26/09/06
- Sistema de refuerzo de luz para el interior de un vehículo a motor. Universidad de Zaragoza. Sobresaliente "Cum Laude". 20/12/05
- Segregación de fuentes sonoras para reconocimiento robusto del habla. Universidad de Zaragoza. Sobresaliente "Cum Laude". 23/11/00
Participaciones en congresos
- 18th Annual Conference of the International Speech Communication Association, INTERSPEECH 2017. Participativo - Ponencia oral (comunicación oral). Tied Hidden Factors in Neural Networks for End-to-End Speaker Recognition. Estocolmo. 29/08/17
- 18th Annual Conference of the International Speech Communication Association, INTERSPEECH 2017. Participativo - Ponencia oral (comunicación oral). Domain Adaptation of PLDA models in Broadcast Diarization by means of Unsupervised Speaker Clustering. Estocolmo. 29/08/17
- IEEE Automatic Speech Recognition and Understanding (ASRU 2015). Participativo - Ponencia oral (comunicación oral). Variational Bayesian PLDA for Speaker Diarization in the MGB Challenge. Arizona. 12/12/15
- 16th Annual Conference of the International Speech Communication Association, INTERSPEECH 2015. Participativo - Ponencia oral (comunicación oral). Spoofing Detection with DNN and One-class SVM for the ASVspoof 2015 Challenge. Dresden. 09/09/15
- 15th Annual Conference of the International Speech Communication Association, INTERSPEECH 2014. Participativo - Ponencia oral (comunicación oral). Factor Analysis with Sampling Methods for Text Dependent Speaker Recognition. Singapur. 02/09/14
- 14th Annual Conference of the International Speech Communication Association, INTERSPEECH 2013. Participativo - Ponencia oral (comunicación oral). A New Bayesian Network to Assess the Reliability of Speaker Verification Decisions. Lyon. 28/08/13
- 14th Annual Conference of the International Speech Communication Association, INTERSPEECH 2013. Participativo - Ponencia oral (comunicación oral). The I3A Speaker Recognition System for NIST SRE12: Post-evaluation Analysis. Lyon. 28/08/13
- 14th Annual Conference of the International Speech Communication Association, INTERSPEECH 2013. Participativo - Ponencia oral (comunicación oral). Suprasegmental Information Modelling for Autism Disorder Spectrum and Specific Language Impairment Classification. Lyon. 28/08/13
- SLAM 2013 Speech, Language and Audio in Multimedia. Participativo - Ponencia oral (comunicación oral). Broadcast News Segmentation with Factor Analysis System. Marsella. 25/08/13
- 24th EAEEIE Annual Conference (EAEEIE), 2013. Participativo - Ponencia oral (comunicación oral). Collaborative learning in international teams on Technologies to Reduce the Access Barrier in Human Computer Interaction (TrabHCI) Erasmus Intensive Programme. Chania. 25/05/13
- IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2013). Participativo - Ponencia oral (comunicación oral). Prosodic features and formant modeling for an ivector-based language recognition system. Vancouver. 12/05/13
- IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2013). Participativo - Póster. Segmentation-by-classification system based on factor analysis. Vancouver. 12/05/13
UNIZAR teaching of the last six courses
- Tecnologías del habla y del lenguaje. Máster Universitario en Ingeniería de Telecomunicación. During academic year 2024-25
- Ingeniería acústica. Graduado en Ingeniería de Tecnologías y Servicios de Telecomunicación. From the 2021-22 course to the 2024-25 course
- Procesado de audio e imagen. Graduado en Ingeniería de Tecnologías y Servicios de Telecomunicación. From the 2020-21 course to the 2024-25 course
- Prácticas externas 6. Graduado en Ingeniería de Tecnologías y Servicios de Telecomunicación. During academic year 2023-24
- Trabajo fin de grado (sistemas de telecomunicación). Graduado en Ingeniería de Tecnologías y Servicios de Telecomunicación. During academic year 2023-24
- Trabajo fin de máster. Máster Universitario en Ingeniería de Telecomunicación. During academic year 2023-24
- Tecnologías del habla. Máster Universitario en Ingeniería de Telecomunicación. From the 2021-22 course to the 2023-24 course
- Circuitos y sistemas. Graduado en Ingeniería de Tecnologías y Servicios de Telecomunicación. From the 2019-20 course to the 2021-22 course
- Comunicaciones audiovisuales. Graduado en Ingeniería de Tecnologías y Servicios de Telecomunicación. From the 2019-20 course to the 2020-21 course
- Tecnologías del habla. Máster Universitario en Ingeniería de Telecomunicación. From the 2019-20 course to the 2020-21 course
|