Artículos
- Mingote, Victoria; Gimeno, Pablo; Vicente, Luis; Khurana, Sameer; Laurent, Antoine; Duret, Jarod. Direct text to speech translation system using acoustic units. IEEE SIGNAL PROCESSING LETTERS. 2023. DOI: 10.1109/LSP.2023.3313513
- Mingote, Victoria; Viñals, Ignacio; Gimeno, Pablo; Miguel, Antonio; Ortega, Alfonso; Lleida, Eduardo. Multimodal Diarization Systems by Training Enrollment Models as Identity Representations. APPLIED SCIENCES (SWITZERLAND). 2022. DOI: 10.3390/app12031141
- Gimeno, P.; Ribas, D.; Ortega, A.; Miguel, A.; Lleida, E. Unsupervised adaptation of deep speech activity detection models to unseen domains. APPLIED SCIENCES (SWITZERLAND). 2022. DOI: 10.3390/app12041832
- Gimeno, P; Mingote, V; Ortega, A; Miguel, A; Lleida, E. Generalizing AUC Optimization to Multiclass Classification for Audio Segmentation With Limited Training Data. IEEE SIGNAL PROCESSING LETTERS. 2021. DOI: 10.1109/LSP.2021.3084501
- Mingote, Victoria; Viñals, Ignacio; Gimeno, Pablo; Miguel, Antonio; Ortega, Alfonso; Lleida, Eduardo. ViVoLAB Multimodal Diarization System for RTVE 2020 Challenge. IBERSPEECH 2021. 2021. DOI: 10.21437/IberSPEECH.2021-16
- Gimeno, P.; Mingote, V.; Ortega, A.; Miguel, A.; Lleida, E. Partial AUC optimisation using recurrent neural networks for music detection with limited training data. INTERSPEECH (USB). 2020. DOI: 10.21437/Interspeech.2020-1108
- Gimeno, Pablo; Viñals, Ignacio; Ortega, Alfonso; Miguel, Antonio; Lleida, Eduardo. Multiclass audio segmentation based on recurrent neural networks for broadcast domain data. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING. 2020. DOI: 10.1186/s13636-020-00172-6
- Viñals, I.; Gimeno, P.; Ortega, A.; Miguel, A.; Lleida, E. Vivolab speaker diarization system for the Dihard 2019 challenge. INTERSPEECH (USB). 2019. DOI: 10.21437/Interspeech.2019-2462
- Viñals, I.; Ribas, D.; Mingote, V.; Llombart, J.; Gimeno, P.; Miguel, A.; Ortega, A.; Lleida, E. Phonetically-aware embeddings, wide residual networks with time-delay neural networks and self attention models for the 2018 NIST speaker recognition evaluation. INTERSPEECH (USB). 2019. DOI: 10.21437/Interspeech.2019-2417
Proyectos
- T36_23R: VIVOLAB. 01/01/23 - 31/12/25
- ESPERANTO / Exchanges for SPEech ReseArch aNd TechnOlogies (G.A. No. 101007666). 01/01/21 - 31/12/25
- T36_20R: Vivolab. 01/01/20 - 31/12/22
Contratos
- INDEXADO AUTOMÁTICO Y LA RECUPERACIÓN DE DOCUMENTOS AUDIOVISUALES. 27/03/23 - 31/12/25
- LABORATORIO DE TECNOLOGÍAS DEL HABLA. 01/11/15 - 31/10/25
Participaciones en congresos
- 2024 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW). Participativo - Póster. Cross-Lingual Transfer Learning for Low-Resource Speech Translation. Seúl. 14/04/24
- Iberspeech 2016. Participativo - Ponencia oral (comunicación oral). Automatic Text-to-Audio Alignment of Multimedia Broadcast Content. Lisboa. 20/11/16
Estancias
- ELSA Corp. Lisboa. Portugal. 06/01/19 - 14/04/19
Docencia UNIZAR de los últimos seis cursos
|