{"id":76,"date":"2020-07-24T18:50:01","date_gmt":"2020-07-24T15:50:01","guid":{"rendered":"https:\/\/sites.uef.fi\/speech\/?page_id=76"},"modified":"2026-02-10T15:22:54","modified_gmt":"2026-02-10T13:22:54","slug":"publications","status":"publish","type":"page","link":"https:\/\/sites.uef.fi\/speech\/publications\/","title":{"rendered":"Publications"},"content":{"rendered":"\n<h3 class=\"wp-block-heading\">2026(updated 10.02.2026)<\/h3>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Xi Xuan, Davide Carbone, Ruchi Pandey, Wenxin Zhang, Tomi H Kinnunen, &#8220;<a href=\"https:\/\/arxiv.org\/abs\/2602.02980\">WST-X Series: Wavelet Scattering Transform for Interpretable Speech Deepfake Detection<\/a>&#8220;, <i>IEEE Signal Processing Letters<\/i>.<\/li>\n\n\n\n<li>Xi Xuan, Xuechen Liu, Wenxin Zhang, Yi-Cheng Lin, Xiaojian Lin, Tomi Kinnunen, &#8220;<a href=\"https:\/\/arxiv.org\/abs\/2510.05305\">WaveSP-Net: Learnable Wavelet-Domain Sparse Prompt Tuning for Speech Deepfake Detection<\/a>&#8220;, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP): Proceedings, 2026.<\/li>\n\n\n\n<li>Oguzhan Kurnaz, Jagabandhu Mishra, Tomi Kinnunen, Cemal Hanilci, &#8220;<a href=\"https:\/\/arxiv.org\/abs\/2602.01722\">Joint Optimization of ASV and CM tasks: BTUEF Team&#8217;s Submission for WildSpoof Challenge<\/a>&#8220;<\/li>\n<\/ol>\n\n\n\n<h3 class=\"wp-block-heading\">2025<\/h3>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Manasi Chhibber,&nbsp;Jagabandhu Mishra,&nbsp;Hyejin Shim,&nbsp;Tomi H. Kinnunen, &#8220;<a href=\"https:\/\/ieeexplore.ieee.org\/document\/10889868\">An Explainable Probabilistic Attribute Embedding Approach for Spoofed Speech Characterization<\/a>&#8221; , IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP): Proceedings, 2025.<\/li>\n\n\n\n<li>Anton Firc,&nbsp;Manasi Chibber,&nbsp;Jagabandhu Mishra,&nbsp;Vishwanath Pratap Singh,&nbsp;Tomi Kinnunen,&nbsp;Kamil Malinka, &#8220;<a href=\"https:\/\/www.isca-archive.org\/interspeech_2025\/firc25_interspeech.html\">STOPA: A Database of Systematic VariaTion Of DeePfake Audio for Open-Set Source Tracing and Attribution<\/a>&#8220;, Interspeech, 2025.<\/li>\n\n\n\n<li>Satu Hopponen,&nbsp;Tomi Kinnunen,&nbsp;Alexandre Nikolaev,&nbsp;Rosa Gonz\u00e1lez Hautam\u00e4ki,&nbsp;Lauri Tavi,&nbsp;Einar Meister, &#8220;<a href=\"https:\/\/www.isca-archive.org\/interspeech_2025\/hopponen25_interspeech.html\">FROST-EMA: Finnish and Russian Oral Speech Dataset of Electromagnetic Articulography Measurements with L1, L2 and Imitated L2 Accents<\/a>&#8220;, Interspeech, 2025.<\/li>\n\n\n\n<li>Parismita Gogoi,&nbsp;Vishwanath Pratap Singh,&nbsp;Seema Khadirnaikar,&nbsp;Soma Siddhartha,&nbsp;Sishir Kalita,&nbsp;Jagabandhu Mishra,&nbsp;Md Sahidullah,&nbsp;Priyankoo Sarmah,&nbsp;S. R. M. Prasanna, &#8220;<a href=\"https:\/\/www.isca-archive.org\/interspeech_2025\/gogoi25_interspeech.html\">Leveraging AM and FM Rhythm Spectrograms for Dementia Classification and Assessment<\/a>&#8220;, Interspeech, 2025.<\/li>\n\n\n\n<li>Vishwanath Pratap Singh,&nbsp;Md. Sahidullah,&nbsp;Tomi Kinnunen, &#8220;<a href=\"https:\/\/www.isca-archive.org\/interspeech_2025\/pratapsingh25_interspeech.html\">Causal Structure Discovery for Error Diagnostics of Children&#8217;s ASR<\/a>&#8220;, Interspeech, 2025.<\/li>\n\n\n\n<li>Edem Ahadzi,&nbsp;Vishwanath Pratap Singh,&nbsp;Tomi Kinnunen,&nbsp;Ville Hautamaki, &#8220;<a href=\"https:\/\/www.isca-archive.org\/interspeech_2025\/ahadzi25_interspeech.html\">Continuous Learning for Children&#8217;s ASR: Overcoming Catastrophic Forgetting with Elastic Weight Consolidation and Synaptic Intelligence<\/a>&#8220;, Interspeech, 2025.<\/li>\n\n\n\n<li>Xi Xuan,&nbsp;Yang Xiao,&nbsp;Rohan Kumar Das,&nbsp;Tomi Kinnunen, &#8220;<a href=\"https:\/\/arxiv.org\/abs\/2508.04143\">Multilingual Source Tracing of Speech Deepfakes: A First Benchmark<\/a>&#8220;, Interspeech SPSC 2025 &#8211; 5th Symposium on Security and Privacy in Speech Communication<\/li>\n\n\n\n<li>Manasi Chhibber, Jagabandhu Mishra, Tomi H. Kinnunen, &#8220;<a href=\"https:\/\/arxiv.org\/abs\/2509.24674\"><a href=\"https:\/\/arxiv.org\/abs\/2509.24674\">Advancing Zero-Shot Open-Set Speech Deepfake Source Tracing<\/a><\/a>&#8220;<\/li>\n\n\n\n<li>O\u011fuzhan Kurnaz, Jagabandhu Mishra, Tomi H Kinnunen, Cemal Hanil\u00e7i, &#8220;<a href=\"https:\/\/arxiv.org\/abs\/2510.01818\">Joint Optimization of Speaker and Spoof Detectors for Spoofing-Robust Automatic Speaker Verification<\/a>&#8220;<br><\/li>\n<\/ol>\n\n\n\n<h3 class=\"wp-block-heading\">2024<\/h3>\n\n\n\n<ol class=\"wp-block-list\">\n<li>J. Mishra and S. R. M. Prasanna, &#8220;<a href=\"https:\/\/ieeexplore.ieee.org\/abstract\/document\/10596692\">Implicit Self-Supervised Language Representation for Spoken Language Diarization<\/a>&#8220;, IEEE\/ACM Transactions on Audio, Speech, and Language Processing (TASLP), 2024.<\/li>\n\n\n\n<li>Jagabandhu Mishra, S.R. Mahadeva Prasanna, &#8220;<a href=\"https:\/\/www.sciencedirect.com\/science\/article\/pii\/S1051200424003038\">Generative attention based framework for implicit language change detection<\/a>&#8220;, Digital Signal Processing, 2024.<\/li>\n\n\n\n<li>Vishwanath Pratap Singh, Federico Malato, Ville Hautamaki, Md Sahidullah, Tomi Kinnunen, &#8220;<a href=\"https:\/\/arxiv.org\/abs\/2406.09999\">ROAR: Reinforcing Original to Augmented Data Ratio Dynamics for Wav2Vec2. 0 Based ASR<\/a>&#8220;, Interspeech, 2024.<\/li>\n\n\n\n<li>Mishra, J., Prasanna, S.R.M., &#8220;<a href=\"https:\/\/link.springer.com\/article\/10.1007\/s00034-024-02743-w\">Spoken Language Change Detection Inspired by Speaker Change Detection<\/a>&#8220;,&nbsp;Circuits Syst Signal Process,&nbsp;2024.<\/li>\n\n\n\n<li>Vishwanath Pratap Singh, Md Sahidullah, Tomi Kinnunen, &#8220;<a href=\"https:\/\/arxiv.org\/abs\/2402.15214\">ChildAugment: Data Augmentation Methods for Zero-Resource Children&#8217;s Speaker Verification<\/a>&#8220;, The Journal of the Acoustical Society of America (JASA) (accepted), 2024.<\/li>\n\n\n\n<li>  Xuechen Liu, Md Sahidullah, Kong Aik Lee, Tomi Kinnunen, &#8220;<a href=\"https:\/\/ieeexplore.ieee.org\/abstract\/document\/10415203\/\">Generalizing Speaker Verification for Spoof Awareness in the Embedding Space<\/a>&#8220;, IEEE\/ACM Transactions on Audio, Speech, and Language Processing (TASLP), <em>vol. 32, pp. 1261&nbsp;&#8211; 1273<\/em>, 2024.<\/li>\n\n\n\n<li>Yi Ma, Kong Aik Lee,Ville Hautam\u00e4ki, Meng Ge, Haizhou Li, &#8220;<a href=\"https:\/\/doi.org\/10.1109\/ICASSP48485.2024.10446174\">Gradient Weighting for Speaker Verification in Extremely Low Signal-to-Noise Ratio<\/a>&#8220;, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP): Proceedings, 2024 .<\/li>\n\n\n\n<li>Xin Wang, Tomi Kinnunen, Kong Aik Lee, Paul-Gauthier No\u00e9, Junichi Yamagishi, &#8220;<a href=\"https:\/\/doi.org\/10.21437\/Interspeech.2024-422\">Revisiting and Improving Scoring Fusion for Spoofing-aware Speaker Verification Using Compositional Data Analysis<\/a>&#8220;, Interspeech, 2024.<\/li>\n\n\n\n<li>Ivan Kukanov, Janne Laakkonen, Tomi Kinnunen, Ville Hautam\u00e4ki, &#8220;<a href=\"https:\/\/doi.org\/10.1109\/SLT61566.2024.10832350\">Meta-Learning Approaches for Improving Detection of Unseen Speech Deepfakes<\/a>&#8220;, IEEE Spoken Language Technology Workshop, 2024.<\/li>\n\n\n\n<li>Hye-jin Shim, Md Sahidullah, Jee-weon Jung, Shinji Watanabe, Tomi Kinnunen, &#8220;<a href=\"https:\/\/doi.org\/10.21437\/SynData4GenAI.2024-12\">Beyond Silence: Bias Analysis through Loss and Asymmetric Approach in Audio Anti-Spoofing<\/a>&#8220;, Synthetic Data\u2019s Transformative Role in Foundational Speech Models, 2024.<\/li>\n\n\n\n<li>Hye-jin Shim, Jee-weon Jung, Tomi Kinnunen, Nicholas Evans, Jean-Fran\u00e7ois Bonastre, Itshak Lapidot, &#8220;<a href=\"https:\/\/doi.org\/10.21437\/odyssey.2024-23\">a-DCF: an architecture agnostic metric with application to spoofing-robust speaker verification<\/a>&#8220;, The Speaker and Language Recognition Workshop (Odyssey 2024).<\/li>\n\n\n\n<li>Tomi Kinnunen, Rosa Gonzalez Hautam\u00e4ki, Xin Wang, Junichi Yamagishi, &#8220;<a href=\"https:\/\/doi.org\/10.21437\/Interspeech.2024-1704\">Speaker Detection by the Individual Listener and the Crowd: Parametric Models Applicable to Bonafide and Deepfake Speech<\/a>&#8220;, Interspeech, 2024.<\/li>\n\n\n\n<li>Tomi Kinnunen, Lee H, Aik Kong, Hemlata Tak, Nicholas Evans, Andreas Nautsch, &#8220;<a href=\"https:\/\/doi.org\/10.1109\/TPAMI.2023.3313648\">t-EER: Parameter-Free Tandem Evaluation of Countermeasures and Biometric Comparators<\/a>&#8220;, IEEE transactions on pattern analysis and machine intelligence, 2024.<\/li>\n<\/ol>\n\n\n\n<h3 class=\"wp-block-heading\">2023<\/h3>\n\n\n\n<ol class=\"wp-block-list\">\n<li>T. H. Kinnunen, K. A. Lee, H. Tak, N. Evans, A. Nautsch, &#8220;<a href=\"https:\/\/ieeexplore.ieee.org\/document\/10246406\">t-EER: Parameter-Free Tandem Evaluation of Countermeasures and Biometric Comparators<\/a>&#8220;, <em>IEEE Transactions on Pattern Analysis and Machine Intelligence<\/em>, doi: 10.1109\/TPAMI.2023.3313648.<\/li>\n\n\n\n<li>Xuechen Liu, Xin Wang, Md Sahidullah, Jose Patino, H\u00e9ctor Delgado, Tomi Kinnunen, Massimiliano Todisco, Junichi Yamagishi, Nicholas Evans, Andreas Nautsch, Kong Aik Lee, &#8220;<a class=\"c-link\" href=\"https:\/\/arxiv.org\/abs\/2210.02437\" target=\"_blank\" rel=\"noopener noreferrer\" data-stringify-link=\"https:\/\/arxiv.org\/abs\/2210.02437\" data-sk=\"tooltip_parent\">ASVspoof 2021: Towards Spoofed and Deepfake Speech Detection in the Wild<\/a>&#8220;,&nbsp;<i data-stringify-type=\"italic\">IEEE\/ACM Transactions on Audio, Speech and Language Processing, &nbsp;vol. 31, pp. 2507-2522, 2023<\/i>.<\/li>\n\n\n\n<li>Xuechen Liu, Md Sahidullah, Kong Aik Lee, Tomi Kinnunen, &#8220;<a class=\"c-link\" data-stringify-link=\"https:\/\/arxiv.org\/abs\/2303.01126\" data-sk=\"tooltip_parent\" href=\"https:\/\/arxiv.org\/abs\/2303.01126\" target=\"_blank\" rel=\"noopener noreferrer\">Speaker-Aware Anti-Spoofing<\/a>&#8220;, Proc. Interspeech, 2498-2502, Dublin, Ireland, 2023.<\/li>\n\n\n\n<li>R Tao, KA Lee, RK Das, V Hautam\u00e4ki, H Li, &#8220;<a href=\"https:\/\/scholar.google.com\/citations?view_op=view_citation&amp;hl=en&amp;user=esQWyTcAAAAJ&amp;sortby=pubdate&amp;citation_for_view=esQWyTcAAAAJ:D_sINldO8mEC\">Self-Supervised Training of Speaker Encoder With Multi-Modal Diverse Positive Pairs<\/a>&#8220;, IEEE\/ACM Transactions on Audio, Speech, and Language Processing (TASLP), <em>vol. 31. pp. 1706&nbsp;&#8211; 1719<\/em>, 2023.<\/li>\n\n\n\n<li>Sung Hwan Mun, Hye-jin Shim, Hemlata Tak, Xin Wang, Xuechen Liu, Md Sahidullah, Myeonghun Jeong, Min Hyun Han, Massimiliano Todisco, Kong Aik Lee, Junichi Yamagishi, Nicholas Evans, Tomi Kinnunen, Nam Soo Kim, Jee-weon Jung, &#8220;<a class=\"c-link\" href=\"https:\/\/arxiv.org\/abs\/2305.19051\" target=\"_blank\" rel=\"noopener noreferrer\" data-stringify-link=\"https:\/\/arxiv.org\/abs\/2305.19051\" data-sk=\"tooltip_parent\">Towards single integrated spoofing-aware speaker verification embeddings<\/a>&#8220;, Proc. Interspeech, 3989-3993, Dublin, Ireland, 2023.<\/li>\n\n\n\n<li>Hye-jin Shim, Rosa Gonz\u00e1lez Hautam\u00e4ki, Md Sahidullah, Tomi Kinnunen, &#8220;<a class=\"c-link\" href=\"https:\/\/arxiv.org\/abs\/2306.00044\" target=\"_blank\" rel=\"noopener noreferrer\" data-stringify-link=\"https:\/\/arxiv.org\/abs\/2306.00044\" data-sk=\"tooltip_parent\">How to Construct Perfect and Worse-than-Coin-Flip Spoofing Countermeasures: A Word of Warning on Shortcut Learning<\/a>&#8220;, Proc. Interspeech, 785-789, Dublin, Ireland, 2023.<\/li>\n\n\n\n<li>Hye-jin Shim, Jee-weon Jung, Tomi Kinnunen, &#8220;<a class=\"c-link\" href=\"https:\/\/arxiv.org\/abs\/2305.19953\" target=\"_blank\" rel=\"noopener noreferrer\" data-stringify-link=\"https:\/\/arxiv.org\/abs\/2305.19953\" data-sk=\"tooltip_parent\">Multi-Dataset Co-Training with Sharpness-Aware Optimization for Audio Anti-spoofing<\/a>&#8220;, Proc. Interspeech, 3804-3808, Dublin, Ireland, 2023.<\/li>\n\n\n\n<li>Vishwanath Pratap Singh, Md Sahidullah, Tomi Kinnunen, &#8220;<a class=\"c-link\" href=\"https:\/\/arxiv.org\/abs\/2306.07501\" target=\"_blank\" rel=\"noopener noreferrer\" data-stringify-link=\"https:\/\/arxiv.org\/abs\/2306.07501\" data-sk=\"tooltip_parent\">Speaker Verification Across Ages: Investigating Deep Speaker Embedding Sensitivity to Age Mismatch in Enrollment and Test Speech<\/a>&#8220;, Proc. Interspeech, 1948-1952, Dublin, Ireland, 2023.<\/li>\n\n\n\n<li>M. Anderson, T. Kinnunen, N. Harte, &#8220;Learnable frontends that do not learn: Quantifying sensitivity to filterbank initialisation&#8221;, IEEE ICASSP, Rhodes island, Greece, 2023.<\/li>\n<\/ol>\n\n\n\n<h3 class=\"wp-block-heading\">2022<\/h3>\n\n\n\n<ol class=\"wp-block-list\">\n<li>A. Kanervisto, V. Hautam\u00e4ki, T. Kinnunen, J. Yamagishi, &#8220;<a href=\"https:\/\/ieeexplore.ieee.org\/document\/9664367\">Optimizing Tandem Speaker Verification and Anti-Spoofing Systems<\/a>,&#8221; <em>IEEE\/ACM Transactions on Audio, Speech, and Language Processing<\/em>, vol. 30, pp. 477-488, 2022.<\/li>\n\n\n\n<li>J.-w. Jung, H. Tak, H.-j. Shim, H.-S. Heo, B.-J. Lee, S.-W. Chung, H.-J. Yu, N. Evans, T. Kinnunen, &#8220;<a href=\"https:\/\/www.isca-speech.org\/archive\/pdfs\/interspeech_2022\/jung22c_interspeech.pdf\">SASV 2022: The First Spoofing-Aware Speaker Verification Challenge<\/a>&#8220;, <i>Proc. Interspeech<\/i>, 2893-2897, Incheon, Korea, 2022.<\/li>\n\n\n\n<li>R. Tao, K.A. Lee, R.K. Das, V. Hautam\u00e4ki, H.&nbsp; Li, &#8220;<a href=\"https:\/\/arxiv.org\/abs\/2110.03869?context=eess.SP\">Self-supervised Speaker Recognition with Loss-gated Learning<\/a>&#8220;, <em>IEEE ICASSP<\/em>, Singapore, 2022.<\/li>\n\n\n\n<li>X. Liu, M. Sahidullah, T. Kinnunen, &#8220;<a href=\"https:\/\/arxiv.org\/abs\/2202.05236\">Learnable Nonlinear Compression for Robust Speaker Verification<\/a>&#8220;, <em>IEEE ICASSP<\/em>, Singapore, 2022.<\/li>\n\n\n\n<li>X. Liu, M. Sahidullah, T. Kinnunen, &#8220;<a href=\"https:\/\/arxiv.org\/abs\/2203.10992\">Spoofing-aware Speaker Verification with Unsupervised Domain Adaptation<\/a>&#8220;, Speaker Odyssey 2022, Beijing, China.<\/li>\n\n\n\n<li>A. Sholokhov, X. Liu, M. Sahidullah, T. Kinnunen, &#8220;<a href=\"https:\/\/arxiv.org\/abs\/2205.00288\">Baselines and Protocols for Household Speaker Recognition<\/a>&#8220;, Speaker Odyssey 2022, Beijing, China.<\/li>\n\n\n\n<li>H.-j. Shim, H. Tak, X. Liu, H.-S. Heo, J.-W. Jung, J.S. Chung, S.-W. Chung, H.-J. Yu, B.-J. Lee, M. Todisco, H. Delgado, K.A. Lee, M. Sahidullah,&nbsp;T. Kinnunen, N. Evans, &#8220;<a href=\"https:\/\/arxiv.org\/abs\/2204.09976\">Baseline Systems for the First Spoofing-Aware Speaker Verification Challenge: Score and Embedding Fusion<\/a>&#8220;,&nbsp;<i>Speaker Odyssey 2022<\/i>, Beijing, China.<\/li>\n\n\n\n<li>S. Ghimire,&nbsp;T. Kinnunen, R. Gonz\u00e1lez Hautam\u00e4ki, &#8220;<a href=\"https:\/\/arxiv.org\/pdf\/2205.04923.pdf\">Gamified Speaker Comparison by Listening<\/a>&#8220;,&nbsp; <i>Speaker Odyssey 2022<\/i>, Beijing, China.<\/li>\n\n\n\n<li>L. Tavi, T. Kinnunen, R. Gonz\u00e1lez Hautam\u00e4ki,&nbsp;&#8220;<a class=\"gsc_oci_title_link\" href=\"https:\/\/www.sciencedirect.com\/science\/article\/pii\/S0167639322000498\" data-clk=\"hl=en&amp;sa=T&amp;ei=dBVEY875LorVmQH8sJ74DQ\">Improving speaker de-identification with functional data analysis of f0 trajectories<\/a>&#8220;, Speech Communication, Volume 140, Pages 1-10, 2022.<\/li>\n<\/ol>\n\n\n\n<h3 class=\"wp-block-heading\">2021<\/h3>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Y. Ma, K.A. Lee, V.&nbsp; Hautam\u00e4ki, H.&nbsp; Li, <a href=\"https:\/\/arxiv.org\/pdf\/2110.00940.pdf\">&#8220;PL-EERSR: Perceptual Loss Based End-to-End Robust Speaker Representation Extraction<\/a>&#8220;, <i>IEEE Automatic Speech Recognition and Understanding workshop<\/i>, 2021<\/li>\n\n\n\n<li>K. Hechmi, T.N. Trong, V. Hautam\u00e4ki, T. Kinnunen, \u201d<a href=\"https:\/\/arxiv.org\/pdf\/2109.13510.pdf\">VoxCeleb Enrichment for Age and Gender Recognition<\/a>\u201d,&nbsp; <i>IEEE Automatic Speech Recognition and Understanding workshop<\/i>, 2021<\/li>\n\n\n\n<li>X. Liu, M. Sahidullah, T. Kinnunen, \u201c<a href=\"https:\/\/arxiv.org\/pdf\/2109.12058.pdf\">Optimized Power Normalized Cepstral Coefficients towards Robust Deep Speaker Verification<\/a>\u201d, to appear in <i>IEEE Automatic Speech Recognition and Understanding workshop, 2021<\/i><\/li>\n\n\n\n<li>X. Liu, M. Sahidullah, T. Kinnunen, \u201c<a href=\"https:\/\/arxiv.org\/pdf\/2109.12056.pdf\">Parameterized Channel Normalization for Far-field Deep Speaker Verification<\/a>\u201d, <i>IEEE Automatic Speech Recognition and Understanding workshop, 2021<\/i><\/li>\n\n\n\n<li>X. Liu, M. Sahidullah, T. Kinnunen, \u201c<a href=\"https:\/\/arxiv.org\/pdf\/2110.10983.pdf\">Optimizing Multi-Taper Features for Deep Speaker Verification<\/a>\u201d, <i>IEEE Signal Processing Letters, <\/i>28: 2187&#8211;2191, October 2021.<\/li>\n\n\n\n<li>L. Tavi, T. Kinnunen, E. Meister, R Gonz\u00e1lez-Hautam\u00e4ki, A. Malmi, \u201d<a href=\"http:\/\/cs.joensuu.fi\/pages\/tkinnu\/webpage\/pdf\/Articulatory_analysis_of_voice_disguise__a_pilot_study.pdf\">Articulation During Voice Disguise: A Pilot Study<\/a>\u201d, <i>Proc. Speech and Computer<\/i> (SPECOM\u201921), Springer LNAI 12997, pp. 680\u2013691, St. Petersburg, Russia, September 2021.<\/li>\n\n\n\n<li>T. Kinnunen, A. Nautsch, M. Sahidullah, N. Evans, X. Wang, M. Todisco, H. Delgado, J. Yamagishi, K.A. Lee, \u201c<a href=\"https:\/\/arxiv.org\/pdf\/2106.06362.pdf\">Visualizing Classifier Adjacency Relations: A Case Study in Speaker Verification and Voice Anti-Spoofing<\/a>\u201d, <i>Proc. Interspeech<\/i>, 4299-4303, Brno, Czech Republic, 2021.<\/li>\n\n\n\n<li>A. Kanervisto, C. Scheller, Y. Schraner and V. Hautam\u00e4ki, &#8220;<a href=\"https:\/\/arxiv.org\/abs\/2107.00703\">Distilling Reinforcement Learning Tricks for Video Games<\/a>&#8220;, <em>IEEE Conference on Games, <\/em>Virtual, 2021.<\/li>\n\n\n\n<li>B. Chettri, R. Gonz\u00e1lez Hautam\u00e4ki, M. Sahidullah, T. Kinnunen, \u201c<a href=\"https:\/\/arxiv.org\/pdf\/2103.14602.pdf\">Data Quality as Predictor of Voice Anti-Spoofing Generalization<\/a>\u201d, <i>Proc. Interspeech<\/i>, 1659-1663, Brno, Czech Republic, 2021.<\/li>\n\n\n\n<li>J. Turkia, L. Meht\u00e4talo, U. Schwab, and V. Hautam\u00e4ki, &#8220;<a href=\"https:\/\/www.nature.com\/articles\/s41598-021-91437-3\">Mixed-Effect Bayesian Network Reveals Personal Effects of Nutrition<\/a>&#8220;, <em>Scientific Reports<\/em>, Vol. 11, No. 12016, 2021.<\/li>\n\n\n\n<li>K.A. Lee, V. Vestman, and T. Kinnunen, \u201c<a href=\"https:\/\/doi.org\/10.1016\/j.softx.2021.100697\">ASVtorch Toolkit: Speaker Verification with Deep Neural Networks<\/a>\u201d, SoftwareX, Volume 14, 100697, June 2021.<\/li>\n\n\n\n<li>K. Ishihara, A. Kanervisto, J.&nbsp; Miura and V. Hautam\u00e4ki, &#8220;<a href=\"https:\/\/arxiv.org\/abs\/2104.10753\">Multi-task Learning with Attention for End-to-end Autonomous Driving<\/a>&#8220;, <em>CVPR 2021 Workshop on Autonomous Driving<\/em>, 2021.<\/li>\n\n\n\n<li>X. Liu, M. Sahidullah, T. Kinnunen, \u201c<a href=\"https:\/\/arxiv.org\/pdf\/2102.10322.pdf\">Learnable MFCCs for Speaker Verification<\/a>\u201d, <i>Proc. IEEE Int. Symp. Circuits and Systems<\/i> (ISCAS 2021), Daegu, Korea, May 2021<\/li>\n\n\n\n<li>A. Nautsch, X. Wang, N. Evans, T. Kinnunen, V. Vestman, M. Todisco, H. Delgado, M. Sahidullah, J. Yamagishi, K.A. Lee, \u201c<a href=\"https:\/\/arxiv.org\/pdf\/2102.05889.pdf\">ASVspoof 2019: Spoofing Countermeasures for the Detection of Synthesized, Converted and Replayed Speech<\/a>\u201d, <i>IEEE Transactions on Biometrics, Behavior, and Identity Science<\/i>, 3(2): 252&#8211;265, April 2021<\/li>\n\n\n\n<li>M. Sahidullah, A.K. Sarkar, V. Vestman, X. Liu, R. Serizel, T. Kinnunen, Z.-H. Tan, E. Vincent, \u201c<a href=\"https:\/\/arxiv.org\/pdf\/2007.13118.pdf\">UIAI System for Short-Duration Speaker Verification Challenge 2020<\/a>\u201d, <i>Proc. IEEE Spoken Language Technology Workshop<\/i> (SLT 2021), Shenzhen, China, January 2021<\/li>\n<\/ol>\n\n\n\n<h3 class=\"wp-block-heading\">2020<\/h3>\n\n\n\n<ol class=\"wp-block-list\">\n<li>R. K. Das, T. Kinnunen, W.-C. Huang, Z. Ling, J. Yamagishi, Y. Zhao, X. Tian, T. Toda, \u201c<a href=\"https:\/\/arxiv.org\/abs\/2009.03554\">Predictions of Subjective Ratings and Spoofing Assessments of Voice Conversion Challenge 2020 Submissions<\/a>\u201d, <i>Proc. Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge<\/i>, pp. 99&#8211;120, 2020.<\/li>\n\n\n\n<li>Y. Zhao, W.-C. Huang, X. Tian, J. Yamagishi, R.K. Das, T. Kinnunen, Z. Ling, T. Toda, \u201c<a href=\"https:\/\/arxiv.org\/abs\/2008.12527\">Voice Conversion Challenge 2020: Intra-lingual semi-parallel and cross-lingual voice conversion<\/a>\u201d, <i>Proc. Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge<\/i>, pp. 80&#8211;98, 2020.<\/li>\n\n\n\n<li>A. Sholokhov, T. Kinnunen, V. Vestman, K.A. Lee, \u201c<a href=\"https:\/\/arxiv.org\/pdf\/2008.03590.pdf\">Extrapolating False Alarm Rates in Automatic Speaker Verification<\/a>\u201d, <i>Proc. Interspeech 2020<\/i>, pp. 4218&#8211;4222, Shanghai, China, October 2020<\/li>\n\n\n\n<li>R. K. Das, X. Tian, T. Kinnunen, H. Li, \u201c<a href=\"https:\/\/arxiv.org\/pdf\/2004.08849.pdf\">The Attacker&#8217;s Perspective on Automatic Speaker Verification: An Overview<\/a>\u201d, <i>Proc. Interspeech 2020<\/i>, pp. 4213&#8211;4217, Shanghai, China, October 2020<\/li>\n\n\n\n<li>R. Gonz\u00e1lez Hautam\u00e4ki and T. Kinnunen, \u201c<a href=\"https:\/\/arxiv.org\/pdf\/2008.04578.pdf\">Why Did the x-Vector System Miss a Target Speaker? Impact of Acoustic Mismatch Upon Target Score on VoxCeleb Data<\/a>\u201d, <i>Proc. Interspeech 2020<\/i>, pp. 4313&#8211;4317, Shanghai, China, October 2020<\/li>\n\n\n\n<li>X. Liu, M. Sahidullah, T. Kinnunen, \u201c<a href=\"https:\/\/arxiv.org\/pdf\/2007.15283.pdf\">A Comparative Re-Assessment of Feature Extractors for Deep Speaker Embeddings<\/a>\u201d, <i>Proc. Interspeech 2020,<\/i> pp. 3221&#8211;3225, Shanghai, China, October 2020<\/li>\n\n\n\n<li>T. Kinnunen, H. Delgado, N. Evans, K.A. Lee, V. Vestman, A. Nautsch, M. Todisco, X. Wang, M. Sahidullah, J. Yamagishi, D.A. Reynolds, \u201c<a href=\"http:\/\/cs.joensuu.fi\/pages\/tkinnu\/webpage\/pdf\/Tandem_t_DCF_IEEE_ACM_TASLP.pdf\">Tandem Assessment of Spoofing Countermeasures and Automatic Speaker Verification: Fundamentals<\/a>\u201d, <i>IEEE\/ACM Transactions on Audio, Speech, and Language Processing<\/i><\/li>\n\n\n\n<li>I. Kukanov, TN. Trong, V. Hautam\u00e4ki, SM. Siniscalchi, VM. Salerno, KA. Lee.&nbsp; &#8220;<a href=\"http:\/\/cs.uef.fi\/~villeh\/TASLP-MFoM.pdf\">Maximal figure-of-merit framework to detect multi-label phonetic features for spoken language recognition<\/a>&#8221; <em>IEEE\/ACM transactions on audio, speech, and language processing<\/em> 28: 682-695. 2020<\/li>\n\n\n\n<li>A. Kanervisto,&nbsp; C. Scheller, V. Hautam\u00e4ki, &#8220;<a href=\"https:\/\/arxiv.org\/abs\/2004.00980\">Action Space Shaping in Deep Reinforcement Learning<\/a>&#8220;, <em>IEEE Conference on Games 2020<\/em><\/li>\n\n\n\n<li>A. Kanervisto, J. Pussinen, Ville Hautam\u00e4ki, &#8220;<a href=\"https:\/\/arxiv.org\/abs\/2004.00981\">Benchmarking End-to-End Behavioural Cloning on Video Games<\/a>&#8220;, <em>IEEE Conference on Games 2020<\/em><\/li>\n\n\n\n<li>X. Wang, J. Yamagishi, M. Todisco, H. Delgado, A. Nautsch, N. Evans, M. Sahidullah, V. Vestman, T. Kinnunen, K.A. Lee, L. Juvela, P. Alku, Y.-H. Peng, H.-T. Hwang, Y. Tsao, H.-M. Wang, S. L. Maguer, M. Becker, F. Henderson, R. Clark, Y. Zhang, Q. Wang, Y. Jia, K. Onuma, K. Mushika, T. Kaneda, Y. Jiang, L.-J. Liu, Y.-C. Wu, W.-C. Huang, T. Toda, K. Tanaka, H. Kameoka, I. Steiner, D. Matrouf, J. -F. Bonastre, A. Govender, S. Ronanki, J.-X. Zhang, Z.-H. Ling, \u201c<a href=\"https:\/\/arxiv.org\/abs\/1911.01601\">ASVspoof 2019: a large-scale public database of synthetic, converted and replayed speech<\/a>\u201d, <i>Computer Speech &amp; Language<\/i>, 64, November 2020<\/li>\n\n\n\n<li>A. Kanervisto, J. Karttunen, V. Hautam\u00e4ki, &#8220;<a href=\"https:\/\/arxiv.org\/abs\/2005.03374\">Playing Minecraft with Behavioural Cloning<\/a>&#8220;, <em>PMLR post proceedings &#8211; Competition Track@NeurIPS2019<\/em>, 2020<\/li>\n\n\n\n<li>B. Chettri, T. Kinnunen, E. Benetos, \u201c<a href=\"http:\/\/cs.joensuu.fi\/pages\/tkinnu\/webpage\/pdf\/VAE_replay_spoofing_CSL.pdf\">Deep Generative Variational Autoencoding for Replay Spoof Detection in Automatic Speaker Verification<\/a>\u201d,<em> Computer Speech &amp; Language<\/em>, 63: 1&#8211;18, September 2020<\/li>\n\n\n\n<li>V. Vestman, K.A. Lee, T. Kinnunen, \u201c<a href=\"http:\/\/cs.joensuu.fi\/pages\/tkinnu\/webpage\/pdf\/Odyssey2020__Neural_i_vector.pdf\">Neural i-Vectors<\/a>\u201d, <i><a href=\"http:\/\/www.odyssey2020.org\/\">Proc. Odyssey 2020<\/a>, <\/i>pp. 67&#8211;74, Tokyo, Japan, Nov. 2020<\/li>\n\n\n\n<li>B. Chettri, T. Kinnunen, E. Benetos, \u201c<a href=\"http:\/\/cs.joensuu.fi\/pages\/tkinnu\/webpage\/pdf\/odyssey2020_subband_spoof.pdf\">Subband modeling for Spoofing Detection in Automatic Speaker Verification<\/a>\u201d, <i><a href=\"http:\/\/www.odyssey2020.org\/\">Proc. Odyssey 2020<\/a>, <\/i>pp. 341&#8211;348, Tokyo, Japan, Nov. 2020<\/li>\n\n\n\n<li>A. Kanervisto, V. Hautam\u00e4ki, T. Kinnunen, J. Yamagishi, \u201c<a href=\"http:\/\/cs.joensuu.fi\/pages\/tkinnu\/webpage\/pdf\/Odyssey2020_reinforce_tDCF.pdf\">An Initial Investigation on Optimizing Tandem Speaker Verification and Countermeasure Systems Using Reinforcement Learning<\/a>\u201d, <i><a href=\"http:\/\/www.odyssey2020.org\/\">Proc. Odyssey 2020<\/a>,<\/i> pp. 151&#8211;158, Tokyo, Japan, Nov. 2020<\/li>\n\n\n\n<li>J. Karttunen, A. Kanervisto, V. Kyrki, Ville Hautam\u00e4ki, &#8220;<a href=\"http:\/\/cs.uef.fi\/~villeh\/Turtlebot_ICASSP.pdf\">From Video Game to Real Robot: The Transfer Between Action Spaces<\/a>&#8220;,<em> IEEE ICASSP<\/em>, Virtual conference, May 2020<\/li>\n\n\n\n<li>A. Sholokhov, T. Kinnunen, V. Vestman, K.A. Lee, \u201c<a href=\"https:\/\/arxiv.org\/pdf\/1911.01182.pdf\">Voice Biometrics Security: Extrapolating False Alarm Rate via Hierarchical Bayesian Modeling of Speaker Verification Scores<\/a>\u201d, <em>Computer Speech &amp; Language<\/em>, 60: 1&#8211;19, March 2020<\/li>\n<\/ol>\n\n\n\n<h3 class=\"wp-block-heading\">2019<\/h3>\n\n\n\n<ol class=\"wp-block-list\">\n<li>R. Gonz\u00e1lez Hautam\u00e4ki and T. Kinnunen, \u201c<a href=\"http:\/\/cs.uef.fi\/pages\/tkinnu\/webpage\/pdf\/ASRU2019_framing.pdf\">Towards Controlling False Alarm &#8212; Miss Trade-Off in Perceptual Speaker Comparison via Non-Neutral Listening Task Framing<\/a>\u201d, Proc. IEEE ASRU, December 2019, Singapore<\/li>\n\n\n\n<li>A. Sholokhov, T. Kinnunen, V. Vestman, K.A. Lee, \u201c<a href=\"https:\/\/arxiv.org\/pdf\/1911.01182.pdf\">Voice Biometrics Security: Extrapolating False Alarm Rate via Hierarchical Bayesian Modeling of Speaker Verification Scores<\/a>\u201d, Computer Speech &amp; Language, 60: 1&#8211;19, March 2020<\/li>\n\n\n\n<li>A. Kato and T. Kinnunen, \u201c<a href=\"http:\/\/cs.uef.fi\/pages\/tkinnu\/webpage\/pdf\/Statistical_Regression_Models_for_Noise_Robust_F0_Estimation.pdf\">Statistical Regression Models for Noise Robust F0 Estimation Using Recurrent Deep Neural Networks<\/a>\u201d, IEEE\/ACM Transactions on Audio, Speech, and Language Processing, 27(2):2336&#8211;2349, December 2019.<\/li>\n\n\n\n<li>R. Gonz\u00e1lez Hautam\u00e4ki, V. Hautam\u00e4ki, T. Kinnunen, \u201c<a href=\"http:\/\/cs.uef.fi\/pages\/tkinnu\/webpage\/pdf\/Disguise_JASA_accepted.pdf\">On Limits of Automatic Speaker Verification: Explaining Degraded Recognizer Score Through Acoustic Changes Resulting from Voice Disguise<\/a>\u201d, Journal of the Acoustic Society of America, 146(1): 693&#8211;704, July 2019<\/li>\n\n\n\n<li>V. Vestman, T. Kinnunen, R. Gonzalez Hautam\u00e4ki, M. Sahidullah, \u201c<a href=\"http:\/\/cs.uef.fi\/pages\/tkinnu\/webpage\/pdf\/mimicry_attack2019_CSL.pdf\">Voice Mimicry Attacks Assisted by Automatic Speaker Verification\u201d<\/a>, Computer Speech &amp; Language, 59: 36&#8211;54, January 2020.<\/li>\n\n\n\n<li>V. Vestman, K. A. Lee, T. Kinnunen, T. Koshinaka, \u201c<a href=\"http:\/\/cs.uef.fi\/pages\/tkinnu\/webpage\/pdf\/Interspeech19__pytorch_i_vectors.pdf\">Unleashing the Unused Potential of I-Vectors Enabled by GPU Acceleration\u201d<\/a>, <i>Proc. Interspeech 2019<\/i>, pp. 351&#8211;355, Graz, Austria, September 2019<\/li>\n\n\n\n<li>M. Todisco, X. Wang, V. Vestman, M. Sahidullah, H. Delgado, A. Nautsch, J. Yamagishi, N. Evans, T. Kinnunen, K. A. Lee, \u201c<a href=\"http:\/\/cs.uef.fi\/pages\/tkinnu\/webpage\/pdf\/Interspeech2019_ASVSpoof2019.pdf\">ASVspoof 2019: Future Horizons in Spoofed and Fake Audio Detection<\/a>\u201d,&nbsp; <i>Proc. Interspeech 2019<\/i>, pp. 1008&#8211;1012, Graz, Austria, September 2019<\/li>\n\n\n\n<li>Bilal Soomro, Anssi Kanervisto, Trung Ngo Trong and Ville Hautam\u00e4ki, &#8220;<a href=\"http:\/\/cs.uef.fi\/~villeh\/IS2019-debugging.pdf\">Towards Debugging Deep Neural Networks by Generating Speech Utterances<\/a>&#8220;, <i>Proc. Interspeech 2019<\/i>,,&nbsp; pp.&nbsp; 3213-3217, Graz, Austria, September 2019. <a href=\"https:\/\/github.com\/bilalsoomro\/debugging-deep-neural-networks\">Github<\/a><\/li>\n\n\n\n<li>K. A. Lee, V. Hautam\u00e4ki, T. Kinnunen, H. Yamamoto, K. Okabe, V. Vestman, J. Huang, G. Ding, H. Sun, A. Larcher, R. K. Das, H. Li, M. Rouvier, P. Bousquet, W. Rao, Q. Wang, C. Zhang, F. Bahmaninezhad, H. Delgado, M. Todisco, Q. Wang, L. Guo, T. Koshinaka, J. Zhang, K. Shinoda, T. N. Trong, M. Sahidullah, F. Lu, Y. Tang, M. Tu, K. K. Teh, H. D. Tran, K. K. George, I. Kukanov, F. Desnous, J. Yang, E. Y\u0131lmaz, L. Xu, J. Bonastre, C. Xu, Z. H. Lim, E. S. Chng, S. Ranjan, J. H. L. Hansen, J. Patino, N. Evans, \u201c<a href=\"http:\/\/cs.uef.fi\/pages\/tkinnu\/webpage\/pdf\/i4u_interspeech_2019.pdf\">I4U Submission to NIST SRE 2018: Leveraging from a Decade of Shared Experiences<\/a>\u201d, <i>Proc. Interspeech 2019,<\/i> pp. 1497&#8211;1501, Graz, Austria, September 2019<\/li>\n\n\n\n<li>X. Wu, E. Granger, T. Kinnunen, X. Feng, A. Hadid, \u201cAudio-Visual Kinship Verification in the Wild\u201d,&nbsp; 12th IAPR International Conference On Biometrics (ICB 2019), Crete, Greece, June 2019. [<a href=\"http:\/\/cs.joensuu.fi\/pages\/tkinnu\/webpage\/pdf\/audio_visual_kinship_ICB2019.pdf\">PDF<\/a>]<\/li>\n\n\n\n<li>T. Kinnunen, R. Gonz\u00e1lez Hautam\u00e4ki, V. Vestman, M. Sahidullah, \u201cCan We Use Speaker Recognition Technology to Attack Itself? Enhancing Mimicry Attacks Using Automatic Target Speaker Selection\u201d, Proc. IEEE ICASSP, pp. 6146&#8211;6150, Brighton, UK, May 2019 [<a href=\"http:\/\/cs.joensuu.fi\/pages\/tkinnu\/webpage\/pdf\/ICASSP19_can_we_use_ASV_to_attack_itself.pdf\">PDF<\/a>]<\/li>\n\n\n\n<li>V. Vestman, B. Soomro, A. Kanervisto, V. Hautam\u00e4ki, T. Kinnunen, \u201cWho Do I sound Like? Showcasing Speaker Recognition Technology by YouTube Voice Search\u201d, Proc. IEEE ICASSP, pp. 5781&#8211;5785, Brighton, UK, May 2019 [<a href=\"http:\/\/cs.joensuu.fi\/pages\/tkinnu\/webpage\/pdf\/ICASSP19_who_do_I_sound_like.pdf\">PDF<\/a>]<\/li>\n\n\n\n<li>E. Jokinen, R. Saeidi, T. Kinnunen, P. Alku, &#8220;Vocal Effort Compensation for MFCC Feature Extraction in a Shouted Versus Normal Speaker Recognition Task&#8221;, <em>Computer Speech &amp; Language<\/em>, 53: 1-11, January 2019 IF=1.776 JF=<\/li>\n<\/ol>\n\n\n\n<h3 class=\"wp-block-heading\">2018<\/h3>\n\n\n\n<ol class=\"wp-block-list\">\n<li>M. Sahidullah, H. Delgado, M. Todisco, T. Kinnunen, N. Evans, J. Yamagishi, K.A. Lee, \u201cIntroduction to Voice Presentation Attack Detection and Recent Advances\u201d, book chapter in Handbook of Biometric Anti-Spoofing: Presentation Attack Detection, Springer, S. Marcel, M.S. Nixon, J. Fierrez, N. Evans (Eds.), Springer, 2018 [<a href=\"http:\/\/cs.joensuu.fi\/pages\/tkinnu\/webpage\/pdf\/voicePAD-springer2018.pdf\">PDF<\/a>]<\/li>\n\n\n\n<li>F. Fang, J. Yamagishi, I. Echizen, M. Sahidullah, T. Kinnunen, &#8220;Transforming Acoustic Characteristics to Deceive Playback Spoofing Countermeasures of Speaker Verification Systems&#8221;, Proc&nbsp;IEEE Int. Workshop on Information Forensics and Security (WIFS 2018), Hong Kong, China, 2018 [<a href=\"http:\/\/cs.joensuu.fi\/pages\/tkinnu\/webpage\/pdf\/fang_transforming_WIFS2018.pdf\">PDF<\/a>]<\/li>\n\n\n\n<li>M. Todisco, H. Delgado, K.A. Lee, M. Sahidullah, N. Evans, T. Kinnunen, J. Yamagishi, &#8220;Integrated Presentation Attack Detection and Automatic Speaker Verification: Common Features and Gaussian Back-end Fusion&#8221;, <i>Proc. Interspeech 2018, <\/i>pp. 77-81, Hyderabad, India, September 2018<i> <\/i>[<a href=\"http:\/\/cs.joensuu.fi\/pages\/tkinnu\/webpage\/pdf\/integrated-presentation-attack-INTERSPEECH2018.pdf\">PDF<\/a>] JF=1<\/li>\n\n\n\n<li>A. Kato, T. Kinnunen, &#8220;Waveform to Single Sinusoid Regression to Estimate the F0 Contour from Noisy Speech Using Recurrent Deep Neural Networks&#8221;, <i>Proc. Interspeech 2018, <\/i>pp. 327-331, Hyderabad, India, September 2018<i> <\/i>[<a href=\"http:\/\/cs.joensuu.fi\/pages\/tkinnu\/webpage\/pdf\/waveform-to-sinusoid-INTERSPEECH2018.pdf\">PDF<\/a>] JF=1<\/li>\n\n\n\n<li>S. Sieranoja, M. Sahidullah, T. Kinnunen, J. Komulainen, A. Hadid, &#8220;Audiovisual Synchrony Detection with Optimized Audio Features&#8221;, accepted to <a href=\"http:\/\/www.icsip.org\/\"><i>IEEE 3rd Int. Conference on Signal and Image Processing (ICSIP 2018)<\/i><\/a>, Shenzhen, China, July 2018 [<a href=\"http:\/\/cs.joensuu.fi\/pages\/tkinnu\/webpage\/pdf\/audiovisual_synchrony_2018.pdf\">PDF<\/a>] JF=0<\/li>\n\n\n\n<li>T. N. Trong, V. Hautam\u00e4ki, and K. Jokinen, &#8220;<a href=\"http:\/\/cs.uef.fi\/~villeh\/staircase-network-structural.pdf\">Staircase Network: structural language identification via hierarchical attentive units<\/a>&#8220;, <a href=\"http:\/\/www.odyssey2018.org\/\"><i>Proc. Odyssey 2018<\/i><\/a>, pp. 60-67, Les Sables d&#8217;Olonne, France, June 2018 JF=1<\/li>\n\n\n\n<li>T. Kinnunen, K.A. Lee, H. Delgado, N. Evans, M. Todisco, M. Sahidullah, J. Yamagishi, D.A. Reynolds,&nbsp; &#8220;t-DCF: a Detection Cost Function for the Tandem Assessment of Spoofing Countermeasures and Automatic Speaker Verification&#8221;, <a href=\"http:\/\/www.odyssey2018.org\/\"> <i>Proc. Odyssey 2018<\/i><\/a>, pp. 312-319, Les Sables d&#8217;Olonne, France, June 2018 [<a href=\"http:\/\/cs.joensuu.fi\/pages\/tkinnu\/webpage\/pdf\/tDCF_Odyssey2018.pdf\">PDF<\/a>] JF=1<\/li>\n\n\n\n<li>T. Kinnunen, J. Lorenzo-Trueba, J. Yamagishi, T. Toda, D. Saito, F. Villavicencio, Z. Ling, &#8220;A Spoofing Benchmark for the 2018 Voice Conversion Challenge: Leveraging from Spoofing Countermeasures for Speech Artifact Assessment&#8221;, <a href=\"http:\/\/www.odyssey2018.org\/\"><i>Proc. <\/i><\/a><i><a href=\"http:\/\/www.odyssey2018.org\/\">Odyssey 2018<\/a>, <\/i>pp. 187-194, Les Sables d&#8217;Olonne, France, June 2018 [<a href=\"http:\/\/cs.joensuu.fi\/pages\/tkinnu\/webpage\/pdf\/Spoofing_benchmark_VCC18.pdf\">PDF (original)<\/a>], [<a href=\"https:\/\/arxiv.org\/pdf\/1804.08438.pdf\">PDF (corrected version) from arXiv with a bug fix)<\/a>] JF=1<\/li>\n\n\n\n<li>J. Lorenzo-Trueba, J. Yamagishi, T. Toda, D. Saito, F. Villavicencio, T. Kinnunen, Z. Ling, &#8220;The Voice Conversion Challenge 2018: Promoting Development of Parallel and Nonparallel Methods&#8221;, <a href=\"http:\/\/www.odyssey2018.org\/\"><i>Proc. <\/i><\/a><a href=\"http:\/\/www.odyssey2018.org\/\"><i>Odyssey 2018<\/i><\/a>, pp. 195-202, Les Sables d&#8217;Olonne, France, June 2018 [<a href=\"http:\/\/cs.joensuu.fi\/pages\/tkinnu\/webpage\/pdf\/VCC18_overview_Odyssey2018.pdf\">PDF<\/a>] [<a href=\"http:\/\/dx.doi.org\/10.7488\/ds\/2337\">The data and challenge results are available here<\/a>] JF=1<\/li>\n\n\n\n<li>V. Vestman and T. Kinnunen, &#8220;Supervector Compression Strategies to Speed up I-Vector System Development&#8221;,&nbsp; <a href=\"http:\/\/www.odyssey2018.org\/\"><i>Proc. <\/i><\/a><i><a href=\"http:\/\/www.odyssey2018.org\/\">Odyssey 2018<\/a>, <\/i>pp. 357-364, Les Sables d&#8217;Olonne, France, June 2018 [<a href=\"http:\/\/cs.joensuu.fi\/pages\/tkinnu\/webpage\/pdf\/supervector_compression_Odyssey2018.pdf\">PDF<\/a>] JF=1<\/li>\n\n\n\n<li>R. Gonzalez Hautam\u00e4ki, A. Kanervisto, V. Hautam\u00e4ki, T. Kinnunen, &#8220;Perceptual Evaluation of the Effectiveness of Voice Disguise by Age Modification&#8221;,&nbsp; <a href=\"http:\/\/www.odyssey2018.org\/\"><i>Proc. <\/i><\/a><a href=\"http:\/\/www.odyssey2018.org\/\"><i>Odyssey 2018<\/i><\/a>, pp. 320-326, Les Sables d&#8217;Olonne, France, June 2018 [<a href=\"http:\/\/cs.joensuu.fi\/pages\/tkinnu\/webpage\/pdf\/Perceptual_effectiveness_Odyssey2018.pdf\">PDF<\/a>] JF=1<\/li>\n\n\n\n<li>A. Kato and T. Kinnunen, &#8220;A Regression Model of Recurrent Deep Neural Networks for Noise Robust Estimation of the Fundamental Frequency Contour of Speech&#8221;, <a href=\"http:\/\/www.odyssey2018.org\/\"><i>Proc. <\/i><\/a><a href=\"http:\/\/www.odyssey2018.org\/\"><i>Odyssey 2018<\/i><\/a>, pp. 275-282, Les Sables d&#8217;Olonne, France, June 2018 [<a href=\"http:\/\/cs.joensuu.fi\/pages\/tkinnu\/webpage\/pdf\/F0regression_Odyssey2018.pdf\">PDF<\/a>] JF=1<\/li>\n\n\n\n<li>J. Lorenzo-Trueba, F. Fang, X. Wang, I. Echizen, J. Yamagishi, T. Kinnunen, &#8220;Can we steal your vocal identity from the Internet? Initial investigation of cloning Obama\u2019s voice using GAN, WaveNet and low-quality found data&#8221;, <a href=\"http:\/\/www.odyssey2018.org\/\"><i>Proc. <\/i><\/a><a href=\"http:\/\/www.odyssey2018.org\/\"><i>Odyssey 2018<\/i><\/a>, pp. 240-247, Les Sables d&#8217;Olonne, France, June 2018 [<a href=\"http:\/\/cs.joensuu.fi\/pages\/tkinnu\/webpage\/pdf\/Obama_Odyssey2018.pdf\">PDF<\/a>] JF=1<\/li>\n\n\n\n<li>H. Delgado, M. Todisco, M. Sahidullah, N. Evans, T. Kinnunen, K.A. Lee, J. Yamagishi, &#8220;ASVspoof 2017 Version 2.0: meta-data analysis and baseline enhancements&#8221;,&nbsp; <a href=\"http:\/\/www.odyssey2018.org\/\"><i>Proc. <\/i><\/a><a href=\"http:\/\/www.odyssey2018.org\/\"><i>Odyssey 2018<\/i><\/a>, pp. 296-303, Les Sables d&#8217;Olonne, France, June 2018 [<a href=\"http:\/\/cs.joensuu.fi\/pages\/tkinnu\/webpage\/pdf\/ASVspoof2.0_Odyssey2018.pdf\">PDF<\/a>] JF=1<\/li>\n\n\n\n<li>T. Lepp\u00e4nen, H. Vrzakova, R. Bednarik, A. Kanervisto, A. Elomaa, A. Huotarinen, P. Bartczak, M. Fraunberg, J. J\u00e4\u00e4skel\u00e4inen, &#8220;Augmenting Microsurgical Training: Microsurgical Instrument Detection Using Convolutional Neural Networks&#8221;, <em>Proc. CBMS 2018<\/em>, pp. 211-216, Karlstad, June 2018 JF=1<\/li>\n\n\n\n<li>T. N. Trong, K. Jokinen and V. Hautam\u00e4ki, &#8220;<a href=\"http:\/\/cs.uef.fi\/~villeh\/IWSDS-2018_paper_20.pdf\">Enabling Spoken Dialgoue Systems for low-resourced languages &#8211; end-to-end dialect recognition for North Sami<\/a>&#8220;, IWSDS 2018, Singapore, May 2018 [Best paper award]<\/li>\n\n\n\n<li>I. Kukanov, V. Hautam\u00e4ki and Kong Aik Lee, &#8220;<a href=\"http:\/\/cs.uef.fi\/~villeh\/MFoM-ICASSP2017.pdf\">Maximal Figure-of-Merit Embedding for Multi-label Audio Classification<\/a>&#8220;, &nbsp;Proc. ICASSP 2018, pp. 136-140, Calgary, Canada, April 2018 JF=1<\/li>\n\n\n\n<li>V. Vestman, D. Gowda, M. Sahidullah, P. Alku, and T. Kinnunen, &#8220;Speaker Recognition from Whispered Speech: a Tutorial Survey and an Application of Time-Varying Linear Prediction&#8221;, <em>Speech Communication<\/em>, 99: 62-79, May 2018 IF=1.585 JF=2<\/li>\n\n\n\n<li>M. Sahidullah, D. Thomsen, R. Gonzalez Hautam\u00e4ki, T. Kinnunen, Z.-H. Tan, R. Parts, and Martti Pitk\u00e4nen, &#8220;Robust Voice Liveness Detection and Speaker Verification Using Throat Microphones&#8221;, <i>I<\/i><i>EEE\/ACM Trans. on Audio, Speech, and Language Processing<\/i>, 26(1): 44-56, January 2018 [<a href=\"http:\/\/cs.joensuu.fi\/pages\/tkinnu\/webpage\/pdf\/IEEE_TASLP_throat_mic_ASV_liveness.pdf\">PDF<\/a>] IF=2.95 JF=2<\/li>\n\n\n\n<li>A. Sholokhov, M. Sahidullah, T. Kinnunen, &#8220;Semi-Supervised Speech Activity Detection with an Application to Automatic Speaker Verification&#8221;, <em>Computer Speech &amp; Language<\/em>, 47:132-156, January 2018 IF=1.90 JF=2<\/li>\n<\/ol>\n\n\n\n<p>Group has separated from <a href=\"http:\/\/www.uef.fi\/web\/machine-learning\">machine learning group <\/a>on 2018. Articles published before 2018 can be found <a href=\"http:\/\/www.uef.fi\/web\/machine-learning\/publications\">here.<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>2026(updated 10.02.2026) 2025 2024 2023 2022 2021 2020 2019 2018 Group has separated from machine learning group on 2018. Articles published before 2018 can be found here.<\/p>\n","protected":false},"author":24,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"_acf_changed":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"class_list":["post-76","page","type-page","status-publish","hentry"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.5 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Publications - Computational speech group<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/sites.uef.fi\/speech\/publications\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Publications - Computational speech group\" \/>\n<meta property=\"og:description\" content=\"2026(updated 10.02.2026) 2025 2024 2023 2022 2021 2020 2019 2018 Group has separated from machine learning group on 2018. Articles published before 2018 can be found here.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/sites.uef.fi\/speech\/publications\/\" \/>\n<meta property=\"og:site_name\" content=\"Computational speech group\" \/>\n<meta property=\"article:modified_time\" content=\"2026-02-10T13:22:54+00:00\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"15 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/sites.uef.fi\\\/speech\\\/publications\\\/\",\"url\":\"https:\\\/\\\/sites.uef.fi\\\/speech\\\/publications\\\/\",\"name\":\"Publications - Computational speech group\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/sites.uef.fi\\\/speech\\\/#website\"},\"datePublished\":\"2020-07-24T15:50:01+00:00\",\"dateModified\":\"2026-02-10T13:22:54+00:00\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/sites.uef.fi\\\/speech\\\/publications\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/sites.uef.fi\\\/speech\\\/publications\\\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/sites.uef.fi\\\/speech\\\/publications\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/sites.uef.fi\\\/speech\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Publications\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/sites.uef.fi\\\/speech\\\/#website\",\"url\":\"https:\\\/\\\/sites.uef.fi\\\/speech\\\/\",\"name\":\"Computational speech group\",\"description\":\"\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/sites.uef.fi\\\/speech\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Publications - Computational speech group","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/sites.uef.fi\/speech\/publications\/","og_locale":"en_US","og_type":"article","og_title":"Publications - Computational speech group","og_description":"2026(updated 10.02.2026) 2025 2024 2023 2022 2021 2020 2019 2018 Group has separated from machine learning group on 2018. Articles published before 2018 can be found here.","og_url":"https:\/\/sites.uef.fi\/speech\/publications\/","og_site_name":"Computational speech group","article_modified_time":"2026-02-10T13:22:54+00:00","twitter_card":"summary_large_image","twitter_misc":{"Est. reading time":"15 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/sites.uef.fi\/speech\/publications\/","url":"https:\/\/sites.uef.fi\/speech\/publications\/","name":"Publications - Computational speech group","isPartOf":{"@id":"https:\/\/sites.uef.fi\/speech\/#website"},"datePublished":"2020-07-24T15:50:01+00:00","dateModified":"2026-02-10T13:22:54+00:00","breadcrumb":{"@id":"https:\/\/sites.uef.fi\/speech\/publications\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/sites.uef.fi\/speech\/publications\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/sites.uef.fi\/speech\/publications\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/sites.uef.fi\/speech\/"},{"@type":"ListItem","position":2,"name":"Publications"}]},{"@type":"WebSite","@id":"https:\/\/sites.uef.fi\/speech\/#website","url":"https:\/\/sites.uef.fi\/speech\/","name":"Computational speech group","description":"","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/sites.uef.fi\/speech\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"}]}},"_links":{"self":[{"href":"https:\/\/sites.uef.fi\/speech\/wp-json\/wp\/v2\/pages\/76","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/sites.uef.fi\/speech\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/sites.uef.fi\/speech\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/sites.uef.fi\/speech\/wp-json\/wp\/v2\/users\/24"}],"replies":[{"embeddable":true,"href":"https:\/\/sites.uef.fi\/speech\/wp-json\/wp\/v2\/comments?post=76"}],"version-history":[{"count":1,"href":"https:\/\/sites.uef.fi\/speech\/wp-json\/wp\/v2\/pages\/76\/revisions"}],"predecessor-version":[{"id":1285,"href":"https:\/\/sites.uef.fi\/speech\/wp-json\/wp\/v2\/pages\/76\/revisions\/1285"}],"wp:attachment":[{"href":"https:\/\/sites.uef.fi\/speech\/wp-json\/wp\/v2\/media?parent=76"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}