Publications

2026(updated 10.02.2026)

Xi Xuan, Davide Carbone, Ruchi Pandey, Wenxin Zhang, Tomi H Kinnunen, “WST-X Series: Wavelet Scattering Transform for Interpretable Speech Deepfake Detection“, IEEE Signal Processing Letters.
Xi Xuan, Xuechen Liu, Wenxin Zhang, Yi-Cheng Lin, Xiaojian Lin, Tomi Kinnunen, “WaveSP-Net: Learnable Wavelet-Domain Sparse Prompt Tuning for Speech Deepfake Detection“, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP): Proceedings, 2026.
Oguzhan Kurnaz, Jagabandhu Mishra, Tomi Kinnunen, Cemal Hanilci, “Joint Optimization of ASV and CM tasks: BTUEF Team’s Submission for WildSpoof Challenge“

2025

Manasi Chhibber, Jagabandhu Mishra, Hyejin Shim, Tomi H. Kinnunen, “An Explainable Probabilistic Attribute Embedding Approach for Spoofed Speech Characterization” , IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP): Proceedings, 2025.
Anton Firc, Manasi Chibber, Jagabandhu Mishra, Vishwanath Pratap Singh, Tomi Kinnunen, Kamil Malinka, “STOPA: A Database of Systematic VariaTion Of DeePfake Audio for Open-Set Source Tracing and Attribution“, Interspeech, 2025.
Satu Hopponen, Tomi Kinnunen, Alexandre Nikolaev, Rosa González Hautamäki, Lauri Tavi, Einar Meister, “FROST-EMA: Finnish and Russian Oral Speech Dataset of Electromagnetic Articulography Measurements with L1, L2 and Imitated L2 Accents“, Interspeech, 2025.
Parismita Gogoi, Vishwanath Pratap Singh, Seema Khadirnaikar, Soma Siddhartha, Sishir Kalita, Jagabandhu Mishra, Md Sahidullah, Priyankoo Sarmah, S. R. M. Prasanna, “Leveraging AM and FM Rhythm Spectrograms for Dementia Classification and Assessment“, Interspeech, 2025.
Vishwanath Pratap Singh, Md. Sahidullah, Tomi Kinnunen, “Causal Structure Discovery for Error Diagnostics of Children’s ASR“, Interspeech, 2025.
Edem Ahadzi, Vishwanath Pratap Singh, Tomi Kinnunen, Ville Hautamaki, “Continuous Learning for Children’s ASR: Overcoming Catastrophic Forgetting with Elastic Weight Consolidation and Synaptic Intelligence“, Interspeech, 2025.
Xi Xuan, Yang Xiao, Rohan Kumar Das, Tomi Kinnunen, “Multilingual Source Tracing of Speech Deepfakes: A First Benchmark“, Interspeech SPSC 2025 – 5th Symposium on Security and Privacy in Speech Communication
Manasi Chhibber, Jagabandhu Mishra, Tomi H. Kinnunen, “Advancing Zero-Shot Open-Set Speech Deepfake Source Tracing“
Oğuzhan Kurnaz, Jagabandhu Mishra, Tomi H Kinnunen, Cemal Hanilçi, “Joint Optimization of Speaker and Spoof Detectors for Spoofing-Robust Automatic Speaker Verification“

2024

J. Mishra and S. R. M. Prasanna, “Implicit Self-Supervised Language Representation for Spoken Language Diarization“, IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP), 2024.
Jagabandhu Mishra, S.R. Mahadeva Prasanna, “Generative attention based framework for implicit language change detection“, Digital Signal Processing, 2024.
Vishwanath Pratap Singh, Federico Malato, Ville Hautamaki, Md Sahidullah, Tomi Kinnunen, “ROAR: Reinforcing Original to Augmented Data Ratio Dynamics for Wav2Vec2. 0 Based ASR“, Interspeech, 2024.
Mishra, J., Prasanna, S.R.M., “Spoken Language Change Detection Inspired by Speaker Change Detection“, Circuits Syst Signal Process, 2024.
Vishwanath Pratap Singh, Md Sahidullah, Tomi Kinnunen, “ChildAugment: Data Augmentation Methods for Zero-Resource Children’s Speaker Verification“, The Journal of the Acoustical Society of America (JASA) (accepted), 2024.
Xuechen Liu, Md Sahidullah, Kong Aik Lee, Tomi Kinnunen, “Generalizing Speaker Verification for Spoof Awareness in the Embedding Space“, IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP), vol. 32, pp. 1261 – 1273, 2024.
Yi Ma, Kong Aik Lee,Ville Hautamäki, Meng Ge, Haizhou Li, “Gradient Weighting for Speaker Verification in Extremely Low Signal-to-Noise Ratio“, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP): Proceedings, 2024 .
Xin Wang, Tomi Kinnunen, Kong Aik Lee, Paul-Gauthier Noé, Junichi Yamagishi, “Revisiting and Improving Scoring Fusion for Spoofing-aware Speaker Verification Using Compositional Data Analysis“, Interspeech, 2024.
Ivan Kukanov, Janne Laakkonen, Tomi Kinnunen, Ville Hautamäki, “Meta-Learning Approaches for Improving Detection of Unseen Speech Deepfakes“, IEEE Spoken Language Technology Workshop, 2024.
Hye-jin Shim, Md Sahidullah, Jee-weon Jung, Shinji Watanabe, Tomi Kinnunen, “Beyond Silence: Bias Analysis through Loss and Asymmetric Approach in Audio Anti-Spoofing“, Synthetic Data’s Transformative Role in Foundational Speech Models, 2024.
Hye-jin Shim, Jee-weon Jung, Tomi Kinnunen, Nicholas Evans, Jean-François Bonastre, Itshak Lapidot, “a-DCF: an architecture agnostic metric with application to spoofing-robust speaker verification“, The Speaker and Language Recognition Workshop (Odyssey 2024).
Tomi Kinnunen, Rosa Gonzalez Hautamäki, Xin Wang, Junichi Yamagishi, “Speaker Detection by the Individual Listener and the Crowd: Parametric Models Applicable to Bonafide and Deepfake Speech“, Interspeech, 2024.
Tomi Kinnunen, Lee H, Aik Kong, Hemlata Tak, Nicholas Evans, Andreas Nautsch, “t-EER: Parameter-Free Tandem Evaluation of Countermeasures and Biometric Comparators“, IEEE transactions on pattern analysis and machine intelligence, 2024.

2023

T. H. Kinnunen, K. A. Lee, H. Tak, N. Evans, A. Nautsch, “t-EER: Parameter-Free Tandem Evaluation of Countermeasures and Biometric Comparators“, IEEE Transactions on Pattern Analysis and Machine Intelligence, doi: 10.1109/TPAMI.2023.3313648.
Xuechen Liu, Xin Wang, Md Sahidullah, Jose Patino, Héctor Delgado, Tomi Kinnunen, Massimiliano Todisco, Junichi Yamagishi, Nicholas Evans, Andreas Nautsch, Kong Aik Lee, “ASVspoof 2021: Towards Spoofed and Deepfake Speech Detection in the Wild“, IEEE/ACM Transactions on Audio, Speech and Language Processing, vol. 31, pp. 2507-2522, 2023.
Xuechen Liu, Md Sahidullah, Kong Aik Lee, Tomi Kinnunen, “Speaker-Aware Anti-Spoofing“, Proc. Interspeech, 2498-2502, Dublin, Ireland, 2023.
R Tao, KA Lee, RK Das, V Hautamäki, H Li, “Self-Supervised Training of Speaker Encoder With Multi-Modal Diverse Positive Pairs“, IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP), vol. 31. pp. 1706 – 1719, 2023.
Sung Hwan Mun, Hye-jin Shim, Hemlata Tak, Xin Wang, Xuechen Liu, Md Sahidullah, Myeonghun Jeong, Min Hyun Han, Massimiliano Todisco, Kong Aik Lee, Junichi Yamagishi, Nicholas Evans, Tomi Kinnunen, Nam Soo Kim, Jee-weon Jung, “Towards single integrated spoofing-aware speaker verification embeddings“, Proc. Interspeech, 3989-3993, Dublin, Ireland, 2023.
Hye-jin Shim, Rosa González Hautamäki, Md Sahidullah, Tomi Kinnunen, “How to Construct Perfect and Worse-than-Coin-Flip Spoofing Countermeasures: A Word of Warning on Shortcut Learning“, Proc. Interspeech, 785-789, Dublin, Ireland, 2023.
Hye-jin Shim, Jee-weon Jung, Tomi Kinnunen, “Multi-Dataset Co-Training with Sharpness-Aware Optimization for Audio Anti-spoofing“, Proc. Interspeech, 3804-3808, Dublin, Ireland, 2023.
Vishwanath Pratap Singh, Md Sahidullah, Tomi Kinnunen, “Speaker Verification Across Ages: Investigating Deep Speaker Embedding Sensitivity to Age Mismatch in Enrollment and Test Speech“, Proc. Interspeech, 1948-1952, Dublin, Ireland, 2023.
M. Anderson, T. Kinnunen, N. Harte, “Learnable frontends that do not learn: Quantifying sensitivity to filterbank initialisation”, IEEE ICASSP, Rhodes island, Greece, 2023.

2022

A. Kanervisto, V. Hautamäki, T. Kinnunen, J. Yamagishi, “Optimizing Tandem Speaker Verification and Anti-Spoofing Systems,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 30, pp. 477-488, 2022.
J.-w. Jung, H. Tak, H.-j. Shim, H.-S. Heo, B.-J. Lee, S.-W. Chung, H.-J. Yu, N. Evans, T. Kinnunen, “SASV 2022: The First Spoofing-Aware Speaker Verification Challenge“, Proc. Interspeech, 2893-2897, Incheon, Korea, 2022.
R. Tao, K.A. Lee, R.K. Das, V. Hautamäki, H. Li, “Self-supervised Speaker Recognition with Loss-gated Learning“, IEEE ICASSP, Singapore, 2022.
X. Liu, M. Sahidullah, T. Kinnunen, “Learnable Nonlinear Compression for Robust Speaker Verification“, IEEE ICASSP, Singapore, 2022.
X. Liu, M. Sahidullah, T. Kinnunen, “Spoofing-aware Speaker Verification with Unsupervised Domain Adaptation“, Speaker Odyssey 2022, Beijing, China.
A. Sholokhov, X. Liu, M. Sahidullah, T. Kinnunen, “Baselines and Protocols for Household Speaker Recognition“, Speaker Odyssey 2022, Beijing, China.
H.-j. Shim, H. Tak, X. Liu, H.-S. Heo, J.-W. Jung, J.S. Chung, S.-W. Chung, H.-J. Yu, B.-J. Lee, M. Todisco, H. Delgado, K.A. Lee, M. Sahidullah, T. Kinnunen, N. Evans, “Baseline Systems for the First Spoofing-Aware Speaker Verification Challenge: Score and Embedding Fusion“, Speaker Odyssey 2022, Beijing, China.
S. Ghimire, T. Kinnunen, R. González Hautamäki, “Gamified Speaker Comparison by Listening“, Speaker Odyssey 2022, Beijing, China.
L. Tavi, T. Kinnunen, R. González Hautamäki, “Improving speaker de-identification with functional data analysis of f0 trajectories“, Speech Communication, Volume 140, Pages 1-10, 2022.

2021

Y. Ma, K.A. Lee, V. Hautamäki, H. Li, “PL-EERSR: Perceptual Loss Based End-to-End Robust Speaker Representation Extraction“, IEEE Automatic Speech Recognition and Understanding workshop, 2021
K. Hechmi, T.N. Trong, V. Hautamäki, T. Kinnunen, ”VoxCeleb Enrichment for Age and Gender Recognition”, IEEE Automatic Speech Recognition and Understanding workshop, 2021
X. Liu, M. Sahidullah, T. Kinnunen, “Optimized Power Normalized Cepstral Coefficients towards Robust Deep Speaker Verification”, to appear in IEEE Automatic Speech Recognition and Understanding workshop, 2021
X. Liu, M. Sahidullah, T. Kinnunen, “Parameterized Channel Normalization for Far-field Deep Speaker Verification”, IEEE Automatic Speech Recognition and Understanding workshop, 2021
X. Liu, M. Sahidullah, T. Kinnunen, “Optimizing Multi-Taper Features for Deep Speaker Verification”, IEEE Signal Processing Letters, 28: 2187–2191, October 2021.
L. Tavi, T. Kinnunen, E. Meister, R González-Hautamäki, A. Malmi, ”Articulation During Voice Disguise: A Pilot Study”, Proc. Speech and Computer (SPECOM’21), Springer LNAI 12997, pp. 680–691, St. Petersburg, Russia, September 2021.
T. Kinnunen, A. Nautsch, M. Sahidullah, N. Evans, X. Wang, M. Todisco, H. Delgado, J. Yamagishi, K.A. Lee, “Visualizing Classifier Adjacency Relations: A Case Study in Speaker Verification and Voice Anti-Spoofing”, Proc. Interspeech, 4299-4303, Brno, Czech Republic, 2021.
A. Kanervisto, C. Scheller, Y. Schraner and V. Hautamäki, “Distilling Reinforcement Learning Tricks for Video Games“, IEEE Conference on Games, Virtual, 2021.
B. Chettri, R. González Hautamäki, M. Sahidullah, T. Kinnunen, “Data Quality as Predictor of Voice Anti-Spoofing Generalization”, Proc. Interspeech, 1659-1663, Brno, Czech Republic, 2021.
J. Turkia, L. Mehtätalo, U. Schwab, and V. Hautamäki, “Mixed-Effect Bayesian Network Reveals Personal Effects of Nutrition“, Scientific Reports, Vol. 11, No. 12016, 2021.
K.A. Lee, V. Vestman, and T. Kinnunen, “ASVtorch Toolkit: Speaker Verification with Deep Neural Networks”, SoftwareX, Volume 14, 100697, June 2021.
K. Ishihara, A. Kanervisto, J. Miura and V. Hautamäki, “Multi-task Learning with Attention for End-to-end Autonomous Driving“, CVPR 2021 Workshop on Autonomous Driving, 2021.
X. Liu, M. Sahidullah, T. Kinnunen, “Learnable MFCCs for Speaker Verification”, Proc. IEEE Int. Symp. Circuits and Systems (ISCAS 2021), Daegu, Korea, May 2021
A. Nautsch, X. Wang, N. Evans, T. Kinnunen, V. Vestman, M. Todisco, H. Delgado, M. Sahidullah, J. Yamagishi, K.A. Lee, “ASVspoof 2019: Spoofing Countermeasures for the Detection of Synthesized, Converted and Replayed Speech”, IEEE Transactions on Biometrics, Behavior, and Identity Science, 3(2): 252–265, April 2021
M. Sahidullah, A.K. Sarkar, V. Vestman, X. Liu, R. Serizel, T. Kinnunen, Z.-H. Tan, E. Vincent, “UIAI System for Short-Duration Speaker Verification Challenge 2020”, Proc. IEEE Spoken Language Technology Workshop (SLT 2021), Shenzhen, China, January 2021

2020

R. K. Das, T. Kinnunen, W.-C. Huang, Z. Ling, J. Yamagishi, Y. Zhao, X. Tian, T. Toda, “Predictions of Subjective Ratings and Spoofing Assessments of Voice Conversion Challenge 2020 Submissions”, Proc. Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge, pp. 99–120, 2020.
Y. Zhao, W.-C. Huang, X. Tian, J. Yamagishi, R.K. Das, T. Kinnunen, Z. Ling, T. Toda, “Voice Conversion Challenge 2020: Intra-lingual semi-parallel and cross-lingual voice conversion”, Proc. Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge, pp. 80–98, 2020.
A. Sholokhov, T. Kinnunen, V. Vestman, K.A. Lee, “Extrapolating False Alarm Rates in Automatic Speaker Verification”, Proc. Interspeech 2020, pp. 4218–4222, Shanghai, China, October 2020
R. K. Das, X. Tian, T. Kinnunen, H. Li, “The Attacker’s Perspective on Automatic Speaker Verification: An Overview”, Proc. Interspeech 2020, pp. 4213–4217, Shanghai, China, October 2020
R. González Hautamäki and T. Kinnunen, “Why Did the x-Vector System Miss a Target Speaker? Impact of Acoustic Mismatch Upon Target Score on VoxCeleb Data”, Proc. Interspeech 2020, pp. 4313–4317, Shanghai, China, October 2020
X. Liu, M. Sahidullah, T. Kinnunen, “A Comparative Re-Assessment of Feature Extractors for Deep Speaker Embeddings”, Proc. Interspeech 2020, pp. 3221–3225, Shanghai, China, October 2020
T. Kinnunen, H. Delgado, N. Evans, K.A. Lee, V. Vestman, A. Nautsch, M. Todisco, X. Wang, M. Sahidullah, J. Yamagishi, D.A. Reynolds, “Tandem Assessment of Spoofing Countermeasures and Automatic Speaker Verification: Fundamentals”, IEEE/ACM Transactions on Audio, Speech, and Language Processing
I. Kukanov, TN. Trong, V. Hautamäki, SM. Siniscalchi, VM. Salerno, KA. Lee. “Maximal figure-of-merit framework to detect multi-label phonetic features for spoken language recognition” IEEE/ACM transactions on audio, speech, and language processing 28: 682-695. 2020
A. Kanervisto, C. Scheller, V. Hautamäki, “Action Space Shaping in Deep Reinforcement Learning“, IEEE Conference on Games 2020
A. Kanervisto, J. Pussinen, Ville Hautamäki, “Benchmarking End-to-End Behavioural Cloning on Video Games“, IEEE Conference on Games 2020
X. Wang, J. Yamagishi, M. Todisco, H. Delgado, A. Nautsch, N. Evans, M. Sahidullah, V. Vestman, T. Kinnunen, K.A. Lee, L. Juvela, P. Alku, Y.-H. Peng, H.-T. Hwang, Y. Tsao, H.-M. Wang, S. L. Maguer, M. Becker, F. Henderson, R. Clark, Y. Zhang, Q. Wang, Y. Jia, K. Onuma, K. Mushika, T. Kaneda, Y. Jiang, L.-J. Liu, Y.-C. Wu, W.-C. Huang, T. Toda, K. Tanaka, H. Kameoka, I. Steiner, D. Matrouf, J. -F. Bonastre, A. Govender, S. Ronanki, J.-X. Zhang, Z.-H. Ling, “ASVspoof 2019: a large-scale public database of synthetic, converted and replayed speech”, Computer Speech & Language, 64, November 2020
A. Kanervisto, J. Karttunen, V. Hautamäki, “Playing Minecraft with Behavioural Cloning“, PMLR post proceedings – Competition Track@NeurIPS2019, 2020
B. Chettri, T. Kinnunen, E. Benetos, “Deep Generative Variational Autoencoding for Replay Spoof Detection in Automatic Speaker Verification”, Computer Speech & Language, 63: 1–18, September 2020
V. Vestman, K.A. Lee, T. Kinnunen, “Neural i-Vectors”, Proc. Odyssey 2020, pp. 67–74, Tokyo, Japan, Nov. 2020
B. Chettri, T. Kinnunen, E. Benetos, “Subband modeling for Spoofing Detection in Automatic Speaker Verification”, Proc. Odyssey 2020, pp. 341–348, Tokyo, Japan, Nov. 2020
A. Kanervisto, V. Hautamäki, T. Kinnunen, J. Yamagishi, “An Initial Investigation on Optimizing Tandem Speaker Verification and Countermeasure Systems Using Reinforcement Learning”, Proc. Odyssey 2020, pp. 151–158, Tokyo, Japan, Nov. 2020
J. Karttunen, A. Kanervisto, V. Kyrki, Ville Hautamäki, “From Video Game to Real Robot: The Transfer Between Action Spaces“, IEEE ICASSP, Virtual conference, May 2020
A. Sholokhov, T. Kinnunen, V. Vestman, K.A. Lee, “Voice Biometrics Security: Extrapolating False Alarm Rate via Hierarchical Bayesian Modeling of Speaker Verification Scores”, Computer Speech & Language, 60: 1–19, March 2020

2019

R. González Hautamäki and T. Kinnunen, “Towards Controlling False Alarm — Miss Trade-Off in Perceptual Speaker Comparison via Non-Neutral Listening Task Framing”, Proc. IEEE ASRU, December 2019, Singapore
A. Sholokhov, T. Kinnunen, V. Vestman, K.A. Lee, “Voice Biometrics Security: Extrapolating False Alarm Rate via Hierarchical Bayesian Modeling of Speaker Verification Scores”, Computer Speech & Language, 60: 1–19, March 2020
A. Kato and T. Kinnunen, “Statistical Regression Models for Noise Robust F0 Estimation Using Recurrent Deep Neural Networks”, IEEE/ACM Transactions on Audio, Speech, and Language Processing, 27(2):2336–2349, December 2019.
R. González Hautamäki, V. Hautamäki, T. Kinnunen, “On Limits of Automatic Speaker Verification: Explaining Degraded Recognizer Score Through Acoustic Changes Resulting from Voice Disguise”, Journal of the Acoustic Society of America, 146(1): 693–704, July 2019
V. Vestman, T. Kinnunen, R. Gonzalez Hautamäki, M. Sahidullah, “Voice Mimicry Attacks Assisted by Automatic Speaker Verification”, Computer Speech & Language, 59: 36–54, January 2020.
V. Vestman, K. A. Lee, T. Kinnunen, T. Koshinaka, “Unleashing the Unused Potential of I-Vectors Enabled by GPU Acceleration”, Proc. Interspeech 2019, pp. 351–355, Graz, Austria, September 2019
M. Todisco, X. Wang, V. Vestman, M. Sahidullah, H. Delgado, A. Nautsch, J. Yamagishi, N. Evans, T. Kinnunen, K. A. Lee, “ASVspoof 2019: Future Horizons in Spoofed and Fake Audio Detection”, Proc. Interspeech 2019, pp. 1008–1012, Graz, Austria, September 2019
Bilal Soomro, Anssi Kanervisto, Trung Ngo Trong and Ville Hautamäki, “Towards Debugging Deep Neural Networks by Generating Speech Utterances“, Proc. Interspeech 2019,, pp. 3213-3217, Graz, Austria, September 2019. Github
K. A. Lee, V. Hautamäki, T. Kinnunen, H. Yamamoto, K. Okabe, V. Vestman, J. Huang, G. Ding, H. Sun, A. Larcher, R. K. Das, H. Li, M. Rouvier, P. Bousquet, W. Rao, Q. Wang, C. Zhang, F. Bahmaninezhad, H. Delgado, M. Todisco, Q. Wang, L. Guo, T. Koshinaka, J. Zhang, K. Shinoda, T. N. Trong, M. Sahidullah, F. Lu, Y. Tang, M. Tu, K. K. Teh, H. D. Tran, K. K. George, I. Kukanov, F. Desnous, J. Yang, E. Yılmaz, L. Xu, J. Bonastre, C. Xu, Z. H. Lim, E. S. Chng, S. Ranjan, J. H. L. Hansen, J. Patino, N. Evans, “I4U Submission to NIST SRE 2018: Leveraging from a Decade of Shared Experiences”, Proc. Interspeech 2019, pp. 1497–1501, Graz, Austria, September 2019
X. Wu, E. Granger, T. Kinnunen, X. Feng, A. Hadid, “Audio-Visual Kinship Verification in the Wild”, 12th IAPR International Conference On Biometrics (ICB 2019), Crete, Greece, June 2019. [PDF]
T. Kinnunen, R. González Hautamäki, V. Vestman, M. Sahidullah, “Can We Use Speaker Recognition Technology to Attack Itself? Enhancing Mimicry Attacks Using Automatic Target Speaker Selection”, Proc. IEEE ICASSP, pp. 6146–6150, Brighton, UK, May 2019 [PDF]
V. Vestman, B. Soomro, A. Kanervisto, V. Hautamäki, T. Kinnunen, “Who Do I sound Like? Showcasing Speaker Recognition Technology by YouTube Voice Search”, Proc. IEEE ICASSP, pp. 5781–5785, Brighton, UK, May 2019 [PDF]
E. Jokinen, R. Saeidi, T. Kinnunen, P. Alku, “Vocal Effort Compensation for MFCC Feature Extraction in a Shouted Versus Normal Speaker Recognition Task”, Computer Speech & Language, 53: 1-11, January 2019 IF=1.776 JF=

2018

M. Sahidullah, H. Delgado, M. Todisco, T. Kinnunen, N. Evans, J. Yamagishi, K.A. Lee, “Introduction to Voice Presentation Attack Detection and Recent Advances”, book chapter in Handbook of Biometric Anti-Spoofing: Presentation Attack Detection, Springer, S. Marcel, M.S. Nixon, J. Fierrez, N. Evans (Eds.), Springer, 2018 [PDF]
F. Fang, J. Yamagishi, I. Echizen, M. Sahidullah, T. Kinnunen, “Transforming Acoustic Characteristics to Deceive Playback Spoofing Countermeasures of Speaker Verification Systems”, Proc IEEE Int. Workshop on Information Forensics and Security (WIFS 2018), Hong Kong, China, 2018 [PDF]
M. Todisco, H. Delgado, K.A. Lee, M. Sahidullah, N. Evans, T. Kinnunen, J. Yamagishi, “Integrated Presentation Attack Detection and Automatic Speaker Verification: Common Features and Gaussian Back-end Fusion”, Proc. Interspeech 2018, pp. 77-81, Hyderabad, India, September 2018 [PDF] JF=1
A. Kato, T. Kinnunen, “Waveform to Single Sinusoid Regression to Estimate the F0 Contour from Noisy Speech Using Recurrent Deep Neural Networks”, Proc. Interspeech 2018, pp. 327-331, Hyderabad, India, September 2018 [PDF] JF=1
S. Sieranoja, M. Sahidullah, T. Kinnunen, J. Komulainen, A. Hadid, “Audiovisual Synchrony Detection with Optimized Audio Features”, accepted to IEEE 3rd Int. Conference on Signal and Image Processing (ICSIP 2018), Shenzhen, China, July 2018 [PDF] JF=0
T. N. Trong, V. Hautamäki, and K. Jokinen, “Staircase Network: structural language identification via hierarchical attentive units“, Proc. Odyssey 2018, pp. 60-67, Les Sables d’Olonne, France, June 2018 JF=1
T. Kinnunen, K.A. Lee, H. Delgado, N. Evans, M. Todisco, M. Sahidullah, J. Yamagishi, D.A. Reynolds, “t-DCF: a Detection Cost Function for the Tandem Assessment of Spoofing Countermeasures and Automatic Speaker Verification”, Proc. Odyssey 2018, pp. 312-319, Les Sables d’Olonne, France, June 2018 [PDF] JF=1
T. Kinnunen, J. Lorenzo-Trueba, J. Yamagishi, T. Toda, D. Saito, F. Villavicencio, Z. Ling, “A Spoofing Benchmark for the 2018 Voice Conversion Challenge: Leveraging from Spoofing Countermeasures for Speech Artifact Assessment”, Proc. Odyssey 2018, pp. 187-194, Les Sables d’Olonne, France, June 2018 [PDF (original)], [PDF (corrected version) from arXiv with a bug fix)] JF=1
J. Lorenzo-Trueba, J. Yamagishi, T. Toda, D. Saito, F. Villavicencio, T. Kinnunen, Z. Ling, “The Voice Conversion Challenge 2018: Promoting Development of Parallel and Nonparallel Methods”, Proc. Odyssey 2018, pp. 195-202, Les Sables d’Olonne, France, June 2018 [PDF] [The data and challenge results are available here] JF=1
V. Vestman and T. Kinnunen, “Supervector Compression Strategies to Speed up I-Vector System Development”, Proc. Odyssey 2018, pp. 357-364, Les Sables d’Olonne, France, June 2018 [PDF] JF=1
R. Gonzalez Hautamäki, A. Kanervisto, V. Hautamäki, T. Kinnunen, “Perceptual Evaluation of the Effectiveness of Voice Disguise by Age Modification”, Proc. Odyssey 2018, pp. 320-326, Les Sables d’Olonne, France, June 2018 [PDF] JF=1
A. Kato and T. Kinnunen, “A Regression Model of Recurrent Deep Neural Networks for Noise Robust Estimation of the Fundamental Frequency Contour of Speech”, Proc. Odyssey 2018, pp. 275-282, Les Sables d’Olonne, France, June 2018 [PDF] JF=1
J. Lorenzo-Trueba, F. Fang, X. Wang, I. Echizen, J. Yamagishi, T. Kinnunen, “Can we steal your vocal identity from the Internet? Initial investigation of cloning Obama’s voice using GAN, WaveNet and low-quality found data”, Proc. Odyssey 2018, pp. 240-247, Les Sables d’Olonne, France, June 2018 [PDF] JF=1
H. Delgado, M. Todisco, M. Sahidullah, N. Evans, T. Kinnunen, K.A. Lee, J. Yamagishi, “ASVspoof 2017 Version 2.0: meta-data analysis and baseline enhancements”, Proc. Odyssey 2018, pp. 296-303, Les Sables d’Olonne, France, June 2018 [PDF] JF=1
T. Leppänen, H. Vrzakova, R. Bednarik, A. Kanervisto, A. Elomaa, A. Huotarinen, P. Bartczak, M. Fraunberg, J. Jääskeläinen, “Augmenting Microsurgical Training: Microsurgical Instrument Detection Using Convolutional Neural Networks”, Proc. CBMS 2018, pp. 211-216, Karlstad, June 2018 JF=1
T. N. Trong, K. Jokinen and V. Hautamäki, “Enabling Spoken Dialgoue Systems for low-resourced languages – end-to-end dialect recognition for North Sami“, IWSDS 2018, Singapore, May 2018 [Best paper award]
I. Kukanov, V. Hautamäki and Kong Aik Lee, “Maximal Figure-of-Merit Embedding for Multi-label Audio Classification“, Proc. ICASSP 2018, pp. 136-140, Calgary, Canada, April 2018 JF=1
V. Vestman, D. Gowda, M. Sahidullah, P. Alku, and T. Kinnunen, “Speaker Recognition from Whispered Speech: a Tutorial Survey and an Application of Time-Varying Linear Prediction”, Speech Communication, 99: 62-79, May 2018 IF=1.585 JF=2
M. Sahidullah, D. Thomsen, R. Gonzalez Hautamäki, T. Kinnunen, Z.-H. Tan, R. Parts, and Martti Pitkänen, “Robust Voice Liveness Detection and Speaker Verification Using Throat Microphones”, IEEE/ACM Trans. on Audio, Speech, and Language Processing, 26(1): 44-56, January 2018 [PDF] IF=2.95 JF=2
A. Sholokhov, M. Sahidullah, T. Kinnunen, “Semi-Supervised Speech Activity Detection with an Application to Automatic Speaker Verification”, Computer Speech & Language, 47:132-156, January 2018 IF=1.90 JF=2

Group has separated from machine learning group on 2018. Articles published before 2018 can be found here.