10:00-12:00 Pre-conference workshop, Inés Matres, Mietta Lennes, Masoud Fatemi: FIN-CLARIAH tools to make sense of web data
12:30 -13:15 Opening of DRDHum 2024, Michael Pace-Sigge; Esa Pentillä, HoS; Laura Hirsto, 1st Vice-Dean
13:20-14:20 Plenary 1: ANNA FOKA, Intro & Chair Jenni Merovuo
14:30-15:35 Block 1
Machine Learning, Chair Michael Pace-Sigge
Tony Berber Sardinha: Assessing the Linguistic Characteristics of AI-Generated Texts Across Different Registers
Erik Henriksson, Amanda Myntti & Veronika Laippala: Using deep learning to examine cross-linguistic similarities of registers
Old Texts & Archives, Chair Michael Rießler
Ágnes Telek: Can the archives become as cool as a museum? – Data Circulation in the Budapest Time Machine
Janine Siewert: A Dialectometric Study of Low Saxon Syntactic Variation through Time
Speech
Jaakko Kauramäki, Satu Saalasti & Kerttu Huttunen: Predicting Language Outcomes Across Diverse Longitudinal Cohorts: A Machine Learning Approach back in Speech Therapy for People with Aphasia
Eugenia Rykova: AI-based Personalized Feedback in Speech Therapy for People with Aphasia (via zoom)
Text re-generation
María do Campo Bayón & Pilar Sánchez-Gijón: Avoiding Generatese: the optimization of NLG Systems through fit-for-purpose data collections
Koldo Garai: Quasi-Parallel Corpora for Less-Resourced Languages: Parallelized Translations of Plato´s Faidon in Basque and Finnish
15:35-15:55 Coffee Break
15:55-17:35 Block 2
Machine Learning cont., Chair Michael Pace-Sigge
Katarzyna Wiśniewska & Benedikt Perak: In Search of the Invisible: GPT in An Investigation of Hidden Semantic Information
Marja Laasonen, Rosa González Hautamäki, Federico Malato, Jade Plym, Sini Smolander, Eva Arkkila, Pekka Lahti-Nuuttila, Sari Kunnari, Penny Levickis, Cristina McKean & Patricia Eadie: Predicting Language Outcomes Across Diverse Longitudinal Cohorts: A Machine Learning Approach
Pyry Kantanen, Kati Kataja & Leo M Lahti: Sentiment analysis for detecting suicidal youths’ positive and negative encounters with public service providers
Dana Roemling: Digital Methodologies in Forensic Linguistic Authorship Analysis: Social Media Data and Computational Approaches in Geolinguistic Profiling
Music and AI
Leo M Lahti, Pyry Kantanen & Akewak Jeba: Towards Open Source Ecosystem for European Music Data
William Matthew Randall: Event-based Experience Sampling of Music Listening with the MuPsych app
Maria Claudia Nunes Delfino & Tony Berber Sardinha: Artificial Melodies: Investigating the Limits of AI in Replicating Human Songwriting
Visual Aspects, Chair Satu Saalasti
Michele Varini: The Wanders of the Invisible World. Astrology and magic-superstitious beliefs on social networks
Juhana Venäläinen: Hiking with Machine’s Eyes: A Computer Vision Exploration of Nature Photography in Instagram
Anca Serbanescu: Navigating the ethical and legal dimensions of Human-AI co-creativity in Interaction Design
Literature 1
Julia Matveeva, Osma Suominen & Leo M Lahti: Automating data curation for the Finnish national bibliography Fennica
Kati Launis, Aino Mäkikalli, Viola Parente-Čapková, Veli-Matti Pynttäri , Leo Lahti & Osma Suominen: Data-rich History for 19th Century Literature in Finland
Lauri Luoto, Leo M Lahti & Kati Launis: Defining the core characters and events of a fictional narrative by two mode social network analysis
17:35 Poster Session
Minna Mundoli, Reetta Viljanen, Jaakko Kauramäki, Katja Dindar, Satu Saalasti & Kerttu Huttunen: Collecting digital research data using smart devices from deaf and hard of hearing children training speechreading
Riikka Marttila, Sanna Joska, Mikko Lipsanen, Atte Föhr & Ilkka Jokipii: Handwritten Text Recognition (HTR) model for historical documents from 17th to 20th centuries – Using TrOCR
Mirkka Forssell, Marjo Tourula, Anna Liisa Suominen, Hanna Tenhunen & Elias Vaattovaara: The three universities’ cooperated management studies in the specialist training in medicine and dentistry
Adélie Laruncet: Grasping the ‘Freedom of Speech’ Argument on Social Media, Between Circulation and Escalation: Cross-Contributions of Digital Methods and Social Psychology
Sari Karjalainen: Encounters between the worlds of visual arts, easy language and AI
Gabriele Lieber, Sandra Reimann, Michael Rießler & Amelie Tahedl: Letter_2_Santa.py – Tapping Big Data from the Arctic Circle
Anna Kajander: Automated tool for sharing experiential knowledge: the case of Human Science section in the Digital Citizen Science Center of University of Jyväskylä
Kimmo Katajala, Jenni Merovuo, Antti Härkönen & Kasper Kepsu: Towns on the Eastern Border of the Swedish Great Power
Tommi Jauhiainen, Erik Henriksson, Heidi Jauhiainen & Marja Vierros: Data Sources for Automatic Classification and Analysis of Texts from Egyptian Antiquity
Jaakko Kauramäki, Satu Saalasti & Kerttu Huttunen: Practical solutions for digitally administering and scoring of a children’s speechreading test
Jenna Saarni & Otto Tarkka: Quantitative and qualitative approach to Finnish Twitter during the Covid-19 pandemic: Topics, attitudes, and emotions
Marilisa Shimazumi & Tony Berber Sardinha: Exploring the Potential of AI-Generated Texts to Replace Human-Written Content in Language Education
Olga Kellert: How good is AI at Natural Language Understanding and Inferencing?
Linguistics: Northern European languages, Chair Alexandre Nikolaev
Saara Hellström: Comparing French and Swedish web registers using multilingual word vectors
Sergei Kruk: Ambiguous grammatical forms and power relations. A statistical analysis of Latvian corpora
Political Discourses 1, Chair Michael Pace-Sigge
Ester Di Silvestro & Marco Venuti: A Comparative Corpus-based Discursive News Values Analysis of Liz Truss’ and Rishi Sunak’s representation in the British Press
Tibor Vocásek & Raquel Amaro: Regulation of AI? Comparing Czech and Portuguese Media Imaginaries with CADS
Social Media, Chair Temitayo Olatoye
Selcen Erten Johansson: Finland’s navigation towards NATO: How is it portrayed in Turkish digital media?
Federica Silvestri: Linguistic Practices and Identity Construction on Social Media: Italian Americans on Instagram
12:00 – 13:30 Lunch
13:30-15:10 Block 4
Learning and Teaching 2
Martin Laun, Katharina Hirt, Eva L. Wyss & Fabian Wolff: The Trust Divide: Chatbots’ Superior Performance and Skeptical Students
Martin Laun, Katharina Hirt, Eva L. Wyss & Fabian Wolff: Mitigating heterogeneity in the classroom? – Chatbots as support in nursing training
Jenny Tarvainen, Ida Toivanen & Ari Huhta: AI Literacy for study and working life – University students’ experiences from the pilot course
Linguistics: Northern European languages cont., Chair Jarmo H. Jantunen
Alexandre Nikolaev, Harald R. Baayen & Yu-Ying Chuang: Analyzing Finnish Inflectional Classes through Discriminative Lexicon Models
Lea Meriläinen, Heli Paulasto: A cancer of Finnish or a great happiness to us all? A corpus-assisted discourse study on the English language in the Suomi24 discussion forum
Kirsi Sandberg, Juho Karvinen, Aarne Ranta & Jyrki Nummenmaa: Word proximity and dependencies in parliamentary discourse in Finnish parliament
Political Discourses 2, Chair Jenni Merovuo
Tony Berber Sardinha, Maria Claudia Nunes Delfino, Ana Bocorny, Deise Prina Dutra, Simone Sarmento & Paula Tavares Pinto: Multi-Dimensional Collocational Analysis of Discourses around COVID-19 Therapies
Risto Turunen: Finding Patterns across Multiple Time Series Datasets: Democracy in the Twentieth-century Political Discourses in the United Kingdom, Sweden, and Finland
Social Media cont., Chair Juhana Venäläinen
Ilia Moshnikov & Eugenia Rykova: Tweets in Karelian: from data collection to the content analysis
Mingyao Song: Enhancing TikTok Content Success Prediction through Multimodal Fusion
Reeta Karjalainen: Manual data collection & qualitative analysis for social media data – “luddite” meme researcher insecurities in the age of AI
15:10-15:30 Coffee Break
15:30-16:30 Plenary 3: TONY McENERY, Intro & Chair Michael Pace-Sigge
16:35 Round Table, Chairs Michael Pace-Sigge & Temitayo Olatoye
20:00 Conference Dinner
10:00-11:45 Block 5
Learning and Teaching 3
Lingzhi Nie: Chinese vocabulary teaching in Spain: a proposal for the localisation of the International Standard for Chinese Language Levels
Päivi Kousa, Jenny Tarvainen: Automatic Language Proficiency Assessment of Written Texts: Training a CEFR classifier in L2-Finnish
Yanni Sun: Unwanted in the homeland? The image of Chinese international students on Chinese social media Zhihu
Literature 2, Chair Kati Launis
Jasmine Westerlund & Asko Nivala: The Atlas of Finnish Literature 1870–1940
Anna Biström: The forgotten 33%? Finland-Swedish literature from a database perspective
Gordana Galić Kakkonen: NLP-based Topical Analysis and Comparison of “Molokai” by Alan Brennert and “Night Calypso” by Lawrence Scott
Political Discourses 3
Kimmo Elo: How to identify ‘umbrella’ concepts not spoken out? Exploring German and Finnish plenary debates on ‘Democracy’ (1990-2020) with a TNA
Ella Lillqvist: Imaginaries of ownership and sustainability: A corpus-assisted study
Tendai Chari: Emerging Indigenous Language Usage Practices in Digital Newspaper Readers’ Comments in Zimbabwe
Identity
Laura Sofia Pensabene: #WOMENINSTEM: A Corpus-based Multimodal Critical Discourse Analysis of STEM Identity Construction and Advocacy Performance on Insta
Jarmo Harri Jantunen: Corpus-assisted critical discourse analysis on LGBTQ+ segregation and internal migration in Finland
Daria Kosinova: Gendered recruiting in social media: a case study in network marketing
11:45 Coffee Break
12:40-13:40 Plenary 4 MICHAELA MAHLBERG, Intro Mikko Laitinen, Chair Michael Pace-Sigge
13:45-14:30 Closing of Conference
Pre-Conference Workshop: FIN-CLARIAH tools to make sense of web data
Tuesday morning, 10:15 – 12:00, open to all attendees
If you like to attend, please tick the relevant box on the registration form. Note: Please bring your own laptop for practical exercises.
This 2-hour workshop presents the results, services, and ongoing work produced within FIN-CLARIAH (https://www.kielipankki.fi/organization/fin-clariah/). This research infrastructure is currently funded by the Research Council of Finland, and its activities aim at fostering data-intensive and digital research in Social Sciences and Humanities (SSH). The workshop introduces datasets and tools that are freely available for SSH researchers, including newspapers, periodicals, and other publications from the National Library of Finland, machine-readable records from the National Archives, Finnish parliamentary speeches, Twitch game streams, and social media data from the Nordic region. These novel tools and interfaces have been built to evaluate, subset, enrich, and analyze large-scale SSH datasets. Current efforts are directed towards supporting visual and multimodal research and developing transformer models for SSH research. In addition to introducing resources available, in this workshop we will focus on a selection of tools and datasets for social media and web data.
Overall, the FIN-CLARIAH consortium comprises two components, FIN-CLARIN and DARIAH-FI. The Language Bank of Finland (Kielipankki) provides centralized services for sharing and reusing materials and tools in the research community. In turn, the DARIAH-FI consortium consists of SSH research teams with high demands and expertise in data-intensive research committed to make what they develop (datasets, tools and methods) available to wider research communities.
This workshop offers an experimental and hands-on setting that complements the conference theme on digital applications in the advent of ML and AI. After providing an overview of FIN-CLARIAH and its core services, there will be a practical section with four resources for researchers in the format of a brief tutorial with time for attendees to try the showcased resources on their own laptops and to pose questions to the presenters. The workshop is open to all conference participants.