Interspeech 2025

Interspeech 2025 Satellite Events

This page is constantly updated with new events as soon as they get approved, so make sure to check it often to learn more about the latest approved Satellite Events at Interspeech 2025!

The 6th Clarity Workshop on Improving Speech-in-Noise for Hearing Devices (Clarity-2025)

The Clarity Workshop focuses on improving speech understanding in noisy environments through advances in speech processing, auditory perception, and assistive technology. It brings together researchers, engineers, and industry professionals working on innovations in hearing aids, hearables, and AI-driven speech enhancement. Speech intelligibility in noisy settings is a major challenge—not only for individuals with hearing impairment but also for users of hearables, communication headsets, and consumer audio devices. Background noise increases listening effort and reduces communication clarity, affecting over 360 million people with disabling hearing loss as well as professionals in noisy workplaces and users of augmented listening devices. Recent advances in machine learning, digital signal processing (DSP), and edge AI have enabled real-time speech enhancement on hearing aids, hearables, and earbuds. Low-power AI models now run on-device, while cloud-assisted processing enhances performance without draining battery life. Speech separation, statistical speech modeling, and acoustic scene analysis are also being integrated into consumer listening devices, improving speech clarity in noise. The workshop provides a platform for discussing the next generation of speech-in-noise solutions, exploring how emerging technologies can enhance speech intelligibility across diverse applications.

Website: https://claritychallenge.org/clarity2025-workshop

4th COG-MHEAR International Audio-Visual Speech Enhancement Challenge (AVSEC-4)

The 4th COG-MHEAR International Audio-Visual Speech Enhancement Challenge (AVSEC-4) brings together researchers from interdisciplinary fields working on audio-visual speech technologies and hearing research. With opportunities for oral and poster presentations, we expect to provide a space to reflect on the scope and limitations of current audio-visual speech technologies, their potential applications, and the challenges of real-world deployment and standardised evaluation. Featuring keynote talks and a panel session with invited academic and industry experts, we aim to provide valuable insights into future research directions in the field. Results of AVSEC-4 will be announced during the workshop.

Website: https://challenge.cogmhear.org/#/getting-started/call-for-papers

DiSS 2025 - Disfluency in Spontaneous Speech

DiSS workshop – Disfluency in Spontaneous Speech, will have its 12th edition this year and will be held on September 4th-5th, in Lisbon, Portugal, at ISCTE IUL, Lisbon. This year’s theme is “Disfluencies in the Age of AI: A Multidisciplinary View“. To support the multidisciplinary perspective adopted as a motto for this workshop, the organizing and technical programme committee have distinct backgrounds and also represent different regions of the globe. The submissions are also encouraged from all fields that deal with disfluency, paralinguistics, and other related phenomena. This includes, among others: psychology, neuropsychology and neurocognition, psycholinguistics, linguistics, speech production and perception, conversational AI, gesture analysis, computational linguistics, speech technology, dialogue systems, human-centered AI, brain-computer Interfaces, healthcare, and generative AI. Continuing the motto of the multidisciplinarity of our workshop, the keynote speakers also reflect different areas of expertises, with extensive experience in the study of Disfluencies and paralinguistic events. Petra Wagner and Simon Beltz, both from Bielefeld University, covering a phonetics approach of disfluencies, and Gopala Anumanchipalli from the University of California, Berkeley, tackling automatic processing of disfluencies and its applications. The workshop continues the long tradition of studies in Disfluencies and is an ISCA endorsed event. The proceedings will be made available through the ISCA archive.

Website: https://diss2025.inesc-id.pt

11th Doctoral Consortium @ Interspeech 2025

ISCA’s Student Advisory Committee invites doctoral students in speech-related fields to join the 11th Doctoral Consortium at Interspeech 2025 on August 16 at TU Delft. Participants will receive expert mentoring on their dissertation projects during a dedicated 30-minute slot. DC is ideal whether you’re just starting or deep into your research. We especially welcome submissions from students preparing for qualification exams. Submit a concise two-page abstract (with an optional additional page for references) by May 27, 2025, 23:59 AOE. For further details, contact Fabian Ritter-Gutierrez at sac@isca-speech.org or visit www.isca-students.org.

Website: https://sites.google.com/view/dc-2025/about

SLaTE-2025 - Workshop on Speech and Language Technology in Education

SLaTE 2025, the 10th Workshop on Speech and Language Technology in Education (SLaTE) will be held on August 22-24, 2025, at Radboud University in Nijmegen, Netherlands. The SLaTE 2025 workshop is a satellite workshop of Interspeech 2025, and is organized in conjunction with WOCCI 2025, the Workshop on Child Computer Interaction. SLaTE 2025 focuses on advancing the use of speech and natural language processing technologies in educational settings. Key areas of focus include speech technology for education, natural language processing for education, spoken dialogue systems, applications of speech and/or NLP for education, intelligent tutoring systems using NLP or speech, development of language resources for educational applications, and assessment of SLaTE methods and applications. At SLaTE-2025, there will also be a special session about the "Speak & Improve Challenge 2025" on spoken language assessment and feedback. The "Speak & Improve" corpus, that will be made available, offers unique opportunities to advance spoken language processing technologies. Two types of papers can be submitted to the SLaTE 2025 workshop: [1] Full 5 page papers; Experimental papers, Discursive or position papers, etc. [2] Short 2 pages papers; Demo papers, Early-career/Student papers, etc. Papers submitted to the workshop are peer reviewed by an international panel of experts; and accepted papers are published in the ISCA archive with individual DOIs.

Website: https://sites.google.com/view/slate-2025/

Speech Synthesis Workshop 2025

The Speech Synthesis Workshop (SSW) is the main meeting place for research and innovation in speech synthesis—the process of generating speech signals from text input. Text-to-Speech (TTS) technology is a key component of numerous applications, including speech-to-speech translation, digital assistants, conversational agents, and social robots. While early research focused on basic intelligibility, contemporary systems now achieve remarkable naturalness. Current research frontiers include emotional expression, speaking style control, and efficient deployment for the world’s languages. SSW welcomes contributions not only in core TTS technology but also from researchers in related fields, including phoneticians, phonologists, linguists, and neuroscientists, as well as experts in multimodal human-machine interaction. Since 1990, SSWs have been held every three years under the auspices of ISCA’s special interest group SynSIG. In 2019, it was decided to hold an SSW every two years, as the technology is advancing more rapidly. The 13th ISCA Speech Synthesis Workshop will be held in Leeuwarden, Netherlands, from Sunday, August 24, to Tuesday, August 26, 2025. This year's theme is "Scaling Down: Sustainable Synthesis for Language Diversity," and the Blizzard Challenge edition associated with SSW aligns with this theme by challenging participants to synthesize speech in Bildts, a Dutch Low Saxon dialect historically spoken in Friesland.

Website: https://www.rug.nl/cf/studeren-bij-cf/ssw13-speech-synthesis-workshop-2025/?lang=en

5th Symposium on Security and Privacy in Speech Communication (SPSC Symposium)

Speech and voice are media through which we express ourselves. Speech communication is not only critical to convey emotions and identify oneself, today speech and voice is employed to interact with technology, like virtual assistants and smart devices. How can we strengthen security and privacy for speech representation types in user-centric human/machine interaction? Interdisciplinary exchange is in high demand. The need to better understand and develop user-centric security solutions and privacy safeguard in speech communication is of growing importance for commercial, forensic, and government applications. The SPSC Symposium is a platform to seek better designed services and products, as well as better informed policy papers for legislators and governance. The symposium is organized by the ISCA SPSC special interest group. The fifth edition of the SPSC Symposium aims at bringing together researchers and practitioners across multiple disciplines – more specifically: signal processing, cryptography, security, human-computer interaction, law ethics, and anthropology. We’re planning to have a keynote speaker on ethics/law and a tutorial.

Website: https://spsc-symposium.de/

The 28th International Conference on Text, Speech, and Dialogue (TSD 2025)

The 28th International Conference on Text, Speech and Dialogue (TSD 2025) will take place on August 25–28, 2025, in Erlangen, Germany. Organized by the University of West Bohemia, Masaryk University, and Friedrich-Alexander-Universität Erlangen–Nürnberg, the conference serves as a major forum for the presentation of cutting-edge research in the processing of both written and spoken language.

TSD 2025 invites contributions on a broad range of topics, including but not limited to automatic speech recognition and synthesis, multilingual and expressive language generation, dialogue systems, sentiment and credibility analysis, large language models, corpora and lexicons, multimodal interaction, and applications such as machine translation, question answering, and assistive technologies. Emphasis is placed on multilingual and cross-lingual approaches, self-learning systems, and real-world deployments.

Accepted papers will be published in Springer’s Lecture Notes in Artificial Intelligence (LNAI) series and indexed in major scientific databases (SCOPUS, DBLP, CPCI, etc.). The program will feature oral and poster sessions, invited talks, demos, and a rich social agenda. The submission deadline is May 30, 2025.

Website: https://www.kiv.zcu.cz/tsd2025/

WOCCI 2025 – Workshop on Child Computer Interaction

Workshop on Child Computer Interaction (WOCCI) 2025 aims to unite together researchers and practitioners from academia and industry focused on child-machine interaction, including computer, robotics and multi-modal interfaces. Children are unique at both the acoustic/linguistic level and the interaction level. WOCCI provides a unique opportunity for bringing together different research communities in diverse fields—cognitive science, computer vision, robotics, speech processing, linguistics, and applications in medicine and education— to connect and collaborate. Participants can showcase innovative systems that will shape the future of child-centred computer interaction. As we face increasing challenges in education and health, technology plays a crucial role in supporting the well-being of our society. Noticeable examples are remedial treatments for children with or without disabilities and individualized attention. WOCCI will be a platform for sharing the latest advancements in core technologies, experimental systems, and prototypes. The 7th WOCCI 2025 will take place in Nijmegen on August 22, 2025, in conjunction with the 10th Workshop on Speech and Language Technology in Education (SLaTE 2025).

Website: https://sites.google.com/view/wocci-isca-is25

Workshop on Multilingual Conversational Speech Language Model

Large Language Models (LLMs) have demonstrated remarkable capabilities in a wide range of downstream tasks, serving as powerful foundation models for language understanding and generation. Furthermore, there has been significant attention on utilizing LLMs in speech and audio processing tasks such as Automatic Speech Recognition (ASR), Audio Captioning, and emerging areas like Spoken Dialogue Models. However, real-world conversational speech data is critical for the development of robust LLM-based Spoken Dialogue Models, as it encapsulates the complexity of human communication, including natural pauses, interruptions, speaker overlaps, and diverse conversational styles. The limited availability of such data, especially in multilingual settings, poses a significant challenge to advancing the field.

The importance of real-world conversational speech extends beyond technological advancement—it is essential for building AI systems that can understand and respond naturally in multilingual, dynamic, and context-rich environments. This is especially crucial for next-generation human-AI interaction systems, where spoken dialogue serves as a primary mode of communication. Thus, this workshop aims to bridge the gap by hosting the challenge of building multilingual conversational speech language models together with the release of a real-world multilingual conversational speech dataset.

The challenge consists of two tasks, both of which require participants to explore the development of speech language model:

Task 1: Multilingual Conversational Speech Recognition
- Objective: Develop a multilingual LLM based ASR model
- This task focuses on optimizing transcription accuracy in a multilingual setting.
Task 2: Multilingual Conversational Speech Diarization and Recognition
- Objective: Develop a system for both speaker diarization (identifying who is speaking when), and recognition (transcribing speech to text).
- Both pipeline-based and end-to-end systems are encouraged, providing flexibility in system design and implementation.

Website: https://www.nexdata.ai/competition/mlc-slm

Young Female* Researchers in Speech Workshop (YFRSW)

YFRSW is a workshop for female* Bachelor’s and Master’s students currently working in speech science and technology. The workshop aims to promote interest in research in our field among women* who have not yet committed to pursuing a PhD in speech science or technology, but who have already gained research experience at their universities through individual or group projects. For the application process and registration information, please visit our website.

*The workshop is open for marginalized genders, including women, as well as non-binary and gender non-conforming people who are comfortable in a space that is centered on women’s experiences in the speech science and technology community. We aim to offer an inclusive and accessible program. If you are unsure if this workshop is for you, please don’t hesitate to reach out to us via email (youngfemaleresearchersinspeech@gmail.com)!

Website: https://sites.google.com/view/yfrsw-2025/

Interspeech 2025

PCO: TU Delft Events

Delft University of Technology

Communication Department

Prometheusplein 1

2628 ZC Delft

The Netherlands

Email: pco@interspeech2025.org

X (formerly Twitter): @ISCAInterspeech

Bluesky: @interspeech.bsky.social

Interspeech 2025 is working under the privacy policy of TU Delft

Interspeech 2025 Satellite Events​​​

Interspeech 2025 Satellite Events