The ASRU 2025 Hackathon brought together students, researchers, and professionals to collaborate, innovate, and solve real-world challenges in speech and language technologies alongside the IEEE Automatic Speech Recognition and Understanding (ASRU) Workshop 2025. This two-day, intensive event focused on advancing the frontiers of speech and language research.

Focus Areas

The hackathon concentrated on cutting-edge topics at the intersection of speech processing, large language models (LLMs), and multimodal. Over 48 hours, teams worked to design and prototype innovative solutions addressing real-world challenges.

Event Details

DetailInformation
Dates5–6 December 2025
LocationThe Executive Ballroom in the Campus Center at the University of Hawaii
FormatIn-person, team-based (at least one person per team has to be in-person)
RecognitionThe top selected teams will present their work during the main ASRU 2025 program and will receive special awards.

 

  • This project evaluates the reasoning capabilities of Large Language Models (LLMs) on long-form meeting understanding. Using the NOTSOFAR-1 dataset, we compare reasoning based on the reference transcript with E2E MT-ASR transcripts, as well as reasoning from state-of-the-art audio-understanding models. Our evaluation encompasses multiple reasoning tasks including meeting summarization and information retrieval/question-answering, assessing the feasibility of true end-to-end meeting understanding and evaluating whether existing metrics adequately capture faithful, speaker-aware outputs that reflect who said what.

    • image.png
      Sathvik Udupa
    • image.png
      Alexander Polok
    • image.png
      Samuele Cornell
  • Our project is a real-time speech and LLM-powered co-pilot that addresses the challenge of missed non-verbal cues in remote patient check-ins. It performs parallel analyses of the transcript (what is said) and the prosody (how it's said). When our system detects a word-tone mismatch, it provides the Community Health Worker (CHW) with a discreet, real-time alert and an LLM-generated, context-aware suggestion. This tool moves beyond simple transcription to provide an "empathy-aware" layer, empowering CHWs to catch hidden distress and improve patient outcomes.

    • image.png
      Quang Loc Lam
    • image.png
      Akib Sadmanee
    • image.png
      Fahim Yasir
  • This project proposes an Accent-Aware Emotion Understanding Assistant that leverages Large Audio-Language Models (LALMs) to perform direct speech-to-understanding without relying solely on intermediate transcription. Our system captures both linguistic content and paralinguistic cues (tone, pauses, intensity, and rhythm) and produces empathetic reflections. This approach promotes fairness and inclusivity for accented and multilingual speakers by interpreting affective and contextual meaning directly from acoustic signals. The prototype will demonstrate how LALMs can power culturally sensitive, emotionally intelligent spoken interfaces.

    • Screenshot 2025-11-24 095518.png
      Chibuzor Okocha
    • Screenshot 2025-11-24 095537.png
      Detravious Brinkley