BENCHMARKING CLINICAL SPEECH RECOGNITION AND INFORMATION EXTRACTION
Speaker: Hanna Suominen (ANU) Seminar Date: Tuesday April 4 12:00pm Brief abstract: Over a tenth of preventable adverse events in health care are caused by failures in information flow. These failures are tangible in clinical handover; regardless of good verbal handover, from two-thirds to all of this information is lost after 3-5 shifts if notes are taken by hand, or not at all. Objective: We study automated speech recognition (ASR) and information extraction (IE) as a way to fill out a handover form for clinical proofing and sign-off. Methods: First, we introduced a Web app at http://nicta-stct.s3-website-ap-southeast-2.amazonaws.com to demonstrate the software system design and workflow by considering a form with 50 mutually exclusive categories. Second, to lower the entry barrier and encourage novelty in this field, we provided an open dataset and open-source software at https://www.nicta.com.au/nicta-synthetic-nursing-handover-open-data-soft… for the spoken and written natural language processing (NLP) tasks to fill out the form. These ASR and IE methods were evaluated against those by the participants of the CLEFeHealth evaluation laboratories in 2015 and 2016 using cross-validation techniques to measure processing correctness and statistical significance of its differences between the methods. Results: The data provided were a simulation of nursing handover, as recorded using a mobile device, built from 301 simulated patient records and handover scripts, spoken by an Australian registered nurse. ASR recognised up to 10, 244 (73 %) of 14, 095 test words correctly. IE trained, validated, and tested on 101, 100, and 100 records, respectively, achieved the F1 percentage of 81 in the category for irrelevant text and up to 100 in the 35 nonempty categories of the form (38 % on macro-average). Conclusions: The significance of this study hinges on opening our data, together with the related performance benchmarks and some processing software, to the research and development community for studying clinical documentation and NLP. Short Bio: Hanna Suominen was awarded her MSc by Research (incl. BPhil) in applied mathematics, PhD in computer science, and Adj/Prof in computer science in the University of Turku (UTU), Finland in 2005, 2009, and 2013, respectively. She joined The Australian National University (ANU) as a Senior Lecturer and Data61 as the Team Leader of Natural Language Processing within the Machine Learning Research Group after working in Data61/NICTA as a Senior Researcher and Researcher and the University of Turku as a Turku Centre for Computer Science (TUCS) Graduate Research Assistant, Coordinator, and Lecturer. Her over 100 publications have been published in A* journals and won best-paper/10%-elite-PhD/top-method awards. This work has also led to real-life products and scored competitive grants together with business-plan and teaching-excellence awards. Hanna’s research interests are developing and evaluating statistical machine learning methods for text analytics and health. Her aspiration is to bridge the gap between computing and heath/social sciences.