skip to main content
Work in Progress

Conversational AI in health: Design considerations from a Wizard-of-Oz dermatology case study with users, clinicians and a medical LLM

Published: 11 May 2024 Publication History

Abstract

Although skin concerns are common, access to specialist care is limited. Artificial intelligence (AI)-assisted tools to support medical decisions may provide patients with feedback on their concerns while also helping ensure the most urgent cases are routed to dermatologists. Although AI-based conversational agents have been explored recently, how they are perceived by patients and clinicians is not well understood. We conducted a Wizard-of-Oz study involving 18 participants with real skin concerns. Participants were randomly assigned to interact with either a clinician agent (portrayed by a dermatologist) or an LLM agent (supervised by a dermatologist) via synchronous multimodal chat. In both conditions, participants found the conversation to be helpful in understanding their medical situation and alleviate their concerns. Through qualitative coding of the conversation transcripts, we provide insight on the importance of empathy and effective information-seeking. We conclude with design considerations for future AI-based conversational agents in healthcare settings.

Supplemental Material

PDF File - Supplemental Material
A.1 Participant pre-interaction survey A.2 Participant post-interaction survey A.3 Participant follow-up survey A.4 Clinician post-interaction survey A.5 Qualitative codes

References

[1]
Dominique Ansell, James A G Crispo, Benjamin Simard, and Lise M Bjerre. 2017. Interventions to reduce wait times for primary care appointments: a systematic review. BMC Health Serv. Res. 17, 1 (April 2017), 295.
[2]
Gopi J Astik, Nita Kulkarni, Rachel M Cyrus, Chen Yeh, and Kevin J O’Leary. 2021. Implementation of a triage nurse role and the effect on hospitalist workload. Hospital Practice 49, 5 (2021), 336–340.
[3]
Adam Baker, Yura Perov, Katherine Middleton, Janie Baxter, Daniel Mullarkey, Davinder Sangar, Mobasher Butt, Arnold DoRosario, and Saurabh Johri. 2020. A comparison of artificial intelligence and human doctors for the purpose of triage and diagnosis. Frontiers in artificial intelligence 3 (2020), 543405.
[4]
Neeli M Bendapudi, Leonard L Berry, Keith A Frey, Janet Turner Parish, and William L Rayburn. 2006. Patients’ perspectives on ideal physician behaviors. In Mayo Clinic Proceedings, Vol. 81. Elsevier, Mayo Clinic Proceedings, England, UK, 338–344.
[5]
Virginia Braun and Victoria Clarke. 2012. Thematic analysis. American Psychological Association, Washington DC, USA.
[6]
PA Cameron, Belinda Jane Gabbe, Karen Smith, and Biswadev Mitra. 2014. Triaging the right patient to the right place in the shortest time. British journal of anaesthesia 113, 2 (2014), 226–233.
[7]
Bolin Cao, Shiyi Huang, and Weiming Tang. 2024. AI triage or manual triage? Exploring medical staffs’ preference for AI triage in China. Patient Education and Counseling 119 (2024), 108076.
[8]
Deborah Cline, Carolyn Reilly, and Jayne F Moore. 2004. What’s behind RN turnover?: Uncover the “real reason” nurses leave. Holistic Nursing Practice 18, 1 (2004), 45–48.
[9]
Mukhamad Fathoni, Hathairat Sangchan, and Praneed Songwathana. 2013. Relationships between triage knowledge, training, working experiences and triage skills among emergency nurses in East Java, Indonesia. Nurse Media Journal of Nursing 3, 1 (2013), 511–525.
[10]
Thomas B Fitzpatrick. 1988. The validity and practicality of sun-reactive skin types I through VI. Archives of dermatology 124, 6 (1988), 869–871.
[11]
Karen A Funk and Malia Davis. 2015. Enhancing the role of the nurse in primary care: the RN “co-visit” model. Journal of general internal medicine 30, 12 (2015), 1871–1873.
[12]
Aidan Gilson, Conrad W Safranek, Thomas Huang, Vimig Socrates, Ling Chi, Richard Andrew Taylor, David Chartash, 2023. How does ChatGPT perform on the United States medical licensing examination? The implications of large language models for medical education and knowledge assessment. JMIR Medical Education 9, 1 (2023), e45312.
[13]
Katelyn R Glines, Wasim Haidari, Leena Ramani, Zeynep M Akkurt, and Steven R Feldman. 2020. Digital future of dermatology. Dermatology online journal 26, 10 (2020), N/A.
[14]
Derek Haggett. 2022. N.B. woman shocked at four-year wait time to see dermatologist. https://atlantic.ctvnews.ca/n-b-woman-shocked-at-four-year-wait-time-to-see-dermatologist-1.5975452. Accessed: 2023-11-2.
[15]
Eunkyung Jo, Daniel A. Epstein, Hyunhoon Jung, and Young-Ho Kim. 2023. Understanding the Benefits and Challenges of Deploying Conversational AI Leveraging Large Language Models for Public Health Intervention. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (, Hamburg, Germany,) (CHI ’23). Association for Computing Machinery, New York, NY, USA, Article 18, 16 pages. https://doi.org/10.1145/3544548.3581503
[16]
William R. Kearns, Neha Kaura, Myra Divina, Cuong Vo, Dong Si, Teresa Ward, and Weichao Yuwen. 2020. A Wizard-of-Oz Interface and Persona-based Methodology for Collecting Health Counseling Dialog. In Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems (, Honolulu, HI, USA,) (CHI EA ’20). Association for Computing Machinery, New York, NY, USA, 1–9. https://doi.org/10.1145/3334480.3382902
[17]
Rafal Kocielnik, Elena Agapie, Alexander Argyle, Dennis T Hsieh, Kabir Yadav, Breena Taira, and Gary Hsieh. 2019. HarborBot: a chatbot for social needs screening. In AMIA Annual Symposium Proceedings, Vol. 2019. American Medical Informatics Association, American Medical Informatics Association, USA, 552.
[18]
Liliana Laranjo, Adam G Dunn, Huong Ly Tong, Ahmet Baki Kocaballi, Jessica Chen, Rabia Bashir, Didi Surian, Blanca Gallego, Farah Magrabi, Annie YS Lau, 2018. Conversational agents in healthcare: a systematic review. Journal of the American Medical Informatics Association 25, 9 (2018), 1248–1258.
[19]
Brenna Li, Tetyana Skoropad, Puneet Seth, Mohit Jain, Khai Truong, and Alex Mariakakis. 2023. Constraints and Workarounds to Support Clinical Consultations in Synchronous Text-based Platforms. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (, Hamburg, Germany,) (CHI ’23). Association for Computing Machinery, New York, NY, USA, Article 342, 17 pages. https://doi.org/10.1145/3544548.3581014
[20]
Society of Dermatology Physician Assistants. 2023. Patients Are Waiting: America’s Dermatology Wait Times Crisis. https://www.dermpa.org/page/GAPP. Accessed: 2023-11-2.
[21]
Vikas N O’Reilly-Shah. 2017. Factors influencing healthcare provider respondent fatigue answering a globally administered in-app survey. PeerJ 5 (2017), e3785.
[22]
Maria Panagioti, Efharis Panagopoulou, Peter Bower, George Lewith, Evangelos Kontopantelis, Carolyn Chew-Graham, Shoba Dawson, Harm Van Marwijk, Keith Geraghty, and Aneez Esmail. 2017. Controlled interventions to reduce burnout in physicians: a systematic review and meta-analysis. JAMA internal medicine 177, 2 (2017), 195–205.
[23]
Marisa Shrimpling. 2002. Redesigning triage to reduce waiting times. Emerg. Nurse 10, 2 (May 2002), 34–37.
[24]
Karan Singhal, Shekoofeh Azizi, Tao Tu, S Sara Mahdavi, Jason Wei, Hyung Won Chung, Nathan Scales, Ajay Tanwani, Heather Cole-Lewis, Stephen Pfohl, 2023. Large language models encode clinical knowledge. Nature 620, 7972 (2023), 172–180.
[25]
Karan Singhal, Tao Tu, Juraj Gottweis, Rory Sayres, Ellery Wulczyn, Le Hou, Kevin Clark, Stephen Pfohl, Heather Cole-Lewis, Darlene Neal, Mike Schaekermann, Amy Wang, Mohamed Amin, Sami Lachgar, Philip Mansfield, Sushant Prakash, Bradley Green, Ewa Dominowska, Blaise Aguera y Arcas, Nenad Tomasev, Yun Liu, Renee Wong, Christopher Semturs, S. Sara Mahdavi, Joelle Barral, Dale Webster, Greg S. Corrado, Yossi Matias, Shekoofeh Azizi, Alan Karthikesalingam, and Vivek Natarajan. 2023. Towards Expert-Level Medical Question Answering with Large Language Models. arxiv:2305.09617 [cs.CL]
[26]
Augustin Toma, Patrick R Lawler, Jimmy Ba, Rahul G Krishnan, Barry B Rubin, and Bo Wang. 2023. Clinical Camel: An Open-Source Expert-Level Medical Language Model with Dialogue-Based Knowledge Encoding.

Cited By

View all
  • (2024)ARAS: LLM-Supported Augmented Reality Assistance System for Pancreatic SurgeryCompanion of the 2024 on ACM International Joint Conference on Pervasive and Ubiquitous Computing10.1145/3675094.3677543(176-180)Online publication date: 5-Oct-2024
  • (2024)The Value-Sensitive Conversational Agent Co-Design FrameworkInternational Journal of Human–Computer Interaction10.1080/10447318.2024.2426737(1-32)Online publication date: 25-Nov-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CHI EA '24: Extended Abstracts of the CHI Conference on Human Factors in Computing Systems
May 2024
4761 pages
ISBN:9798400703317
DOI:10.1145/3613905
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 May 2024

Check for updates

Author Tags

  1. Artificial Intelligence
  2. Chatbot
  3. Dermatology
  4. Large Language Models
  5. Medical
  6. Wizard-of-Oz

Qualifiers

  • Work in progress
  • Research
  • Refereed limited

Funding Sources

Conference

CHI '24

Acceptance Rates

Overall Acceptance Rate 6,164 of 23,696 submissions, 26%

Upcoming Conference

CHI 2025
ACM CHI Conference on Human Factors in Computing Systems
April 26 - May 1, 2025
Yokohama , Japan

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1,244
  • Downloads (Last 6 weeks)155
Reflects downloads up to 20 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)ARAS: LLM-Supported Augmented Reality Assistance System for Pancreatic SurgeryCompanion of the 2024 on ACM International Joint Conference on Pervasive and Ubiquitous Computing10.1145/3675094.3677543(176-180)Online publication date: 5-Oct-2024
  • (2024)The Value-Sensitive Conversational Agent Co-Design FrameworkInternational Journal of Human–Computer Interaction10.1080/10447318.2024.2426737(1-32)Online publication date: 25-Nov-2024

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media