loading page

Evaluating Conversational AI Systems for Responsible Integration in Education: A Comprehensive Framework
  • Utkarsh Mittal,
  • Utkarch Mittal,
  • ․namjae Cho
Utkarsh Mittal

Corresponding Author:[email protected]

Author Profile
Utkarch Mittal
School of Business, Hanyang University
․namjae Cho
School of Business, Hanyang University

Abstract

As conversational AI systems such as ChatGPT have become more advanced, researchers are exploring ways to use them in education. However, we need effective ways to evaluate these systems before allowing them to help teach students. This study proposes a detailed framework for testing conversational AI across three important criteria as follow. First, specialized benchmarks that measure skills include giving clear explanations, adapting to context during long dialogues, and maintaining a consistent teaching personality. Second, adaptive standards check whether the systems meet the ethical requirements of privacy, fairness, and transparency. These standards are regularly updated to match societal expectations. Lastly, evaluations were conducted from three perspectives: technical accuracy on test datasets, performance during simulations with groups of virtual students, and feedback from real students and teachers using the system. This framework provides a robust methodology for identifying strengths and weaknesses of conversational AI before its deployment in schools. It emphasizes assessments tailored to the critical qualities of dialogic intelligence, user-centric metrics capturing real-world impact, and ethical alignment through participatory design. Responsible innovation by AI assistants requires evidence that they can enhance accessible, engaging, and personalized education without disrupting teaching effectiveness or student agency.
20 Aug 2024Submitted to Data Science and Machine Learning
22 Aug 2024Published in Data Science and Machine Learning