News Context

At a glance

A recent⁢ study indicates that while artificial intelligence can accelerate the grading process for educators, it might‌ do so at the expense of precision.
Xiaoming Zhai, associate professor and director of‍ the AI4STEM Education Center at the University of Georgia, noted the time constraints teachers face.
The study⁢ presented the LLM Mixtral with‌ middle school students' written‌ responses, including a question asking ‍them to model particle behavior when heat energy is transferred.

AI grading promises faster feedback, but at what⁣ cost? A ‌new study reveals that while artificial intelligence can speed up the‍ grading process for teachers, it may compromise assessment accuracy.the research, wich News Directory 3 covered, indicates that Large Language Models (LLMs) often take grading shortcuts, like focusing ‍on keywords, which leads to mistakes in evaluating student responses. Though, the study⁢ suggests that when LLMs use detailed, human-designed rubrics, their grading accuracy substantially improves.Human-made rubrics boost the accuracy rate from ‍33.5% to over 50%.This could change how educators provide feedback. Discover what’s next⁢ for AI in education and if it can truly help teachers.

AI Grading Systems Offer ⁣Speed, But Accuracy ⁢Is a Concern

Updated May 27,‌ 2025

A recent⁢ study indicates that while artificial intelligence can accelerate the grading process for educators, it might‌ do so at the expense of precision. The research highlights the challenges of assessing complex student work using AI, especially in subjects emphasizing argumentation, investigation, and data analysis.

Xiaoming Zhai, associate professor and director of‍ the AI4STEM Education Center at the University of Georgia, noted the time constraints teachers face. He said that grading‌ complex tasks takes time, which means students may not get timely feedback.The⁣ study compared ‍Large Language Models (LLMs) to human graders.

The study⁢ presented the LLM Mixtral with‌ middle school students’ written‌ responses, including a question asking ‍them to model particle behavior when heat energy is transferred. The LLM then⁣ created rubrics to evaluate the students’ work and assign scores.

Researchers discovered that while LLMs grade quickly, they often⁤ rely on shortcuts, such as identifying keywords, which reduces ⁤accuracy. Supplying LLMs with detailed rubrics that mirror human⁣ analytical⁢ thought could improve their performance, the study suggests. These rubrics should specify what‍ the grader should look ⁤for in a student’s response.

“The train has left the station, but⁣ it has just left‍ the station,” Zhai said, emphasizing the need for further development ⁢in AI grading.

Traditionally, LLMs are trained using both student answers and ‌human scores. Though, this study uniquely instructed LLMs to develop their own rubrics. While these AI-generated rubrics showed some similarities to human-created ones, LLMs⁤ often lacked the reasoning capabilities‌ of humans, relying instead on shortcuts like “over-inferring.”

Zhai explained that LLMs might incorrectly assume a student’s understanding based solely on the presence of certain keywords, without ⁤evaluating the student’s underlying logic. For example, ‍mentioning a temperature increase might lead the ‌LLM to assume ⁤the student understands particle movement, even if their writing‍ doesn’t⁢ demonstrate that understanding.

The researchers caution against completely replacing human graders,despite the speed⁣ advantages of LLMs. Human-made rubrics, which reflect⁢ instructor expectations, substantially improve AI accuracy.‍ Without ⁢them, llms have an accuracy rate of only 33.5%,which increases to ⁢just over 50% with human rubrics.

Improved accuracy could make educators more receptive to using⁣ AI to streamline grading, freeing up time for other tasks.

“Manny teachers told me, ‘I had to spend my ‌weekend giving feedback, but by using automatic scoring, I do not have to do that. Now, I have more time‌ to focus on more meaningful work rather of some labor-intensive work,’” Zhai said.

What’s next

Future research will likely focus on refining AI⁢ algorithms and integrating detailed, human-like rubrics to enhance the accuracy of AI grading systems, possibly leading⁢ to wider⁢ adoption in educational settings.

AI Grading: Faster Feedback for Teachers

AI Grading Systems Offer ⁣Speed, But Accuracy ⁢Is a Concern

What’s next

Related

AI Grading: Faster Feedback for Teachers

AI Grading Systems Offer ⁣Speed, But Accuracy ⁢Is a Concern

What’s next

Share this:

Related