Results

Selected article for: "high quality and quality model"

Author: Condor, Aubrey

Title: Exploring Automatic Short Answer Grading as a Tool to Assist in Human Rating

Cord-id: va82gnps

Document date: 2020_6_10

ID: va82gnps

Hyperlink: Download document. Google Scholar. Related documents.

Snippet: This project proposes using BERT (Bidirectional Encoder Representations from Transformers) as a tool to assist educators with automated short answer grading (ASAG) as opposed to replacing human judgement in high-stakes scenarios. Many educators are hesitant to give authority to an automated system, especially in assessment tasks such as grading constructed response items. However, evaluating free-response text can be time and labor costly for one rater, let alone multiple raters. In addition, so

KG: Link to Knowledge Graph

Complete Snippet

Document: This project proposes using BERT (Bidirectional Encoder Representations from Transformers) as a tool to assist educators with automated short answer grading (ASAG) as opposed to replacing human judgement in high-stakes scenarios. Many educators are hesitant to give authority to an automated system, especially in assessment tasks such as grading constructed response items. However, evaluating free-response text can be time and labor costly for one rater, let alone multiple raters. In addition, some degree of inconsistency exists within and between raters for assessing a given task. Recent advances in Natural Language Processing have resulted in subsequent improvements for technologies that rely on artificial intelligence and human language. New, state-of-the-art models such as BERT, an open source, pre-trained language model, have decreased the amount of training data needed for specific tasks and in turn, have reduced the amount of human annotation necessary for producing a high-quality classification model. After training BERT on expert ratings of constructed responses, we use subsequent automated grading to calculate Cohenâ€™s Kappa as a measure of inter-rater reliability between the automated system and the human rater. For practical application, when the inter-rater reliability metric is unsatisfactory, we suggest that the human rater(s) use the automated model to call attention to ratings where a second opinion might be needed to confirm the raterâ€™s correctness and consistency of judgement.

Search related documents:

Co phrase search for related documents

Try single phrases listed below for: 1

Co phrase search for related documents, hyperlinks ordered by date

ABSTRACT:

TERMS:

DOCUMENTS: