Assessment Systems

What is Automated Essay Scoring?

Automated essay scoring (AES) is an important application of machine learning and artificial intelligence to the field of psychometrics and assessment.  In fact, it’s been around far longer than “machine learning” and “artificial intelligence” have been buzzwords in the general public!  The field of psychometrics has been doing such groundbreaking work for decades.

So how does AES work, and how can you apply it?

What is automated essay scoring?

The first and most critical thing to know is that there is not an algorithm that “reads” the student essays.  Instead, you need to train an algorithm.  That is, if you are a teacher and don’t want to grade your essays, you can’t just throw them in an essay scoring system.  You have to  actually grade the essays (or at least a large sample of them) and then use that data to fit a machine learning algorithm.  Data scientists use the term train the model , which sounds complicated, but if you have ever done simple linear regression, you have experience with training models.

There are three steps for automated essay scoring:

  • Establish your data set. Begin by gathering a substantial collection of student essays, ensuring a diverse range of topics and writing styles. Each essay should be meticulously graded by human experts to create a reliable and accurate benchmark. This data set forms the foundation of your automated scoring system, providing the necessary examples for the machine learning model to learn from.
  • Determine the features. Identify the key features that will serve as predictor variables in your model. These features might include grammar, syntax, vocabulary usage, coherence, structure, and argument strength. Carefully selecting these attributes is crucial as they directly impact the model’s ability to assess essays accurately. The goal is to choose features that are indicative of overall writing quality and are relevant to the scoring criteria.
  • Train the machine learning model. Use the established data set and selected features to train your machine learning model. This involves feeding the graded essays into the model, allowing it to learn the relationship between the features and the assigned grades. Through iterative training and validation processes, the model adjusts its algorithms to improve accuracy. Continuous refinement and testing ensure that the model can reliably score new, unseen essays with a high degree of precision.

Here’s an extremely oversimplified example:

  • You have a set of 100 student essays, which you have scored on a scale of 0 to 5 points.
  • The essay is on Napoleon Bonaparte, and you want students to know certain facts, so you want to give them “credit” in the model if they use words like: Corsica, Consul, Josephine, Emperor, Waterloo, Austerlitz, St. Helena.  You might also add other Features such as Word Count, number of grammar errors, number of spelling errors, etc.
  • You create a map of which students used each of these words, as 0/1 indicator variables.  You can then fit a multiple regression with 7 predictor variables (did they use each of the 7 words) and the 5 point scale as your criterion variable.  You can then use this model to predict each student’s score from just their essay text.

Obviously, this example is too simple to be of use, but the same general idea is done with massive, complex studies.  The establishment of the core features (predictive variables) can be much more complex, and models are going to be much more complex than multiple regression (neural networks, random forests, support vector machines).

Here’s an example of the very start of a data matrix for features, from an actual student essay.  Imagine that you also have data on the final scores, 0 to 5 points.  You can see how this is then a regression situation.

How do you score the essay?

If they are on paper, then automated essay scoring won’t work unless you have an extremely good software for character recognition that converts it to a digital database of text.  Most likely, you have delivered the exam as an online assessment and already have the database.  If so, your platform should include functionality to manage the scoring process, including multiple custom rubrics.  An example of our   FastTest platform   is provided below.

FastTest_essay-marking

Some rubrics you might use:

  • Supporting arguments
  • Organization
  • Vocabulary / word choice

How do you pick the Features?

This is one of the key research problems.  In some cases, it might be something similar to the Napoleon example.  Suppose you had a complex item on Accounting, where examinees review reports and spreadsheets and need to summarize a few key points.  You might pull out a few key terms as features (mortgage amortization) or numbers (2.375%) and consider them to be Features.  I saw a presentation at Innovations In Testing 2022 that did exactly this.  Think of them as where you are giving the students “points” for using those keywords, though because you are using complex machine learning models, it is not simply giving them a single unit point.  It’s contributing towards a regression-like model with a positive slope.

In other cases, you might not know.  Maybe it is an item on an English test being delivered to English language learners, and you ask them to write about what country they want to visit someday.  You have no idea what they will write about.  But what you can do is tell the algorithm to find the words or terms that are used most often, and try to predict the scores with that.  Maybe words like “jetlag” or “edification” show up in students that tend to get high scores, while words like “clubbing” or “someday” tend to be used by students with lower scores.  The AI might also pick up on spelling errors.  I worked as an essay scorer in grad school, and I can’t tell you how many times I saw kids use “ludacris” (name of an American rap artist) instead of “ludicrous” when trying to describe an argument.  They had literally never seen the word used or spelled correctly.  Maybe the AI model finds to give that a negative weight.   That’s the next section!

How do you train a model?

Well, if you are familiar with data science, you know there are TONS of models, and many of them have a bunch of parameterization options.  This is where more research is required.  What model works the best on your particular essay, and doesn’t take 5 days to run on your data set?  That’s for you to figure out.  There is a trade-off between simplicity and accuracy.  Complex models might be accurate but take days to run.  A simpler model might take 2 hours but with a 5% drop in accuracy.  It’s up to you to evaluate.

If you have experience with Python and R, you know that there are many packages which provide this analysis out of the box – it is a matter of selecting a model that works.

How effective is automated essay scoring?

Well, as psychometricians love to say, “it depends.”  You need to do the model fitting research for each prompt and rubric.  It will work better for some than others.  The general consensus in research is that AES algorithms work as well as a second human, and therefore serve very well in that role.  But you shouldn’t use them as the only score; of course, that’s impossible in many cases.

Here’s a graph from some research we did on our algorithm, showing the correlation of human to AES.  The three lines are for the proportion of sample used in the training set; we saw decent results from only 10% in this case!  Some of the models correlated above 0.80 with humans, even though this is a small data set.   We found that the Cubist model took a fraction of the time needed by complex models like Neural Net or Random Forest; in this case it might be sufficiently powerful.

Automated essay scoring results

How can I implement automated essay scoring without writing code from scratch?

There are several products on the market.  Some are standalone, some are integrated with a human-based essay scoring platform.  ASC’s platform for automated essay scoring is SmartMarq; click here to learn more .  It is currently in a standalone approach like you see below, making it extremely easy to use.  It is also in the process of being integrated into our online assessment platform, alongside human scoring, to provide an efficient and easy way of obtaining a second or third rater for QA purposes.

Want to learn more?  Contact us to request a demonstration .

SmartMarq automated essay scoring

  • Latest Posts

Avatar for Nathan Thompson, PhD

Nathan Thompson, PhD

Latest posts by nathan thompson, phd ( see all ).

  • What is an Assessment-Based Certificate? - October 12, 2024
  • What is Psychometrics? How does it improve assessment? - October 12, 2024
  • What is RIASEC Assessment? - September 29, 2024

Assessment Systems Logo - white

Online Assessment

Psychometrics.

what is automated essay scoring

Automated Essay Scoring

  • © 2022
  • Beata Beigman Klebanov 0 ,
  • Nitin Madnani 1

Educational Testing Service, USA

You can also search for this author in PubMed   Google Scholar

Part of the book series: Synthesis Lectures on Human Language Technologies (SLHLT)

2658 Accesses

5 Citations

1 Altmetric

This is a preview of subscription content, log in via an institution to check access.

Access this book

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Other ways to access

Licence this eBook for your library

Institutional subscriptions

About this book

Similar content being viewed by others.

what is automated essay scoring

Automated Essay Scoring Systems

what is automated essay scoring

Automated Essay Feedback Generation in the Learning of Writing: A Review of the Field

Table of contents (13 chapters), front matter, introduction, should we do it can we do it.

Beata Beigman Klebanov, Nitin Madnani

Getting Hands-On

Building an automated essay scoring system, from lessons to guidelines, a deep dive: models, features, architecture, and evaluation, generic features, genre- and task-specific features, automated scoring systems: from prototype to production, evaluating for real-world use, further afield: feedback, content, speech, and gaming, automated feedback, automated scoring of content, automated scoring of speech, fooling the system: gaming strategies, summary and discussion, looking back, looking ahead, back matter, authors and affiliations, about the authors, bibliographic information.

Book Title : Automated Essay Scoring

Authors : Beata Beigman Klebanov, Nitin Madnani

Series Title : Synthesis Lectures on Human Language Technologies

DOI : https://doi.org/10.1007/978-3-031-02182-4

Publisher : Springer Cham

eBook Packages : Synthesis Collection of Technology (R0) , eBColl Synthesis Collection 11

Copyright Information : Springer Nature Switzerland AG 2022

Softcover ISBN : 978-3-031-01054-5 Published: 12 November 2021

eBook ISBN : 978-3-031-02182-4 Published: 31 May 2022

Series ISSN : 1947-4040

Series E-ISSN : 1947-4059

Edition Number : 1

Number of Pages : XX, 294

Topics : Artificial Intelligence , Natural Language Processing (NLP) , Computational Linguistics

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

IMAGES

  1. Automated Essay Scoring Explained

    what is automated essay scoring

  2. What is Automated Essay Scoring, Marking, Grading?

    what is automated essay scoring

  3. GitHub

    what is automated essay scoring

  4. ASAP Benchmark (Automated Essay Scoring)

    what is automated essay scoring

  5. AI4E®- Developing AI Literacy among Educators: AI in Automated Essay Scoring Systems

    what is automated essay scoring

  6. Cognitive-based automated essay scoring (CAES) model

    what is automated essay scoring

COMMENTS

  1. Automated essay scoring - Wikipedia

    Automated essay scoring. Automated essay scoring (AES) is the use of specialized computer programs to assign grades to essays written in an educational setting. It is a form of educational assessment and an application of natural language processing. Its objective is to classify a large set of textual entities into a small number of discrete ...

  2. What is Automated Essay Scoring? - Assessment Systems

    Automated essay scoring (AES) is an important application of machine learning and artificial intelligence to the field of psychometrics and assessment. In fact, it’s been around far longer than “machine learning” and “artificial intelligence” have been buzzwords in the general public!

  3. An automated essay scoring systems: a systematic literature ...

    Automated essay scoring (AES) is a computer-based assessment system that automatically scores or grades the student responses by considering appropriate features. The AES research started in 1966 with the Project Essay Grader (PEG) by Ajay et al. (1973).

  4. An Overview of Automated Scoring of Essays

    coring of EssaysSemire DikliIntroductionAutomated Essay Scoring (AES) is defined as the computer technology that evaluates and scores the written prose (Shermis & Barrera, 2002; Shermis & Burstei. , 2003; Shermis, Raymat, & Barrera, 2003). AES sys-tems are developed to assist teachers in low-stakes classroom assessment as well as testing ...

  5. Automated Essay Scoring Systems | SpringerLink

    The first widely known automated scoring system, Project Essay Grader (PEG), was conceptualized by Ellis Battan Page in late 1960s (Page, 1966, 1968).PEG relies on proxy measures, such as average word length, essay length, number of certain punctuation marks, and so forth, to determine the quality of an open-ended response item.

  6. A Comprehensive Review of Automated Essay Scoring (AES ...

    Automated Essay Scoring (AES) is a service or software that can predictively grade essay based on a pre-trained computational model. It has gained a lot of research interest in educational ...

  7. Automated Essay Scoring in Middle School Writing ...

    Automated essay scoring (AES) and automated writing evaluation (AWE), which rely on artificial intelligence (AI)-based natural language processing to score and give real-time writing-focused feedback to students as they write, have the potential to save teachers time and, perhaps, make it more feasible to give students more writing opportunities.

  8. Automated Essay Scoring - SpringerLink

    This book discusses the state of the art of automated essay scoring, its challenges and its potential. One of the earliest applications of artificial intelligence to language data (along with machine translation and speech recognition), automated essay scoring has evolved to become both a revenue-generating industry and a vast field of research, with many subfields and connections to other NLP ...

  9. (PDF) Automated Essay Scoring Systems - ResearchGate

    Accordingly, automated essay grading (AEG) systems, or automated essay scoring (AES systems, are de fined as a computer-based process. of applying standardized measurements on open-ended or ...

  10. Automated Essay Scoring | A Cross-disciplinary Perspective ...

    This new volume is the first to focus entirely on automated essay scoring and evaluation. It is intended to provide a comprehensive overview of the evolution and state-of-the-art of automated essay scoring and evaluation technology across several disciplines, including education, testing and measurement, cognitive science, computer science, and ...