Tags collegeessay essay essayprompts nlp prompts roberta textclassification

Classify your Essays according to the Essay Prompts using AI

An essay prompt is a cue that suggests a starting point or a potential topic idea for the essay you need to complete. Essay prompts are a trigger for ideas regarding a topic or issue.

Students respond to those prompts(given by professors) by writing an essay. Generally prompts serve as a starting point for an original essay, report, journal entry, story, poem, or other forms of writing.

A prompt can be anything; a single line or a picture. It all depends on the students creativity on how he interprets to go on to write about it. That eventually leads to the grading of writing, reasoning and analytical skills.

Stick to the given Essay Prompts

Many applications provide you with a few different prompts, to write about. This is quite helpful for the students as they don’t have to brainstorm them. However there are times when the student has pre-planned story to write about, but it wouldn’t conform to any of the given prompts.

To get away with this, they write a paragraph to satisfy the prompt and the remaining portion is about something else.

Another biggest problem some students often face is that they misunderstand the main topic of the prompt.

The colleges provide you with a prompt for a reason, they want to test your writing on that certain topic. You need to oblige to it and make sure that the essay fits into the given prompt. When you try to deviate away or misunderstand, your essay loses the score.

The Common App Essay Prompts for 2022–2023

These are the 7 prompts to help students write an effective college essay.

#1 Prompt : Background Essay

#2 Prompt : Setback Essay

#3 Prompt : Challenger Essay

#4 Prompt : Gratitude Essay

#5 Prompt : Accomplishment Essay

#6 Prompt : Passion Essay

#7 Prompt : Custom: Topic of your Choice

Using AI to make you adhere to the Essay prompts

At textify.ai, we applied AI to find a solution to this problem. In the following sections, we look at how we used state-of-the-art NLP algorithms to find a probabilistic score for your essay against a prompt.

Dataset preparation

The essays were labelled based on the seven prompts as shown above. Here, to avoid ambiguity the essay topic as well as the essay text were given the labels from 0–6, mapped into a target attribute.

DataFrame consisting of Essay Topic, Essay Text and the Target attributes

The target class of the dataset was unbalanced with more essays prominently belonging to the ‘Background’ prompt. Hence, during the model building this has to be considered. There can also be an overlap between the prompts within the dataset.

The total count of prompts in the entire dataset represented as a pie chart

Setting up the Model

We create the config class to initialize the model, hyperparameters of the model and the path of files. To boost the accuracy, we make changes to the hyperparameters in the config class.

We will be performing a multiclass classification for the essay topic and the essay text, using the RoBERTa-tiny-cased model from huggingface.

Training the Model

The training is carried out on the embedding extracted after tokenizing the essay texts. We split the data into train embeddings and test embeddings(25%) to validate the model performance. Because the labels are imbalanced, we split the data set in a stratified fashion.

Evaluation of the Model

After the successful training of the model for 25 epochs, we check the model loss and model accuracy on the train and test set for evaluation.

A plot to represent the model loss and model accuracy vs the number of training epochs.

From the above results, we can see that the model has performed quite well. The RoBERTa model produces high accuracies without any hyperparameter tuning.

Classification Metrics

The dataset contained prompts that were unbalanced in the target attribute. The Background prompt appeared more frequently. Hence, it was not wise to consider the accuracy metric. F1 score the preferred choice here.

We got a f1-score of 0.98, let’s proceed to check the model performance on the unseen data.

Prediction on the Unseen Data

We’ve preprocessed our data, built, trained, and saved our model. Now, we can begin making some predictions with it. After our input string has been formatted into the correct dictionary-tensors format, we can pass them to the predict method. This will return an array of probabilities across each output label.

the probabilities predicted by the model for an unseen essay

Here, we can see the probabilities of the essay text belonging to each of the prompts. We can consider three highest probabilities of the returned array.

We can consider the top three probabilistic scores for the respective essay text and append it to a data frame. top_1 — denotes the high probability score.

the final dataframe depicting the top 3 prompts in order of highest probability

Conclusion

We come to the end of full multi-class classification transformer model walkthrough, from start-to-finish. This was the complete procedure followed to classify the essays based on the essay prompts. Try it out here!

Check out more AI tools.

Freestyle writing: How to ace it?

What Is The Coalition App?

Create Your Mindmap Right Here!

The 5 Cheapest Colleges In Michigan

How to Start a College Essay to Hook Your Reader?

The 5 Cheapest Colleges in Virginia

Reduce Word Count of Essays using AI

Argumentative Essay: Examples And Tips