The goal of the CommonLit Readability Prize (CLRP) is to predict the reading difficulty of the given text.
The goal of the CLRP is to predict the reading difficulty of the given text.
The main columns are : -
Unique ID : Unique ID for each excerpt.
Excerpt : A piece of content(about 700 characters long) from several educational materials.
Target Score : Difficulty score for each excerpt rated by high school teachers.
Data Cleaning: CLRP - https://www.kaggle.com/ananduk1993/clrp-data-clean-excerpt-stats
Feature Engineering+Baseline Prediction: https://www.kaggle.com/ananduk1993/clrp-feature-engineering-baseline-prediction
Fine-tuning Roberta: https://www.kaggle.com/ananduk1993/cl-roberta-large-lightgbm-better-finetune
Ensemble top Models: https://www.kaggle.com/ananduk1993/ensemble-top-models