Code and data for the paper: "The Rising Entropy of English in the Attention Economy".
Contact repo owner for more information.
Note: The folder gutenberg
under utilities is from another project, found at https://github.com/pgcorpus/gutenberg. The tools in this folder were used to clean text in a standardised way.
The project is tested on python 3.9 to 3.10. To run, you first need to install the package dependencies, e.g. by:
pip3 install -r requirements.txt
The data/results
folder includes results from measures on the corpora.
If you wish to replciate the anlaysis, you will first need to download the various text corpora. There are links to these in the main paper, or you can Google them, or contact the repo owner. For each, save the files in data/copora/<corpus_name>/raw
.