This repo contains a Python script that reads source text files and counts the number of occurrences of each word. It then saves the word in a specified format for use in other projects. There are some pre-generated files for quick use. There is also a data structure for easy manipulation and use of the generated word lists.
- Place a variety of text files in source_texts/.
- Run the program
python word_frequencies.py
. - The created word_frequencies.json file contains the words and their frequencies in the source texts.
A class containing the required functions to use and modify a stored word frequency list. See file for functions and documentation.
A word frequencies list generated from a random set of fiction and non-fiction texts downloaded from textfiles.com.
- Line 1: A json encoded list of unique words.
- Line 2: A dictionary of the occurrence count of each unique word.