The word-counter-binary-tree from manasesjesus

manasesjesus / word-counter-binary-tree Goto Github PK

View Code? Open in Web Editor NEW

Read text from a file, split it into words and build a binary tree where each leaf node represents a unique word.

License: MIT License

Java 100.00%

word-counter-binary-tree's Introduction

Word Counter Binary Tree - Code Challenge

Write a Java-based program (command line) that reads text from a file, splits it into words at spaces and newline characters and constructs an (unbalanced) binary tree where each leaf node represents a unique word.

The tree construction shall start by creating a node for each unique word, where a node has a field to keep track of the occurrence count. The algorithm starts with the two least occurring nodes and creates a parent node. The parent node gets assigned an occurrence count that is the sum of the word occurrences. The process then repeats, i.e., it locates the two nodes with the least occurrence count, creates a parent node, and so on, until all nodes are part of the tree.

Finally, print the tree to the console (very basic output is sufficient).

For example, the text “She had had to address address problems” results in this tree (note that there are multiple variants):

Solution

The application receives a filename as a parameter. It is then parsed to get the words and number of occurrences stored in a map which is used to build the tree. Optionally, elements to be ignored can be passed as parameters, e.g. ( ) . ,

By default it is not case sensitive, but it is possible to call the WordSplitter ignoreCase method and set it to false.

Execution:
WordCounterBinaryTree filename [elements to be ignored]

Examples

Execution and output 1:

$ WordCounterBinaryTree  resources/test01.txt  

7
├──4
│  ├──1  [ address ]
│  └––3
│     ├──1  [ (had) ]
│     └––2
│        ├──1  [ she ]
│        └––1  [ had, ]
└––3
   ├──1  [ (address); ]
   └––2
      ├──1  [ problems... ]
      └––1  [ to ]

Execution and output 2:

$ WordCounterBinaryTree  resources/test01.txt ( ) . , ; 

7 
├──3 
│  ├──1  [ problems ] 
│  └––2 
│     ├──1  [ to ] 
│     └––1  [ she ] 
└––4 
   ├──2  [ address ] 
   └––2  [ had ]

Recommend Projects

manasesjesus / word-counter-binary-tree Goto Github PK

word-counter-binary-tree's Introduction

Word Counter Binary Tree - Code Challenge

Solution

Examples

word-counter-binary-tree's People

Contributors

Stargazers

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent