Minimum Spanning Tree:

MST is a program aimed at genreating a randomly connected, undirected, weighted graph, using both an adjacency matrix and adjacency list implementation. Then finding the minimum spanning tree within the graph. This project is split up into four different parts:

Graph Generation
Edge Sorting
Kruskal's Algorithm
Prim's Algorithm (w/ Priority Queue)

Included are expected outputs of what the program should generate.

Graph Generation:

The graph is implemented in the Maze.java class which returns a a connected graph of "n" vertices's, where "n" is specified in the input file.

To determine if two vertices's should be connected by an edge, a random number is generated and if that number is less than or equal to the "p" value specified by input, then those two edges are connected with weight "w" which is a random number between 1 and "n" generated by two times the seed value.

To ensure the generated graph is connected, a depth-first traversal is used to count the number of vertices's visited. if the count does not equal the number of vertices's specified by the "n" value the procedure restarts.

The graph generation is timed and outputted.

Sorting Edges:

Sorting edges is done using three differing algorithms, Insertion Sort, Counting Sort, and QuickSort. To sort the edges, the adjacency list and matrix must first be converted into a list of edges which is done through the getList and getMatrix methods in the Maze.java file. The edge class implements the Comparable interface so that if the sort being used is unstable, such as Quicksort, then the program can force the sort to be stable. To determine which edge has greater priority first the weights are looked at if there is no deciding factor there, then the row numbers from the adjacency matrix are used, and if they are in the same row the column number is used to break the tie.

The conversion from adjacency list and matrix to a list of edges as well as the accompanied sorting of that list are timed and outputted. Note that graph generation is not factored into the timing.

Kruskal's Algorithm:

Implementation of Kruskal's algorithm with path compression and union-by-rank for cycle detection, using insertion sort, count sort, and quicksort to sort the edges of the graph, where the graph is implemented as both an adjacency matrix and adjancency list.

After the edges are sorted, and assuming no cycles are formed, choose an edge with the smallest weight from the list of the sorted edges, remove it from the list and add it to the minimum spanning tree. Repeat this procedure until "n" minus one edges have been added to the MST. Then check that no cycles have been formed using the union-find algorithm with path compression.

For Cycle checking there are two cases which could occur given an Edge with a left and right Vertex L and R:

L and R are not in the new partitioned tree, therefore adding that edge WILL NOT create a cycle
L and R are already in the new partitioned tree, therefore adding that edge WILL create a cycle (skip this edge)

To quickly determine if two Vertices's are in the same subtree, Union-Find Data structure is used. Union-Find works by saying initially you have N subtrees each of which contain exactly one Vertex. When two subtrees are joined, the number of total trees decreases by one. To implement a Union-Find data structure two functions are required:

Find(Vertex): Which given a vertex finds the root of that subtree
Union(Subtree1, Subtree2): Which given two subtrees combines the two trees. (Note. Union assumes that the trees have different roots)

Union Find can find a cycle in a graph, but its time complexity is at worst O(N) which although not the worst, could be better.

To make the Union-Find data structure more efficient, union by rank is used. Union by rank is just fancy verbiage to indicate that given two subtrees as before, instead of just combining them in some random way, combine them in a way that you don't create a straight line, but rather tree within a tree. To do this you find the bigger subtree and add the smaller subtree to the bigger one (a tree within a tree). This seems simple enough, but this small change to just the Union function takes the worst case of this function O(N) and makes it O(Log N).

Overall O(Log N) is way better then O(N), but there is one last thing that can be done to make this run even faster then that, Path Compression. Path Compression works by modifying that Find function from earlier. Again Path Compression is fancy verbiage to say that after you find the root of a given subtree, you can make it easier for everybody else who is trying to find the root by just saying that the parent of that node is the root. For example in a two node tree there is no difference, but in a tree with three subtrees the node on the bottom left points directly to the top and so does every other node in the tree. Overall this changes the runtime of cycle detection from O(N) in worst case initially to a very small constant. For example it would take about thirty years to determine if a cycle existed using a billion items using depth-first search, but only about six seconds using Union-by-rank with path compression.

The converting of the adjacency matrix and adjacency list into an array of edges, as well as cycle-checking, and the actual construction of the minimum spanning tree are timed and outputted.

Prim's Algorithm:

Just like Kruskal's algorithm, Prim's algorithm will find the minimum spanning tree, but through a different process. Prim's Algorithms works more effectively when dealing with a very dense graph ( Graph where the number of edges is closer to the maximum). This is due to how it generates the MST. Prims method focuses more on vertices as opposed to Kruskal's which focuses on edges.

Prims works by first choosing any random vertex in the Graph and adding it the MST.
To choose the next vertex to add the MST, Look at all the neighbors from that initial vertex and add whichever vertex has the smallest edge.
On each pass, choose the vertex (among those not yet in the MST) that has the smallest edge weight connected to any of the verices already in the MST.

Overall Prim's runtime is proportional to M log M (if using a priority queue implemented as a binary heap).

Error Conditions and Messages

Missing input file:
- Input file not found
n or seed is not an integer:
- n and seed must be integers
n is less than 2:
- n must be greater than 1
p is not a real number:
- p must be a real number
p is less than 0 or greater than 1:
- p must be between 0 and 1

michaelrinos / mst-in-c Goto Github PK

mst-in-c's Introduction

Minimum Spanning Tree:

Graph Generation:

Sorting Edges:

Kruskal's Algorithm:

Prim's Algorithm:

Error Conditions and Messages

mst-in-c's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent