The dsci553-foundations-and-applications-of-data-mining's intro from minglyubyte

dsci553-foundations-and-applications-of-data-mining's Introduction

DSCI 553: Foundations and Applications of Data Mining

Description

Data mining is a fundamental skill for massive data analysis. At a high level, it allows the analyst to discover patterns in data, and transform it into a usable product. The course will teach data mining algorithms for analyzing very large data sets. It will have an applied focus, in that it is meant for preparing students to utilize topics in data mining to solve real world problems.

Homeworks

No.	Main Application	Programming	Tags	Version	Score
1	Spark Data Exploration	Python & Spark & Scala	`MapReduce` `Spark` `Pyspark`	Python 3.6, JDK 1.8, Scala 2.11, and Spark 2.4.4	7.7/7
2	Frequent itemsets	Python & Spark	`SON` `Limited-Pass` `MapReduce`	Python 3.6, JDK 1.8, Scala 2.12, and Spark 3.1.2	7/7
3	Collaborative-filtering recommendation systems	Python & Spark	`LSH` `MinHash` `XGBregressor`	Python 3.6, JDK 1.8, Scala 2.12, and Spark 3.1.2	7/7
4	Community Detection	Python & Spark	`Girvan-Newman algorithm` `Betweenness` `Graph` `LPA` `GraphFrames`	Python 3.6, JDK 1.8, Scala 2.12, and Spark 3.1.2	7/7
5	Mining Stream Data	Python & Spark	`Bloom Filtering` `Flajolet-Martin algorithm` `Fixed Size Sampling`	Python 3.6, JDK 1.8, Scala 2.12, and Spark 3.1.2	7/7
6	Clustering	Python & Spark	`KMeans` `BFR Algorithm`	Python 3.6, JDK 1.8, Scala 2.12, and Spark 3.1.2	7/7

Recommend Projects