Giter Site home page Giter Site logo

finloop / data-science-notebooks Goto Github PK

View Code? Open in Web Editor NEW
1.0 1.0 0.0 38.06 MB

Collection of my data science notebooks.

License: MIT License

Jupyter Notebook 99.71% Python 0.29%
forecasting prophet time-series data-science fpgrowth wikipedia

data-science-notebooks's Introduction

About

This repo contains various analysis on different datasets. Current analysis focuses on time series forecasting and anomaly detection.

Datasets

Wikipedia

Drawing graph of page links

import urllib3
import networkx as nx
from wikipedia.parser import get_graph

pool = urllib3.PoolManager()

G = get_graph(pool, url = "https://en.wikipedia.org/wiki/Data_mining", deep=1)
nx.draw(G, nx.circular_layout(G), with_labels=True)

Data mining graph

Finding philosophy page

In this experiment, I'll test the hypothesis that: By going to the first link on any Wikipedia article, you'll end up on the
philosophy article.

For more info on the subject go to my dev.to article.

crawl(pool, "https://en.wikipedia.org/wiki/Data_mining", phrase="Philosophy", deep=30, n=1, verbose=True)
30 Entering Data_mining
29 Entering Data_set

...

   [('https://en.wikipedia.org/wiki/Thought',
     [('https://en.wikipedia.org/wiki/Ideas',
       ['https://en.wikipedia.org/wiki/Philosophy'])])])])])])])])])])])])])])])])])])])])])])])])])])

Experiment and code

E-commerce dataset from brazilian retail store

Predictions Dataset - sampled daily Prophet Prophet prediction of order volume with confidence intervals

Notebooks:

Animations:

Smoothed with 3-day moving average, yearly seasonality

User interactions database

This dataset contains data from a news website. Each csv file contains info about sessions, clicks on articles, time of interaction etc.

In file frequency_analysis.ipynb the distribution of page on with the article appears is analysed. For any session I added the order of articles. Then I filtered for that one article and created histograms for each hour.

Histograms of given article (119592) order in sessions

Image processing

Animation of how Sobel edge detetion works: Edge detection with Sobel

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.