Giter Site home page Giter Site logo

adaga's Introduction

Structure

This project contains 2 versions of ADAGA, the algorithm for change point detection presented in "Adaptive Gaussian Process Change Point Detection", by Edoardo Caldarelli, Philippe Wenk, Stefan Bauer, and Andreas Krause (Proceedings of the 39th International Conference on Machine Learning, Baltimore, Maryland, USA, PMLR 162, 2022). The directory inducing_points_version contains the code implemented with the inducing points, and qff_version contains the one implemented with quadrature Fourier features (and the exact linear kernel). This demo uses pip version 20.2.4.

The directory time_series contains the 6 real-world and 3 synthetic time series used in our experiments.

Create the virtual environment

The virtual environment for running the project can be created by running these commands, in the main directory of the project code. Firstly, we create the virtual environment via venv:

python3 -m venv env python=3.7

Then we activate the environment:

source env/bin/activate

Now, we install the required packages:

pip install -r requirements.txt

Sources

Code

The implementation of GP regression with QFFs is based on code implemented by Emmanouil Angelis, ETH Zurich (Angelis et al., "SLEIPNIR: Deterministic and Provably Accurate Feature Expansion for Gaussian Process Regression with Derivatives", 2020).

Datasets

The 6 real-world time series datatsets were downloaded from https://github.com/alan-turing-institute/TCPD.

Change point detection

If we want to reproduce our change point detection experiments, we can run the following steps.

Firstly, we must extract the 6 real-world datasets used in our experiment. To do so, we can run the script extract_time_series.py. We have to make sure that, at this step, we are in the codedirectory:

python -m extract_time_series

Then, we can generate the 3 synthetic series, also used in our experiments, by using the script generate_simulated_cp_data.py. Note that this command creates 10 different noisy realizations of each series. We have to make sure that we are still in the code directory:

python -m generate_simulated_cp_data

The extracted time series' values (obs.csv), along with the timesteps (tsteps.csv), are saved, e.g., at the path ./time_series/run_log for the Run log series, or ./time_series/mean_0 for the first noisy realization of synthetic series with 2 change points in the mean.

Inducing points

If the inducing points are used, we select the inducing point approximation, from the codedirectory:

cd inducing_points_version

We can now partition the desired series. For instance,

python -m regionalize_time_series --dataset "run_log" --kernel "Linear"

regionalizes the Run Log time series with the parameters used in our experiments.

The real-world datasets available are run_log , businv , gdp_japan, gdp_argentina, gdp_iran.

The synthetic datasets available are mean (with change points in the mean), var (with change points in the noise variance), and per (with change points in the temporal correlation of the samples.

Valid kernels are Linear (for businv and run_log datasets), and RBF, RQ, Matern52, Periodic (for the remaining datasets).

QFFs (and exact linear kernel)

If the QFFs (or the exact linear kernel) are used, we select the QFF approximation, from the codedirectory:

cd qff_version

Then, we proceed as before. The command

python -m regionalize_time_series --dataset "run_log" 

regionalizes the Run log time series with the parameters used in our experiments.

Note that, in our paper, the exact linear kernel is used with the businv and run_log datasets only. Conversely, all the other datasets are processed with the RBF kernel only (approximated with QFFs). Thus, in this case, we do not state the kernel to be used in the processing in the shell command.

ADAGA's output

The information about the start and end of the regions is saved as a list of dictionaries at "inducing_points_version/regions_time_series", as .npyfiles. Each element in the list corresponds to one region.

If the QFFs are used, the results' directory is "qff_version/regions_time_series".

adaga's People

Contributors

caedoard avatar

Stargazers

 avatar Christian avatar Mariano Ramirez avatar Chohee.Kim avatar  avatar

Watchers

Andreas Krause avatar  avatar David Lindner avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.