Giter Site home page Giter Site logo

Porting WEASEL 2.0 to pyts about pyts HOT 3 OPEN

aglenis avatar aglenis commented on May 26, 2024
Porting WEASEL 2.0 to pyts

from pyts.

Comments (3)

patrickzib avatar patrickzib commented on May 26, 2024

Thanks for taking the effort!

A couple of comments:

  1. n_bins_arg=4
    In WEASEL 2.0 the alphabet size is fixed to 2

See: Alphabet Size

  1. window_sizes_arg=[0.1, 0.3, 0.5, 0.7, 0.9]:
    I suppose this means 0.9 times the size of the time series length? Please go for much smaller numbers. I go for, i.e. win_size in np.arange(4, 44) or win_size in np.arange(4, 24). In combination with dilation, this adds up to very large receptive fields.

See: Window sizes

  1. first_difference = False
    I do not see the use of first_differences? Please randomly choose from first_differences, too.

See: Ensemble

  1. The number of parameter configurations should be between 50 and 100, each choosing from the range of window_sizes, first_differences, and dilation factors.

See: Ensemble

  1. WEASEL has a novel feature selection strategy based on variance

See: SFA with Variance

  1. strategy_arg='uniform'
    Not sure, what uniform refers to? I am randomly choosing from equi-width and equi-depth

See: [Binning Strategy]
(https://github.com/patrickzib/dictionary/blob/63633eeaa52680f3a1eb016ec95ea0ca2c5430b9/weasel/classification/dictionary_based/_weasel_v2.py#L125)

Hope, this helps. IMO: The most critical parts should be alphabet_size, window-size, differences, and variance in SFA.

from pyts.

johannfaouzi avatar johannfaouzi commented on May 26, 2024

Hi,

Sorry for the delayed response, I saw the notification and forgot about it...

First, thanks @aglenis for the effort and thanks @patrickzib for the feedback! I will need to look at the paper and the source code to provide more detailed, but I will answer some points first.

  • Performing dilation just to get the indices sounds suboptimal to me. You can get the indices with a closed formula.
  • The default window sizes seem to be from my implementation of WEASEL in pyts (I don't remember the default values in the original implementation of WEASEL, but I prefer in general relative values than absolute values for hyper-parameters).
  • The first difference seems to be used with X_train_trend and X_test_trend.
  • The strategy argument has different values in pyts: uniform stands for equi-width (the bins all have the same width), while quantile stands for equi-depth (the same number of values fall in each bin).

In general, I like having more hyper-parameters (even if the values are fixed in the original paper) because it might be useful to change these values for other datasets (many people have their own datasets and don't work on the UCR/UEA archive), but I try to keep the default values as close as possible to the ones in the original publication.

I'm very interested in adding WEASEL 2.0 to pyts, so I will further look into your code and also start working on this on my own, and we'll see what we get!

from pyts.

TonyBagnall avatar TonyBagnall commented on May 26, 2024

FYI, WEASEL 2 is in aeon and we have run it to test results
https://github.com/aeon-toolkit/aeon/blob/main/aeon/classification/dictionary_based/_weasel_v2.py

from pyts.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.