Giter Site home page Giter Site logo

timeseriesai / tsai Goto Github PK

View Code? Open in Web Editor NEW
4.7K 60.0 601.0 1.47 GB

Time series Timeseries Deep Learning Machine Learning Pytorch fastai | State-of-the-art Deep Learning library for Time Series and Sequences in Pytorch / fastai

Home Page: https://timeseriesai.github.io/tsai/

License: Apache License 2.0

Jupyter Notebook 97.08% Python 2.92% CSS 0.01%
time-series-classification deep-learning fastai pytorch timeseries time-series sequential time-series-analysis transformer cnn

tsai's Introduction

tsai



CI PyPI Conda (channel only) DOI PRs

Description

State-of-the-art Deep Learning library for Time Series and Sequences.

tsai is an open-source deep learning package built on top of Pytorch & fastai focused on state-of-the-art techniques for time series tasks like classification, regression, forecasting, imputation…

tsai is currently under active development by timeseriesAI.

What’s new:

During the last few releases, here are some of the most significant additions to tsai:

  • New models: PatchTST (Accepted by ICLR 2023), RNN with Attention (RNNAttention, LSTMAttention, GRUAttention), TabFusionTransformer, …
  • New datasets: we have increased the number of datasets you can download using tsai:
    • 128 univariate classification datasets
    • 30 multivariate classification datasets
    • 15 regression datasets
    • 62 forecasting datasets
    • 9 long term forecasting datasets
  • New tutorials: PatchTST. Based on some of your requests, we are planning to release additional tutorials on data preparation and forecasting.
  • New functionality: sklearn-type pipeline transforms, walk-foward cross validation, reduced RAM requirements, and a lot of new functionality to perform more accurate time series forecasts.
  • Pytorch 2.0 support.

Installation

Pip install

You can install the latest stable version from pip using:

pip install tsai

If you plan to develop tsai yourself, or want to be on the cutting edge, you can use an editable install. First install PyTorch, and then:

git clone https://github.com/timeseriesAI/tsai
pip install -e "tsai[dev]"

Note: starting with tsai 0.3.0 tsai will only install hard dependencies. Other soft dependencies (which are only required for selected tasks) will not be installed by default (this is the recommended approach. If you require any of the dependencies that is not installed, tsai will ask you to install it when necessary). If you still want to install tsai with all its dependencies you can do it by running:

pip install tsai[extras]

Conda install

You can also install tsai using conda (note that if you replace conda with mamba the install process will be much faster and more reliable):

conda install -c timeseriesai tsai

Documentation

Here’s the link to the documentation.

Available models:

Here’s a list with some of the state-of-the-art models available in tsai:

plus other custom models like: TransformerModel, LSTMAttention, GRUAttention, …

How to start using tsai?

To get to know the tsai package, we’d suggest you start with this notebook in Google Colab: 01_Intro_to_Time_Series_Classification It provides an overview of a time series classification task.

We have also develop many other tutorial notebooks.

To use tsai in your own notebooks, the only thing you need to do after you have installed the package is to run this:

from tsai.all import *

Examples

These are just a few examples of how you can use tsai:

Binary, univariate classification

Training:

from tsai.basics import *

X, y, splits = get_classification_data('ECG200', split_data=False)
tfms = [None, TSClassification()]
batch_tfms = TSStandardize()
clf = TSClassifier(X, y, splits=splits, path='models', arch="InceptionTimePlus", tfms=tfms, batch_tfms=batch_tfms, metrics=accuracy, cbs=ShowGraph())
clf.fit_one_cycle(100, 3e-4)
clf.export("clf.pkl") 

Inference:

from tsai.inference import load_learner

clf = load_learner("models/clf.pkl")
probas, target, preds = clf.get_X_preds(X[splits[1]], y[splits[1]])

Multi-class, multivariate classification

Training:

from tsai.basics import *

X, y, splits = get_classification_data('LSST', split_data=False)
tfms = [None, TSClassification()]
batch_tfms = TSStandardize(by_sample=True)
mv_clf = TSClassifier(X, y, splits=splits, path='models', arch="InceptionTimePlus", tfms=tfms, batch_tfms=batch_tfms, metrics=accuracy, cbs=ShowGraph())
mv_clf.fit_one_cycle(10, 1e-2)
mv_clf.export("mv_clf.pkl")

Inference:

from tsai.inference import load_learner

mv_clf = load_learner("models/mv_clf.pkl")
probas, target, preds = mv_clf.get_X_preds(X[splits[1]], y[splits[1]])

Multivariate Regression

Training:

from tsai.basics import *

X, y, splits = get_regression_data('AppliancesEnergy', split_data=False)
tfms = [None, TSRegression()]
batch_tfms = TSStandardize(by_sample=True)
reg = TSRegressor(X, y, splits=splits, path='models', arch="TSTPlus", tfms=tfms, batch_tfms=batch_tfms, metrics=rmse, cbs=ShowGraph(), verbose=True)
reg.fit_one_cycle(100, 3e-4)
reg.export("reg.pkl")

Inference:

from tsai.inference import load_learner

reg = load_learner("models/reg.pkl")
raw_preds, target, preds = reg.get_X_preds(X[splits[1]], y[splits[1]])

The ROCKETs (RocketClassifier, RocketRegressor, MiniRocketClassifier, MiniRocketRegressor, MiniRocketVotingClassifier or MiniRocketVotingRegressor) are somewhat different models. They are not actually deep learning models (although they use convolutions) and are used in a different way.

⚠️ You’ll also need to install sktime to be able to use them. You can install it separately:

pip install sktime

or use:

pip install tsai[extras]

Training:

from sklearn.metrics import mean_squared_error, make_scorer
from tsai.data.external import get_Monash_regression_data
from tsai.models.MINIROCKET import MiniRocketRegressor

X_train, y_train, *_ = get_Monash_regression_data('AppliancesEnergy')
rmse_scorer = make_scorer(mean_squared_error, greater_is_better=False)
reg = MiniRocketRegressor(scoring=rmse_scorer)
reg.fit(X_train, y_train)
reg.save('MiniRocketRegressor')

Inference:

from sklearn.metrics import mean_squared_error
from tsai.data.external import get_Monash_regression_data
from tsai.models.MINIROCKET import load_minirocket

*_, X_test, y_test = get_Monash_regression_data('AppliancesEnergy')
reg = load_minirocket('MiniRocketRegressor')
y_pred = reg.predict(X_test)
mean_squared_error(y_test, y_pred, squared=False)

Forecasting

You can use tsai for forecast in the following scenarios:

  • univariate or multivariate time series input
  • univariate or multivariate time series output
  • single or multi-step ahead

You’ll need to: * prepare X (time series input) and the target y (see documentation) * select PatchTST or one of tsai’s models ending in Plus (TSTPlus, InceptionTimePlus, TSiTPlus, etc). The model will auto-configure a head to yield an output with the same shape as the target input y.

Single step

Training:

from tsai.basics import *

ts = get_forecasting_time_series("Sunspots").values
X, y = SlidingWindow(60, horizon=1)(ts)
splits = TimeSplitter(235)(y) 
tfms = [None, TSForecasting()]
batch_tfms = TSStandardize()
fcst = TSForecaster(X, y, splits=splits, path='models', tfms=tfms, batch_tfms=batch_tfms, bs=512, arch="TSTPlus", metrics=mae, cbs=ShowGraph())
fcst.fit_one_cycle(50, 1e-3)
fcst.export("fcst.pkl")

Inference:

from tsai.inference import load_learner

fcst = load_learner("models/fcst.pkl", cpu=False)
raw_preds, target, preds = fcst.get_X_preds(X[splits[1]], y[splits[1]])
raw_preds.shape
# torch.Size([235, 1])

Multi-step

This example show how to build a 3-step ahead univariate forecast.

Training:

from tsai.basics import *

ts = get_forecasting_time_series("Sunspots").values
X, y = SlidingWindow(60, horizon=3)(ts)
splits = TimeSplitter(235, fcst_horizon=3)(y) 
tfms = [None, TSForecasting()]
batch_tfms = TSStandardize()
fcst = TSForecaster(X, y, splits=splits, path='models', tfms=tfms, batch_tfms=batch_tfms, bs=512, arch="TSTPlus", metrics=mae, cbs=ShowGraph())
fcst.fit_one_cycle(50, 1e-3)
fcst.export("fcst.pkl")

Inference:

from tsai.inference import load_learner
fcst = load_learner("models/fcst.pkl", cpu=False)
raw_preds, target, preds = fcst.get_X_preds(X[splits[1]], y[splits[1]])
raw_preds.shape
# torch.Size([235, 3])

Input data format

The input format for all time series models and image models in tsai is the same. An np.ndarray (or array-like object like zarr, etc) with 3 dimensions:

[# samples x # variables x sequence length]

The input format for tabular models in tsai (like TabModel, TabTransformer and TabFusionTransformer) is a pandas dataframe. See example.

How to contribute to tsai?

We welcome contributions of all kinds. Development of enhancements, bug fixes, documentation, tutorial notebooks, …

We have created a guide to help you start contributing to tsai. You can read it here.

Enterprise support and consulting services:

Want to make the most out of timeseriesAI/tsai in a professional setting? Let us help. Send us an email to learn more: [email protected]

Citing tsai

If you use tsai in your research please use the following BibTeX entry:

@Misc{tsai,
    author =       {Ignacio Oguiza},
    title =        {tsai - A state-of-the-art deep learning library for time series and sequential data},
    howpublished = {Github},
    year =         {2023},
    url =          {https://github.com/timeseriesAI/tsai}
}

tsai's People

Contributors

cversek avatar davnn avatar dependabot[bot] avatar deven-gqc avatar deven367 avatar dnth avatar filipj8 avatar geoheil avatar imilas avatar jeffhcross avatar jmp75 avatar ksachdeva avatar mvccn avatar oguiza avatar radi-cho avatar rkainkaryam avatar samlloydig avatar talesa avatar tcapelle avatar vrodriguezf avatar williamsdoug avatar yangtzech avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tsai's Issues

Can't use GPU runtime on Google Colab

Hi,

Thanks for making this wonderful package.
There is one small problem I encountered. I was trying to run 06_TS_to_image_classification.ipynb.
It would run well on a CPU setting but when I switch to a GPU runtime, it would give TypeError: expected CPU (got CUDA)
I tried manually .to(device) or .cuda() but none prevailed.
Would love to hear about your suggestions.

Error with the 'feature' column when creating a databunch

Hi,

I found a weird error when creating the databunch following the steps of the first notebook:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2656             try:
-> 2657                 return self._engine.get_loc(key)
   2658             except KeyError:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'feature'

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
4 frames
<ipython-input-45-c74cce7b4c94> in <module>()
      3 target_colname = 'target' # *
      4 valid_pct = 0.2 # *
----> 5 db = (TimeSeriesList.from_df(df = df, path = '.', cols=df.columns.values[3:], feat='feature')
      6       .split_by_rand_pct(valid_pct= valid_pct, seed=seed)
      7       .label_from_df(cols=target_colname, label_cls=CategoryList)

/content/timeseriesAI/fastai_timeseries/exp/nb_TSBasicData.py in from_df(cls, df, path, cols, feat, processor, **kwargs)
    204         assert inputs.isna().sum().sum(
    205         ) == 0, f"You have NaN values in column(s) {cols} of your dataframe, please fix it."
--> 206         inputs = df2array(inputs, feat)
    207         res = cls(
    208             items=inputs,

/content/timeseriesAI/fastai_timeseries/exp/nb_TSBasicData.py in df2array(df, feat)
    297     if feat is None:
    298         return df.values[:, None]
--> 299     for i, ch in enumerate(df[feat].unique()):
    300         data_i = df[df[feat] == ch].values[:, None]
    301         if i == 0: data = data_i

/usr/local/lib/python3.6/dist-packages/pandas/core/frame.py in __getitem__(self, key)
   2925             if self.columns.nlevels > 1:
   2926                 return self._getitem_multilevel(key)
-> 2927             indexer = self.columns.get_loc(key)
   2928             if is_integer(indexer):
   2929                 indexer = [indexer]

/usr/local/lib/python3.6/dist-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2657                 return self._engine.get_loc(key)
   2658             except KeyError:
-> 2659                 return self._engine.get_loc(self._maybe_cast_indexer(key))
   2660         indexer = self.get_indexer([key], method=method, tolerance=tolerance)
   2661         if indexer.ndim > 1 or indexer.size > 1:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'feature'

Best!

Key error when building the dataset

I was building my own dataset according to the tutorial but encountered the following error:

(26994, 27, 100) (26994,) ((#22495) [0,1,2,3,4,5,6,7,8,9...], (#4499) [22495,22496,22497,22498,22499,22500,22501,22502,22503,22504...])
Traceback (most recent call last):
  File "train2.py", line 16, in <module>
    dsets = TSDatasets(X, y, tfms=tfms, splits=splits, inplace=True)
  File "/home/wqiu/.cache/pypoetry/virtualenvs/dzts-kaskTdXw-py3.7/src/tsai/tsai/data/core.py", line 159, in __init__
    self.ptls = L([tl if not self.inplace else tl[:] if type(tl[0]).__name__ == 'memmap' else tensor(stack(tl[:])) for tl in self.tls])
  File "/home/wqiu/.cache/pypoetry/virtualenvs/dzts-kaskTdXw-py3.7/src/tsai/tsai/data/core.py", line 159, in <listcomp>
    self.ptls = L([tl if not self.inplace else tl[:] if type(tl[0]).__name__ == 'memmap' else tensor(stack(tl[:])) for tl in self.tls])
  File "/home/wqiu/.cache/pypoetry/virtualenvs/dzts-kaskTdXw-py3.7/lib/python3.7/site-packages/fastai2/data/core.py", line 266, in __getitem__
    return self._after_item(res) if is_indexer(idx) else res.map(self._after_item)
  File "/home/wqiu/.cache/pypoetry/virtualenvs/dzts-kaskTdXw-py3.7/lib/python3.7/site-packages/fastcore/foundation.py", line 377, in map
    return self._new(map(g, self))
  File "/home/wqiu/.cache/pypoetry/virtualenvs/dzts-kaskTdXw-py3.7/lib/python3.7/site-packages/fastcore/foundation.py", line 327, in _new
    def _new(self, items, *args, **kwargs): return type(self)(items, *args, use_list=None, **kwargs)
  File "/home/wqiu/.cache/pypoetry/virtualenvs/dzts-kaskTdXw-py3.7/lib/python3.7/site-packages/fastcore/foundation.py", line 47, in __call__
    res = super().__call__(*((x,) + args), **kwargs)
  File "/home/wqiu/.cache/pypoetry/virtualenvs/dzts-kaskTdXw-py3.7/lib/python3.7/site-packages/fastcore/foundation.py", line 318, in __init__
    items = list(items) if use_list else _listify(items)
  File "/home/wqiu/.cache/pypoetry/virtualenvs/dzts-kaskTdXw-py3.7/lib/python3.7/site-packages/fastcore/foundation.py", line 254, in _listify
    if is_iter(o): return list(o)
  File "/home/wqiu/.cache/pypoetry/virtualenvs/dzts-kaskTdXw-py3.7/lib/python3.7/site-packages/fastcore/foundation.py", line 220, in __call__
    return self.fn(*fargs, **kwargs)
  File "/home/wqiu/.cache/pypoetry/virtualenvs/dzts-kaskTdXw-py3.7/lib/python3.7/site-packages/fastai2/data/core.py", line 229, in _after_item
    def _after_item(self, o): return self.tfms(o)
  File "/home/wqiu/.cache/pypoetry/virtualenvs/dzts-kaskTdXw-py3.7/lib/python3.7/site-packages/fastcore/transform.py", line 187, in __call__
    def __call__(self, o): return compose_tfms(o, tfms=self.fs, split_idx=self.split_idx)
  File "/home/wqiu/.cache/pypoetry/virtualenvs/dzts-kaskTdXw-py3.7/lib/python3.7/site-packages/fastcore/transform.py", line 140, in compose_tfms
    x = f(x, **kwargs)
  File "/home/wqiu/.cache/pypoetry/virtualenvs/dzts-kaskTdXw-py3.7/lib/python3.7/site-packages/fastcore/transform.py", line 72, in __call__
    def __call__(self, x, **kwargs): return self._call('encodes', x, **kwargs)
  File "/home/wqiu/.cache/pypoetry/virtualenvs/dzts-kaskTdXw-py3.7/lib/python3.7/site-packages/fastcore/transform.py", line 82, in _call
    return self._do_call(getattr(self, fn), x, **kwargs)
  File "/home/wqiu/.cache/pypoetry/virtualenvs/dzts-kaskTdXw-py3.7/lib/python3.7/site-packages/fastcore/transform.py", line 86, in _do_call
    return x if f is None else retain_type(f(x, **kwargs), x, f.returns_none(x))
  File "/home/wqiu/.cache/pypoetry/virtualenvs/dzts-kaskTdXw-py3.7/lib/python3.7/site-packages/fastcore/dispatch.py", line 98, in __call__
    return f(*args, **kwargs)
  File "/home/wqiu/.cache/pypoetry/virtualenvs/dzts-kaskTdXw-py3.7/lib/python3.7/site-packages/fastai2/data/transforms.py", line 228, in encodes
    def encodes(self, o): return TensorCategory(self.vocab.o2i[o])
KeyError: -0.040000000000000036

0.2.14 error

The code that works on version 0.2.11 goes wrong on Version 2.14, and nothing has changed about the code

X, y, splits = combine_ split_ data([X_ train, X_ test], [y_ train, y_ test])

tfms = [None, [Categorize()]]

batch_ tfms=TSStandardize(by_ var=True)

dls = get_ ts_ dls(X, y, tfms=tfms, splits=splits, bs=64, batch_ tfms=batch_ tfms)

learn = ts_ learner(dls, TST, loss_ func=LabelSmoothingCrossEntropyFlat(), res_ dropout=0.3, fc_ dropout=0.9, metrics=[RocAucBinary(), accuracy], cbs=ShowGraph())

start = time.time ()

learn.load ('md1')


probas, targets, preds = learn.get_ X_ preds(X, with_ decoded=True)

error


File "f:\Anaconda3\lib\site-packages\tsai\learner.py", line 216, in get_X_preds
return self.get_preds(dl=self.dls.new_dl(X, y=y), **kwargs)

File "f:\Anaconda3\lib\site-packages\fastai\learner.py", line 250, in get_preds
if reorder and hasattr(dl, 'get_idxs'): res = nested_reorder(res, tensor(idxs).argsort())

File "f:\Anaconda3\lib\site-packages\fastai\torch_core.py", line 697, in nested_reorder
elif is_listy(t): return type(t)(nested_reorder(t_, idxs) for t_ in t)

File "f:\Anaconda3\lib\site-packages\fastai\torch_core.py", line 697, in
elif is_listy(t): return type(t)(nested_reorder(t_, idxs) for t_ in t)

File "f:\Anaconda3\lib\site-packages\fastai\torch_core.py", line 696, in nested_reorder
if isinstance(t, (Tensor,L)): return t[idxs]

IndexError: index 124 is out of bounds for dimension 0 with size 124

CUDA cannot allocate memory error

I am trying to use timeserieAI with my own dataset (1D sequences) and learn.fit_one_cycle() gives an error "Can not allocate memory".

Couple of questions?

  1. Is the timeserieAI databunch (using TSList) memory heavy? My np array size is only about 150M, but if I load databunch with the suggested approach almost 6-7GB of RAM is taken away.

  2. Is there anything I can do?

best regards,
~anoop

Scaling validation and test using the train statistics

Hi,

I was taking a look at the ROCKET notebook and I've seen that the validation and test sets are normalized using their own mean/std, instead of the one from the training set. I am not an expert on this but I remember from the fastai course that they recommended to use only the statistics from the training set to avoid bias.

In this repo, is this done like this on purpose?

Thanks!

How to create/generate a file like "NATOPS_TRAIN.arff"

Thank you for the detailed introduction! However, I am not sure how to create/generate a file like "NATOPS_TRAIN.arff". I know how to convert my data (in csv) to the file like "NATOPSDimension1_Train.arff", which contains only one dimension. How can I combine different dimensions together? Thank you!

when installing tsai : ModuleNotFoundError: No module named 'sktime.utils.load_data'

I am getting an error when installing tsai in colab:
`stable = True # True: latest version, False: stable version

import sys
ISCOLAB = 'google.colab' in sys.modules
if ISCOLAB:
if stable:
!pip install tsai -q
else:
!pip install git+https://github.com/timeseriesAI/tsai.git -q

import tsai
from tsai.all import *
print('tsai :', tsai.version)
print('fastai :', fastai.version)
print('fastcore :', fastcore.version)
print('torch :', torch.version)`

The error is:

ModuleNotFoundError: No module named 'sktime.utils.load_data'

Any help?

assertion error of minibatch length for regression objective

When trying to perform a regression using:

window_length = 120
stride = None
get_x = ['foo', 'bar', 'baz']

get_y = 'y_float'

horizon = 8
                            
X, y = SlidingWindowPanel(window_length, ['id_1', 'id_2'], stride, get_x=get_x, get_y=get_y,  horizon=horizon, seq_first=True, sort_by=['hour'], ascending=True, check_leakage=True, return_key=False, verbose=True)(df)

                        sort_by=['hour'], ascending=True, check_leakage=True, return_key=False, verbose=True)

splits = get_splits(y, valid_size=.2, stratify=True, random_state=47, shuffle=False)
#tfms  = [None, [Categorize()]] << classification works just fine
tfms  = [None, [ToFloat(), ToNumpyTensor()]]
dsets = TSDatasets(X, y, tfms=tfms, splits=splits)

the code fails with:

AssertionError: ==:
2048
16384

However, a classification task with:

%%time

splits = get_splits(y, valid_size=.2, stratify=True, random_state=47, shuffle=False)
tfms  = [None, [Categorize()]]
dsets = TSDatasets(X, y, tfms=tfms, splits=splits)
dsets

and an iteger class label works just fine.

---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-9-a480b725f69a> in <module>
     18                    )#.to_fp16()
     19     start = time.time()
---> 20     learn.fit_one_cycle(300, 1e-4)
     21     elapsed = time.time() - start
     22     vals = learn.recorder.values[-1]

~/development/conda_envs/my_env/lib/python3.8/site-packages/fastai/callback/schedule.py in fit_one_cycle(self, n_epoch, lr_max, div, div_final, pct_start, wd, moms, cbs, reset_opt)
    110     scheds = {'lr': combined_cos(pct_start, lr_max/div, lr_max, lr_max/div_final),
    111               'mom': combined_cos(pct_start, *(self.moms if moms is None else moms))}
--> 112     self.fit(n_epoch, cbs=ParamScheduler(scheds)+L(cbs), reset_opt=reset_opt, wd=wd)
    113 
    114 # Cell

~/development/conda_envs/my_env/lib/python3.8/site-packages/fastai/learner.py in fit(self, n_epoch, lr, wd, cbs, reset_opt)
    204             self.opt.set_hypers(lr=self.lr if lr is None else lr)
    205             self.n_epoch = n_epoch
--> 206             self._with_events(self._do_fit, 'fit', CancelFitException, self._end_cleanup)
    207 
    208     def _end_cleanup(self): self.dl,self.xb,self.yb,self.pred,self.loss = None,(None,),(None,),None,None

~/development/conda_envs/my_env/lib/python3.8/site-packages/fastai/learner.py in _with_events(self, f, event_type, ex, final)
    153 
    154     def _with_events(self, f, event_type, ex, final=noop):
--> 155         try:       self(f'before_{event_type}')       ;f()
    156         except ex: self(f'after_cancel_{event_type}')
    157         finally:   self(f'after_{event_type}')        ;final()

~/development/conda_envs/my_env/lib/python3.8/site-packages/fastai/learner.py in _do_fit(self)
    195         for epoch in range(self.n_epoch):
    196             self.epoch=epoch
--> 197             self._with_events(self._do_epoch, 'epoch', CancelEpochException)
    198 
    199     def fit(self, n_epoch, lr=None, wd=None, cbs=None, reset_opt=False):

~/development/conda_envs/my_env/lib/python3.8/site-packages/fastai/learner.py in _with_events(self, f, event_type, ex, final)
    153 
    154     def _with_events(self, f, event_type, ex, final=noop):
--> 155         try:       self(f'before_{event_type}')       ;f()
    156         except ex: self(f'after_cancel_{event_type}')
    157         finally:   self(f'after_{event_type}')        ;final()

~/development/conda_envs/my_env/lib/python3.8/site-packages/fastai/learner.py in _do_epoch(self)
    190     def _do_epoch(self):
    191         self._do_epoch_train()
--> 192         self._do_epoch_validate()
    193 
    194     def _do_fit(self):

~/development/conda_envs/my_env/lib/python3.8/site-packages/fastai/learner.py in _do_epoch_validate(self, ds_idx, dl)
    186         if dl is None: dl = self.dls[ds_idx]
    187         self.dl = dl
--> 188         with torch.no_grad(): self._with_events(self.all_batches, 'validate', CancelValidException)
    189 
    190     def _do_epoch(self):

~/development/conda_envs/my_env/lib/python3.8/site-packages/fastai/learner.py in _with_events(self, f, event_type, ex, final)
    153 
    154     def _with_events(self, f, event_type, ex, final=noop):
--> 155         try:       self(f'before_{event_type}')       ;f()
    156         except ex: self(f'after_cancel_{event_type}')
    157         finally:   self(f'after_{event_type}')        ;final()

~/development/conda_envs/my_env/lib/python3.8/site-packages/fastai/learner.py in all_batches(self)
    159     def all_batches(self):
    160         self.n_iter = len(self.dl)
--> 161         for o in enumerate(self.dl): self.one_batch(*o)
    162 
    163     def _do_one_batch(self):

~/development/conda_envs/my_env/lib/python3.8/site-packages/fastai/learner.py in one_batch(self, i, b)
    177         self.iter = i
    178         self._split(b)
--> 179         self._with_events(self._do_one_batch, 'batch', CancelBatchException)
    180 
    181     def _do_epoch_train(self):

~/development/conda_envs/my_env/lib/python3.8/site-packages/fastai/learner.py in _with_events(self, f, event_type, ex, final)
    155         try:       self(f'before_{event_type}')       ;f()
    156         except ex: self(f'after_cancel_{event_type}')
--> 157         finally:   self(f'after_{event_type}')        ;final()
    158 
    159     def all_batches(self):

~/development/conda_envs/my_env/lib/python3.8/site-packages/fastai/learner.py in __call__(self, event_name)
    131     def ordered_cbs(self, event): return [cb for cb in sort_by_run(self.cbs) if hasattr(cb, event)]
    132 
--> 133     def __call__(self, event_name): L(event_name).map(self._call_one)
    134 
    135     def _call_one(self, event_name):

~/development/conda_envs/my_env/lib/python3.8/site-packages/fastcore/foundation.py in map(self, f, gen, *args, **kwargs)
    152     def range(cls, a, b=None, step=None): return cls(range_of(a, b=b, step=step))
    153 
--> 154     def map(self, f, *args, gen=False, **kwargs): return self._new(map_ex(self, f, *args, gen=gen, **kwargs))
    155     def argwhere(self, f, negate=False, **kwargs): return self._new(argwhere(self, f, negate, **kwargs))
    156     def filter(self, f=noop, negate=False, gen=False, **kwargs):

~/development/conda_envs/my_env/lib/python3.8/site-packages/fastcore/basics.py in map_ex(iterable, f, gen, *args, **kwargs)
    654     res = map(g, iterable)
    655     if gen: return res
--> 656     return list(res)
    657 
    658 # Cell

~/development/conda_envs/my_env/lib/python3.8/site-packages/fastcore/basics.py in __call__(self, *args, **kwargs)
    644             if isinstance(v,_Arg): kwargs[k] = args.pop(v.i)
    645         fargs = [args[x.i] if isinstance(x, _Arg) else x for x in self.pargs] + args[self.maxi+1:]
--> 646         return self.func(*fargs, **kwargs)
    647 
    648 # Cell

~/development/conda_envs/my_env/lib/python3.8/site-packages/fastai/learner.py in _call_one(self, event_name)
    135     def _call_one(self, event_name):
    136         assert hasattr(event, event_name), event_name
--> 137         [cb(event_name) for cb in sort_by_run(self.cbs)]
    138 
    139     def _bn_bias_state(self, with_bias): return norm_bias_params(self.model, with_bias).map(self.opt.state)

~/development/conda_envs/my_env/lib/python3.8/site-packages/fastai/learner.py in <listcomp>(.0)
    135     def _call_one(self, event_name):
    136         assert hasattr(event, event_name), event_name
--> 137         [cb(event_name) for cb in sort_by_run(self.cbs)]
    138 
    139     def _bn_bias_state(self, with_bias): return norm_bias_params(self.model, with_bias).map(self.opt.state)

~/development/conda_envs/my_env/lib/python3.8/site-packages/fastai/callback/core.py in __call__(self, event_name)
     42                (self.run_valid and not getattr(self, 'training', False)))
     43         res = None
---> 44         if self.run and _run: res = getattr(self, event_name, noop)()
     45         if event_name=='after_fit': self.run=True #Reset self.run to True at each end of fit
     46         return res

~/development/conda_envs/my_env/lib/python3.8/site-packages/fastai/learner.py in after_batch(self)
    456         if len(self.yb) == 0: return
    457         mets = self._train_mets if self.training else self._valid_mets
--> 458         for met in mets: met.accumulate(self.learn)
    459         if not self.training: return
    460         self.lrs.append(self.opt.hypers[-1]['lr'])

~/development/conda_envs/my_env/lib/python3.8/site-packages/fastai/learner.py in accumulate(self, learn)
    378     def accumulate(self, learn):
    379         bs = find_bs(learn.yb)
--> 380         self.total += learn.to_detach(self.func(learn.pred, *learn.yb))*bs
    381         self.count += bs
    382     @property

~/development/conda_envs/my_env/lib/python3.8/site-packages/fastai/metrics.py in accuracy(inp, targ, axis)
     99 def accuracy(inp, targ, axis=-1):
    100     "Compute accuracy with `targ` when `pred` is bs * n_classes"
--> 101     pred,targ = flatten_check(inp.argmax(dim=axis), targ)
    102     return (pred == targ).float().mean()
    103 

~/development/conda_envs/my_env/lib/python3.8/site-packages/fastai/torch_core.py in flatten_check(inp, targ)
    781     "Check that `out` and `targ` have the same number of elements and flatten them."
    782     inp,targ = inp.contiguous().view(-1),targ.contiguous().view(-1)
--> 783     test_eq(len(inp), len(targ))
    784     return inp,targ

~/development/conda_envs/my_env/lib/python3.8/site-packages/fastcore/test.py in test_eq(a, b)
     33 def test_eq(a,b):
     34     "`test` that `a==b`"
---> 35     test(a,b,equals, '==')
     36 
     37 # Cell

~/development/conda_envs/my_env/lib/python3.8/site-packages/fastcore/test.py in test(a, b, cmp, cname)
     23     "`assert` that `cmp(a,b)`; display inputs and `cname or cmp.__name__` if it fails"
     24     if cname is None: cname=cmp.__name__
---> 25     assert cmp(a,b),f"{cname}:\n{a}\n{b}"
     26 
     27 # Cell

AssertionError: ==:
2048
16384

regression task memory error

classification is working fine for me. However after 50 iterations the data loader is running out of memory in the case of a 1D regression task:

RuntimeError: DataLoader worker (pid 140809) is killed by signal: Bus error. It is possible that dataloader's workers are out of shared memory. Please try to raise your shared memory limit.

I till try to further reduce the batch size, but so far this did not help.

Interestingly, the GPU as well as host memory do not look occupied at all. No where near their limit.

CUDA out of memory error with ROCKET

I've been trying to use the ROCKET method to generate the random features, but for a much larger dataset, and I've been getting a CUDA out of memory error. Is there anyway to apply the transformation in batches, rather than loading the entire data into CUDA?

[getting started] data loading of custom data

I am just getting started with deep learning and directly with deep learning on time-series. As a result, I am still rather confused about how to get started loading custom data. I hope here is the right place to ask for help.

I have data on the following shape:

JupyterLab

I.e. many (long)multivariate time-series from IoT devices.

Following along with https://colab.research.google.com/github/timeseriesAI/tsai/blob/master/tutorial_nbs/00c_Time_Series_data_preparation.ipynb#scrollTo=txlPio00mFa0 I want to create a sliding window to be able to use tsAI. However, I need to generate the sliding window for each device - otherwise, it will not make much sense.

The two main questions are:

  1. how to efficiently apply this for each device? I.e. device_id?
  2. how to ensure that horizon is set correctly i.e. the y is binary, i.e. 0 if there is no anomaly and 1 if there is an anomaly in the window

To generate an example time-series with example labels:

import pandas as pd
import numpy as np

import random
random_seed = 47
np.random.seed(random_seed)
random.seed(random_seed)

def generate_df_for_device(n_observations, n_metrics, device_id, geo_id, topology_id, cohort_id):
        df = pd.DataFrame(np.random.randn(n_observations,n_metrics), index=pd.date_range('2020', freq='H', periods=n_observations))
        df.columns = [f'metrik_{c}' for c in df.columns]
        df['geospatial_id'] = geo_id
        df['topology_id'] = topology_id
        df['cohort_id'] = cohort_id
        df['device_id'] = device_id
        return df
    
def generate_multi_device(n_observations, n_metrics, n_devices, cohort_levels, topo_levels):
    results = []
    for i in range(1, n_devices +1):
        r = random.randrange(1, n_devices)
        cohort = random.randrange(1, cohort_levels)
        topo = random.randrange(1, topo_levels)
        df_single_dvice = generate_df_for_device(n_observations, n_metrics, i, r, topo, cohort)

        ################### generating anomaly label START
        df_single_dvice = df_single_dvice.reset_index().rename(columns={'index':'hour'}).sort_values(['hour'])
        anomalies = df_single_dvice.sample(frac=0.003).index.values
        #print(len(anomalies))
        for anon in anomalies:
            a = 4
            mask = df_single_dvice.iloc[anon - 2 : anon + a].index
            df_single_dvice.loc[mask, 'is_anomaly'] = 1
        if len(anomalies) == 0:
            df_single_dvice['is_anomaly'] = 0
        else:
            df_single_dvice.is_anomaly = df_single_dvice.is_anomaly.fillna(0)
        df_single_dvice = df_single_dvice.set_index(['hour'])
        ################### generating anomaly label END
        results.append(df_single_dvice)
    return pd.concat(results)

# hourly data, 1 week of data
n_observations = 14 * 24
n_metrics = 3
n_devices = 20
cohort_levels = 3
topo_levels = 5

df = generate_multi_device(n_observations, n_metrics, n_devices, cohort_levels, topo_levels)
df = df.sort_index()
df = df.reset_index()
print(df.is_anomaly.value_counts(normalize=True))
df.head()

I have a second dataset called labels which holds details when an anomaly for a specific device occurred but:

  • they are not complete. Not for each true anomaly, a label was created
  • they are not always correct. Sometimes labels are wrong
  • their time boundaries might be off. This is why we extend the start boundary of the label by 8 to 32 hours into the past
  • lastly but for this minimal example most importantly, the labels were generated randomly (not associated with the data (for simplicity). This is not the case for the real labels

Then:


import tsai
from tsai.all import *
print('tsai       :', tsai.__version__)
print('fastai     :', fastai.__version__)
print('fastcore   :', fastcore.__version__)
print('torch      :', torch.__version__)

# window_length is usually selected based on prior domain knowledge or by trial and error
window_length = 48                               # 48 hours (for each device)

# None for non-overlapping (stride = window_length) (default = 1). This depends on how often you want to predict once the model is trained
stride = 24                                      # I guess 24? i.e. I want to have an overlapping time window of  24 hours for the 48-hour sliding window               

start = 0                                        # use all data since the first time stamp (default = 0)

# None mean X comes from all columns except the target. If you only need specific cols pass their names in a list
get_x = ['metrik_0', 'metrik_1', 'metrik_2']     # omitting the others for now. Possibly utilize them later via preprocessing re-weighting via i.e. CLR (centered log-ratio)               

get_y = 'is_anomaly'                             # In multivariate time series, you must indicate which is/are the y columns

# !!!!
# 0 means y is taken from the last timestamp of the time sequence (default = 0)
horizon = 0  # probably it is best to start with the simplified assumption: if there is any `is_anomaly == 1` value in the whole window, it is classified as an anomaly. Therefore, horizon=0 is wrong - probably need a custom function?
# !!!!

seq_first = True
                            
X, y = SlidingWindow(window_length, stride=stride, start=start, get_x=get_x,  get_y=get_y, horizon=horizon, seq_first=seq_first)(df)
splits = get_splits(y, valid_size=.2, stratify=True, random_state=23, shuffle=False)
tfms  = [None, [Categorize()]]
dsets = TSDatasets(X, y, tfms=tfms, splits=splits)
dsets

This is giving some output - but as far as I understand it is not the one I want. In particular, my two requirements are not satisfied:

  1. how to efficiently apply this for each device? I.e. device_id?
  2. how to ensure that horizon is set correctly i.e. the y is binary, i.e. 0 if there is no anomaly and 1 if there is an anomaly in the window

How can I change the data loading code to accommodate my needs?

edit

For the second requirement it looks like something along the lines of:

y = np.array([0,1,1,0]) # this is a dummy y 
np.sum(stack(y).squeeze().sum() > 0)

For line:

y = stack(y).squeeze() if y != [] else None
would get the job done. But I am stuck on (1) and also not sure if this approach to (2) is correct for the tsAI library.

Loading train, validation and test data correctly

Hi, I have 3 separate pandas dataframe the train, validation and test time series. How do I correctly load them into the dataloaders for training?
My codes below

X_train, y_train = SlidingWindow(window_length, get_x=columns[:-1], get_y='Ah')(train_df)
X_valid, y_valid = SlidingWindow(window_length, get_x=columns[:-1], get_y='Ah')(valid_df)
X_test, y_test = SlidingWindow(window_length, get_x=columns[:-1], get_y='Ah')(test_df)

train_dsets = TSDatasets(X_train, y_train, tfms=tfms)
valid_dsets = TSDatasets(X_valid, y_valid, tfms=tfms)
test_dsets = TSDatasets(X_test, y_test, tfms=tfms)

dls = TSDataLoaders.from_dsets(train_dsets, valid_dsets, test_dsets, bs=[128, 128, 128])

Is this the right way?

Is the predicted result erratic?

I use TST model to predict new data, but the results seem to be inconsistent every time I run, and the length of new data interception also seems to affect the prediction results. Is there any way I can fix the prediction results? This is very important to me. Thank you,

Suppress progress bar on predict/get_preds

First of all thanks for supplying this useful repository, which enables everyone to work with an implementation of cutting edge timeseries concepts.
I am using a TST model to make live predictions in a bokeh animation displayed in a jupyter notebook. I create the test data for every prediction with a SlidingWindowPanel, where the progress bar can be disabled by setting verbose=False, and then get a prediction for the generated test data with get_preds. The get_preds method does not seem to have an option for setting verbosity.
The problem that arises is that even though the progress bar of get_preds disappears after it is done, it creates a new blank row each time it is called, leading to an endless and growing white space under the animation.
I tried suppressing the progress bar by redirecting or suppressing std_out, which was without success. I read about a utils function from fastai to suppress all fastprogress bars, but it seems as though this feature was removed in fastai 2.X.

Am i missing an option to set the verbosity of get_preds to false? If not, do you have any idea how i could manually suppress the progress bar(s)?

batch size question

Hi, I am looking at the TSDataLoaders and wondering what is the role of the validation batch size? I am used to setting up the training batch size, which defines the number of updates (1 per row) of each epoch. But in the past, for example OOB error, the validation set is just either all computed, or sampled. Here's what I am doing. Or possibly, I am mis-interpreting the 2 elements fed into bs=.. as being training and validation.

dls = TSDataLoaders.from_dsets(
    ds.train, ds.valid, bs=[train_bs, valid_bs], shuffle_train=False
)

thanks for your work.

fast preprocessing

I love your new SlidingWindowPanel, but as you can easily imagine for even a moderate size of devices the current implementation using python loops and numpy (=single threaded implementation) is rather slow.

What do you think about adding a num-jobs parameter similar to scikit-learn which would use the CPU to auto-parallelize (perhaps via dask)?

core.py defined in root and in data

The file core.py exists both in the root directory and in the data folder of the tsai package. This however is only present in when the package is installed from pip (version 0.1.0) and not on the github repository.

There are differences between e.g. the TSTensor class defined in both of the core.py.

Are there any reason for having the two versions?

Default value of pct_start in nb01

Hi,

I see that the default value of pct_start is .7 in the first notebook of this great repository, but the one in fastai's fit_one_cycleis .3.

Why is the reason for that? Is it something for the specific dataset of the notebook or it is something that has been found by empirical study for TSC over the UCR dataset?

Best!

Shared Memory Error running tutorial notebook 00b_How_to_use_numpy_arrays_in_fastai

Tutorial notebook fails in cell 38 error: RuntimeError: DataLoader worker (pid 23402) is killed by signal: Bus error. It is possible that dataloader's workers are out of shared memory. Please try to raise your shared memory limit. The above exception was the direct cause of the following exception.

Is there a recommended shared memory configuration for the current release of tsai?

Gist containing notebook with error output: https://gist.github.com/williamsdoug/dc3fd738457496aa45f34aadc0479616

tsai : 0.2.14
fastai : 2.2.5
fastcore : 1.3.19
torch : 1.7.1
cudatoolkit 10.2

OS: Ubuntu 20.04.2 LTS (Focal Fossa)
GPU: NVIDIA RTX2080ti (driver version: 460.32.03)

Loading custom model class in ts_learner

Hi! Does anyone know how do I load a custom model into with ts_learner? For example, I have a custom model class as follows:

class MySuperCustomModel(Module):
    def __init__(self, c_in, c_out, nf=32, nb_filters=None, **kwargs):
        nf = ifnone(nf, nb_filters) # for compatibility
        self.inceptionblock = InceptionBlock(c_in, nf, **kwargs)
        self.gap = GAP1d(1)
        self.fc = nn.Linear(nf * 4, c_out)

    def forward(self, x):
        x = self.inceptionblock(x)
        x = self.gap(x)
        x = self.fc(x)
        return x.clamp(max=1.0)

Can I get MySuperCustomModel to work with ts_learner?
Below code doesn't work

learn = ts_learner(dls, MySuperCustomModel, pretrained=True, weights_path=f'data/TSBERT/unsupervised_weights.pth')

Thanks!

Requirements not properly installed

Hi! I am trying to use your library.

I tried first installing from git and when launching a jupyter with from tsai.all import * I got some missing packages.

I installed from pip, same thing. Here they are (they are NOT exclusive, because I already have fastai and other libraries so some requirements were already installed)

pywavelet
imbalanced-learn
numba

[for now]

Thank you

TSBERT not working with lr_find

When running the TSBERT notebook I tried adding a lr_find. lr_find ran and the output looked reasonable, but the subsequent fit_one_cycle gave the error below.

learn = ts_learner(udls100, InceptionTimePlus, cbs=[ShowGraph(), TSBERT(target_dir='./data/TSBERT', fname=f'{dsid}')])
lr_min, lr_steep = learn.lr_find()
learn.fit_one_cycle(200, 1e-2)
ValueError                                Traceback (most recent call last)
<ipython-input-8-91d50d80d902> in <module>()
      2 learn = ts_learner(udls100, InceptionTimePlus, cbs=[ShowGraph(), TSBERT(target_dir='./data/TSBERT', fname=f'{dsid}')])
      3 lr_min, lr_steep = learn.lr_find()
----> 4 learn.fit_one_cycle(200, 1e-2)

11 frames
/usr/local/lib/python3.6/dist-packages/fastai/callback/schedule.py in fit_one_cycle(self, n_epoch, lr_max, div, div_final, pct_start, wd, moms, cbs, reset_opt)
    110     scheds = {'lr': combined_cos(pct_start, lr_max/div, lr_max, lr_max/div_final),
    111               'mom': combined_cos(pct_start, *(self.moms if moms is None else moms))}
--> 112     self.fit(n_epoch, cbs=ParamScheduler(scheds)+L(cbs), reset_opt=reset_opt, wd=wd)
    113 
    114 # Cell

/usr/local/lib/python3.6/dist-packages/fastai/learner.py in fit(self, n_epoch, lr, wd, cbs, reset_opt)
    209             self.opt.set_hypers(lr=self.lr if lr is None else lr)
    210             self.n_epoch = n_epoch
--> 211             self._with_events(self._do_fit, 'fit', CancelFitException, self._end_cleanup)
    212 
    213     def _end_cleanup(self): self.dl,self.xb,self.yb,self.pred,self.loss = None,(None,),(None,),None,None

/usr/local/lib/python3.6/dist-packages/fastai/learner.py in _with_events(self, f, event_type, ex, final)
    158 
    159     def _with_events(self, f, event_type, ex, final=noop):
--> 160         try: self(f'before_{event_type}');  f()
    161         except ex: self(f'after_cancel_{event_type}')
    162         self(f'after_{event_type}');  final()

/usr/local/lib/python3.6/dist-packages/fastai/learner.py in _do_fit(self)
    200         for epoch in range(self.n_epoch):
    201             self.epoch=epoch
--> 202             self._with_events(self._do_epoch, 'epoch', CancelEpochException)
    203 
    204     def fit(self, n_epoch, lr=None, wd=None, cbs=None, reset_opt=False):

/usr/local/lib/python3.6/dist-packages/fastai/learner.py in _with_events(self, f, event_type, ex, final)
    160         try: self(f'before_{event_type}');  f()
    161         except ex: self(f'after_cancel_{event_type}')
--> 162         self(f'after_{event_type}');  final()
    163 
    164     def all_batches(self):

/usr/local/lib/python3.6/dist-packages/fastai/learner.py in __call__(self, event_name)
    139 
    140     def ordered_cbs(self, event): return [cb for cb in self.cbs.sorted('order') if hasattr(cb, event)]
--> 141     def __call__(self, event_name): L(event_name).map(self._call_one)
    142 
    143     def _call_one(self, event_name):

/usr/local/lib/python3.6/dist-packages/fastcore/foundation.py in map(self, f, gen, *args, **kwargs)
    152     def range(cls, a, b=None, step=None): return cls(range_of(a, b=b, step=step))
    153 
--> 154     def map(self, f, *args, gen=False, **kwargs): return self._new(map_ex(self, f, *args, gen=gen, **kwargs))
    155     def argwhere(self, f, negate=False, **kwargs): return self._new(argwhere(self, f, negate, **kwargs))
    156     def filter(self, f=noop, negate=False, gen=False, **kwargs):

/usr/local/lib/python3.6/dist-packages/fastcore/basics.py in map_ex(iterable, f, gen, *args, **kwargs)
    664     res = map(g, iterable)
    665     if gen: return res
--> 666     return list(res)
    667 
    668 # Cell

/usr/local/lib/python3.6/dist-packages/fastcore/basics.py in __call__(self, *args, **kwargs)
    649             if isinstance(v,_Arg): kwargs[k] = args.pop(v.i)
    650         fargs = [args[x.i] if isinstance(x, _Arg) else x for x in self.pargs] + args[self.maxi+1:]
--> 651         return self.func(*fargs, **kwargs)
    652 
    653 # Cell

/usr/local/lib/python3.6/dist-packages/fastai/learner.py in _call_one(self, event_name)
    143     def _call_one(self, event_name):
    144         if not hasattr(event, event_name): raise Exception(f'missing {event_name}')
--> 145         for cb in self.cbs.sorted('order'): cb(event_name)
    146 
    147     def _bn_bias_state(self, with_bias): return norm_bias_params(self.model, with_bias).map(self.opt.state)

/usr/local/lib/python3.6/dist-packages/fastai/callback/core.py in __call__(self, event_name)
     42                (self.run_valid and not getattr(self, 'training', False)))
     43         res = None
---> 44         if self.run and _run: res = getattr(self, event_name, noop)()
     45         if event_name=='after_fit': self.run=True #Reset self.run to True at each end of fit
     46         return res

/usr/local/lib/python3.6/dist-packages/tsai/callback/core.py in after_epoch(self)
     69         val_losses = [v[1] for v in rec.values]
     70         x_bounds = (0, (self.n_epoch - len(self.nb_batches)) * self.nb_batches[0] + len(rec.losses))
---> 71         y_min = min((min(rec.losses), min(val_losses)))
     72         y_max = max((max(rec.losses), max(val_losses)))
     73         margin = (y_max - y_min) * .05

ValueError: min() arg is an empty sequence

On TSDatasets getter RecursionError: maximum recursion depth exceeded

Here's my dataset setup:

X_feature = 'Close'
y_feature = 'Close'

# y_offset=1 means X(t) :-> y=X(t+1), y_offset=2 means X(t) :-> y=X(t+2), ..
# 
y_offset, window_length, horizon = 1,1,1
tfms  = [None, [ToFloat(), ToNumpyTensor()]]
train_bs, valid_bs = 64, 128

training_end_ts = datetime(days.iloc[-2]['tsYear'],days.iloc[-2]['tsMonth'],days.iloc[-2]['tsDay'], 23, 59, 0) # less than or equal to this date
reqd_cols = set([X_feature, y_feature, 'isNewDay',])
end_day_val = 0. # TODO: Experiment with different values
df = data_1m[reqd_cols]

print('Training end timestamp', training_end_ts)
noise_reduction_threshold = 3 #decimals
first_daily_obs = list(reversed(np.where(df['isNewDay'])[0]))
df = insert_end_of_day_rows(
    df=df, 
    first_daily_obs=first_daily_obs,
    end_day_val=end_day_val
)

del df['isNewDay']
cond = (df.index <= training_end_ts)        
X = df[X_feature].values
y = shift(input=df[y_feature].values, shift=y_offset, cval=np.NaN)

train_idx = np.where( cond)[0]
valid_idx = np.where(~cond)[0]

splits = (list(train_idx),list(valid_idx))
print('training:\t', len(train_idx), '\ntesting:\t', len(valid_idx))


X1, y1 = SlidingWindow(window_length, horizon=horizon)(X)
splits1 = (L(splits[0]), L(splits[1][:-horizon]))
ds = TSDatasets(X=X1, y=y1, tfms=tfms, splits=splits1)
dls = TSDataLoaders.from_dsets(
    ds.train, ds.valid, bs=[train_bs, valid_bs], shuffle_train=False
)
ds[0]

returns

---------------------------------------------------------------------------
RecursionError                            Traceback (most recent call last)
/opt/conda/lib/python3.6/site-packages/IPython/core/formatters.py in __call__(self, obj)
    700                 type_pprinters=self.type_printers,
    701                 deferred_pprinters=self.deferred_printers)
--> 702             printer.pretty(obj)
    703             printer.flush()
    704             return stream.getvalue()

/opt/conda/lib/python3.6/site-packages/IPython/lib/pretty.py in pretty(self, obj)
    386                 if cls in self.type_pprinters:
    387                     # printer registered in self.type_pprinters
--> 388                     return self.type_pprinters[cls](obj, self, cycle)
    389                 else:
    390                     # deferred printer

/opt/conda/lib/python3.6/site-packages/IPython/lib/pretty.py in inner(obj, p, cycle)
    564                 p.text(',')
    565                 p.breakable()
--> 566             p.pretty(x)
    567         if len(obj) == 1 and type(obj) is tuple:
    568             # Special case for 1-item tuples.

/opt/conda/lib/python3.6/site-packages/IPython/lib/pretty.py in pretty(self, obj)
    403                         if cls is not object \
    404                                 and callable(cls.__dict__.get('__repr__')):
--> 405                             return _repr_pprint(obj, self, cycle)
    406 
    407             return _default_pprint(obj, self, cycle)

/opt/conda/lib/python3.6/site-packages/IPython/lib/pretty.py in _repr_pprint(obj, p, cycle)
    693     """A pprint that just redirects to the normal repr function."""
    694     # Find newlines and replace them with p.break_()
--> 695     output = repr(obj)
    696     lines = output.splitlines()
    697     with p.group():

/ws/forks/tsai/tsai/data/core.py in __repr__(self)
     65 
     66     def __repr__(self):
---> 67         if self.numel() == 1: return f'{self}'
     68         elif self.ndim >= 3:
     69             return f'TSTensor(samples:{self.shape[-3]}, vars:{self.shape[-2]}, len:{self.shape[-1]})'

/opt/conda/lib/python3.6/site-packages/torch/tensor.py in __format__(self, format_spec)
    530         from torch.overrides import has_torch_function, handle_torch_function
    531         if type(self) is not Tensor and has_torch_function(relevant_args):
--> 532             return handle_torch_function(Tensor.__format__, relevant_args, self, format_spec)
    533         if self.dim() == 0:
    534             return self.item().__format__(format_spec)

/opt/conda/lib/python3.6/site-packages/torch/overrides.py in handle_torch_function(public_api, relevant_args, *args, **kwargs)
   1058         # Use `public_api` instead of `implementation` so __torch_function__
   1059         # implementations can do equality/identity comparisons.
-> 1060         result = overloaded_arg.__torch_function__(public_api, types, args, kwargs)
   1061 
   1062         if result is not NotImplemented:

/opt/conda/lib/python3.6/site-packages/fastai/torch_core.py in __torch_function__(self, func, types, args, kwargs)
    317 #         if func.__name__[0]!='_': print(func, types, args, kwargs)
    318 #         with torch._C.DisableTorchFunction(): ret = _convert(func(*args, **(kwargs or {})), self.__class__)
--> 319         ret = super().__torch_function__(func, types, args=args, kwargs=kwargs)
    320         if isinstance(ret, TensorBase): ret.set_meta(self, as_copy=True)
    321         return ret

/opt/conda/lib/python3.6/site-packages/torch/tensor.py in __torch_function__(cls, func, types, args, kwargs)
    993 
    994         with _C.DisableTorchFunction():
--> 995             ret = func(*args, **kwargs)
    996             return _convert(ret, cls)
    997 

/opt/conda/lib/python3.6/site-packages/torch/tensor.py in __format__(self, format_spec)
    533         if self.dim() == 0:
    534             return self.item().__format__(format_spec)
--> 535         return object.__format__(self, format_spec)
    536 
    537     def __ipow__(self, other):  # type: ignore[misc]

/ws/forks/tsai/tsai/data/core.py in __repr__(self)
     65 
     66     def __repr__(self):
---> 67         if self.numel() == 1: return f'{self}'
     68         elif self.ndim >= 3:
     69             return f'TSTensor(samples:{self.shape[-3]}, vars:{self.shape[-2]}, len:{self.shape[-1]})'

... last 2 frames repeated, from the frame below ...

/opt/conda/lib/python3.6/site-packages/torch/tensor.py in __format__(self, format_spec)
    533         if self.dim() == 0:
    534             return self.item().__format__(format_spec)
--> 535         return object.__format__(self, format_spec)
    536 
    537     def __ipow__(self, other):  # type: ignore[misc]

RecursionError: maximum recursion depth exceeded

This dataset creation works in the sense that I am able to train a model successfully (at least it appears normal compared to other experiences I have had in the learning phase), but when I try to access the elements of the TSDatasets object, I get a recursion error. When I load the tests datasets, such as NATOPS from the tutorials, I have no problem.

freeze() doesn't set requires_grad to False

While playing around with the TSBert notebook, I noticed that the models parameters don't seem to get frozen when loading pretrained weights. Even if you call freeze() on the Learner, the parameters do NOT get frozen! I used the count_parameters() method (which as I checked just counts the parameters with requires_grad==True) to confirm this behavior.

Code sample:

dsid = 'LSST'
X, y, splits = get_UCR_data(dsid, split_data=False)

tfms = [None, TSClassification()]
batch_tfms = [TSStandardize(by_sample=True)]
dls = get_ts_dls(X, y, splits=splits, tfms=tfms, batch_tfms=batch_tfms)

learn = ts_learner(dls, InceptionTimePlus, fc_dropout=.1, metrics=accuracy)

learn.freeze()
print("Trainable parameters frozen:\t", count_parameters(learn.model))  # outputs 457614
learn.unfreeze()
print("Trainable parameters unfrozen:\t", count_parameters(learn.model))  # outputs 457614
for p in learn.model.parameters():
  p.requires_grad=False
print("All parameter frozen manually:\t", count_parameters(learn.model))  # outputs 0

Numba error on Rocket notebook 5

As also noted in this other users post to the forum https://forums.fast.ai/t/time-series-sequential-data-study-group/29686/588 I'm also seeing the numba error when attempting to execute the rocket notebook.

I've tried with numba versions 0.45.1, 0.46 and 0.47 all with the same results
The error seems to blame this line:
candidate_lengths = np.array(([7, 9, 11])) but when I executing that in its own nb cell I don't get an error.

Offending nb line and error:
`kernels = generate_kernels(seq_len, 10000)

NotImplementedError Traceback (most recent call last)
~/anaconda3/envs/fastai/lib/python3.7/site-packages/numba/targets/base.py in get_constant_generic(self, builder, ty, val)
498 try:
--> 499 impl = self._get_constants.find((ty,))
500 return impl(self, builder, ty, val)

~/anaconda3/envs/fastai/lib/python3.7/site-packages/numba/targets/base.py in find(self, sig)
49 if out is None:
---> 50 out = self._find(sig)
51 self._cache[sig] = out

~/anaconda3/envs/fastai/lib/python3.7/site-packages/numba/targets/base.py in _find(self, sig)
58 else:
---> 59 raise NotImplementedError(self, sig)
60

NotImplementedError: (<numba.targets.base.OverloadSelector object at 0x7f84b4955890>, (reflected list(int64),))

During handling of the above exception, another exception occurred:

NotImplementedError Traceback (most recent call last)
~/anaconda3/envs/fastai/lib/python3.7/site-packages/numba/errors.py in new_error_context(fmt_, *args, **kwargs)
661 try:
--> 662 yield
663 except NumbaError as e:

~/anaconda3/envs/fastai/lib/python3.7/site-packages/numba/lowering.py in lower_block(self, block)
257 loc=self.loc, errcls_=defaulterrcls):
--> 258 self.lower_inst(inst)
259

~/anaconda3/envs/fastai/lib/python3.7/site-packages/numba/lowering.py in lower_inst(self, inst)
300 ty = self.typeof(inst.target.name)
--> 301 val = self.lower_assign(ty, inst)
302 self.storevar(val, inst.target.name)

~/anaconda3/envs/fastai/lib/python3.7/site-packages/numba/lowering.py in lower_assign(self, ty, inst)
476 const = self.context.get_constant_generic(self.builder, valty,
--> 477 pyval)
478 # cast it to the variable type

~/anaconda3/envs/fastai/lib/python3.7/site-packages/numba/targets/base.py in get_constant_generic(self, builder, ty, val)
501 except NotImplementedError:
--> 502 raise NotImplementedError("Cannot lower constant of type '%s'" % (ty,))
503

NotImplementedError: Cannot lower constant of type 'reflected list(int64)'

During handling of the above exception, another exception occurred:

LoweringError Traceback (most recent call last)
in
----> 1 kernels = generate_kernels(seq_len, 10000)

~/anaconda3/envs/fastai/lib/python3.7/site-packages/numba/dispatcher.py in _compile_for_args(self, *args, **kws)
393 e.patch_message(''.join(e.args) + help_msg)
394 # ignore the FULL_TRACEBACKS config, this needs reporting!
--> 395 raise e
396
397 def inspect_llvm(self, signature=None):

~/anaconda3/envs/fastai/lib/python3.7/site-packages/numba/dispatcher.py in _compile_for_args(self, *args, **kws)
350 argtypes.append(self.typeof_pyval(a))
351 try:
--> 352 return self.compile(tuple(argtypes))
353 except errors.TypingError as e:
354 # Intercept typing error that may be due to an argument

~/anaconda3/envs/fastai/lib/python3.7/site-packages/numba/compiler_lock.py in _acquire_compile_lock(*args, **kwargs)
30 def _acquire_compile_lock(*args, **kwargs):
31 with self:
---> 32 return func(*args, **kwargs)
33 return _acquire_compile_lock
34

~/anaconda3/envs/fastai/lib/python3.7/site-packages/numba/dispatcher.py in compile(self, sig)
691
692 self._cache_misses[sig] += 1
--> 693 cres = self._compiler.compile(args, return_type)
694 self.add_overload(cres)
695 self._cache.save_overload(sig, cres)

~/anaconda3/envs/fastai/lib/python3.7/site-packages/numba/dispatcher.py in compile(self, args, return_type)
74
75 def compile(self, args, return_type):
---> 76 status, retval = self._compile_cached(args, return_type)
77 if status:
78 return retval

~/anaconda3/envs/fastai/lib/python3.7/site-packages/numba/dispatcher.py in _compile_cached(self, args, return_type)
88
89 try:
---> 90 retval = self._compile_core(args, return_type)
91 except errors.TypingError as e:
92 self._failed_cache[key] = e

~/anaconda3/envs/fastai/lib/python3.7/site-packages/numba/dispatcher.py in _compile_core(self, args, return_type)
106 args=args, return_type=return_type,
107 flags=flags, locals=self.locals,
--> 108 pipeline_class=self.pipeline_class)
109 # Check typing error if object mode is used
110 if cres.typing_error is not None and not flags.enable_pyobject:

~/anaconda3/envs/fastai/lib/python3.7/site-packages/numba/compiler.py in compile_extra(typingctx, targetctx, func, args, return_type, flags, locals, library, pipeline_class)
970 pipeline = pipeline_class(typingctx, targetctx, library,
971 args, return_type, flags, locals)
--> 972 return pipeline.compile_extra(func)
973
974

~/anaconda3/envs/fastai/lib/python3.7/site-packages/numba/compiler.py in compile_extra(self, func)
388 self.lifted = ()
389 self.lifted_from = None
--> 390 return self._compile_bytecode()
391
392 def compile_ir(self, func_ir, lifted=(), lifted_from=None):

~/anaconda3/envs/fastai/lib/python3.7/site-packages/numba/compiler.py in _compile_bytecode(self)
901 """
902 assert self.func_ir is None
--> 903 return self._compile_core()
904
905 def _compile_ir(self):

~/anaconda3/envs/fastai/lib/python3.7/site-packages/numba/compiler.py in _compile_core(self)
888 self.define_pipelines(pm)
889 pm.finalize()
--> 890 res = pm.run(self.status)
891 if res is not None:
892 # Early pipeline completion

~/anaconda3/envs/fastai/lib/python3.7/site-packages/numba/compiler_lock.py in _acquire_compile_lock(*args, **kwargs)
30 def _acquire_compile_lock(*args, **kwargs):
31 with self:
---> 32 return func(*args, **kwargs)
33 return _acquire_compile_lock
34

~/anaconda3/envs/fastai/lib/python3.7/site-packages/numba/compiler.py in run(self, status)
264 # No more fallback pipelines?
265 if is_final_pipeline:
--> 266 raise patched_exception
267 # Go to next fallback pipeline
268 else:

~/anaconda3/envs/fastai/lib/python3.7/site-packages/numba/compiler.py in run(self, status)
255 try:
256 event("-- %s" % stage_name)
--> 257 stage()
258 except _EarlyPipelineCompletion as e:
259 return e.result

~/anaconda3/envs/fastai/lib/python3.7/site-packages/numba/compiler.py in stage_nopython_backend(self)
762 """
763 lowerfn = self.backend_nopython_mode
--> 764 self._backend(lowerfn, objectmode=False)
765
766 def stage_compile_interp_mode(self):

~/anaconda3/envs/fastai/lib/python3.7/site-packages/numba/compiler.py in _backend(self, lowerfn, objectmode)
701 self.library.enable_object_caching()
702
--> 703 lowered = lowerfn()
704 signature = typing.signature(self.return_type, *self.args)
705 self.cr = compile_result(

~/anaconda3/envs/fastai/lib/python3.7/site-packages/numba/compiler.py in backend_nopython_mode(self)
688 self.calltypes,
689 self.flags,
--> 690 self.metadata)
691
692 def _backend(self, lowerfn, objectmode):

~/anaconda3/envs/fastai/lib/python3.7/site-packages/numba/compiler.py in native_lowering_stage(targetctx, library, interp, typemap, restype, calltypes, flags, metadata)
1141 lower = lowering.Lower(targetctx, library, fndesc, interp,
1142 metadata=metadata)
-> 1143 lower.lower()
1144 if not flags.no_cpython_wrapper:
1145 lower.create_cpython_wrapper(flags.release_gil)

~/anaconda3/envs/fastai/lib/python3.7/site-packages/numba/lowering.py in lower(self)
175 if self.generator_info is None:
176 self.genlower = None
--> 177 self.lower_normal_function(self.fndesc)
178 else:
179 self.genlower = self.GeneratorLower(self)

~/anaconda3/envs/fastai/lib/python3.7/site-packages/numba/lowering.py in lower_normal_function(self, fndesc)
216 # Init argument values
217 self.extract_function_arguments()
--> 218 entry_block_tail = self.lower_function_body()
219
220 # Close tail of entry block

~/anaconda3/envs/fastai/lib/python3.7/site-packages/numba/lowering.py in lower_function_body(self)
241 bb = self.blkmap[offset]
242 self.builder.position_at_end(bb)
--> 243 self.lower_block(block)
244
245 self.post_lower()

~/anaconda3/envs/fastai/lib/python3.7/site-packages/numba/lowering.py in lower_block(self, block)
256 with new_error_context('lowering "{inst}" at {loc}', inst=inst,
257 loc=self.loc, errcls_=defaulterrcls):
--> 258 self.lower_inst(inst)
259
260 def create_cpython_wrapper(self, release_gil=False):

~/anaconda3/envs/fastai/lib/python3.7/contextlib.py in exit(self, type, value, traceback)
128 value = type()
129 try:
--> 130 self.gen.throw(type, value, traceback)
131 except StopIteration as exc:
132 # Suppress StopIteration unless it's the same exception that

~/anaconda3/envs/fastai/lib/python3.7/site-packages/numba/errors.py in new_error_context(fmt_, *args, **kwargs)
668 from numba import config
669 tb = sys.exc_info()[2] if config.FULL_TRACEBACKS else None
--> 670 six.reraise(type(newerr), newerr, tb)
671
672

~/anaconda3/envs/fastai/lib/python3.7/site-packages/numba/six.py in reraise(tp, value, tb)
657 if value.traceback is not tb:
658 raise value.with_traceback(tb)
--> 659 raise value
660
661 else:

LoweringError: Failed in nopython mode pipeline (step: nopython mode backend)
Cannot lower constant of type 'reflected list(int64)'

File "fastai_timeseries/exp/rocket_functions.py", line 16:
def generate_kernels(input_length, num_kernels, kss=[7, 9, 11], pad=True, dilate=True):
candidate_lengths = np.array((kss))
^

[1] During: lowering "kss = arg(2, name=kss)" at /home/scottcha/src/timeseriesAI/fastai_timeseries/exp/rocket_functions.py (16)

This should not have happened, a problem has occurred in Numba's internals.
You are currently using Numba version 0.45.1.

Please report the error message and traceback, along with a minimal reproducer
at: https://github.com/numba/numba/issues/new

If more help is needed please feel free to speak to the Numba core developers
directly at: https://gitter.im/numba/numba

Thanks in advance for your help in improving Numba!`

Mixed precision training

At the moment do we have support for mixed precision training? I seem to get NaN loss with with learn.to_fp16 training. Running on RTX3090. Not sure if theres issue with the cuda 11 support for the new 30 series card.

seq_len for multivariate time series

Hi, I am seeing in the first notebook that the attribute seq_len when loading the NATOPS dataset is marked as 52, but the real length of the sequences if 51 according to the UCR archive.

However, for the ChlorineConcentration dataset it is ok. Is it a problem with multivariate datasets, maybe the feature column is being taken into account as part of seq_len?

On individual predict

I trained the model with this code:
#----------------------------------

X, y, splits = combine_split_data([X_train, X_test], [y_train, y_test]) tfms = [None, [Categorize()]] dsets = TSDatasets(X, y, tfms=tfms, splits=splits) dls = TSDataLoaders.from_dsets(dsets.train, dsets.valid, bs=64, batch_tfms=TSStandardize(by_var=True)) model = TST(dls.vars, dls.c, dls.len, res_dropout=0.3, fc_dropout=0.9) learn = Learner(dls, model, loss_func=LabelSmoothingCrossEntropyFlat(), metrics=[RocAucBinary(), accuracy], cbs=ShowGraphCallback2()) learn.fit_one_cycle(15, lr_max=1e-3 learn.save('md1')

#------------------------------------
Then I want to use this model alone to predict new unlabeled data. X2, the structure is the same as X for training,I can see the example. It seems that the method can only be added. Can we use it like this, that is, we don't need to divide train, test and y
dsets=TSDataset(X2) dls = TSDataLoaders.from_dsets(dsets, bs=[64, 128]) model = TST(dls.vars, dls.c, dls.len, res_dropout=0.3, fc_dropout=0.9) learn = Learner(dls, model, loss_func=LabelSmoothingCrossEntropyFlat(), metrics=[RocAucBinary(), accuracy], cbs=ShowGraphCallback2()) learn.fit_one_cycle(15, lr_max=1e-3 learn.load('md1')

But it didn't pass----------learn.load('md1')

time series as images: fitting shape for weight

I was following the tutorial to convert time-series as Images. However:

Expected 3-dimensional input for 3-dimensional weight [32, 47, 3], but got 4-dimensional input of size [64, 47, 224, 224] instead

is the error I get for:

epochs = 10

def construct_learner():
    model = create_model(xresnet1d34, dls=dls)#, **k)
    return Learner(dls,  model, metrics=[accuracy, fbeta_score, precision, cohen_kappa, average_precision_score], 
                   loss_func = CrossEntropyLossFlat(weight=class_weights)
                  )

As I understand this is not transfer learning, rather training the network from scratch and the dimensions should fit. Is this correct?

preprocessing functions for panel data

How can I apply the pre-processing functions for panel based time-series data i.e. time series of multiple things (devices)?

Should I set the batch-size to 1 (painfully slow) or is it required to re-code these functions to generate a standardization per panel element or an image for each one?

How do you a prediction for an arbitrary timestamp?

Sorry this is a really noob question. I've gone through tutorial 4 on time series regression and have successfully trained a learner. What I can't figure out is making a prediction on a future date. Here's what I tried:

from datetime import datetime
epoch = datetime.utcfromtimestamp(0)

def unix_time_seconds(dt):
    return (dt - epoch).total_seconds()

df_json = {"unix_timestamp":{"6":1451606400.0,"1558":1454284800.0,"3110":1456790400.0,"4662":1459468800.0,"6214":1462060800.0,"7766":1464739200.0,"9318":1467331200.0,"10870":1470009600.0,"12422":1472688000.0,"13974":1475280000.0,"15526":1477958400.0,"17078":1480550400.0,"18630":1483228800.0,"20182":1485907200.0,"21734":1488326400.0,"23286":1491004800.0,"24838":1493596800.0,"26390":1496275200.0,"27942":1498867200.0,"29494":1501545600.0,"31046":1504224000.0,"32598":1506816000.0,"34150":1509494400.0,"35702":1512086400.0,"37254":1514764800.0,"38806":1517443200.0,"40358":1519862400.0,"41910":1522540800.0,"43462":1525132800.0,"45014":1527811200.0,"46566":1530403200.0,"48118":1533081600.0,"49670":1535760000.0,"51222":1538352000.0,"52774":1541030400.0,"54326":1543622400.0,"55878":1546300800.0,"57430":1548979200.0,"58982":1551398400.0,"60534":1554076800.0,"62086":1556668800.0,"63638":1559347200.0,"65190":1561939200.0,"66742":1564617600.0,"68294":1567296000.0,"69846":1569888000.0,"71398":1572566400.0,"72950":1575158400.0,"74502":1577836800.0,"76054":1580515200.0,"77606":1583020800.0},"Usage_MWh":{"6":5.34858,"1558":3.78055,"3110":3.4831,"4662":3.74901,"6214":3.02347,"7766":7.63334,"9318":5.62975,"10870":5.51058,"12422":4.36067,"13974":3.29915,"15526":2.76066,"17078":2.94552,"18630":2.7777,"20182":2.76716,"21734":5.78573,"23286":4.8537129444,"24838":3.1271232778,"26390":2.8168842646,"27942":2.8774968882,"29494":2.8774968882,"31046":2.7846744079,"32598":2.8774968882,"34150":2.7846744079,"35702":2.8774968882,"37254":2.8774968882,"38806":2.5990294474,"40358":2.8774968882,"41910":2.5288257571,"43462":2.5724465738,"45014":3.3510199005,"46566":3.060722109,"48118":2.6989527056,"49670":2.5984900474,"51222":3.4421489889,"52774":3.4093083871,"54326":3.5249084516,"55878":3.1468401143,"57430":3.0175142015,"58982":3.3731491579,"60534":3.0313829708,"62086":3.1152347778,"63638":3.2681218106,"65190":3.4173398852,"66742":2.897951582,"68294":3.3056545,"69846":3.1457436,"71398":2.646408469,"72950":2.5245141129,"74502":2.7281552182,"76054":6.5999127071,"77606":7.7869288929}}
df = pd.DataFrame.from_dict(df_json)

window_length = 5
stride = None
horizon=1
X, y = SlidingWindow(5, get_x=['unix_timestamp'], get_y='Usage_MWh')(regression_df)
itemify(X, y)

splits = RandomSplitter()(X) 
tfms  = [None, [ToFloat(), ToNumpyTensor()]]
dsets = TSDatasets(X, y, tfms=tfms, splits=splits)
dls   = TSDataLoaders.from_dsets(dsets.train, dsets.valid)

model = InceptionTime(dls.vars, 1)
learn = Learner(dls, model, loss_func=MSELossFlat())

try:
  learn.fit_one_cycle(5)
  # learn.recorder.plot_metrics()

  valid_preds, valid_targets = learn.get_preds(ds_idx=1)
  valid_preds.flatten(), valid_targets.data

  y_pred = valid_targets.data.tolist()

  y_true = dsets.valid.items[1]

  future_date = datetime.strptime('2020-12-01', '%Y-%m-%d')
  seconds = unix_time_seconds(future_date)
  learn.predict(tensor([seconds]))

except Exception:
  print("ouch")
  traceback.print_exc()

I get

AssertionError: Expected an input of type in 
  - <class 'numpy.ndarray'>
 but got <class 'torch.Tensor'>

Sorry again for the really dumb question. It's probably a lack of fastai knowledge. I'll do a pull request afterwards and add the answer to the tutorial notebook.

Torch size issue and regression inference

  1. I have prepared my data as per the regression example. During fit_one_cycle, there is a warning that "Using a target size (torch.Size([16])) that is different to the input size (torch.Size([16, 1])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size."

Is there a way to squeeze(-1) the target size, so as to give it the extra 1 dimension? I suspect as a result of that my model is not training across epochs

  1. With respect to regression inference (i.e. on unknown y), if I simulate the classification example, "test_probs, b = learn.get_preds(dl=test_dl, save_preds=None)", what means by test_probs? Also there is a second output "b" that is returned, which is None, what does that mean? Unless I'm doing regression inference wrongly.

Regardless of the above, it is a great library and really recommend my peers to try it out too!

handling of non equally (too short sized) time series

Your code https://github.com/timeseriesAI/tsai/blob/master/tsai/data/preparation.py#L83 explicitly enforces time-series of at least

window_length + stride + horizon > seq_len

And I think this is a good thing (in general). However, it does not allow to remove offending time-series (i.e. in the SlidingWindowPanel some individual devices might not have enough observations).

What do you think about adding a switch to auto ignore (i.e. catch the exception but remove these devices )?

Change imports for fastai2

fastai2 proposed to fastai as of 21 Aug 2020. Need ti update imports accordingly (note: have submitted associated pull request as initial fix)

Import Error in Intro_to_Time_Series_Classification

Running the import libraries cell would incur the following error

/usr/local/lib/python3.6/dist-packages/tsai/all.py in <module>()
----> 1 from .imports import *
      2 from .utils import *
      3 from .data.all import *
      4 from .metrics import *
      5 from .learner import *

/usr/local/lib/python3.6/dist-packages/tsai/imports.py in <module>()
      1 import fastai2
      2 from fastai2.imports import *
----> 3 from fastai2.data.all import *
      4 from fastai2.torch_core import *
      5 from fastai2.learner import *

/usr/local/lib/python3.6/dist-packages/fastai2/data/all.py in <module>()
      3 from .load import *
      4 from .external import *
----> 5 from .transforms import *
      6 from .block import *

/usr/local/lib/python3.6/dist-packages/fastai2/data/transforms.py in <module>()
    228 
    229 # Cell
--> 230 class Categorize(DisplayedTransform):
    231     "Reversible transform of category string to `vocab` id"
    232     loss_func,order,store_attrs=CrossEntropyLossFlat(),1,'vocab,add_na'

/usr/local/lib/python3.6/dist-packages/fastai2/data/transforms.py in Categorize()
    230 class Categorize(DisplayedTransform):
    231     "Reversible transform of category string to `vocab` id"
--> 232     loss_func,order,store_attrs=CrossEntropyLossFlat(),1,'vocab,add_na'
    233     def __init__(self, vocab=None, sort=True, add_na=False):
    234         store_attr(self, self.store_attrs+',sort')

/usr/local/lib/python3.6/dist-packages/fastcore/utils.py in _f(*args, **kwargs)
    470         init_args.update(log)
    471         setattr(inst, 'init_args', init_args)
--> 472         return inst if to_return else f(*args, **kwargs)
    473     return _f
    474 

/usr/local/lib/python3.6/dist-packages/fastai2/layers.py in __init__(self, axis, *args, **kwargs)
    304     y_int = True
    305     @use_kwargs_dict(keep=True, weight=None, ignore_index=-100, reduction='mean')
--> 306     def __init__(self, *args, axis=-1, **kwargs): super().__init__(nn.CrossEntropyLoss, *args, axis=axis, **kwargs)
    307     def decodes(self, x):    return x.argmax(dim=self.axis)
    308     def activation(self, x): return F.softmax(x, dim=self.axis)

/usr/local/lib/python3.6/dist-packages/fastcore/utils.py in _f(*args, **kwargs)
    470         init_args.update(log)
    471         setattr(inst, 'init_args', init_args)
--> 472         return inst if to_return else f(*args, **kwargs)
    473     return _f
    474 

/usr/local/lib/python3.6/dist-packages/fastai2/layers.py in __init__(self, loss_cls, axis, flatten, floatify, is_2d, *args, **kwargs)
    279     activation=decodes=noops
    280     def __init__(self, loss_cls, *args, axis=-1, flatten=True, floatify=False, is_2d=True, **kwargs):
--> 281         store_attr(self, "axis,flatten,floatify,is_2d")
    282         self.func = loss_cls(*args,**kwargs)
    283         functools.update_wrapper(self, self.func)

/usr/local/lib/python3.6/dist-packages/fastcore/utils.py in store_attr(names, self, but, **attrs)
     95     args,varargs,keyw,locs = inspect.getargvalues(fr)
     96     if self is None: self = locs[args[0]]
---> 97     if not hasattr(self, '__stored_args__'): self.__stored_args__ = {}
     98     if attrs: return _store_attr(self, **attrs)
     99 

AttributeError: 'str' object has no attribute '__stored_args__'

ROCKET out of memory for create_rocket_features

When trying to compute the rocket featres it fails for me with an CUDA out of memory error:

X_train, y_train = create_rocket_features(dls.train, model)
X_valid, y_valid = create_rocket_features(dls.valid, model)
X_train.shape, X_valid.shape

RuntimeError: CUDA out of memory. Tried to allocate 2.46 GiB (GPU 0; 15.90 GiB total capacity; 2.46 GiB already allocated; 2.46 GiB free; 3.41 GiB reserved in total by PyTorch)

for ROCKET on a Nvidia P100.

The data is loaded using:

dls   = TSDataLoaders.from_dsets(dsets.train, dsets.valid, bs=[64, 128], batch_tfms=[TSStandardize(by_var=True)], num_workers=0, drop_last=False, shuffle_train=False)# images defined

model = create_model(ROCKET, dls=dls, n_kernels=500, kss=[7]) # n_kernels=10_000, kss=[7, 9, 11] by default, 

and I already try to reduce the number of kernels and kss. But somehow, this still fails and is not working / still running out of memory. This is even true when reducing the batch size to 8/16.

NOTICE: on disk the numpy array is approximately 70GB in size.

Maybe I am creating too many windows? Of an dataframe with approx 50 columns, 14. million records (panel data), 6GB in size according to pandas, I use:

window_length = 48
get_x = ['x1', ... 'x50']
get_y = 'target'
              
def y_func(o): return (o.sum(axis=1) > 0).astype(int)
X, y = SlidingWindowPanel(window_length, ['panel_device_id'], stride=5, get_x=get_x, get_y=get_y, y_func=y_func,
 horizon=0, seq_first=True, sort_by=['hour'], ascending=True, check_leakage=True, return_key=False, verbose=True)(df)

to generate the sliding window of length 48 hours which is sliding over every 5 hours. Perhaps I should decrease the number of windows? But I find it strange that neither batch size or reduction of features helped to solve the problem.

Include FCN and ResCNN in `tsa.models`

Currently, only InceptionTime and ResNet are included in tsai.models, in timeseriesAI, also the FCN and ResCNN are integrated (which is great).

Would be great to have them included as well.

I would offer to do this myself and create a PR for those two architectures, if this is convenient for you @oguiza

Great package!

On TSDatasets getter, RecursionError: maximum recursion depth exceeded

Here's my dataset setup:

X_feature = 'Close'
y_feature = 'Close'

# y_offset=1 means X(t) :-> y=X(t+1), y_offset=2 means X(t) :-> y=X(t+2), ..
# 
y_offset, window_length, horizon = 1,1,1
tfms  = [None, [ToFloat(), ToNumpyTensor()]]
train_bs, valid_bs = 64, 128

training_end_ts = datetime(days.iloc[-2]['tsYear'],days.iloc[-2]['tsMonth'],days.iloc[-2]['tsDay'], 23, 59, 0) # less than or equal to this date
reqd_cols = set([X_feature, y_feature, 'isNewDay',])
end_day_val = 0. # TODO: Experiment with different values
df = data_1m[reqd_cols]

print('Training end timestamp', training_end_ts)
noise_reduction_threshold = 3 #decimals
first_daily_obs = list(reversed(np.where(df['isNewDay'])[0]))
df = insert_end_of_day_rows(
    df=df, 
    first_daily_obs=first_daily_obs,
    end_day_val=end_day_val
)

del df['isNewDay']
cond = (df.index <= training_end_ts)        
X = df[X_feature].values
y = shift(input=df[y_feature].values, shift=y_offset, cval=np.NaN)

train_idx = np.where( cond)[0]
valid_idx = np.where(~cond)[0]

splits = (list(train_idx),list(valid_idx))
print('training:\t', len(train_idx), '\ntesting:\t', len(valid_idx))


X1, y1 = SlidingWindow(window_length, horizon=horizon)(X)
splits1 = (L(splits[0]), L(splits[1][:-horizon]))
ds = TSDatasets(X=X1, y=y1, tfms=tfms, splits=splits1)
dls = TSDataLoaders.from_dsets(
    ds.train, ds.valid, bs=[train_bs, valid_bs], shuffle_train=False
)
ds[0]

returns

---------------------------------------------------------------------------
RecursionError                            Traceback (most recent call last)
/opt/conda/lib/python3.6/site-packages/IPython/core/formatters.py in __call__(self, obj)
    700                 type_pprinters=self.type_printers,
    701                 deferred_pprinters=self.deferred_printers)
--> 702             printer.pretty(obj)
    703             printer.flush()
    704             return stream.getvalue()

/opt/conda/lib/python3.6/site-packages/IPython/lib/pretty.py in pretty(self, obj)
    386                 if cls in self.type_pprinters:
    387                     # printer registered in self.type_pprinters
--> 388                     return self.type_pprinters[cls](obj, self, cycle)
    389                 else:
    390                     # deferred printer

/opt/conda/lib/python3.6/site-packages/IPython/lib/pretty.py in inner(obj, p, cycle)
    564                 p.text(',')
    565                 p.breakable()
--> 566             p.pretty(x)
    567         if len(obj) == 1 and type(obj) is tuple:
    568             # Special case for 1-item tuples.

/opt/conda/lib/python3.6/site-packages/IPython/lib/pretty.py in pretty(self, obj)
    403                         if cls is not object \
    404                                 and callable(cls.__dict__.get('__repr__')):
--> 405                             return _repr_pprint(obj, self, cycle)
    406 
    407             return _default_pprint(obj, self, cycle)

/opt/conda/lib/python3.6/site-packages/IPython/lib/pretty.py in _repr_pprint(obj, p, cycle)
    693     """A pprint that just redirects to the normal repr function."""
    694     # Find newlines and replace them with p.break_()
--> 695     output = repr(obj)
    696     lines = output.splitlines()
    697     with p.group():

/ws/forks/tsai/tsai/data/core.py in __repr__(self)
     65 
     66     def __repr__(self):
---> 67         if self.numel() == 1: return f'{self}'
     68         elif self.ndim >= 3:
     69             return f'TSTensor(samples:{self.shape[-3]}, vars:{self.shape[-2]}, len:{self.shape[-1]})'

/opt/conda/lib/python3.6/site-packages/torch/tensor.py in __format__(self, format_spec)
    530         from torch.overrides import has_torch_function, handle_torch_function
    531         if type(self) is not Tensor and has_torch_function(relevant_args):
--> 532             return handle_torch_function(Tensor.__format__, relevant_args, self, format_spec)
    533         if self.dim() == 0:
    534             return self.item().__format__(format_spec)

/opt/conda/lib/python3.6/site-packages/torch/overrides.py in handle_torch_function(public_api, relevant_args, *args, **kwargs)
   1058         # Use `public_api` instead of `implementation` so __torch_function__
   1059         # implementations can do equality/identity comparisons.
-> 1060         result = overloaded_arg.__torch_function__(public_api, types, args, kwargs)
   1061 
   1062         if result is not NotImplemented:

/opt/conda/lib/python3.6/site-packages/fastai/torch_core.py in __torch_function__(self, func, types, args, kwargs)
    317 #         if func.__name__[0]!='_': print(func, types, args, kwargs)
    318 #         with torch._C.DisableTorchFunction(): ret = _convert(func(*args, **(kwargs or {})), self.__class__)
--> 319         ret = super().__torch_function__(func, types, args=args, kwargs=kwargs)
    320         if isinstance(ret, TensorBase): ret.set_meta(self, as_copy=True)
    321         return ret

/opt/conda/lib/python3.6/site-packages/torch/tensor.py in __torch_function__(cls, func, types, args, kwargs)
    993 
    994         with _C.DisableTorchFunction():
--> 995             ret = func(*args, **kwargs)
    996             return _convert(ret, cls)
    997 

/opt/conda/lib/python3.6/site-packages/torch/tensor.py in __format__(self, format_spec)
    533         if self.dim() == 0:
    534             return self.item().__format__(format_spec)
--> 535         return object.__format__(self, format_spec)
    536 
    537     def __ipow__(self, other):  # type: ignore[misc]

/ws/forks/tsai/tsai/data/core.py in __repr__(self)
     65 
     66     def __repr__(self):
---> 67         if self.numel() == 1: return f'{self}'
     68         elif self.ndim >= 3:
     69             return f'TSTensor(samples:{self.shape[-3]}, vars:{self.shape[-2]}, len:{self.shape[-1]})'

... last 2 frames repeated, from the frame below ...

/opt/conda/lib/python3.6/site-packages/torch/tensor.py in __format__(self, format_spec)
    533         if self.dim() == 0:
    534             return self.item().__format__(format_spec)
--> 535         return object.__format__(self, format_spec)
    536 
    537     def __ipow__(self, other):  # type: ignore[misc]

RecursionError: maximum recursion depth exceeded

This works, but when I try to access the elemnts of the TSDatasets object, I get a recursion error. When I load the tests datasets, such as NATOPS from the tutorials, I have no problem.

Failing Tutorial notebooks

I've been using the notebooks in tsai/tutorial_nbs/ as a form of tsai installation verification. For each notebook that fails on my local system, I have reproduced the error on Google Colab in both the stable and unstable configuration. Since a number of notebooks fail, I will create a unique reply to this issue for each failing notebook. Note: None of these failing notebooks are blockers for me, but I thought it would be good have these errors logged for future debug purposes.

Colab Configuration - Stable:
tsai : 0.2.14
fastai : 2.2.5
fastcore : 1.3.19
torch : 1.7.0+cu101

Colab Configuration - Unstable (master):
tsai : 0.2.15
fastai : 2.2.5
fastcore : 1.3.19
torch : 1.7.0+cu101

how to name class labels?

Hi,

How would you give names to the class labels in this package? I normally use a dictionaries like this, to convert indices into

lbl_dict = dict([
        ('-1.0', 'class 1'),
        ('0.0', 'class 2'),   
        ('1.0', 'class 3'),   
        ('2.0', 'class 4')]
    )

and then I add the function lbl_dict.get as part of the y tfms. It used to work for me in other fastai pipelines when creating the Datasets object, but using TSDatasets gives me a weird error about retain_type that I do not manage to understand. What is the easiest way to do this here?

Thank you! ;)

Errors in notebook 01 after last commit

Hi,

I've noticed two errors in the first notebook after the last commit:

  1. In the first databunch, the use of normalize should be replaced by scale
  2. In the function from_df, the feat argument raises an error unless the feature column is part of the cols argument.

Best!

RNN_FCNPlus model TypeError

Commit db9c860 (updated all Plus models to use pretrained w) introduces an error when using LSTM_FCNPlus (and other models in RNN_FCNPlus.py). This error is:

TypeError: dropout(): argument 'input' (position 1) must be Tensor, not tuple

I've recreated this problem on different OSs, CUDA versions, and pytorch versions, so I don't think version info is necessary. I can provide more information in case this turns out not to be a simple fix.

Thanks for all the work you do!

Error creating multivariate dataloader

Hi, I was trying to follow the tutorial notebook on how to prepare data:

https://github.com/timeseriesAI/tsai/blob/master/tutorial_nbs/00c_Time_Series_data_preparation.ipynb

I opened this in Google Colab, and have tried with and without the stable flag. My versions in Colab from the top cell are:
tsai : 0.2.13
fastai : 2.1.10
fastcore : 1.3.13
torch : 1.7.0+cu101

Under the End-End examples/Single multivariate time series, I can load the first cell fine and see the df. However when I run the second cell to create the data loader, I get the following error:

    ---------------------------------------------------------------------------
    AssertionError                            Traceback (most recent call last)
    <ipython-input-3-dbbb5a3104e2> in <module>()
          7 seq_first = True
          8 
    ----> 9 X, y = SlidingWindow(window_length, stride=stride, start=start, get_x=get_x,  get_y=get_y, horizon=horizon, seq_first=seq_first)(df)
         10 splits = get_splits(y, valid_size=.2, stratify=True, random_state=23, shuffle=False)
         11 tfms  = [None, [Categorize()]]
    
    /usr/local/lib/python3.6/dist-packages/tsai/data/preparation.py in SlidingWindow(window_len, stride, start, get_x, get_y, y_func, horizon, seq_first, sort_by, ascending, check_leakage)
         93     if min_horizon <= 0 and y_func is None and get_y != [] and check_leakage:
         94         assert get_x is not None and  get_y is not None and len([y for y in _get_y if y in _get_x]) == 0,  \
    ---> 95         'you need to change either horizon, get_x, get_y or use a y_func to avoid leakage'
         96     stride = ifnone(stride, window_len)
         97 
    
    AssertionError: you need to change either horizon, get_x, get_y or use a y_func to avoid leakage
    ---------------------------

Any suggestions?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.