tubo213 / kaggle-child-mind-institute-detect-sleep-states Goto Github PK

View Code? Open in Web Editor NEW

112.0 4.0 85.0 1.34 MB

License: MIT License

Python 13.63% Jupyter Notebook 86.37%

kaggle-child-mind-institute-detect-sleep-states's Introduction

Child Mind Institute - Detect Sleep States

This repository is for Child Mind Institute - Detect Sleep States

Discussion url

Build Environment

1. install rye

install documentation

MacOS

curl -sSf https://rye-up.com/get | bash
echo 'source "$HOME/.rye/env"' >> ~/.zshrc
source ~/.zshrc

Linux

curl -sSf https://rye-up.com/get | bash
echo 'source "$HOME/.rye/env"' >> ~/.bashrc
source ~/.bashrc

Windows
see install documentation

2. Create virtual environment

rye sync

3. Activate virtual environment

. .venv/bin/activate

Set path

Rewrite run/conf/dir/local.yaml to match your environment

data_dir: 
processed_dir: 
output_dir: 
model_dir: 
sub_dir: ./

Prepare Data

1. Download data

cd data
kaggle competitions download -c child-mind-institute-detect-sleep-states
unzip child-mind-institute-detect-sleep-states.zip

2. Preprocess data

rye run python run/prepare_data.py -m phase=train,test

Train Model

The following commands are for training the model of LB0.714

rye run python run/train.py downsample_rate=2 duration=5760 exp_name=exp001 dataset.batch_size=32

You can easily perform experiments by changing the parameters because hydra is used. The following commands perform experiments with downsample_rate of 2, 4, 6, and 8.

rye run python run/train.py -m downsample_rate=2,4,6,8

Upload Model

rye run python tools/upload_dataset.py

Inference

The following commands are for inference of LB0.714

rye run python run/inference.py dir=kaggle exp_name=exp001 weight.run_name=single downsample_rate=2 duration=5760 model.params.encoder_weights=null pp.score_th=0.005 pp.distance=40 phase=test

Implemented models

The model is built with two components: feature_extractor and decoder.

The feature_extractor and decoder that can be specified are as follows

Feature Extractor

CNNSpectrogram
LSTMFeatureExtractor
PANNsFeatureExtractor
SpecFeatureExtractor

Decoder

MLPDecoder
LSTMDecoder
TransformerDecoder
TransformerCNNDecoder
UNet1DDecoder
MLPDecoder

Model

Spec2DCNN: Segmentation through UNet.
Spec1D: Segmentation without UNet
DETR2DCNN: Use UNet to detect sleep as in DETR. This model is still under development.
CenterNet: Detect onset and offset, respectively, like CenterNet using UNet
TransformerAutoModel:
- Segmentation using huggingface's AutoModel. don't use feature_extractor and decoder.
- Since the Internet is not available during inference, it is necessary to create a config dataset and specify the path in the model_name.

The correspondence table between each model and dataset is as follows.

model	dataset
Spec1D	seg
Spec2DCNN	seg
DETR2DCNN	detr
CenterNet	centernet
TransformerAutoModel	seg

The command to train CenterNet with feature_extractor=CNNSpectrogram, decoder=UNet1DDecoder is as follows

rye run python run/train.py model=CenterNet dataset=centernet feature_extractor=CNNSpectrogram decoder=UNet1DDecoder

kaggle-child-mind-institute-detect-sleep-states's People

Contributors

Stargazers

Watchers

Forkers

sabercali zhangkaihua88 batigol001 ethanhwang1024 jiayuxu0 syo093c tonytang1997 robsonsan xihuancoding mapleandjoker yom1215 tak34 yuseiitowstr plandic rishav-hub kuto5046 wenmin-wu rubick1896 laskari wushidiguo javazeroo ottantacinque jamieplace fahmouchka tomatoboy-hub yutaoc137 yuyeonee ksj1368 chillandimprove pi-darkkk qzd-1 atamazian yesinkim miaozuoyu oriki101 biologylihaoyu tomutoyoshima augustus2011 tincochan mustafa-khalid debianetassadit abheeshth htylab wasimmadha 6410685082 fridaylover jzhoujg gist-mldl thomz1 meikou80 kylewang12138 lingjoor-research stagoverflow parksunwoo ishikei14k provomittt 956961 sekkydsl bigbossnutchapon hanna-bae alexandreib rinost081 piguaizjx j-takurou tsztungchau-jo myeongwang yukabu15 seozizou l3g5 vineetp6 mori233 sowat4 unfriendlyai yicheng-past yanmingyu yusak1 alex-eslava parkkyungjun badrabu yuyangouc dm99999

kaggle-child-mind-institute-detect-sleep-states's Issues

The validation set loss and score of your model with LB 0.71?

May I ask for the validation set loss and score of your model with LB 0.71? I trained a model following your configuration. The loss steadily decreased and converged, but the score showed fluctuations around 0.002."

resize to nn.Upsample

resizeでoriginalの長さに戻しているのをnn.Upsampleで戻す

https://github.com/tubo213/kaggle-child-mind-institute-detect-sleep-states/blob/1ec3407320edbfabce67055ab4e65fdf354d4d5f/src/models/spec1D.py#L51C1-L51C1

Abount Implementation of CutMix for 1D signals

Hi. Thanks for sharing great code.
I have a discussion for cutmix of 1D inputs.

In my understanding, taking square root is necessary only for 2D inputs, because in 2D image, lambda means cut-out rate of area.
So, if it is applied to 1D signals, I think square root is not necessary because lambda means cut-out rate of frame length.

def get_rand_1dbbox(n_timesteps: int, lam: float) -> tuple[int, int]:
    """Get random 1D bounding box.

    Args:
        n_timesteps (int): Number of timesteps.
        lam (float): Lambda value.

    Returns:
        tuple[int, int]: (start, end) of the bounding box.
    """
    cut_rat = np.sqrt(1.0 - lam)
    cut_len = int(n_timesteps * cut_rat)

    start = np.random.randint(0, n_timesteps - cut_len)
    end = start + cut_len

    return start, end

Thus, fixing as below looks more natural to me.

def get_rand_1dbbox(n_timesteps: int, lam: float) -> tuple[int, int]:
    """Get random 1D bounding box.

    Args:
        n_timesteps (int): Number of timesteps.
        lam (float): Lambda value.

    Returns:
        tuple[int, int]: (start, end) of the bounding box.
    """
    cut_rat = 1.0 - lam
    cut_len = int(n_timesteps * cut_rat)

    start = np.random.randint(0, n_timesteps - cut_len)
    end = start + cut_len

    return start, end

https://github.com/tubo213/kaggle-child-mind-institute-detect-sleep-states/blob/main/src/augmentation/cutmix.py#L15

[BUG] Error during the validation

I'm getting the following error:

Epoch 0: 100% 119/119 [02:51<00:00,  1.44s/it, v_num=0]
Validation: 0it [00:00, ?it/s]
Validation:   0% 0/151 [00:00<?, ?it/s]
Validation DataLoader 0:   0% 0/151 [00:00<?, ?it/s]
Validation DataLoader 0:  13% 20/151 [00:08<00:56,  2.32it/s]
Validation DataLoader 0:  26% 40/151 [00:14<00:39,  2.78it/s]
Validation DataLoader 0:  40% 60/151 [00:20<00:30,  2.98it/s]
Validation DataLoader 0:  53% 80/151 [00:25<00:22,  3.09it/s]
Validation DataLoader 0:  66% 100/151 [00:31<00:16,  3.15it/s]
Validation DataLoader 0:  79% 120/151 [00:37<00:09,  3.22it/s]
Validation DataLoader 0:  93% 140/151 [00:42<00:03,  3.27it/s]
Validation DataLoader 0: 100% 151/151 [00:46<00:00,  3.28it/s]Error executing job with overrides: []
Traceback (most recent call last):
  File "/content/kaggle-child-mind-institute-detect-sleep-states/run/train.py", line 73, in main
    trainer.fit(model, datamodule=datamodule)
  File "/content/kaggle-child-mind-institute-detect-sleep-states/.venv/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 532, in fit
    call._call_and_handle_interrupt(
  File "/content/kaggle-child-mind-institute-detect-sleep-states/.venv/lib/python3.10/site-packages/pytorch_lightning/trainer/call.py", line 43, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "/content/kaggle-child-mind-institute-detect-sleep-states/.venv/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 571, in _fit_impl
    self._run(model, ckpt_path=ckpt_path)
  File "/content/kaggle-child-mind-institute-detect-sleep-states/.venv/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 980, in _run
    results = self._run_stage()
  File "/content/kaggle-child-mind-institute-detect-sleep-states/.venv/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1023, in _run_stage
    self.fit_loop.run()
  File "/content/kaggle-child-mind-institute-detect-sleep-states/.venv/lib/python3.10/site-packages/pytorch_lightning/loops/fit_loop.py", line 202, in run
    self.advance()
  File "/content/kaggle-child-mind-institute-detect-sleep-states/.venv/lib/python3.10/site-packages/pytorch_lightning/loops/fit_loop.py", line 355, in advance
    self.epoch_loop.run(self._data_fetcher)
  File "/content/kaggle-child-mind-institute-detect-sleep-states/.venv/lib/python3.10/site-packages/pytorch_lightning/loops/training_epoch_loop.py", line 134, in run
    self.on_advance_end()
  File "/content/kaggle-child-mind-institute-detect-sleep-states/.venv/lib/python3.10/site-packages/pytorch_lightning/loops/training_epoch_loop.py", line 249, in on_advance_end
    self.val_loop.run()
  File "/content/kaggle-child-mind-institute-detect-sleep-states/.venv/lib/python3.10/site-packages/pytorch_lightning/loops/utilities.py", line 181, in _decorator
    return loop_run(self, *args, **kwargs)
  File "/content/kaggle-child-mind-institute-detect-sleep-states/.venv/lib/python3.10/site-packages/pytorch_lightning/loops/evaluation_loop.py", line 122, in run
    return self.on_run_end()
  File "/content/kaggle-child-mind-institute-detect-sleep-states/.venv/lib/python3.10/site-packages/pytorch_lightning/loops/evaluation_loop.py", line 244, in on_run_end
    self._on_evaluation_epoch_end()
  File "/content/kaggle-child-mind-institute-detect-sleep-states/.venv/lib/python3.10/site-packages/pytorch_lightning/loops/evaluation_loop.py", line 326, in _on_evaluation_epoch_end
    call._call_lightning_module_hook(trainer, hook_name)
  File "/content/kaggle-child-mind-institute-detect-sleep-states/.venv/lib/python3.10/site-packages/pytorch_lightning/trainer/call.py", line 146, in _call_lightning_module_hook
    output = fn(*args, **kwargs)
  File "/content/kaggle-child-mind-institute-detect-sleep-states/src/modelmodule.py", line 97, in on_validation_epoch_end
    preds=preds[:, :, [1, 2]],
IndexError: index 2 is out of bounds for axis 2 with size 2

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
Epoch 0: 100% 119/119 [03:57<00:00,  2.00s/it, v_num=0]

bandpass filter

G2Netではフィルター結構大事だったらしい

https://www.kaggle.com/competitions/g2net-gravitational-wave-detection/discussion/275341

feature extractor, decoderを抽象化

AbstractFeatureExtractorとAbstractDecoderを継承して、feature extractorとdecoderを作る

encoderを分離してmodelを抽象化

Spec1D, Spec2DCNNでモデルが分かれている。
UNetやpoolingをencoderとして抽象化することで1D, 2Dを統一的に扱う

predict offset

現在はdownsampleしているだけだが、centernetのようにoffsetも予測したほうが良さそう

https://arxiv.org/abs/1904.07850

DETR

start, endを同時に出力できるし、個数も指定できて相性良さそう

アノテーションのルール(1時間以上寝てないとだめ、寝てるのは一日一回)をうまいこと学習してくれそう
https://qiita.com/DeepTama/items/937e13f6beda79be17d8

精度そんなに出ない気がするけど、面白いからやりたいいい

audio augmentation

https://github.com/iver56/audiomentations

ensemble

Running run/prepare_data.py killed

It must not be an issue but is a question.
I encountered an intermediate killing operation before finishing the task:

$ rye run python run/prepare_data.py -m phase=train,test

[2023-11-12 23:53:26,135][HYDRA] Launching 2 jobs locally
[2023-11-12 23:53:26,135][HYDRA] #0 : phase=train
Killed

No other error messages appeared.
It might be my environmental issue. But do we have a minimum requirement for e.g., RAM of an environment that runs this command?
FYI: the spec of my environment:
g4dn.xlarge (AWS instance) - 4 cores - 16 GB RAM - 1 GPU - 40Gi Disk

Plz~, May I ask you the version of Transformer

defaults:

self
dir: local
model: TransformerAutoModel #

modelの抽象化

AbstractModelをつくる

メソッド

forward: lossとlogitsの辞書を返す
predict: step毎の各イベントの生起確率を返す. 勾配の計算はしない

cross validation

out of foldだけしか実装していないからcross validationも実装する

CV-LB相関してそうだし優先度低め

wandbにパラメータ記録する

wandbにパラメータが記録されてないので記録する

https://wandb.ai/adrishd/hydra-example/reports/Configuring-W-B-Projects-with-Hydra--VmlldzoxNTA2MzQw

inferenceでtrainのconfigを参照する

以下のコードでtrainのconfigを呼び出して、modelをloadするようにする
inferenceのconfigではどの実験かとpostprocessの設定を指定する

def get_cfg():
    initialize(config_path="run/conf")
    cfg = compose(
        config_name="train",
    )

[BUG] CenterNet error

I'm getting the following error when I try to use CenterNet:

Epoch 0:   0% 0/119 [00:00<?, ?it/s] Error executing job with overrides: []
Traceback (most recent call last):
  File "/content/kaggle-child-mind-institute-detect-sleep-states/run/train.py", line 73, in main
    trainer.fit(model, datamodule=datamodule)
  File "/content/kaggle-child-mind-institute-detect-sleep-states/.venv/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 532, in fit
    call._call_and_handle_interrupt(
  File "/content/kaggle-child-mind-institute-detect-sleep-states/.venv/lib/python3.10/site-packages/pytorch_lightning/trainer/call.py", line 43, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "/content/kaggle-child-mind-institute-detect-sleep-states/.venv/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 571, in _fit_impl
    self._run(model, ckpt_path=ckpt_path)
  File "/content/kaggle-child-mind-institute-detect-sleep-states/.venv/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 980, in _run
    results = self._run_stage()
  File "/content/kaggle-child-mind-institute-detect-sleep-states/.venv/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1023, in _run_stage
    self.fit_loop.run()
  File "/content/kaggle-child-mind-institute-detect-sleep-states/.venv/lib/python3.10/site-packages/pytorch_lightning/loops/fit_loop.py", line 202, in run
    self.advance()
  File "/content/kaggle-child-mind-institute-detect-sleep-states/.venv/lib/python3.10/site-packages/pytorch_lightning/loops/fit_loop.py", line 355, in advance
    self.epoch_loop.run(self._data_fetcher)
  File "/content/kaggle-child-mind-institute-detect-sleep-states/.venv/lib/python3.10/site-packages/pytorch_lightning/loops/training_epoch_loop.py", line 133, in run
    self.advance(data_fetcher)
  File "/content/kaggle-child-mind-institute-detect-sleep-states/.venv/lib/python3.10/site-packages/pytorch_lightning/loops/training_epoch_loop.py", line 219, in advance
    batch_output = self.automatic_optimization.run(trainer.optimizers[0], kwargs)
  File "/content/kaggle-child-mind-institute-detect-sleep-states/.venv/lib/python3.10/site-packages/pytorch_lightning/loops/optimization/automatic.py", line 188, in run
    self._optimizer_step(kwargs.get("batch_idx", 0), closure)
  File "/content/kaggle-child-mind-institute-detect-sleep-states/.venv/lib/python3.10/site-packages/pytorch_lightning/loops/optimization/automatic.py", line 266, in _optimizer_step
    call._call_lightning_module_hook(
  File "/content/kaggle-child-mind-institute-detect-sleep-states/.venv/lib/python3.10/site-packages/pytorch_lightning/trainer/call.py", line 146, in _call_lightning_module_hook
    output = fn(*args, **kwargs)
  File "/content/kaggle-child-mind-institute-detect-sleep-states/.venv/lib/python3.10/site-packages/pytorch_lightning/core/module.py", line 1276, in optimizer_step
    optimizer.step(closure=optimizer_closure)
  File "/content/kaggle-child-mind-institute-detect-sleep-states/.venv/lib/python3.10/site-packages/pytorch_lightning/core/optimizer.py", line 161, in step
    step_output = self._strategy.optimizer_step(self._optimizer, closure, **kwargs)
  File "/content/kaggle-child-mind-institute-detect-sleep-states/.venv/lib/python3.10/site-packages/pytorch_lightning/strategies/strategy.py", line 231, in optimizer_step
    return self.precision_plugin.optimizer_step(optimizer, model=model, closure=closure, **kwargs)
  File "/content/kaggle-child-mind-institute-detect-sleep-states/.venv/lib/python3.10/site-packages/pytorch_lightning/plugins/precision/amp.py", line 76, in optimizer_step
    closure_result = closure()
  File "/content/kaggle-child-mind-institute-detect-sleep-states/.venv/lib/python3.10/site-packages/pytorch_lightning/loops/optimization/automatic.py", line 142, in __call__
    self._result = self.closure(*args, **kwargs)
  File "/content/kaggle-child-mind-institute-detect-sleep-states/.venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/content/kaggle-child-mind-institute-detect-sleep-states/.venv/lib/python3.10/site-packages/pytorch_lightning/loops/optimization/automatic.py", line 128, in closure
    step_output = self._step_fn()
  File "/content/kaggle-child-mind-institute-detect-sleep-states/.venv/lib/python3.10/site-packages/pytorch_lightning/loops/optimization/automatic.py", line 315, in _training_step
    training_step_output = call._call_strategy_hook(trainer, "training_step", *kwargs.values())
  File "/content/kaggle-child-mind-institute-detect-sleep-states/.venv/lib/python3.10/site-packages/pytorch_lightning/trainer/call.py", line 294, in _call_strategy_hook
    output = fn(*args, **kwargs)
  File "/content/kaggle-child-mind-institute-detect-sleep-states/.venv/lib/python3.10/site-packages/pytorch_lightning/strategies/strategy.py", line 380, in training_step
    return self.model.training_step(*args, **kwargs)
  File "/content/kaggle-child-mind-institute-detect-sleep-states/src/modelmodule.py", line 53, in training_step
    output = self.model(batch["feature"], batch["label"], do_mixup, do_cutmix)
  File "/content/kaggle-child-mind-institute-detect-sleep-states/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/content/kaggle-child-mind-institute-detect-sleep-states/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/content/kaggle-child-mind-institute-detect-sleep-states/src/models/base.py", line 41, in forward
    loss = self.loss_fn(logits, labels)
  File "/content/kaggle-child-mind-institute-detect-sleep-states/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/content/kaggle-child-mind-institute-detect-sleep-states/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/content/kaggle-child-mind-institute-detect-sleep-states/src/models/centernet.py", line 51, in forward
    nonzero_idx_onset = labels[:, 4].nonzero().view(-1)
IndexError: index 4 is out of bounds for dimension 1 with size 3

Add a license

Can you add a license to this repository?

Cannot run run/prepare_data.py

When I ran the following command, I've got the error message:

$ rye run python -m run/prepare_data.py phase=train,test

/home/jovyan/workspace/kaggle-child-mind-institute-detect-sleep-states/.venv/bin/python: Error while finding module specification for 'run/prepare_data.py' (ModuleNotFoundError: No module named 'run/prepare_data'). Try using 'run/prepare_data' instead of 'run/prepare_data.py' as the module name.

$ rye run python -m run/prepare_data phase=train,test

/home/jovyan/workspace/kaggle-child-mind-institute-detect-sleep-states/.venv/bin/python: No module named run/prepare_data

How can I work around this?

正例が無いseriesが学習に使われていない

kaggle-child-mind-institute-detect-sleep-states/src/dataset/seg.py

Line 71 in 870a474

series_id = self.event_df.at[idx, "series_id"]

all data training

Train with MLP decoder crashes

TypeError: ModuleList.init() takes from 1 to 2 positional arguments but 6 were given

To overcome this just remove asterisk in mlpdecoder.py: 28

self.mlp = nn.ModuleList(self.mlp)

Configクラスを作る

hydraから読み込んだDictConfigは型がついてなくていやなので、dataclassで型ヒントつける

https://hydra.cc/docs/tutorials/structured_config/config_groups/