$ python run_training.py --logfile ../logs/intel.log --save_model_dir ../saved_models/intel --data_dir ../data/disease-prediction --intel
Some weights of the model checkpoint at emilyalsentzer/Bio_ClinicalBERT were not used when initializing BertForSequenceClassification: ['cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.bias']
This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at emilyalsentzer/Bio_ClinicalBERT and are newly initialized: ['classifier.weight', 'classifier.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
/home/ubuntu/anaconda3/envs/disease_pred_intel/lib/python3.8/site-packages/intel_extension_for_pytorch/optim/_optimizer_utils.py:207: UserWarning: Does not suport fused step for <class 'torch.optim.adam.Adam'>, will use non-fused step
warnings.warn("Does not suport fused step for " + str(type(optimizer)) + ", will use non-fused step")
Epoch 1: 0%| | 0/133 [00:00<?, ?it/s]
Traceback (most recent call last):
File "run_training.py", line 221, in
main(FLAGS)
File "run_training.py", line 99, in main
train(
File "/home/ubuntu/reference_kits/disease-prediction/src/utils/train.py", line 50, in train
for _, (batch, labels) in tqdm(
File "/home/ubuntu/anaconda3/envs/disease_pred_intel/lib/python3.8/site-packages/tqdm/std.py", line 1195, in iter
for obj in iterable:
File "/home/ubuntu/anaconda3/envs/disease_pred_intel/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 530, in next
data = self._next_data()
File "/home/ubuntu/anaconda3/envs/disease_pred_intel/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 570, in _next_data
data = self._dataset_fetcher.fetch(index) # may raise StopIteration
File "/home/ubuntu/anaconda3/envs/disease_pred_intel/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/ubuntu/anaconda3/envs/disease_pred_intel/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/ubuntu/reference_kits/disease-prediction/src/utils/process_data.py", line 94, in getitem
encoding = self.tokenizer(
File "/home/ubuntu/anaconda3/envs/disease_pred_intel/lib/python3.8/site-packages/transformers/tokenization_utils_base.py", line 2489, in call
raise ValueError(
ValueError: text input must of type str (single example), List[str] (batch or single pretokenized example) or List[List[str]] (batch of pretokenized examples).