Comments (8)
I solved the problem by using transformers==4.32.0.
Using either 4.36.2 (latest) or 4.28.1 (specified in requirements.txt) caused some errors.
from mplug-owl.
Update your transformer library.
from mplug-owl.
Update your transformer library.
I updated to the latest version (transformers==4.36.2) but still have the problem.
from mplug-owl.
For the same snippet I got the following error:
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
Cell In[8], line 6
3 query = "Describe the image."
5 model_name = get_model_name_from_path(model_path)
----> 6 tokenizer, model, image_processor, context_len = load_pretrained_model(model_path, None, model_name, load_8bit=False, load_4bit=False, device="cuda")
File /projectnb/ivc-ml/appledora/mPLUGOwl/mPLUGOwl2/mplug_owl2/model/builder.py:117, in load_pretrained_model(model_path, model_base, model_name, load_8bit, load_4bit, device_map, device, **kwargs)
115 use_fast = False
116 tokenizer = AutoTokenizer.from_pretrained(model_path, use_fast=False, trust_remote_code=True)
--> 117 model = AutoModelForCausalLM.from_pretrained(model_path, low_cpu_mem_usage=True, **kwargs)
120 vision_tower = model.get_model().vision_model
121 # vision_tower.to(device=device, dtype=torch.float16)
File /projectnb/ivc-ml/appledora/condaenvs/.conda/envs/mplug_owl2/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py:493, in _BaseAutoModelClass.from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
491 elif type(config) in cls._model_mapping.keys():
492 model_class = _get_model_class(config, cls._model_mapping)
--> 493 return model_class.from_pretrained(
494 pretrained_model_name_or_path, *model_args, config=config, **hub_kwargs, **kwargs
495 )
496 raise ValueError(
497 f"Unrecognized configuration class {config.__class__} for this kind of AutoModel: {cls.__name__}.\n"
498 f"Model type should be one of {', '.join(c.__name__ for c in cls._model_mapping.keys())}."
499 )
File /projectnb/ivc-ml/appledora/condaenvs/.conda/envs/mplug_owl2/lib/python3.10/site-packages/transformers/modeling_utils.py:2700, in PreTrainedModel.from_pretrained(cls, pretrained_model_name_or_path, config, cache_dir, ignore_mismatched_sizes, force_download, local_files_only, token, revision, use_safetensors, *model_args, **kwargs)
2697 init_contexts.append(init_empty_weights())
2699 with ContextManagers(init_contexts):
-> 2700 model = cls(config, *model_args, **model_kwargs)
2702 # Check first if we are `from_pt`
2703 if use_keep_in_fp32_modules:
File /projectnb/ivc-ml/appledora/mPLUGOwl/mPLUGOwl2/mplug_owl2/model/modeling_mplug_owl2.py:218, in MPLUGOwl2LlamaForCausalLM.__init__(self, config)
216 def __init__(self, config):
217 super(LlamaForCausalLM, self).__init__(config)
--> 218 self.model = MPLUGOwl2LlamaModel(config)
220 self.lm_head = nn.Linear(config.hidden_size, config.vocab_size, bias=False)
222 # Initialize weights and apply final processing
File /projectnb/ivc-ml/appledora/mPLUGOwl/mPLUGOwl2/mplug_owl2/model/modeling_mplug_owl2.py:205, in MPLUGOwl2LlamaModel.__init__(self, config)
204 def __init__(self, config: MPLUGOwl2Config):
--> 205 super(MPLUGOwl2LlamaModel, self).__init__(config)
File /projectnb/ivc-ml/appledora/mPLUGOwl/mPLUGOwl2/mplug_owl2/model/modeling_mplug_owl2.py:36, in MPLUGOwl2MetaModel.__init__(self, config)
34 def __init__(self, config):
35 super(MPLUGOwl2MetaModel, self).__init__(config)
---> 36 self.vision_model = MplugOwlVisionModel(
37 MplugOwlVisionConfig(**config.visual_config["visual_model"])
38 )
39 self.visual_abstractor = MplugOwlVisualAbstractorModel(
40 MplugOwlVisualAbstractorConfig(**config.visual_config["visual_abstractor"]), config.hidden_size
41 )
File /projectnb/ivc-ml/appledora/mPLUGOwl/mPLUGOwl2/mplug_owl2/model/visual_encoder.py:403, in MplugOwlVisionModel.__init__(self, config)
400 self.config = config
401 self.hidden_size = config.hidden_size
--> 403 self.embeddings = MplugOwlVisionEmbeddings(config)
404 self.encoder = MplugOwlVisionEncoder(config)
405 if config.use_post_layernorm:
File /projectnb/ivc-ml/appledora/mPLUGOwl/mPLUGOwl2/mplug_owl2/model/visual_encoder.py:105, in MplugOwlVisionEmbeddings.__init__(self, config)
95 self.cls_token = None
97 self.patch_embed = nn.Conv2d(
98 in_channels=3,
99 out_channels=self.hidden_size,
(...)
102 bias=False,
103 )
--> 105 if self.cls_token:
106 self.num_patches = (self.image_size // self.patch_size) ** 2
107 self.position_embedding = nn.Parameter(torch.randn(1, self.num_patches + 1, self.hidden_size))
RuntimeError: Boolean value of Tensor with more than one value is ambiguous
I have the following transformer version :
transformers 4.31.0
Later I upgraded it to 4.32.0 as suggested, but error persists.
from mplug-owl.
Any one was able to fix this?
from mplug-owl.
For the same snippet I got the following error:
--------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) Cell In[8], line 6 3 query = "Describe the image." 5 model_name = get_model_name_from_path(model_path) ----> 6 tokenizer, model, image_processor, context_len = load_pretrained_model(model_path, None, model_name, load_8bit=False, load_4bit=False, device="cuda") File /projectnb/ivc-ml/appledora/mPLUGOwl/mPLUGOwl2/mplug_owl2/model/builder.py:117, in load_pretrained_model(model_path, model_base, model_name, load_8bit, load_4bit, device_map, device, **kwargs) 115 use_fast = False 116 tokenizer = AutoTokenizer.from_pretrained(model_path, use_fast=False, trust_remote_code=True) --> 117 model = AutoModelForCausalLM.from_pretrained(model_path, low_cpu_mem_usage=True, **kwargs) 120 vision_tower = model.get_model().vision_model 121 # vision_tower.to(device=device, dtype=torch.float16) File /projectnb/ivc-ml/appledora/condaenvs/.conda/envs/mplug_owl2/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py:493, in _BaseAutoModelClass.from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs) 491 elif type(config) in cls._model_mapping.keys(): 492 model_class = _get_model_class(config, cls._model_mapping) --> 493 return model_class.from_pretrained( 494 pretrained_model_name_or_path, *model_args, config=config, **hub_kwargs, **kwargs 495 ) 496 raise ValueError( 497 f"Unrecognized configuration class {config.__class__} for this kind of AutoModel: {cls.__name__}.\n" 498 f"Model type should be one of {', '.join(c.__name__ for c in cls._model_mapping.keys())}." 499 ) File /projectnb/ivc-ml/appledora/condaenvs/.conda/envs/mplug_owl2/lib/python3.10/site-packages/transformers/modeling_utils.py:2700, in PreTrainedModel.from_pretrained(cls, pretrained_model_name_or_path, config, cache_dir, ignore_mismatched_sizes, force_download, local_files_only, token, revision, use_safetensors, *model_args, **kwargs) 2697 init_contexts.append(init_empty_weights()) 2699 with ContextManagers(init_contexts): -> 2700 model = cls(config, *model_args, **model_kwargs) 2702 # Check first if we are `from_pt` 2703 if use_keep_in_fp32_modules: File /projectnb/ivc-ml/appledora/mPLUGOwl/mPLUGOwl2/mplug_owl2/model/modeling_mplug_owl2.py:218, in MPLUGOwl2LlamaForCausalLM.__init__(self, config) 216 def __init__(self, config): 217 super(LlamaForCausalLM, self).__init__(config) --> 218 self.model = MPLUGOwl2LlamaModel(config) 220 self.lm_head = nn.Linear(config.hidden_size, config.vocab_size, bias=False) 222 # Initialize weights and apply final processing File /projectnb/ivc-ml/appledora/mPLUGOwl/mPLUGOwl2/mplug_owl2/model/modeling_mplug_owl2.py:205, in MPLUGOwl2LlamaModel.__init__(self, config) 204 def __init__(self, config: MPLUGOwl2Config): --> 205 super(MPLUGOwl2LlamaModel, self).__init__(config) File /projectnb/ivc-ml/appledora/mPLUGOwl/mPLUGOwl2/mplug_owl2/model/modeling_mplug_owl2.py:36, in MPLUGOwl2MetaModel.__init__(self, config) 34 def __init__(self, config): 35 super(MPLUGOwl2MetaModel, self).__init__(config) ---> 36 self.vision_model = MplugOwlVisionModel( 37 MplugOwlVisionConfig(**config.visual_config["visual_model"]) 38 ) 39 self.visual_abstractor = MplugOwlVisualAbstractorModel( 40 MplugOwlVisualAbstractorConfig(**config.visual_config["visual_abstractor"]), config.hidden_size 41 ) File /projectnb/ivc-ml/appledora/mPLUGOwl/mPLUGOwl2/mplug_owl2/model/visual_encoder.py:403, in MplugOwlVisionModel.__init__(self, config) 400 self.config = config 401 self.hidden_size = config.hidden_size --> 403 self.embeddings = MplugOwlVisionEmbeddings(config) 404 self.encoder = MplugOwlVisionEncoder(config) 405 if config.use_post_layernorm: File /projectnb/ivc-ml/appledora/mPLUGOwl/mPLUGOwl2/mplug_owl2/model/visual_encoder.py:105, in MplugOwlVisionEmbeddings.__init__(self, config) 95 self.cls_token = None 97 self.patch_embed = nn.Conv2d( 98 in_channels=3, 99 out_channels=self.hidden_size, (...) 102 bias=False, 103 ) --> 105 if self.cls_token: 106 self.num_patches = (self.image_size // self.patch_size) ** 2 107 self.position_embedding = nn.Parameter(torch.randn(1, self.num_patches + 1, self.hidden_size)) RuntimeError: Boolean value of Tensor with more than one value is ambiguous
I have the following transformer version : transformers 4.31.0
Later I upgraded it to 4.32.0 as suggested, but error persists.
hello, you can change to if self.cls_token is not None
, it works to me.
from mplug-owl.
For the same snippet I got the following error:
--------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) Cell In[8], line 6 3 query = "Describe the image." 5 model_name = get_model_name_from_path(model_path) ----> 6 tokenizer, model, image_processor, context_len = load_pretrained_model(model_path, None, model_name, load_8bit=False, load_4bit=False, device="cuda") File /projectnb/ivc-ml/appledora/mPLUGOwl/mPLUGOwl2/mplug_owl2/model/builder.py:117, in load_pretrained_model(model_path, model_base, model_name, load_8bit, load_4bit, device_map, device, **kwargs) 115 use_fast = False 116 tokenizer = AutoTokenizer.from_pretrained(model_path, use_fast=False, trust_remote_code=True) --> 117 model = AutoModelForCausalLM.from_pretrained(model_path, low_cpu_mem_usage=True, **kwargs) 120 vision_tower = model.get_model().vision_model 121 # vision_tower.to(device=device, dtype=torch.float16) File /projectnb/ivc-ml/appledora/condaenvs/.conda/envs/mplug_owl2/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py:493, in _BaseAutoModelClass.from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs) 491 elif type(config) in cls._model_mapping.keys(): 492 model_class = _get_model_class(config, cls._model_mapping) --> 493 return model_class.from_pretrained( 494 pretrained_model_name_or_path, *model_args, config=config, **hub_kwargs, **kwargs 495 ) 496 raise ValueError( 497 f"Unrecognized configuration class {config.__class__} for this kind of AutoModel: {cls.__name__}.\n" 498 f"Model type should be one of {', '.join(c.__name__ for c in cls._model_mapping.keys())}." 499 ) File /projectnb/ivc-ml/appledora/condaenvs/.conda/envs/mplug_owl2/lib/python3.10/site-packages/transformers/modeling_utils.py:2700, in PreTrainedModel.from_pretrained(cls, pretrained_model_name_or_path, config, cache_dir, ignore_mismatched_sizes, force_download, local_files_only, token, revision, use_safetensors, *model_args, **kwargs) 2697 init_contexts.append(init_empty_weights()) 2699 with ContextManagers(init_contexts): -> 2700 model = cls(config, *model_args, **model_kwargs) 2702 # Check first if we are `from_pt` 2703 if use_keep_in_fp32_modules: File /projectnb/ivc-ml/appledora/mPLUGOwl/mPLUGOwl2/mplug_owl2/model/modeling_mplug_owl2.py:218, in MPLUGOwl2LlamaForCausalLM.__init__(self, config) 216 def __init__(self, config): 217 super(LlamaForCausalLM, self).__init__(config) --> 218 self.model = MPLUGOwl2LlamaModel(config) 220 self.lm_head = nn.Linear(config.hidden_size, config.vocab_size, bias=False) 222 # Initialize weights and apply final processing File /projectnb/ivc-ml/appledora/mPLUGOwl/mPLUGOwl2/mplug_owl2/model/modeling_mplug_owl2.py:205, in MPLUGOwl2LlamaModel.__init__(self, config) 204 def __init__(self, config: MPLUGOwl2Config): --> 205 super(MPLUGOwl2LlamaModel, self).__init__(config) File /projectnb/ivc-ml/appledora/mPLUGOwl/mPLUGOwl2/mplug_owl2/model/modeling_mplug_owl2.py:36, in MPLUGOwl2MetaModel.__init__(self, config) 34 def __init__(self, config): 35 super(MPLUGOwl2MetaModel, self).__init__(config) ---> 36 self.vision_model = MplugOwlVisionModel( 37 MplugOwlVisionConfig(**config.visual_config["visual_model"]) 38 ) 39 self.visual_abstractor = MplugOwlVisualAbstractorModel( 40 MplugOwlVisualAbstractorConfig(**config.visual_config["visual_abstractor"]), config.hidden_size 41 ) File /projectnb/ivc-ml/appledora/mPLUGOwl/mPLUGOwl2/mplug_owl2/model/visual_encoder.py:403, in MplugOwlVisionModel.__init__(self, config) 400 self.config = config 401 self.hidden_size = config.hidden_size --> 403 self.embeddings = MplugOwlVisionEmbeddings(config) 404 self.encoder = MplugOwlVisionEncoder(config) 405 if config.use_post_layernorm: File /projectnb/ivc-ml/appledora/mPLUGOwl/mPLUGOwl2/mplug_owl2/model/visual_encoder.py:105, in MplugOwlVisionEmbeddings.__init__(self, config) 95 self.cls_token = None 97 self.patch_embed = nn.Conv2d( 98 in_channels=3, 99 out_channels=self.hidden_size, (...) 102 bias=False, 103 ) --> 105 if self.cls_token: 106 self.num_patches = (self.image_size // self.patch_size) ** 2 107 self.position_embedding = nn.Parameter(torch.randn(1, self.num_patches + 1, self.hidden_size)) RuntimeError: Boolean value of Tensor with more than one value is ambiguous
I have the following transformer version : transformers 4.31.0
Later I upgraded it to 4.32.0 as suggested, but error persists.hello, you can change to
if self.cls_token is not None
, it works to me.
Yes, this issue is incorporated by the mPLUG-Owl2.1 which disables the cls_token in visual encoder. We fixed this issue in the latest commit.
from mplug-owl.
Yes, i ran last week too by turning off the cls_token check. Glad that it is now officially handled too!
from mplug-owl.
Related Issues (20)
- other downstream tasks available? Like Visual Reasoning, requires the model to predict whether a sentence describes a pair of images HOT 1
- Cannot run inference
- Issue with gradio webui
- cls_token problem with image. HOT 3
- Zero3 train: Invalidate trace cache @ step 391: expected module 25, but got module 5
- Is there any code for fine-tuning the video model?
- QuickStart Code for mplug_owl2.1 has lots of errors. HOT 3
- Please can you split the model into 4GB chunks rather than 1 x 16GB. SafeTensors would be a nice addition also.
- 为什么输出全是英文
- How to do few-shot learning or in-context learning with mPLUG-Owl2?
- ModuleNotFoundError: No module named 'transformers_modules.mPLUG-Owl2'
- mPLUG-Owl2.1输出全是英文
- mplug-owl2-llama2-7b initialization error
- ms-swift对于mPLUG-Owl2和mPLUG-Owl2.1微调(finetune)的支持
- Owl2中使用的Vit-H-16是从哪个版本初始化的? HOT 2
- Question about Abstractor's FFN and Attention
- finetuning: No pytorch_model.bin file after running train_it.sh HOT 1
- The attention mask and pad token id were not set.
- mPLUG-owl2: RuntimeError: probability tensor contains either `inf`, `nan` or element < 0
- Question for mplug-owl v1 model code
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mplug-owl.