Giter Site home page Giter Site logo

Comments (6)

cndn avatar cndn commented on July 16, 2024 1

Hey @TapendraBaduwal The small model is vanilla UnitY - without pretraining components so far due to size limit, but will explore it. We exported the model with Torchscript and shared .ptl files here (compatible with torch lite intepreter) https://github.com/facebookresearch/seamless_communication/blob/main/docs/m4t/on_device_README.md.

from seamless_communication.

cndn avatar cndn commented on July 16, 2024 1

Hey @TapendraBaduwal I confirmed an issue. Fixing.

from seamless_communication.

cndn avatar cndn commented on July 16, 2024 1

Conclusion: pytorch runtime in python only worked with <=1.11.0, just got UnitY-Small-S2T to work with 2.0 (will update HF checkpoint soon) but not UnitY-Small. Updated the doc https://github.com/facebookresearch/seamless_communication/blob/main/docs/m4t/on_device_README.md

from seamless_communication.

TapendraBaduwal avatar TapendraBaduwal commented on July 16, 2024

@cndn
import torchaudio
import torch

TEST_AUDIO_PATH = "/home/tapendra/Desktop/seamless_communication/jfk.wav"
TGT_LANG = "eng"
audio_input, _ = torchaudio.load(TEST_AUDIO_PATH) # Load waveform using torchaudio
s2t_model = torch.jit.load("/home/tapendra/Desktop/seamless_communication/unity_on_device_s2t.ptl") # Load exported S2T model
text = s2t_model(audio_input, tgt_lang=TGT_LANG) # Forward call with tgt_lang specified for ASR or S2TT
print(text)

Error:
File "/home/tapendra/Desktop/seamless_communication/sp2tt.py", line 10, in
text = s2t_model(audio_input, tgt_lang=TGT_LANG) # Forward call with tgt_lang specified for ASR or S2TT
File "/home/tapendra/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript, serialized code (most recent call last):
File "code/torch.py", line 26, in forward
_5 = torch.tensor(-1)
sample = annotate(Dict[str, Optional[Dict[str, Tensor]]], {"net_input": _2, "others": {"lang_tag": _5}})
_8 = (self.generator).generate(sample, None, )
~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
pred_tokens, texts, = _8
return texts[0]
File "code/torch/fairseq/sequence_generator_multi_decoder.py", line 14, in generate
sample: Dict[str, Optional[Dict[str, Tensor]]],
prefix_tokens: Optional[Tensor]=None) -> Tuple[List[List[Dict[str, Tensor]]], List[str]]:
_0 = (self)._generate(sample, prefix_tokens, None, None, False, )
~~~~~~~~~~~~~~~ <--- HERE
return _0
def _generate(self: torch.fairseq.sequence_generator_multi_decoder.MultiDecoderSequenceGenerator,
File "code/torch/fairseq/sequence_generator_multi_decoder.py", line 49, in _generate
src_tokens, src_lengths = _3, _3
_7 = (self.generator_mt.search).init_constraints(constraints, self.generator_mt.beam_size, )
encoder_outs = (self.generator_mt.model).forward_encoder(net_input0, )
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
_8 = self.generator_mt.model.single_model
_9 = _8.target_letter_decoder
File "code/torch/fairseq/sequence_generator_v2.py", line 749, in forward_encoder
else:
_353 = annotate(List[Dict[str, List[Tensor]]], [])
_354 = (getattr(self.models, "0").encoder).forward_torchscript(net_input, )
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
_355 = torch.append(_353, _354)
_352 = _353
File "code/torch/fairseq/models/speech_to_text/xm_transformer.py", line 28, in forward_torchscript
def forward_torchscript(self: torch.fairseq.models.speech_to_text.xm_transformer.Wav2VecEncoderWithAdaptor,
net_input: Dict[str, Tensor]) -> Dict[str, List[Tensor]]:
_3 = (self).forward(net_input["src_tokens"], net_input["src_lengths"], )
~~~~~~~~~~~~~ <--- HERE
return _3
def reorder_encoder_out(self: torch.fairseq.models.speech_to_text.xm_transformer.Wav2VecEncoderWithAdaptor,
File "code/torch/fairseq/models/speech_to_text/xm_transformer.py", line 19, in forward
_0 = torch.fairseq.data.data_utils.lengths_to_padding_mask
padding_mask = _0(src_lengths, )
out = (self.w2v_encoder).forward(src_tokens, padding_mask, )
~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
x = out["encoder_out"]
padding_mask0 = out["padding_mask"]
File "code/torch/fairseq/models/wav2vec/wav2vec2_asr.py", line 25, in forward
else:
_3 = False
res = (_0).extract_features(_1, _2, _3, None, )
~~~~~~~~~~~~~~~~~~~~ <--- HERE
x = res["x"]
padding_mask0 = res["padding_mask"]
File "code/torch/fairseq/models/w2vbert/w2vbert.py", line 283, in extract_features
mask: bool=False,
layer: Optional[int]=None) -> Dict[str, Tensor]:
res = (self).forward(source, padding_mask, mask, True, layer, None, None, None, )
~~~~~~~~~~~~~ <--- HERE
return res
class ConformerEncoder(Module):
File "code/torch/fairseq/models/w2vbert/w2vbert.py", line 116, in forward
else:
features2, unmasked_features0, padding_mask9 = features1, unmasked_features, padding_mask6
features4 = (self.post_extract_proj).forward(features2, )
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
features5 = (self.dropout_input).forward(features4, )
unmasked_features2 = (self.dropout_features).forward(unmasked_features0, )
File "code/torch/torch/nn/modules/linear.py", line 13, in forward
input: Tensor) -> Tensor:
_0 = torch.torch.nn.functional.linear
_1 = _0(torch.to(input, 6), torch.to(self.weight, 6), self.bias, )
~~ <--- HERE
return _1
File "code/torch/torch/nn/functional.py", line 4, in linear
weight: Tensor,
bias: Optional[Tensor]=None) -> Tensor:
return torch.linear(input, weight, bias)
~~~~~~~~~~~~ <--- HERE
def dropout(input: Tensor,
p: float=0.5,

Traceback of TorchScript, original code (most recent call last):
File "/private/home/dnn/anaconda3/envs/ilia/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 153, in generate
(default: self.eos)
"""
return self._generate(sample, prefix_tokens)
~~~~~~~~~~~~~~ <--- HERE
File "/private/home/dnn/seamless_main_20230228/fairseq-py/fairseq/models/fairseq_encoder.py", line 50, in forward_torchscript
"""
if torch.jit.is_scripting():
return self.forward(
~~~~~~~~~~~~ <--- HERE
src_tokens=net_input["src_tokens"],
src_lengths=net_input["src_lengths"],
File "/private/home/dnn/seamless_main_20230228/fairseq-py/fairseq/models/wav2vec/wav2vec2_asr.py", line 518, in forward
# x = x.transpose(0, 1)
# else:
res = self.w2v_model.extract_features(source=w2v_args['source'], padding_mask=w2v_args['padding_mask'], mask=(self.apply_mask and self.training))
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
x = res["x"]
padding_mask = res["padding_mask"]
File "/private/home/dnn/seamless_main_20230228/fairseq-py/fairseq/models/w2vbert/w2vbert.py", line 884, in extract_features
def extract_features(self, source, padding_mask, mask:bool=False, layer:Optional[int]=None):
res = self.forward(
~~~~~~~~~~~~ <--- HERE
source, padding_mask, mask=mask, features_only=True, layer=layer
)
File "/private/home/dnn/anaconda3/envs/ilia/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 96, in forward
def forward(self, input: Tensor) -> Tensor:
return F.linear(input.float(), self.weight.float(), self.bias)
~~~~~~~~ <--- HERE
File "/private/home/dnn/anaconda3/envs/ilia/lib/python3.8/site-packages/torch/nn/functional.py", line 1847, in linear
if has_torch_function_variadic(input, weight):
return handle_torch_function(linear, (input, weight), input, weight, bias=bias)
return torch._C._nn.linear(input, weight, bias)
~~~~~~~~~~~~~~~~~~~ <--- HERE
RuntimeError: self and mat2 must have the same dtype

from seamless_communication.

cndn avatar cndn commented on July 16, 2024

@TapendraBaduwal could you try this _s2t model first? https://dl.fbaipublicfiles.com/seamlessM4T/models/small/unity_on_device_s2t.ptl I'm working on the whole model, and will upload HF paths when done.

from seamless_communication.

TapendraBaduwal avatar TapendraBaduwal commented on July 16, 2024

@cndn Thank you for updating.

from seamless_communication.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.