Giter Site home page Giter Site logo

declare-lab / meld Goto Github PK

View Code? Open in Web Editor NEW
764.0 27.0 198.0 8.16 MB

MELD: A Multimodal Multi-Party Dataset for Emotion Recognition in Conversation

License: GNU General Public License v3.0

Python 100.00%
emotion-recognition sentiment-analysis multimodal-sentiment-analysis multimodal-interactions dialogue-systems conversational-ai chatbot personality-traits personality-profiling emotion

meld's People

Contributors

devamanyu avatar gmn0105 avatar nmder avatar soujanyaporia avatar tae898 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

meld's Issues

How do I convert a video to the data format required for this model?

I've got the bimodal_weights_emotion.hdf5 model from baseline, and the model can recognize the emotion from MELD dataset well. But I don't know how to recognize emotion from my own video. How do I convert a video to the data format required for this model?

I'm a beginner in multimodal emotion recognition. I'd really appreciate it if you could give me some tips

Using pre-trained models with my own audio/video files.

Hello, i have downloaded the pre-trained models from the link provided in the repository, but baseline.py doesn't provide any way to use my own audio/video (.mp3/.mp4) files directly. The authors are loading pickle files instead.

Can somebody give me any scripts so i can use the model with my own audio/video files.

Any help would be really appreciated.

Error on the function method test_model

Whenever below is called,

def test_model(self):
    model = load_model(self.PATH)
    intermediate_layer_model = Model(
    input=model.input, output=model.get_layer("utter").output)

tensorflow throws an error

Traceback (most recent call last):
  File "baseline/baseline.py", line 288, in <module>
    model.train_model()
  File "baseline/baseline.py", line 228, in train_model
    self.test_model()
  File "baseline/baseline.py", line 235, in test_model
    intermediate_layer_model = Model(input=model.input, output=model.get_layer("utter").output)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/tracking/base.py", line 457, in _method_wrapper
    result = method(self, *args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py", line 262, in __init__
    'name', 'autocast'})
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/utils/generic_utils.py", line 778, in validate_kwargs
    raise TypeError(error_message, kwarg)
TypeError: ('Keyword argument not understood:', 'input')

Perhaps utter is not the right name of the layer?

Running on python3.6, running below python packages

tensorboard==2.3.0
tensorboard-plugin-wit==1.7.0
tensorboardcolab==0.0.22
tensorflow==2.3.0
tensorflow-addons==0.8.3
tensorflow-datasets==2.1.0
tensorflow-estimator==2.3.0
tensorflow-gcs-config==2.3.0
tensorflow-hub==0.9.0
tensorflow-metadata==0.24.0
tensorflow-privacy==0.2.2
tensorflow-probability==0.11.0
Keras==2.4.3
Keras-Preprocessing==1.1.2
keras-vis==0.4.1

About visual_features.tar.gz

Hello! I noticed that there is a "visual_features.tar.gz" file in the features file url you provided. I have downloaded it and decompressed it, and the file structure is as follows:

  • matlab_resnet_faces
    • train
      • frame_1_1.mat
      • frame_1_2.mat
      • frame_x_x.mat
        ...
    • dev
    • test

Because there is no README file, so I cannot know the meaning or source of these files. Can you explain it to me? Hope to hear from you. Thank you!

baseline.py not working

from  baseline.data_helpers import Dataloader

ModuleNotFoundError: No module named 'baseline.data_helpers'; 'baseline' is not a package

Baseline results=0

Hi, I have tried the bc_LSTM baseline with bimodal in emotion classification, but the F1-score and accuracy of 'fear' and 'disgust' are always zero, so I can't reproduce the result in paper.

The command I use:

python baseline.py -classify emotion -modality bimodal -train

The results:

      precision    recall  f1-score   support

   0     0.7322    0.7795    0.7551      1256
   1     0.4799    0.4662    0.4729       281
   2     0.0000    0.0000    0.0000        50
   3     0.2781    0.2019    0.2340       208
   4     0.4813    0.5448    0.5111       402
   5     0.0000    0.0000    0.0000        68
   6     0.3832    0.4377    0.4087       345

The emotion labels:

Emotion - {'neutral': 0, 'surprise': 1, 'fear': 2, 'sadness': 3, 'joy': 4, 'disgust': 5, 'anger': 6}.

I know the main strategy is to adjust the class weight. To be hnoest, I'm new to tensorflow. I don't know what codes are needed to add to achieve it. Could you please give me some suggestion?

Best wishes>

Using pre-trained models with our own audio files.

Hi, thanks for making this open source project.

I'm trying to use the pre-trained models you provide on my own audio files in order to extract the emotion and sentiment labels, but the baseline.py does not seem to provide a way to use my own files.

Moreover, the baseline.py file loads .pkl instead of .wav or .mp4 files. How would I go about using my own files an generating similar .pkl files in order to be used with the pre-trained models?

Thanks.

Hello

I am learning emotion recognition.So I downloaded your project,but the website of http://bit.ly/MELD-features that I can not enter.Can you send to my .And the others should I download,I can not enter either. Sincerely thank you .Email:[email protected]

audio features

You mention that feature selection was done using opensmile with initial feature set of 6373, and then feature selection was performed.

What is the config file used for feature extraction ? is it Compare_2016? and how exactly did you do the feature selection? is it possible to provide the indices or names of selected features? Also, audio_emotion.pkl has 122 features that are all zeros out of the 300 selected so they do not provide any information

data preprocessing code of MELD

Thank you for releasing the baseline code and the feature files.
Can you open the feature extraction code for speech and text of the MELD dataset?
Thank you.

Really poor results using the given features

I am trying to create a multimodal model using MELD dataset. After a really large number of tries, using the given features or features obtained by me(opensmile and wav2vec for audio and simple textual approaches), I alway get poor results, really far way from the ones described in the paper. Today, I decided to load the audio and the text features obtained by the bcLSTM model and concatenate them, just like shown in the baseline file, and use them as input for a new bcLSTM. Basically, i copy-past what the authors have done, and the results are so bad. Has anyone face this problem or are the given features actually really bad? I also tried to apply the same methods and models to the features procesed by me and the results are still bad. I am doing my master's thesis in multimodal emotion recognition and this dataset is not helping at all.

guidance for developing a LSTM model

Hi,
I am struggling to find any support for developing a model for a multi-model dataset .
Can you please guide or give me some reference for using this dataset and developing a LSTM or CNN model.
I am new to this field but for now i am able to develop model for images and text separately but having trouble in using a merged input (image+text or image+audio ).

Please provide some direction or explain with respect to the baseline model you provided.
Thanks

error: bad character range \|-t at position 12

when i run python baseline.py -classify [Sentiment|Emotion] -modality [text|audio|bimodal] [-train|-test].

it's errors. how to fix it? thanks

error Traceback (most recent call last)
in
----> 1 get_ipython().run_line_magic('run', 'baseline.py -classify [Sentiment|Emotion] -modality [text|audio|bimodal] [-train|-test]')

F:\Anaconda\lib\site-packages\IPython\core\interactiveshell.py in run_line_magic(self, magic_name, line, _stack_depth)
2325 kwargs['local_ns'] = self.get_local_scope(stack_depth)
2326 with self.builtin_trap:
-> 2327 result = fn(*args, **kwargs)
2328 return result
2329

in run(self, parameter_s, runner, file_finder)

F:\Anaconda\lib\site-packages\IPython\core\magic.py in (f, *a, **k)
185 # but it's overkill for just that one bit of state.
186 def magic_deco(arg):
--> 187 call = lambda f, *a, **k: f(*a, **k)
188
189 if callable(arg):

F:\Anaconda\lib\site-packages\IPython\core\magics\execution.py in run(self, parameter_s, runner, file_finder)
736 else:
737 # tilde and glob expansion
--> 738 args = shellglob(map(os.path.expanduser, arg_lst[1:]))
739
740 sys.argv = [filename] + args # put in the proper filename

F:\Anaconda\lib\site-packages\IPython\utils\path.py in shellglob(args)
324 unescape = unescape_glob if sys.platform != 'win32' else lambda x: x
325 for a in args:
--> 326 expanded.extend(glob.glob(a) or [unescape(a)])
327 return expanded
328

F:\Anaconda\lib\glob.py in glob(pathname, recursive)
19 zero or more directories and subdirectories.
20 """
---> 21 return list(iglob(pathname, recursive=recursive))
22
23 def iglob(pathname, *, recursive=False):

F:\Anaconda\lib\glob.py in _iglob(pathname, recursive, dironly)
55 yield from _glob2(dirname, basename, dironly)
56 else:
---> 57 yield from _glob1(dirname, basename, dironly)
58 return
59 # os.path.split() returns the argument itself as a dirname if it is a

F:\Anaconda\lib\glob.py in _glob1(dirname, pattern, dironly)
83 if not _ishidden(pattern):
84 names = (x for x in names if not _ishidden(x))
---> 85 return fnmatch.filter(names, pattern)
86
87 def _glob0(dirname, basename, dironly):

F:\Anaconda\lib\fnmatch.py in filter(names, pat)
50 result = []
51 pat = os.path.normcase(pat)
---> 52 match = _compile_pattern(pat)
53 if os.path is posixpath:
54 # normcase on posix is NOP. Optimize it away from the loop.

F:\Anaconda\lib\fnmatch.py in _compile_pattern(pat)
44 else:
45 res = translate(pat)
---> 46 return re.compile(res).match
47
48 def filter(names, pat):

F:\Anaconda\lib\re.py in compile(pattern, flags)
250 def compile(pattern, flags=0):
251 "Compile a regular expression pattern, returning a Pattern object."
--> 252 return _compile(pattern, flags)
253
254 def purge():

F:\Anaconda\lib\re.py in _compile(pattern, flags)
302 if not sre_compile.isstring(pattern):
303 raise TypeError("first argument must be string or compiled pattern")
--> 304 p = sre_compile.compile(pattern, flags)
305 if not (flags & DEBUG):
306 if len(_cache) >= _MAXCACHE:

F:\Anaconda\lib\sre_compile.py in compile(p, flags)
762 if isstring(p):
763 pattern = p
--> 764 p = sre_parse.parse(p, flags)
765 else:
766 pattern = None

F:\Anaconda\lib\sre_parse.py in parse(str, flags, state)
946
947 try:
--> 948 p = _parse_sub(source, state, flags & SRE_FLAG_VERBOSE, 0)
949 except Verbose:
950 # the VERBOSE flag was switched on inside the pattern. to be

F:\Anaconda\lib\sre_parse.py in _parse_sub(source, state, verbose, nested)
441 start = source.tell()
442 while True:
--> 443 itemsappend(_parse(source, state, verbose, nested + 1,
444 not nested and not items))
445 if not sourcematch("|"):

F:\Anaconda\lib\sre_parse.py in _parse(source, state, verbose, nested, first)
832 sub_verbose = ((verbose or (add_flags & SRE_FLAG_VERBOSE)) and
833 not (del_flags & SRE_FLAG_VERBOSE))
--> 834 p = _parse_sub(source, state, sub_verbose, nested + 1)
835 if not source.match(")"):
836 raise source.error("missing ), unterminated subpattern",

F:\Anaconda\lib\sre_parse.py in _parse_sub(source, state, verbose, nested)
441 start = source.tell()
442 while True:
--> 443 itemsappend(_parse(source, state, verbose, nested + 1,
444 not nested and not items))
445 if not sourcematch("|"):

F:\Anaconda\lib\sre_parse.py in _parse(source, state, verbose, nested, first)
596 if hi < lo:
597 msg = "bad character range %s-%s" % (this, that)
--> 598 raise source.error(msg, len(this) + 1 + len(that))
599 setappend((RANGE, (lo, hi)))
600 else:

error: bad character range |-t at position 12

Mismatch in data_emotion.p

In MELD/data_emotion.p, why is word_idx_map[','] = 6459, whereas W.shape = (6336, 300)?
Is this a bug?

Error in train.tar.gz

Got this error while untar

gzip: stdin: unexpected end of file
tar: Unexpected EOF in archive
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now

How can I know which face on the video sample is speaking to extract my own visual features ??

First of all, I would like to congratulate all you for the big effort you did when creating this MELD dataset. However, I would also like to ask you if it is possible to obtain the facial landmarks (or any other kind of information) that will allow me to extract the face of the person actively speaking as you did for extracting the features you provide.

The reason is because I would like to explore my own visual features.

Thanks in advance. Best regards from Valencia,

David

Wrong video files

I used this link to download the audio files for this data set.

However, there are a few problems with at least a few of the video files and/or their transcriptions:

  • dia309_utt0.mp4: transcription contains description of scene which needs to be removed ("She doesn't hear him and keeps running, Chandler starts chasing her as the theme to")

  • test_splits_wav/dia220_utt0.mp4: file is wrongly cut (video is 4min long - transcription is way off as it's Ross and Julie meeting Rachel at the airport, not Phoebe talking to Joey )

  • test_splits_wav/dia38_utt4.mp4: file is wrongly cut (video is 5min long)

  • train_splits_wav/dia309_utt0.mp4: file is wrongly cut

In addition, I was able to verify that some of the old problems reported here still persist (e.g. dia793_utt0.mp4).

Have they been solved? Have I perhaps downloaded an old version of the data set?

Data download link

The download link to the audio data doesn't appear to be working. Am I missing something?

No audio modal in the test set.

Hello, prof.
I've downloaded the raw data, and it seems that the samples in the test set, such as ./output_repeated_splits_test/dia263_utt7.mp4, have no sound.
Is the RAW audio modal data for the test set available?
Thank you very much!

class weight {0: 4.0, 1: 15.0, 2: 15.0, 3: 3.0, 4: 1.0, 5: 6.0, 6: 3.0}, it still is zero

Dear professor

i am really sorry to disturb you. I add the class_weight={0: 4.0, 1: 15.0, 2: 15.0, 3: 3.0, 4: 1.0, 5: 6.0, 6: 3.0}.
the prceision is zero

          precision    recall  f1-score   support

       0     0.4919    0.9889    0.6570      1256
       1     0.0000    0.0000    0.0000       281
       2     0.0000    0.0000    0.0000        50
       3     0.0000    0.0000    0.0000       208
       4     0.0000    0.0000    0.0000       402
       5     0.0000    0.0000    0.0000        68
       6     0.4353    0.1072    0.1721       345

micro avg 0.4900 0.4900 0.4900 2610
macro avg 0.1325 0.1566 0.1184 2610
weighted avg 0.2942 0.4900 0.3389 2610

Weighted FScore:
(0.2942449206381084, 0.4900383141762452, 0.3388985544501019, None)

How to fix it?
thanks, best wishes

BC-LSTM text unimodal checkpoint is broken

I could run audio unimodal and bimodal BC-LSTM pretrained model, but got the following error when running text unimodal BC-LSTM.

Model initiated for Sentiment classification
Loading data
Labels used for this classification:  {'neutral': 0, 'positive': 1, 'negative': 2}
Traceback (most recent call last):
  File "baseline.py", line 312, in <module>
    model.test_model()
  File "baseline.py", line 234, in test_model
    model = load_model(self.PATH)
  File "/Users/xiaoyu/.pyenv/versions/3.8.7/lib/python3.8/site-packages/keras/saving/save.py", line 200, in load_model
    return hdf5_format.load_model_from_hdf5(filepath, custom_objects,
  File "/Users/xiaoyu/.pyenv/versions/3.8.7/lib/python3.8/site-packages/keras/saving/hdf5_format.py", line 180, in load_model_from_hdf5
    model = model_config_lib.model_from_config(model_config,
  File "/Users/xiaoyu/.pyenv/versions/3.8.7/lib/python3.8/site-packages/keras/saving/model_config.py", line 52, in model_from_config
    return deserialize(config, custom_objects=custom_objects)
  File "/Users/xiaoyu/.pyenv/versions/3.8.7/lib/python3.8/site-packages/keras/layers/serialization.py", line 208, in deserialize
    return generic_utils.deserialize_keras_object(
  File "/Users/xiaoyu/.pyenv/versions/3.8.7/lib/python3.8/site-packages/keras/utils/generic_utils.py", line 674, in deserialize_keras_object
    deserialized_obj = cls.from_config(
  File "/Users/xiaoyu/.pyenv/versions/3.8.7/lib/python3.8/site-packages/keras/engine/training.py", line 2397, in from_config
    functional.reconstruct_from_config(config, custom_objects))
  File "/Users/xiaoyu/.pyenv/versions/3.8.7/lib/python3.8/site-packages/keras/engine/functional.py", line 1273, in reconstruct_from_config
    process_layer(layer_data)
  File "/Users/xiaoyu/.pyenv/versions/3.8.7/lib/python3.8/site-packages/keras/engine/functional.py", line 1255, in process_layer
    layer = deserialize_layer(layer_data, custom_objects=custom_objects)
  File "/Users/xiaoyu/.pyenv/versions/3.8.7/lib/python3.8/site-packages/keras/layers/serialization.py", line 208, in deserialize
    return generic_utils.deserialize_keras_object(
  File "/Users/xiaoyu/.pyenv/versions/3.8.7/lib/python3.8/site-packages/keras/utils/generic_utils.py", line 674, in deserialize_keras_object
    deserialized_obj = cls.from_config(
  File "/Users/xiaoyu/.pyenv/versions/3.8.7/lib/python3.8/site-packages/keras/layers/core.py", line 1005, in from_config
    function = cls._parse_function_from_config(
  File "/Users/xiaoyu/.pyenv/versions/3.8.7/lib/python3.8/site-packages/keras/layers/core.py", line 1057, in _parse_function_from_config
    function = generic_utils.func_load(
  File "/Users/xiaoyu/.pyenv/versions/3.8.7/lib/python3.8/site-packages/keras/utils/generic_utils.py", line 789, in func_load
    code = marshal.loads(raw_code)
ValueError: bad marshal data (unknown type code)

The command I'm running is:
python baseline.py -classify sentiment -modality text -test.

Is there any face region information in videos?

Hi, Thanks to share good datasets

I want to cut face region who speak in video frame, but many video frame has 2 more person.

Is this any face region information in videos? If any information(xy info, etc...) is exist, please share.

Thank you

Failed to load pretrained models

Hi,
I tried to reload the pretrained models, but failed. I assume it is a keras version problem. Would you like to describe the running enviroment? Sorry, I cannot find it in Readme.
Thanks.

Multiple issues in the dataset.

  1. Audio

There is a disturbance in audio which would have affected the audio features.

Few Examples:
dia793_utt0.mp4
dia164_utt5.mp4
dia682_utt1.mp4
dia529_utt2.mp4
dia1029_utt1.mp4
dia1008_utt1.mp4

Mostly all videos with size > 2.5 MB (around 200 videos in train_set)

  1. Video and text are not matching.

For example

a) dialogue 241. In utterance 1 the sync breaks between the text and the video
utterance 2 in text is "I asked him." while video dia241_utt2.mp4 has just word "now" and the sync issues goes on.

b) dialogue 757 utterance 7 is also not synced with the text.

c) diaglogue 485 utterance 0 in text "Hey, this- Heyy..." but the video is a long clip.

There are many more video-text sync issues.

Is this dataset usable?
Please help me with this.

ValueError: `class_weight` not supported for 3+ dimensional targets. with class_weight

Hi,

I got an error ValueError: "class_weight" not supported for 3+ dimensional targets. with class_weight, when I run a baseline.py (with only text in emotion classification)with class_weight as below that is provided in README.
I didn't make any changes to bc-LSTM model, but should I make a new loss function considering class_weight or something?
Could you give me advice to be able to use class_weight without any problems?
Thank you in advance.

using command:
python baseline.py -classify emotion -modality text -train

fit parameter:

history = model.fit(self.train_x, self.train_y,
                            epochs=self.epochs,
                            batch_size=self.batch_size,
                            sample_weight=self.train_mask,
                            shuffle=True,
                            callbacks=[early_stopping, checkpoint],
                            validation_data=(
                                self.val_x, self.val_y, self.val_mask),
                            class_weight={0: 4.0, 1: 15.0, 2: 15.0,
                                          3: 3.0, 4: 1.0, 5: 6.0, 6: 3.0}
                            )

Videos are not well-aligned with the texts.

Obviously this issue was already brought up at #9

The alignment is pretty bad. It's hard for me to go multimodal at the moment, because of this issue.

I have two questions:

  1. Has this been fixed? Or are you planning on using a better alignment tool?
  2. Can I have access to the original friends videos? I wonder if I can cut the videos into utterances myself using ASR.

The Dataloader class is not getting loaded

The following error is popping up

Traceback (most recent call last):
File "C:/Users/Sanil Andhare/.PyCharm2019.1/FFP/baseline/baseline.py", line 10, in
from baseline.data_helpers import Dataloader
File "C:\Users\Sanil Andhare.PyCharm2019.1\FFP\baseline\baseline.py", line 10, in
from baseline.data_helpers import Dataloader
ModuleNotFoundError: No module named 'baseline.data_helpers'; 'baseline' is not a package

Process finished with exit code 1

Absence of Audio in the Test File

Hello Mr/Ms

I have found an issue where there seems to be no audio in the test dataset which is confusing. I was wondering if maybe there was a problem or maybe there is a separate audio file for this. I hope to hear from you soon.

About the sequence length and sentence length.

Hi, Meld crew.
I tried running baseline code with text-only sentiment classification, and it worked. However, I got one question in baseline.py (Line 124). It's about the input_length in Embedding layer. I think it should be the sentence_length instead of the sequence_length, since the 2nd dimension of the concatenated_tensor is a negative number (-48) in my case.

Baseline results

Hi, I have tried the bc_LSTM baseline with bimodal in emotion classification, but the F1-score and accuracy of 'fear' and 'disgust' are always zero, so I can't reproduce the result in paper.

The command I use:

python baseline.py -classify emotion -modality bimodal -train

The results:

          precision    recall  f1-score   support

       0     0.7322    0.7795    0.7551      1256
       1     0.4799    0.4662    0.4729       281
       2     0.0000    0.0000    0.0000        50
       3     0.2781    0.2019    0.2340       208
       4     0.4813    0.5448    0.5111       402
       5     0.0000    0.0000    0.0000        68
       6     0.3832    0.4377    0.4087       345

The emotion labels:

Emotion - {'neutral': 0, 'surprise': 1, 'fear': 2, 'sadness': 3, 'joy': 4, 'disgust': 5, 'anger': 6}.

Is there something wrong with my understanding?

Cannot download the raw data.

First of all, sorry if this is not the correct place to post this question.

I tried the following commands several times but could not download the raw data.

$ wget http://web.eecs.umich.edu/~mihalcea/downloads/MELD.Raw.tar.gz
--2022-10-17 14:00:13--  http://web.eecs.umich.edu/~mihalcea/downloads/MELD.Raw.tar.gz
Resolving web.eecs.umich.edu (web.eecs.umich.edu)... 141.212.113.214
Connecting to web.eecs.umich.edu (web.eecs.umich.edu)|141.212.113.214|:80... connected.
HTTP request sent, awaiting response... Read error (Connection reset by peer) in headers.
Retrying.

--2022-10-17 14:02:13--  (try: 2)  http://web.eecs.umich.edu/~mihalcea/downloads/MELD.Raw.tar.gz
Connecting to web.eecs.umich.edu (web.eecs.umich.edu)|141.212.113.214|:80... connected.
HTTP request sent, awaiting response... Read error (Connection reset by peer) in headers.
Retrying.

--2022-10-17 14:04:15--  (try: 3)  http://web.eecs.umich.edu/~mihalcea/downloads/MELD.Raw.tar.gz
Connecting to web.eecs.umich.edu (web.eecs.umich.edu)|141.212.113.214|:80... connected.
HTTP request sent, awaiting response... Read error (Connection reset by peer) in headers.
Retrying.

--2022-10-17 14:06:20--  (try: 4)  http://web.eecs.umich.edu/~mihalcea/downloads/MELD.Raw.tar.gz
Connecting to web.eecs.umich.edu (web.eecs.umich.edu)|141.212.113.214|:80... connected.
HTTP request sent, awaiting response... Read error (Connection reset by peer) in headers.
Retrying.

--2022-10-17 14:07:29--  (try: 5)  http://web.eecs.umich.edu/~mihalcea/downloads/MELD.Raw.tar.gz
Connecting to web.eecs.umich.edu (web.eecs.umich.edu)|141.212.113.214|:80... connected.
HTTP request sent, awaiting response...^C

And I confirmed that I also cannot access Prof. Mihalcea's website, where the data is located.

Could you please check if you can access this download link?
Thank you.

Use of the dataset for training models in relation to the license

Do you have any expectations for how the GPL v3 license should apply to weights in a model trained using the data as part of a training set, with no use of the software or the pretrained models? I was looking to include this as training data either in Stanford's stanza package, which has an apache v2 license, or CoreNLP software, which has a separate license for commercial applications.

Missing video in dev set and utility of additional videos

Dear Prof. Poria,
I've downloaded the raw dataset from the offical website because I want to extract multimodal features by myself. However, I find that the video 'dia110_utt7.mp4' (Sr No. 1153 in the 'dev_sent_emo.csv') does not exist in the 'dev_splits_complete' folder. Could you verify this problem and update the dataset?

Besides, I notice that there are additional videos in video folders which have no corresponding annotations in csv files. For example, 'dia66_utt9', 'dia49_utt5', 'dia49_utt4', 'dia66_utt10' in the dev set and 'dia108_utt2', 'final_videos_testdia101_utt0' in the test set. Could you tell me what are videos used for?

Thanks very much!

missing training data

Hi there,
I've downloaded MELD.Raw.tar.gz, the dev and test data is ok, but the training data is missing.
when I untar train.tar.gz, it always shows the following issue:
gzip: stdin: unexpected end of file
tar: Unexpected EOF in archive
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now
and after that, I got 'split' directory which contains 4648 files. So I believe the training data is definitely missing.
I tried many times, always the same results.
I wish I can get some help with that.

ValueError: "input_length" is 33, but received input has shape (None, 50)

I am trying to run the baseline.py file to test the model for Emotion classification and text modality by using the already trained models (the source of which were provided in the README file).

I am using the following command:
python baseline.py -classify Emotion -modality text -test

The error that I am getting after running this command is as follows:

Using TensorFlow backend.
Model initiated for Emotion classification
Loading data
Labels used for this classification:  {'neutral': 0, 'surprise': 1, 'fear': 2, 'sadness': 3, 'joy': 4, 'disgust': 5, 'anger': 6}
Traceback (most recent call last):
  File "baseline.py", line 286, in <module>
    model.test_model()
  File "baseline.py", line 234, in test_model
    model = load_model(self.PATH)
  File "C:\Program Files\Python37\lib\site-packages\keras\engine\saving.py", line 492, in load_wrapper
    return load_function(*args, **kwargs)
  File "C:\Program Files\Python37\lib\site-packages\keras\engine\saving.py", line 584, in load_model
    model = _deserialize_model(h5dict, custom_objects, compile)
  File "C:\Program Files\Python37\lib\site-packages\keras\engine\saving.py", line 274, in _deserialize_model
    model = model_from_config(model_config, custom_objects=custom_objects)
  File "C:\Program Files\Python37\lib\site-packages\keras\engine\saving.py", line 627, in model_from_config
    return deserialize(config, custom_objects=custom_objects)
  File "C:\Program Files\Python37\lib\site-packages\keras\layers\__init__.py", line 168, in deserialize
    printable_module_name='layer')
  File "C:\Program Files\Python37\lib\site-packages\keras\utils\generic_utils.py", line 147, in deserialize_keras_object
    list(custom_objects.items())))
  File "C:\Program Files\Python37\lib\site-packages\keras\engine\network.py", line 1075, in from_config
    process_node(layer, node_data)
  File "C:\Program Files\Python37\lib\site-packages\keras\engine\network.py", line 1025, in process_node
    layer(unpack_singleton(input_tensors), **kwargs)
  File "C:\Program Files\Python37\lib\site-packages\keras\engine\base_layer.py", line 506, in __call__
    output_shape = self.compute_output_shape(input_shape)
  File "C:\Program Files\Python37\lib\site-packages\keras\layers\embeddings.py", line 136, in compute_output_shape
    (str(self.input_length), str(input_shape)))
ValueError: "input_length" is 33, but received input has shape (None, 50)

Can someone please help me resolve this issue?

The meanings about the features.

image
Can you exlpain what features each file represents?I'm a little confused about their file names. After reading the codes in basline.py and data_helper.py, you use cnn to extract the textual-features. Does it mean the vedio features? In your fusion model, I can't find the vedio branch. And what does text_glove_average_emotion.pkl mean? And what's the difference between audio_embeddings_feature_selection_emotion.pkl and audio_emotion.pkl?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.