zkmkarlsruhe / language-identification Goto Github PK

View Code? Open in Web Editor NEW

35.0 4.0 7.0 5.57 MB

Spoken Language Identification on Common Voice and AudioSet using Deep Learning

License: Other

Dockerfile 0.63% Python 97.71% Shell 1.66%

spoken-language-identification language-identification lid common-voice audioset intelligent-museum zkm

language-identification's People

Stargazers

Watchers

Forkers

ralfeger vyoz bellyfat nursumusod hbertin1 dharam1291 rounaksonthalia

language-identification's Issues

large audio file language processing

Hi,
At the first, thanks for the valuable repo.
I have some audio file with average length of 15 minutes that several people with different language are talking in it.
How can I use your pretrained model to handle the aforementioned audio file?
Best regards
@bytosaur
@danomatika
@loelkes

Model Training

Hi! Can you add detailed steps on how to train your model using a custom dataset?

Cannot train with batch size > 1

Hi! I am having this issue when training with batch size > 1:

tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
  (0) Invalid argument:  Cannot add tensor to the batch: number of elements does not match. Shapes are: [tensor]: [119130,1], [batch]: [80000,1]
	 [[node IteratorGetNext (defined at train.py:117) ]]
  (1) Invalid argument:  Cannot add tensor to the batch: number of elements does not match. Shapes are: [tensor]: [119130,1], [batch]: [80000,1]
	 [[node IteratorGetNext (defined at train.py:117) ]]
	 [[Shape/_4]]
0 successful operations.
0 derived errors ignored. [Op:__inference_train_function_14011]

Function call stack:
train_function -> train_function

Issue with the Google Colab -

Hello , i have an issue with the Google Colab file. Here is my error with the last cell execution:

ValueError Traceback (most recent call last)

in ()
----> 1 prediction = model.predict(audio)

9 frames

/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/func_graph.py in wrapper(*args, **kwargs)
992 # invariant: func_outputs contains only Tensors, CompositeTensors,
993 # TensorArrays and Nones.
--> 994 func_outputs = nest.map_structure(convert, func_outputs,
995 expand_composites=True)
996

ValueError: in user code:

/usr/local/lib/python3.7/dist-packages/keras/engine/training.py:1586 predict_function  *
    return step_function(self, iterator)
/usr/local/lib/python3.7/dist-packages/keras/engine/training.py:1576 step_function  **
    outputs = model.distribute_strategy.run(run_step, args=(data,))
/usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/distribute_lib.py:1286 run
    # /job:localhost/replica:0/task:0/device:GPU:0
/usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/distribute_lib.py:2849 call_for_each_replica
    `ReplicaContext`, which can only be called inside the function passed to
/usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/distribute_lib.py:3632 _call_for_each_replica
    
/usr/local/lib/python3.7/dist-packages/keras/engine/training.py:1569 run_step  **
    outputs = model.predict_step(data)
/usr/local/lib/python3.7/dist-packages/keras/engine/training.py:1537 predict_step
    return self(x, training=False)
/usr/local/lib/python3.7/dist-packages/keras/engine/base_layer.py:1037 __call__
    outputs = call_fn(inputs, *args, **kwargs)
/usr/local/lib/python3.7/dist-packages/keras/engine/functional.py:415 call
    inputs, training=training, mask=mask)
/usr/local/lib/python3.7/dist-packages/keras/engine/functional.py:550 _run_internal_graph
    outputs = node.layer(*args, **kwargs)
/usr/local/lib/python3.7/dist-packages/keras/engine/base_layer.py:1037 __call__
    outputs = call_fn(inputs, *args, **kwargs)
/usr/local/lib/python3.7/dist-packages/keras/engine/sequential.py:369 call
    return super(Sequential, self).call(inputs, training=training, mask=mask)
/usr/local/lib/python3.7/dist-packages/keras/engine/functional.py:415 call
    inputs, training=training, mask=mask)
/usr/local/lib/python3.7/dist-packages/keras/engine/functional.py:550 _run_internal_graph
    outputs = node.layer(*args, **kwargs)
/usr/local/lib/python3.7/dist-packages/keras/engine/base_layer.py:1037 __call__
    outputs = call_fn(inputs, *args, **kwargs)
/usr/local/lib/python3.7/dist-packages/keras/saving/saved_model/utils.py:68 return_outputs_and_add_losses
    outputs, losses = fn(*args, **kwargs)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/def_function.py:885 __call__
    else:
/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/def_function.py:924 _call
    "\n"
/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/function.py:3038 __call__
    seen_names.add(proposal)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/function.py:3463 _maybe_define_function
    @tf.contrib.eager.defun
/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/function.py:3308 _create_graph_function
    TypeError: If the function inputs include non-hashable objects
/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/func_graph.py:1007 func_graph_from_py_func
    for arg in (nest.flatten(func_args, expand_composites=True) +
/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/def_function.py:668 wrapped_fn
    
/usr/local/lib/python3.7/dist-packages/tensorflow/python/saved_model/function_deserialization.py:294 restored_function_body
    def load_function_def_library(library, load_shared_name_suffix=None):

ValueError: Could not find matching function to call loaded from the SavedModel. Got:
  Positional arguments (1 total):
    * Tensor("x:0", shape=(None, 80000, 2), dtype=float32)
  Keyword arguments: {}

Expected these arguments to match one of the following 1 option(s):

Option 1:
  Positional arguments (1 total):
    * TensorSpec(shape=(None, 80000, 1), dtype=tf.float32, name='x')
  Keyword arguments: {}

Question about preprocessing

First of all, thank you for sharing your models!
I was wondering if you trim the audio from silince during preprocessing, because your model works pretty well where there's voice right away but if someone was lingering in the beginning of the audio, your model predicts noise. How do you think one should approach the issue?

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.