karodievas / car-sound-classification-with-keras Goto Github PK

Car engine sound classification with Keras Deep learning library using spectro analysis pictures of sounds

License: MIT License

Python 100.00%

car-engine-sound car-sound-classification tensorflow predictor python-predictor spectro-analysis-pictures image-classification

car-sound-classification-with-keras's Introduction

Car sound classification with Keras

Car engine sound classification with Keras Deep learning library using Morlet pictures of sounds

Purpose

To detect concrete car engine sound.

About

The goal is to create deep learning algorithm which can detect concrete car engine sound. This library still in developing. In later versions there are no provided training data due huge amount of it. In Data/audio/original_source you can find m4a files of different cars engines. Later I will provide file with representation. In order to use that data you need to convert it to wav files and pass to generate train data.

Possible usage

Automatic garden doors system using car engine sound system

Requirements

Anaconda 3
Keras
Python 3.6
pip >= 9.*
Librosa
ffmpeg
librosa
tensorflow
matplotlib

Instalation

python setup.py install --user

Usage

In forlder Data/raw you can put your data in separated in folders. For example: Data/raw/audi - some wav files Data/raw/other - some wav files

You can have as many as you want categories. Just do not forget to modify Models/KerasModel.py, TrainModel.py and Predictor.py cause these are configurated for binary usage.

For data generation from wav file:

python Main.py

It will generate all needed structure for you.

Slices audio source (currently only mono)
Puts half sliced sources to train half to validation directories

For training network:

python TrainModel.py

Before running update these parameter by your needs:

nb_train_samples
nb_validation_samples
nb_epoch
batch_size

Only improved weights are saving. At the train end you will have two tables about how gone you training.

For prediction:

python Predictor.py path/to/your/weight_file path/to/image_you_want_to_predict

Conclusion

It better to use binary classification (only two classes) due to you can concentrate more train data and model will be more accurate. For me it gives ~0.8 accuracy or ~0.2 loss. I had around 3700 pictures for each class in validation and train.

I have tried to it by classes. Had 47 classes each class has about 100-200 pictures so it's really small amount and network trains very slow. Results was very poor. This will give me only ~4.1 accuracy which is basically nothing. So for this model need more data or do something else.

car-sound-classification-with-keras's People

Contributors

Stargazers

Watchers

Forkers

saadmahboob vignesh1905 pchankh nanfengpo nemocpp holianh jjdblast kidzrl gavin-gy jamess010 xuhaoteoh logicmohe timverion shawn-zwj ishrahussain yurnero-07 kevingoh

car-sound-classification-with-keras's Issues

Accuracy is High but Validation Accuracy is Bad

Hi, I have been working on the channel ordering issue and finally I managed to train the model but the results are very bad. Do you have any suggestions on how shall I improve the results?

The results are as attached below:

https://colab.research.google.com/drive/1xHghjvOEVZc3bGtLfgRNw6Xq51d7U_XA

https://colab.research.google.com/drive/1G_oh_f-2TxF8oa6bVpn-qXlRyOfAPAAq

TrainModel.py Errors

Hi there,

I'm trying to use my Raspberry Pi to build a sound classifier based on your project. I got some trouble while trying to run the TrainModel.py after Main.py finished

K.set_image_dim_ordering("th") I searched forum and it's version problem. Replace it with K.image_data_format()=='channels_first' and it works
--Then I got errors at line 22: Model = kerasModel.get_model(img_width,img_height)
shows valueError: Negative dimension size caused by subtracting 2 from 1 fro 'max_pooling2d_1/MaxPool' (op:'MaxPool')with input shapes: [?,1,501,32]

Could you help me with that? I used around 200 samples and already change the number of train samples and validation samples

Value Error related to Channel Order

Hi, I have changed the script. The first epoch is successfully run but the same value error appeared.

ValueError: Error when checking input: expected conv2d_1_input to have shape (3, 496, 369) but got array with shape (496, 369, 3)

I have tried running the codes on Jupyter and apparently this is the part where it has errors.

history = model.fit_generator( train_generator, steps_per_epoch=nb_train_samples/batch_size, epochs=nb_epochs, validation_data=validation_generator, validation_steps=nb_validation_samples/batch_size, callbacks=[check_pointer])

Do I simply change the channel orders to channels last instead of channels first? Or is that a better way?

No documentation.

Your project seems interesting. Is it actually public ? There's absolutely no documentation on how to run it.

Crash running under anaconda

$ python predictor.py

Using TensorFlow backend.
Traceback (most recent call last):
File "predictor.py", line 21, in
model.add(MaxPooling2D(pool_size=(2, 2)))
File "/Users/pato/anaconda/lib/python2.7/site-packages/keras/models.py", line 332, in add
output_tensor = layer(self.outputs[0])
File "/Users/pato/anaconda/lib/python2.7/site-packages/keras/engine/topology.py", line 572, in call
self.add_inbound_node(inbound_layers, node_indices, tensor_indices)
File "/Users/pato/anaconda/lib/python2.7/site-packages/keras/engine/topology.py", line 635, in add_inbound_node
Node.create_node(self, inbound_layers, node_indices, tensor_indices)
File "/Users/pato/anaconda/lib/python2.7/site-packages/keras/engine/topology.py", line 166, in create_node
output_tensors = to_list(outbound_layer.call(input_tensors[0], mask=input_masks[0]))
File "/Users/pato/anaconda/lib/python2.7/site-packages/keras/layers/pooling.py", line 160, in call
dim_ordering=self.dim_ordering)
File "/Users/pato/anaconda/lib/python2.7/site-packages/keras/layers/pooling.py", line 210, in _pooling_function
pool_mode='max')
File "/Users/pato/anaconda/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py", line 2866, in pool2d
x = tf.nn.max_pool(x, pool_size, strides, padding=padding)
File "/Users/pato/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/nn_ops.py", line 1793, in max_pool
name=name)
File "/Users/pato/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 1598, in _max_pool
data_format=data_format, name=name)
File "/Users/pato/anaconda/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 763, in apply_op
op_def=op_def)
File "/Users/pato/anaconda/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2397, in create_op
set_shapes_for_outputs(ret)
File "/Users/pato/anaconda/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1757, in set_shapes_for_outputs
shapes = shape_func(op)
File "/Users/pato/anaconda/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1707, in call_with_requiring
return call_cpp_shape_fn(op, require_shape_fn=True)
File "/Users/pato/anaconda/lib/python2.7/site-packages/tensorflow/python/framework/common_shapes.py", line 610, in call_cpp_shape_fn
debug_python_shape_fn, require_shape_fn)
File "/Users/pato/anaconda/lib/python2.7/site-packages/tensorflow/python/framework/common_shapes.py", line 675, in _call_cpp_shape_fn_impl
raise ValueError(err.message)
ValueError: Negative dimension size caused by subtracting 2 from 1 for 'MaxPool' (op: 'MaxPool') with input shapes: [?,1,254,32].

Predictor.py arguments

According to the code are these:
img_path = sys.argv[1]
weights = sys.argv[2]
So not just the img path as provided in the readme. Please provide a full example.

ValueError: Error when checking input: expected conv2d_1_input to have shape (3, 496, 369) but got array with shape (496, 369, 3)

I believe this error is regarding channel ordering but I can't find any way to resolve it after searching on the Internet.

Can you help me with this issue? Which part of the code do I need amendments?

KerasModel.pdf

TrainModel.pdf

Predictor.py Issues

Hi,

May I know how shall I properly use the Predictor.py? I tried the following ways but it seems not working.

Categorizing the Original Audio Files

Hi, may I know if how shall I categorize the original source of audio? And where do you get these original source of audio?

I hope I get your help guidance as I am doing this for my Final Year Project and I am a newbie in deep learning.

Thanks in advance.

Loss function to use

Hello,

Do you have any idea which loss function shall I use if I have 4 folders for 4 different car brands? Can I use the same binary_crossentropy function or shall I change it to Multi-Class Cross-Entropy?

I appreciate much if you reply.

With regards,
Teoh