Giter Site home page Giter Site logo

karodievas / car-sound-classification-with-keras Goto Github PK

View Code? Open in Web Editor NEW
28.0 5.0 17.0 1.76 GB

Car engine sound classification with Keras Deep learning library using spectro analysis pictures of sounds

License: MIT License

Python 100.00%
car-engine-sound car-sound-classification tensorflow predictor python-predictor spectro-analysis-pictures image-classification

car-sound-classification-with-keras's Introduction

Car sound classification with Keras

Car engine sound classification with Keras Deep learning library using Morlet pictures of sounds

Purpose

  • To detect concrete car engine sound.

About

The goal is to create deep learning algorithm which can detect concrete car engine sound. This library still in developing. In later versions there are no provided training data due huge amount of it. In Data/audio/original_source you can find m4a files of different cars engines. Later I will provide file with representation. In order to use that data you need to convert it to wav files and pass to generate train data.

Possible usage

  • Automatic garden doors system using car engine sound system

Requirements

  • Anaconda 3
  • Keras
  • Python 3.6
  • pip >= 9.*
  • Librosa
  • ffmpeg
  • librosa
  • tensorflow
  • matplotlib

Instalation

  • python setup.py install --user

Usage

In forlder Data/raw you can put your data in separated in folders. For example: Data/raw/audi - some wav files Data/raw/other - some wav files

You can have as many as you want categories. Just do not forget to modify Models/KerasModel.py, TrainModel.py and Predictor.py cause these are configurated for binary usage.

For data generation from wav file:

  • python Main.py

It will generate all needed structure for you.

  • Slices audio source (currently only mono)
  • Puts half sliced sources to train half to validation directories

For training network:

  • python TrainModel.py

Before running update these parameter by your needs:

  • nb_train_samples
  • nb_validation_samples
  • nb_epoch
  • batch_size

Only improved weights are saving. At the train end you will have two tables about how gone you training.

For prediction:

  • python Predictor.py path/to/your/weight_file path/to/image_you_want_to_predict

Conclusion

It better to use binary classification (only two classes) due to you can concentrate more train data and model will be more accurate. For me it gives ~0.8 accuracy or ~0.2 loss. I had around 3700 pictures for each class in validation and train.

I have tried to it by classes. Had 47 classes each class has about 100-200 pictures so it's really small amount and network trains very slow. Results was very poor. This will give me only ~4.1 accuracy which is basically nothing. So for this model need more data or do something else.

car-sound-classification-with-keras's People

Contributors

karodievas avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

car-sound-classification-with-keras's Issues

TrainModel.py Errors

Hi there,

I'm trying to use my Raspberry Pi to build a sound classifier based on your project. I got some trouble while trying to run the TrainModel.py after Main.py finished

  • K.set_image_dim_ordering("th") I searched forum and it's version problem. Replace it with K.image_data_format()=='channels_first' and it works
    --Then I got errors at line 22: Model = kerasModel.get_model(img_width,img_height)
    shows valueError: Negative dimension size caused by subtracting 2 from 1 fro 'max_pooling2d_1/MaxPool' (op:'MaxPool')with input shapes: [?,1,501,32]

Could you help me with that? I used around 200 samples and already change the number of train samples and validation samples

Value Error related to Channel Order

Hi, I have changed the script. The first epoch is successfully run but the same value error appeared.

ValueError: Error when checking input: expected conv2d_1_input to have shape (3, 496, 369) but got array with shape (496, 369, 3)

I have tried running the codes on Jupyter and apparently this is the part where it has errors.

history = model.fit_generator( train_generator, steps_per_epoch=nb_train_samples/batch_size, epochs=nb_epochs, validation_data=validation_generator, validation_steps=nb_validation_samples/batch_size, callbacks=[check_pointer])

Do I simply change the channel orders to channels last instead of channels first? Or is that a better way?

No documentation.

Your project seems interesting. Is it actually public ? There's absolutely no documentation on how to run it.

Crash running under anaconda

$ python predictor.py

Using TensorFlow backend.
Traceback (most recent call last):
File "predictor.py", line 21, in
model.add(MaxPooling2D(pool_size=(2, 2)))
File "/Users/pato/anaconda/lib/python2.7/site-packages/keras/models.py", line 332, in add
output_tensor = layer(self.outputs[0])
File "/Users/pato/anaconda/lib/python2.7/site-packages/keras/engine/topology.py", line 572, in call
self.add_inbound_node(inbound_layers, node_indices, tensor_indices)
File "/Users/pato/anaconda/lib/python2.7/site-packages/keras/engine/topology.py", line 635, in add_inbound_node
Node.create_node(self, inbound_layers, node_indices, tensor_indices)
File "/Users/pato/anaconda/lib/python2.7/site-packages/keras/engine/topology.py", line 166, in create_node
output_tensors = to_list(outbound_layer.call(input_tensors[0], mask=input_masks[0]))
File "/Users/pato/anaconda/lib/python2.7/site-packages/keras/layers/pooling.py", line 160, in call
dim_ordering=self.dim_ordering)
File "/Users/pato/anaconda/lib/python2.7/site-packages/keras/layers/pooling.py", line 210, in _pooling_function
pool_mode='max')
File "/Users/pato/anaconda/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py", line 2866, in pool2d
x = tf.nn.max_pool(x, pool_size, strides, padding=padding)
File "/Users/pato/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/nn_ops.py", line 1793, in max_pool
name=name)
File "/Users/pato/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 1598, in _max_pool
data_format=data_format, name=name)
File "/Users/pato/anaconda/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 763, in apply_op
op_def=op_def)
File "/Users/pato/anaconda/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2397, in create_op
set_shapes_for_outputs(ret)
File "/Users/pato/anaconda/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1757, in set_shapes_for_outputs
shapes = shape_func(op)
File "/Users/pato/anaconda/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1707, in call_with_requiring
return call_cpp_shape_fn(op, require_shape_fn=True)
File "/Users/pato/anaconda/lib/python2.7/site-packages/tensorflow/python/framework/common_shapes.py", line 610, in call_cpp_shape_fn
debug_python_shape_fn, require_shape_fn)
File "/Users/pato/anaconda/lib/python2.7/site-packages/tensorflow/python/framework/common_shapes.py", line 675, in _call_cpp_shape_fn_impl
raise ValueError(err.message)
ValueError: Negative dimension size caused by subtracting 2 from 1 for 'MaxPool' (op: 'MaxPool') with input shapes: [?,1,254,32].

Predictor.py arguments

According to the code are these:
img_path = sys.argv[1]
weights = sys.argv[2]
So not just the img path as provided in the readme. Please provide a full example.

Predictor.py Issues

Hi,

May I know how shall I properly use the Predictor.py? I tried the following ways but it seems not working.

Error

Categorizing the Original Audio Files

Hi, may I know if how shall I categorize the original source of audio? And where do you get these original source of audio?

I hope I get your help guidance as I am doing this for my Final Year Project and I am a newbie in deep learning.

Thanks in advance.

Loss function to use

Hello,

Do you have any idea which loss function shall I use if I have 4 folders for 4 different car brands? Can I use the same binary_crossentropy function or shall I change it to Multi-Class Cross-Entropy?

I appreciate much if you reply.

With regards,
Teoh

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.