You can try our image classifier app here
Link to our pages: https://jorgeluisgalarraga.github.io/kick-and-punch-image-classification/
This came as a group idea, we started as a class project. However, we couldn't get a good accuracy.
We used a dataset that we created from scratch, after watching MMA's videos on youtube, and looking for different fight images on the internet. This data set contains 4 classes, kick, kicknt, punch, and punchnt. The nt means not touch.
The dataset can be downloaded from Roboflow. This tool allowed us to do the image classification, and the split of the dataset, which is very interesting and easy to go.
So far, we have tried different Keras applications such as vgg16, ResNet50V2, InceptionResNetV2, MobileNET V2 and Efficient Net
With these five models and 4 classes, we got the following results:
- VGG16 with 100.356 trainable params and 14.815.044 total params, no dropout but with data augmentation
- ResNet50V2 with 401.412 trainable parameters and 23.966.212 total params, no dropout but with data augmentation
- InceptionResNetV2 with 3.985.412 trainable parameters and 55.125.732 total params, with a dropout of 0.5, Average Pooling and a fully connected layer
- MobileNETV2
- EfficientNET Model It is important to notice that some of the models don't have dropout since we realized that by adding dropout the performance was worst in each case.
After running a number of experiments, we decided to remove 2 classes, which were kicknt, and punchnt. These 2 classes were so similar to our kick, punch, so the training of the models could be confused by the similitudes of the 4 classes. The dataset can be found in Roboflow Universe
In order to run the experiments with the same models, we decided to train the models with similar conditions; however, for these experiments, we included a dropout of 0.5. So, with these five models and 2 classes, we got the following results:
- VGG16 with 50.178 trainable parameters and 14.764.866 total parameters,
- ResNet50V2 with 200.706 trainable parameters and 23.756.506 total params
- InceptionResNetV2 with 3.074 trainable parameters and 54.339.810 total params
- MobileNETV2
- EfficientNET Model
We can notice that the overall performance increase a 30% on average for the 5 models. Meaning that our assumptions about the 4 classes were right.
After noticing that we increased our models' accuracy, we decided to have some experiments with YoloV8n classify model.
To have some fair experiments we tried with the 4 classes and 2 classes datasets.
Let's compare the results:
YoloV8n 4 classes | YoloV8n 2 classes |
---|---|
In conclusion, we can say that we got better accuracy when using 2 classes and YoloV8 overall.