Xavier Giro-i-Nieto | Aniol Lidon | Marc Bolaños | Maite Garolera | Mariella Dimiccoli | Petia Radeva |
A joint collaboration between:
Image Processing Group at Universitat Politecnica de Catalunya (UPC) | Computer Vision Group at Universitat de Barcelona (UB) |
Accepted as oral presentation in the 2nd Lifelogging Tools and Applications Workshop (LTA'17) in ACM Multimedia 2017.
With the rapid increase of users of wearable cameras in recent years and of the amount of data they produce, there is a strong need for automatic retrieval and summarization techniques. This work addresses the problem of automatically summarizing egocentric photo streams captured through a wearable camera by taking an image retrieval perspective. After removing non-informative images by a new CNN-based filter, images are ranked by relevance to ensure semantic diversity and finally re-ranked by a novelty criterion to reduce redundancy. To assess the results, a new evaluation metric is proposed which takes into account the non-uniqueness of the solution. Experimental results applied on a database of 7,110 images from 6 different subjects and evaluated by experts gave 95.74% of experts satisfaction and a Mean Opinion Score of 4.57 out of 5.0.
An arXiv pre-print is already available.
Please cite with the following Bibtex code:
@inproceedings{lidon2017semantic,
title={Semantic Summarization of Egocentric Photo Stream Events},
author={Lidon, Aniol and Bola{\~n}os, Marc and Dimiccoli, Mariella and Radeva, Petia and Garolera, Maite and Gir{\'o}-i-Nieto, Xavier},
booktitle = {Proceedings of the Second Workshop on Lifelogging Tools and Applications},
series = {LTA '17},
year={2017}
location = {Mountain View, CA, USA},
publisher = {ACM},
address = {New York, NY, USA}
}
You may also want to refer to our publication with the more human-friendly style:
Lidon, Aniol, Marc Bolaños, Mariella Dimiccoli, Petia Radeva, Maite Garolera, and Xavier Giró-i-Nieto. "Semantic Summarization of Egocentric Photo Stream Events." In Proceedings of the second Workshop on Lifelogging Tools and Applications (LTA '17). ACM, New York, NY, USA.
This work is an extension of the master thesis by Aniol Lidon for the Master in Computer Vision Barcelona, class of 2015.
The EDUB-Seg dataset has been used in this paper. Please cite to this publication if you use it. You might also want to check its project page.
Three examples of the top 5 images obtained before introducing diversity (uneven rows) and after introducing it (even rows).
We would like to especially thank Albert Gil Moreno and Josep Pujal from our technical support team at the Image Processing Group at the UPC.
Albert Gil | Josep Pujal |
We gratefully acknowledge the support of NVIDIA Corporation with the donation of the GeoForce GTX Titan Z and Titan X used in this work. | |
The Image Processing Group at UPC (SGR1421) and the Computer Vision Group at UB (SGR1219) are both Consolidated Research Groups recognized and sponsored by the Catalan Government (Generalitat de Catalunya) through its AGAUR office. Mariella Dimiccoli is supported by a Beatriu de Pinos grant, Marie-Curie COFUND action. | |
This work has been developed in the framework of the project BigGraph TEC2013-43935-R and and TIN2012-38187-C03-01, funded by the Spanish Ministerio de Economía y Competitividad and the European Regional Development Fund (ERDF). |
If you have any general doubt about our work or code which may be of interest for other researchers, please use the issues section on this github repo. Alternatively, drop us an e-mail at mailto:[email protected].
<script> (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){ (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o), m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m) })(window,document,'script','//www.google-analytics.com/analytics.js','ga'); ga('create', 'UA-7678045-7', 'auto'); ga('send', 'pageview'); </script>