Giter Site home page Giter Site logo

Comments (4)

PietroVitiello avatar PietroVitiello commented on September 18, 2024

Hello BrightMoonStar,
Thanks for checking the repo out!

For this project I was investigating some low level action representations and for that I used very common architectures, without actually importing external work from anyone in particular. Therefore there aren't many specific references related to the code that I can give you. This being said, in case you were interested, I think these are possible relevant works:

CNN: (Just an example paper this is one of the very first) Y. Lecun, L. Bottou, Y. Bengio and P. Haffner, "Gradient-based learning applied to document recognition," in Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, Nov. 1998, doi: 10.1109/5.726791.
ResNet: He, Kaiming, et al. "Deep residual learning for image recognition." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
UNet: Ronneberger, Olaf, Philipp Fischer, and Thomas Brox. "U-net: Convolutional networks for biomedical image segmentation." Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18. Springer International Publishing, 2015.
Autoencoder: According to Schmidhuber this paper (Ballard, Dana H.. “Modular Learning in Neural Networks.” AAAI Conference on Artificial Intelligence (1987)) has formally introduced them although the concept of autoencoding had certainly been used before, but never for pretraining of neural nets.
Action Image: A paper that proposes an interesting action representation that is effectively purely visual is Khansari, Mohi, et al. "Action image representation: Learning scalable deep grasping policies with zero real world data." 2020 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2020.

There are certainly many many more papers that are relevant to the topic of action representations and more generally robot learning. However, to answer your question about references specifically related to the code I think these are the main ones. In case you were further interested I can share all of the references of my thesis.
I hope this might help you!

from actionrepresentation.

BrightMoonStar avatar BrightMoonStar commented on September 18, 2024

Thank you very much for your reply. For what you mentioned in the readme.md file "To leverage the use of the motion image we propose the MI-Net which makes use of an autoencoder and attention mechanism", I tried to find the paper which proposed MI-Net Net but failed. I'd like to ask you for help in providing information for this paper which proposed MI-Net. Thank you again!

from actionrepresentation.

PietroVitiello avatar PietroVitiello commented on September 18, 2024

Hello BrightMoonStar,
I am proposing the MI-Net in this repo. However it was only my MSc project, I haven't submitted any paper regarding it. I suggest you take a look at the MI-Net architecture directly in src/Learning/Models/MotionIMG this might make its functioning clearer.
I think that the main takeaway of this repo should be the idea of the Motion Image as an action representation that could be predicted by the network. The specific architecture I have used is not the most important part as this is actually very simple and could be easily improved so feel free to give it a try!

from actionrepresentation.

PietroVitiello avatar PietroVitiello commented on September 18, 2024

Closed due to inactivity. Feel free to reopen in case needed.

from actionrepresentation.

Related Issues (2)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.