Giter Site home page Giter Site logo

movilan's Introduction

Alt text

  1. Get familiarized with concepts: (paper here: https://arxiv.org/abs/2101.07891) (explanation slide here: https://github.com/Homagn/MOVILAN/blob/main/MOVILAN_detailed_explanation.pptx)

  2. First set up the docker in your system if you dont have nvidia-docker then follow instructions : https://github.com/Homagn/Dockerfiles/blob/main/Docker-knowhows/nvidia-docker-setup

  3. Pull the necessary environment or build it Build the docker file : (download the source code here from github) (navigate to the Dockerfile location in MOVILAN/)

    sudo nvidia-docker build -t homagni/vision_language:latest .

    OR

    Pull the prebuilt docker image like this:

    docker pull homagni/vision_language:latest

  4. Download the necessary model weights and data go to the google drive folder-> https://drive.google.com/file/d/1Spz3o5wmYUIMyXsYl3tKYYTMapzkca1_/view?usp=sharing download the zip file, after that extract the contents to your source MOVILAN/ folder like this

    alfred_model_1000_modification -> language_understanding/alfred_model_1000_modification

    data -> mapper/data

    nn_weights -> mapper/nn_weights

    unet_weights.pth -> cross_modal/unet_weights.pth

    prehash.npy -> cross_modal/prehash.npy

    descriptions.json -> cross_modal/data/descriptions.json

  5. Run the docker instance

    (in a terminal in linux)

    xhost +

    (after this in a newline)

    (NOTE- replace /home/homagni/Desktop/MOVILAN/ with the location where you have downloaded the source code)

    sudo nvidia-docker run --rm -ti --mount type=bind,source=/home/homagni/Desktop/MOVILAN/,target=/ai2thor --net=host --ipc=host -e DISPLAY=$DISPLAY -v /tmp/.X11-unix:/tmp/.X11-unix --env="QT_X11_NO_MITSHM=1" homagni/vision_language

    (Now youll be inside the terminal of the docker instance)

    (run the test code)

    cd /ai2thor

    python3 main_interactive.py

    (it should open up an ai2thor instance and run an execution of our algorithm for an instruction in ALFRED dataset)

EXTRA NOTES:

In mapper/params.py you can change debug_viz= True or false depending on whether you want to see the internal map state of the robot

The code creates a lot of log outputs depicting the various stages of decision making to make a log you can try

python3 main_batchrun.py > SomeFile.txt

VIEWING EXPERT TRAJECTORIES

cd /ai2hor/robot/

(replace with room number and task number)

python3 master_execution.py --room 1 --task 1 --gendata

DEBUGGING pipeline

For new rooms objects may not be identifiable in a map go to /ai2thor/mapper/datagen.py and follow the instructions in the end comments to generate maps and correct maps

go to /ai2thor/log_instructions.py to generate list of existing instructions in the dataset

(ERRORS ?)

If the display is not opening up from the docker instance (if youre using linux azure VM and docker from inside it) mviereck/x11docker#186 and (probably the last instruction of this) https://github.com/stas-pavlov/azure-glx-rendering

https://unix.stackexchange.com/questions/403424/x11-forwarding-from-a-docker-container-in-remote-server

using the --privileged flag as in here

(https://answers.ros.org/question/301056/ros2-rviz-in-docker-container/)

is able to make gazebo work with display from docker in azure cloud

movilan's People

Contributors

homagn avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

fotouhif

movilan's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.