Giter Site home page Giter Site logo

lijichang / everything-llms-and-robotics Goto Github PK

View Code? Open in Web Editor NEW

This project forked from jrin771/everything-llms-and-robotics

0.0 0.0 0.0 193 KB

The world's largest GitHub Repository for LLMs + Robotics

License: BSD 3-Clause "New" or "Revised" License

everything-llms-and-robotics's Introduction

Everything-LLMs-And-Robotics

The world's largest GitHub Repository for the intersection of LLMs (multimodal included!) + Robotics

Heavily Inspired by Awesome-LLM-Robotics

Logistics

If you want to make a change this repository click here

Why I made this: Go here.

What Does This Repository Have?

LLMs Educational Resources

  • START HERE: "Transformers from Scratch", Brandon Rohrer, [Website]

  • Stanford Transformers Class: "CS25: Transformers United", Stanford, 2022, [Website]

  • Andrej Karpathy GPT Tutorial: "Let's build GPT: from scratch, in code, spelled out." Andrej Karpathy, 2023 [Youtube Video]

Robotics Educational Resources

  • AI-Enabled Robotics Class: "CS199: Stanford Robotics Independent Study", Stanford, 2023, [Website]

LLMs + Robotics Educational Resources

  • Google's 2022 Research: "Google Research, 2022 & beyond: Robotics", Google, 2023, [Website]

  • Controlling Robots Via Large Language Models: "Controlling Robots Via Large Language Models", Sanjiban Choudhury, CS 4756/5756, Cornell, 2023 [Slides]

Reasoning

  • LLM Designs Robots: "CAN LARGE LANGUAGE MODELS DESIGN A ROBOT?", arXiv, Mar 2023. [Paper]

  • PaLM-E: "PaLM-E: An Embodied Multimodal Language Model", arXiV, Mar 2023. [Paper] [Website] [Demo]

  • RT-1: "RT-1: Robotics Transformer for Real-World Control at Scale", arXiv, Dec 2022. [Paper] [Code] [Website]

  • ProgPrompt: "Generating Situated Robot Task Plans using Large Language Models", arXiv, Sept 2022. [Paper] [Code Doesn't Really Exist here] [Website]

  • Code-As-Policies: "Code as Policies: Language Model Programs for Embodied Control", arXiv, Sept 2022. [Paper] [Code] [Website]

  • Say-Can: "Do As I Can, Not As I Say: Grounding Language in Robotic Affordances", arXiv, Apr 2021. [Paper] [Code] [Website]

  • Socratic: "Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language", arXiv, Apr 2021. [Paper] [Code] [Website]

  • PIGLeT: "PIGLeT: Language Grounding Through Neuro-Symbolic Interaction in a 3D World", ACL, Jun 2021. [Paper] [Code] [Website]

Planning

  • LLM-GROP: "Task and Motion Planning with Large Language Models for Object Rearrangement", arXiv, Mar 2023 [Paper]

  • Bio Lab Task Planning: "LLMs can generate robotic scripts from goal-oriented instructions in biological laboratory automation", arXiv, April 2023 [Paper]

  • PromptCraft Robotics: "ChatGPT for Robotics: Design Principles and Model Abilities", Microsoft, 2023, [Paper], [Website], [Code]

  • CLARIFY "Errors are Useful Prompts: Instruction Guided Task Programming with Verifier-Assisted Iterative Prompting", arXiv, March 2023 [Paper][Code][Website]

  • LM-Nav: "Robotic Navigation with Large Pre-Trained Models of Language, Vision, and Action", arXiv, July 2022. [Paper] [Pytorch Code] [Website]

  • InnerMonlogue: "Inner Monologue: Embodied Reasoning through Planning with Language Models", arXiv, July 2022. [Paper] [Website]

  • Housekeep: "Housekeep: Tidying Virtual Households using Commonsense Reasoning", arXiv, May 2022. [Paper] [Pytorch Code] [Website]

  • LID: "Pre-Trained Language Models for Interactive Decision-Making", arXiv, Feb 2022. [Paper] [Pytorch Code] [Website]

  • ZSP: "Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents", ICML, Jan 2022. [Paper] [Pytorch Code] [Website]

Manipulation

  • MOO "Open-World Object Manipulation using Pre-trained Vision-Language Models" arXiv, March 2023 [Paper] [Website]

  • TidyBot: "TidyBot: Personalized Robot Assistance with Large Language Models", arXiV, May 2023, [Paper Website Paper Website]

  • DIAL:"Robotic Skill Acquistion via Instruction Augmentation with Vision-Language Models", arXiv, Nov 2022, [Paper] [Website]

  • CLIP-Fields:"CLIP-Fields: Weakly Supervised Semantic Fields for Robotic Memory", arXiv, Oct 2022, [Paper] [PyTorch Code] [Website]

  • VIMA:"VIMA: General Robot Manipulation with Multimodal Prompts", arXiv, Oct 2022, [Paper] [Pytorch Code] [Website]

  • Perceiver-Actor:"A Multi-Task Transformer for Robotic Manipulation", CoRL, Sep 2022. [Paper] [Pytorch Code] [Website]

  • LaTTe: "LaTTe: Language Trajectory TransformEr", arXiv, Aug 2022. [Paper] [TensorFlow Code] [Website]

  • Robots Enact Malignant Stereotypes: "Robots Enact Malignant Stereotypes", FAccT, Jun 2022. [Paper] [Website] Washington Post] [Wired] (code access on request)

  • ATLA: "Leveraging Language for Accelerated Learning of Tool Manipulation", CoRL, Jun 2022. [Paper]

  • ZeST: "Can Foundation Models Perform Zero-Shot Task Specification For Robot Manipulation?", L4DC, Apr 2022. [Paper]

  • LSE-NGU: "Semantic Exploration from Language Abstractions and Pretrained Representations", arXiv, Apr 2022. [Paper]

  • Embodied-CLIP: "Simple but Effective: CLIP Embeddings for Embodied AI ", CVPR, Nov 2021. [Paper] [Pytorch Code]

  • CLIPort: "CLIPort: What and Where Pathways for Robotic Manipulation", CoRL, Sept 2021. [Paper] [Pytorch Code] [Website]

Instructions and Navigation

  • Text2Motion: "Text2Motion: From Natural Language Instructions to Feasible Plans", arXiv, Mar 2023 [Paper]

  • ChatGPT Robot Collaboration: "Improved Trust in Human-Robot Collaboration with ChatGPT", arXiv, April 2023. [Paper]

  • ADAPT: "ADAPT: Vision-Language Navigation with Modality-Aligned Action Prompts", CVPR, May 2022. [Paper]

  • Pre-Trained Vision Models for Control: "The Unsurprising Effectiveness of Pre-Trained Vision Models for Control", ICML, Mar 2022. [Paper] [Pytorch Code] [Website]

  • CoW: "CLIP on Wheels: Zero-Shot Object Navigation as Object Localization and Exploration", arXiv, Mar 2022. [Paper]

  • Recurrent VLN-BERT: "A Recurrent Vision-and-Language BERT for Navigation", CVPR, Jun 2021 [Paper] [Pytorch Code]

  • VLN-BERT: "Improving Vision-and-Language Navigation with Image-Text Pairs from the Web", ECCV, Apr 2020 [Paper] [Pytorch Code]

  • Interactive Language: "Interactive Language: Talking to Robots in Real Time", arXiv, Oct 2022 [Paper] [Website]

Simulation Frameworks

  • MineDojo: "MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge", arXiv, Jun 2022. [Paper] [Code] [Website] [Open Database]

  • Habitat 2.0: "Habitat 2.0: Training Home Assistants to Rearrange their Habitat", NeurIPS, Dec 2021. [Paper] [Code] [Website]

  • BEHAVIOR: "BEHAVIOR: Benchmark for Everyday Household Activities in Virtual, Interactive, and Ecological Environments", CoRL, Nov 2021. [Paper] [Code] [Website]

  • iGibson 1.0: "iGibson 1.0: a Simulation Environment for Interactive Tasks in Large Realistic Scenes", IROS, Sep 2021. [Paper] [Code] [Website]

  • ALFRED: "ALFRED: A Benchmark for Interpreting Grounded Instructions for Everyday Tasks", CVPR, Jun 2020. [Paper] [Code] [Website]

  • BabyAI: "BabyAI: A Platform to Study the Sample Efficiency of Grounded Language Learning", ICLR, May 2019. [Paper] [Code]

Perception

  • LGX: "Can an Embodied Agent Find Your "Cat-shaped Mug"? LLM-Based Zero-Shot Object Navigation", arXiv, Mar 2023. [Paper]

  • Robots Acquire Skills With VLMs: "Robotic Skill Acquisition via Instruction Augmentation with Vision-Language Models" arXiv, Nov 2022. [Paper]

  • From Occulation To Insight: "From Occlusion to Insight: Object Search in Semantic Shelves using Large Language Models", arXiv, Feb 2023, [Paper]

Project Demos

  • RobotGPT Pt.2 "Twitter Video Of Voice-Input LLM-Powered Robot Arm", Orangewood Labs, 2023, [Video]

  • SPOT GPT: "Boston Dynamics Integration of ChatGPT into SPOT Robot", Boston Dynamics, 2023, [Video]

  • RobotGPT: "Orangewood Labs RoboGPT Demo", Orangewood Labs, 2023, [Video]

  • Mona: "Vitruvian Works Robot Demonstration", Vitruvian Works, 2023, [Video]

  • Ameca: "Ameca Expressions with GPT-3 / 4", Engineered Arts, 2023, [Video]

  • Sarcastic Robot: "Sarcastic Robot powered by GPT-4", Gabrael Levine (Hackathon Project), 2023, [Video]

  • DroneFormer: "DroneFormer: Controlling UAVs with natural language!", Brian Wu (Hackathon Project), Stanford University, 2023 [Video]

Thoughtful Twitter Threads

  • Bitter Lesson 2.0: @hausman_k, 2023 [Thread]

Citation

If you find this repository useful, please consider citing this list:

@misc{rintamaki2023everythingllmsandroboticsrepo,
    title={Everything-LLMs-And-Robotics},
    author={Jacob Rintamaki},
    journal={GitHub repository},
    url={https://github.com/jrin771/Everything-LLMs-And-Robotics},
    year={2023},
}

everything-llms-and-robotics's People

Contributors

jrin771 avatar brianwu568 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.