Giter Site home page Giter Site logo

opengl_cse167x's Introduction

Fueled by a passion for math and computer science, I've tackled diverse projects in machine learning, computer vision, algorithms, and programming. Here are some highlights from my work in computer vision, followed by lower-level programming projects like compilers:

Highlights from My Projects
Vision Question Answering (VQA)
VQA is a task in computer vision that involves answering questions about an image. My work focused on Bottom-Up and Top-Down Attention Mechanism, optimizing object-level attention. The improved implementation achieved 63.61% accuracy, surpassing the original.
Flowers
Vision Language Navigation (VLN)
VLN is a task where agents learn to navigate following natural language instructions. In contrast to VQA, VLN model encodes texts and outputs actions while observing the new information. I explored a robot navigation in multi-floor buildings.
Flowers
Adaptive Zoom Mechanism for Vision Language Navition
I integrated an Adaptive Zoom mechanism into VLN, enabling agents to locate large landmarks with wide-FOV vision and identify smaller or distant objects with magnified vision.
Flowers
Trajectory Encoding for Vision Language Navition
I designed a model that leverages pre-exploration information in 3D buildings, achieving a 45.8% success rate. (The state of the art was 46.5% at the time.)
Flowers
Simultaneous Localization And Mapping (SLAM)
SLAM enables robotic mapping and navigation by constructing an environment map and tracking the robot's location simultaneously. I utilized a particle filter model to create a texture map from a differential-drive robot's two-minute activity in a building.
Flowers
Corner Detection and Sparse Stereo Matching with Epipolar Geometry
Employing epipolar geometry, the geometry of stereo vision, I detected and identified corresponding features in images taken from two distinct camera positions.
Flowers
Photometric Stereo for Surface Reconstruction and Phong Illumination for Surface Rendering
I used photometric stereo, a computer vision technique, to estimate surface normals under varying lighting conditions. The reconstructed surface was then rendered using the Phong reflection model, a computer graphics method for calculating local illumination.
Flowers

opengl_cse167x's People

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.