Giter Site home page Giter Site logo

ai-narrator's Introduction

ChatGPT Vision and ElevenLabs

Someone at my work shared a Tweet about this project here, and I decided to make a clone in Node.JS

demo.mov

Requirements

How it works

The difference between my project and the original project is that I am using socket.io to Stream the audio to the client using Socket.io. How it works is that the client uses the webcam and captures it, after which it converts the image to base64 and sends it to the server. The server receives the image and sends it to the ChatGPT Vision API to get a description of the image based on the prompt. After that, it sends the description to the ElevenLabs API to get the audio file of the description, and finally, the server sends the audio file to the client using Socket.io.

How to use

  1. Clone the repo.
  2. Run npm install.
  3. Create a .env file using the .env.example file as a template.
  4. Run npm start:client to start the client.
  5. Run npm start:server to start the server.
  6. Go to http://localhost:8080 and enjoy!

ai-narrator's People

Contributors

leo1994 avatar

Stargazers

Alif avatar rob dezendorf avatar Ray Smets avatar

Watchers

 avatar

Forkers

rsmets sumayabee

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.