emcf / thepipe Goto Github PK
View Code? Open in Web Editor NEWFeed PDFs, URLs, Slides, YouTube, GitHub, and more into Vision-Language models with one line of code ⚡
Home Page: https://thepi.pe
License: MIT License
Feed PDFs, URLs, Slides, YouTube, GitHub, and more into Vision-Language models with one line of code ⚡
Home Page: https://thepi.pe
License: MIT License
Accepting requests features in this thread, please feel free to suggest!
The roadmap so far includes:
Looking to support extraction of mp4
, mov
, webm
, avi
files as well as youtube
for a Vision-Language model (not a video model)
Video files vary widely in duration, so due to the context window limitations of LLMs in 2024, this implementation would likely extract a constant number of images rather than extract a variable number of images at a constant frame rate.
Audio is not standard in commercial multimodal models today. Because of this, I am also looking to provide the option to transcribe audio from the video.
Apologies there is no docker image yet! 😅
I am on the case.
Looking to support mp3
, wav
Audio is not standard in commercial multimodal models today in 2024. Because of this, I am also looking to transcribe audio to text, probably via Whisper.
I have added the env var THEPIPE_API_KEY to my .env, .bashrc and at the commandline. It is not getting accepted.
{response['error']}")
ValueError: Valid environment variable THEPIPE_API_KEY not found. You may need to restart if you have set your API key. Visit https://thepi.pe/docs to learn more.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.