leptonai / leptonai Goto Github PK

View Code? Open in Web Editor NEW

2.5K 2.5K 162.0 3.84 MB

A Pythonic framework to simplify AI service building

Home Page: https://lepton.ai/

License: Apache License 2.0

Python 95.10% Shell 4.22% Dockerfile 0.68%

artificial-intelligence cloud deep-learning gpu machine-learning python

leptonai's Introduction

Lepton AI

A Pythonic framework to simplify AI service building

Homepage • API Playground • Examples • Documentation • CLI References • Twitter • Blog

The LeptonAI Python library allows you to build an AI service from Python code with ease. Key features include:

A Pythonic abstraction Photon, allowing you to convert research and modeling code into a service with a few lines of code.
Simple abstractions to launch models like those on HuggingFace in few lines of code.
Prebuilt examples for common models such as Llama, SDXL, Whisper, and others.
AI tailored batteries included such as autobatching, background jobs, etc.
A client to automatically call your service like native Python functions.
Pythonic configuration specs to be readily shipped in a cloud environment.

Getting started with one-liner

Install the library with:

pip install -U leptonai

This installs the leptonai Python library, as well as the commandline interface lep. You can then launch a HuggingFace model, say gpt2, in one line of code:

lep photon run --name gpt2 --model hf:gpt2 --local

If you have access to the Llama2 model (apply for access here) and you have a reasonably sized GPU, you can launch it with:

# hint: you can also write `-n` and `-m` for short
lep photon run -n llama2 -m hf:meta-llama/Llama-2-7b-chat-hf --local

(Be sure to use the -hf version for Llama2, which is compatible with huggingface pipelines.)

You can then access the service with:

from leptonai.client import Client, local
c = Client(local(port=8080))
# Use the following to print the doc
print(c.run.__doc__)
print(c.run(inputs="I enjoy walking with my cute dog"))

Fully managed Llama2 models and CodeLlama models can be found in the playground.

Many standard HuggingFace pipelines are supported - find out more details in the documentation. Not all HuggingFace models are supported though, as many of them contain custom code and are not standard pipelines. If you find a popular model you would like to support, please open an issue or a PR.

Checking out more examples

You can find out more examples from the examples repository. For example, launch the Stable Diffusion XL model with:

git clone [email protected]:leptonai/examples.git
cd examples

lep photon run -n sdxl -m advanced/sdxl/sdxl.py --local

Once the service is running, you can access it with:

from leptonai.client import Client, local

c = Client(local(port=8080))

img_content = c.run(prompt="a cat launching rocket", seed=1234)
with open("cat.png", "wb") as fid:
    fid.write(img_content)

or access the mounted Gradio UI at http://localhost:8080/ui. Check the README file for more details.

A fully managed SDXL is hosted at https://dashboard.lepton.ai/playground/sdxl with API access.

Writing your own photons

Writing your own photon is simple: write a Python Photon class and decorate functions with @Photon.handler. As long as your input and output are JSON serializable, you are good to go. For example, the following code launches a simple echo service:

# my_photon.py
from leptonai.photon import Photon

class Echo(Photon):
    @Photon.handler
    def echo(self, inputs: str) -> str:
        """
        A simple example to return the original input.
        """
        return inputs

You can then launch the service with:

lep photon run -n echo -m my_photon.py --local

Then, you can use your service as follows:

from leptonai.client import Client, local

c = Client(local(port=8080))

# will print available paths
print(c.paths())
# will print the doc for c.echo. You can also use `c.echo?` in Jupyter.
print(c.echo.__doc__)
# will actually call echo.
c.echo(inputs="hello world")

For more details, checkout the documentation and the examples.

Contributing

Contributions and collaborations are welcome and highly appreciated. Please check out the contributor guide for how to get involved.

License

The Lepton AI Python library is released under the Apache 2.0 license.

Developer Note: early development of LeptonAI was in a separate mono-repo, which is why you may see commits from the leptonai/lepton repo. We intend to use this open source repo as the source of truth going forward.

leptonai's People

Contributors

Stargazers

Watchers

Forkers

ganler xinxichen yangqing vra xmfbit feifeibear jackman337 sorokinvld mani-kantap techthiyanes hbcbh1999 mbrukman hhy5277 flyinggh nnuujj pterameta severeduck aria1991 wong2 sacovo mehranbahramm smoolya17 doutianbao omnipotentai tomchapin luislortega tg2024 tianweidut johnmasoner 9tong stenicholas bodhihu ai-app cbb9556 zard19991 hexfusion zhengjun16688 tkyyds liuq4360 zhaozilong2024 ai-awe sjy weiplanet iamkomen leon1227 muki-skywalker pk1762012 uk0 liucr flyforfreedom eltociear shopdrop1 1sankalp lingxiyang stofancy jamestiotio startime-h ailabteam kerwinchina leiwahkui trigrass2 saonam sopig nastyefr joeaelkhoury ddbon-calvin lidianxiang apollohuang1 nataliyao91 jimmyleesnow f901107 mhoudini hinzej58 axissun1 linbingqing sanyaade-teachings gokaydikiciii spencerx feirii floria02 yuanzhongqiao smartdao pariigh fateme211 samiya-coder miss-mickeymouse pink-black-bear niebin reinhardt-i ehave-development-team omarofo jbmunro4 entelecheia-archives laxnarp99 john-rice ijireeme yuehong0813 mehdi4crypto oras903 mohammad7155

leptonai's Issues

[BUG] pip install -U leptonai error

Describe the bug
pip install -U leptonai error that require install the Rust package manager. not friendly

Detail:
Collecting hf-transfer (from leptonai)
Using cached https://mirrors.aliyun.com/pypi/packages/a1/a9/9ba1c6d974555246fbdfc7268005152bce6d9283333560e7916bd3642a7a/hf_transfer-0.1.5.tar.gz (21 kB)
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing metadata (pyproject.toml) ... error
error: subprocess-exited-with-error

× Preparing metadata (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> [6 lines of output]

  Cargo, the Rust package manager, is not installed or is not on PATH.
  This package requires Rust and Cargo to compile extensions. Install it through
  the system's package manager or via https://rustup.rs/
  
  Checking for Rust toolchain....
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

Environment (please complete the following information):

OS: MAC
Python version: Python 3.9.6
Pip pip 24.0

New functional requirements

Currently, we are using lepton to implement knowledge retrieval, but we have not found the corresponding API based on the source code.
Did RAG and Prompt projects not provide external information?
@Yangqing @bddppq @vthinkxie

[HF task support]

What huggingface task type would you like to support?
This should be a huggingface task type from https://huggingface.co/tasks.

Specifically, what model are you interested in under this task type?
This should be a huggingface model, like https://huggingface.co/meta-llama/Llama-2-13b-chat-hf

Would you mind sharing a bit more, like use cases and errors you encountered?
Add any other context or screenshots about the feature request here.

Implement timeout capability in lep dep update

[HF task support]

What huggingface task type would you like to support?
This should be a huggingface task type from https://huggingface.co/tasks.

Specifically, what model are you interested in under this task type?
This should be a huggingface model, like https://huggingface.co/meta-llama/Llama-2-13b-chat-hf

Would you mind sharing a bit more, like use cases and errors you encountered?
Add any other context or screenshots about the feature request here.

[BUG] Public endpoint gives "upstream connect error"

Describe the bug
Trying to run the test curl command at this page:

https://www.lepton.ai/playground/openvoice

And I get this error:

upstream connect error or disconnect/reset before headers. reset reason: connection failure

If I remove the env variable, it gives an "Unauthorized" message, so I don't think it is an authentication issue.

To Reproduce

export LEPTON_API_TOKEN="xxxxxxx"

curl 'https://openvoice.lepton.run/run' \
  -H 'Content-Type: application/json' \
  -H "Authorization: Bearer $LEPTON_API_TOKEN" \
  -d '{
     "reference_speaker": "https://www.lepton.ai/playground/openvoice/inputs/speaker_1.mp3",
     "text": "Hi, can you hear me?",
     "emotion": "friendly"
   }'\
  -o output.mp3

cat output.mp3

Expected behavior
The curl command generates a correct mp3 file with a recording of the text.

Screenshots

$ curl 'https://openvoice.lepton.run/run' \
  -H 'Content-Type: application/json' \
  -H "Authorization: Bearer $LEPTON_API_TOKEN" \
  -d '{
     "reference_speaker": "https://www.lepton.ai/playground/openvoice/inputs/speaker_1.mp3",
     "text": "Hi, can you hear me?",
     "emotion": "friendly"
   }'\
  -o output.mp3
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   254  100    91  100   163     16     29  0:00:05  0:00:05 --:--:--    22
$ cat output.mp3 
upstream connect error or disconnect/reset before headers. reset reason: connection failure

Environment (please complete the following information):

OS: Linux Mint Debian Edition 6

Typo in https://www.lepton.ai/docs/build_you_own_model/clients

Hello team,

I hope you're doing well. I've found a small spelling mistake and thought it would be helpful to bring it to your attention for correction.
Details:
Location: https://www.lepton.ai/docs/build_you_own_model/clients
Incorrect Spelling: specifid
Correct Spelling: specified

Thank you for your hard work on this project. I appreciate the opportunity to contribute, even in a small way.
Best regards,
lh

Latest code on main branch didn't pass black formatter

Describe the bug
black code formatter runs failed on the codebase

To Reproduce

git clone https://github.com/leptonai/leptonai-sdk
cd leptonai-sdk
pip install -e . 
pip install -e .[test]
pip install -e .[lint]
black .
git diff

Expected behavior
no file changed.

Screenshots
If applicable, add screenshots to help explain your problem.

Environment (please complete the following information):

OS: macOS
Python version: 3.8.17
LeptonAI sdk version: lep, version 0.9.8.post1.dev1+g500f37c
black version: 23.3.0 (compiled: yes)

Additional context
Add any other context about the problem here.

Typo in https://www.lepton.ai/docs/build_you_own_model/anatomy_of_a_photon#adding-extra-files

Hello team,

I hope you're doing well. I've found a small spelling mistake and thought it would be helpful to bring it to your attention for correction.
Details:
Location: https://www.lepton.ai/docs/build_you_own_model/anatomy_of_a_photon#adding-extra-files
Incorrect Spelling: pricate
Correct Spelling: private

Thank you for your hard work on this project. I appreciate the opportunity to contribute, even in a small way.
Best regards,
lh

[External cause] OMP: Error #15 multiple copies of the OpenMP runtime

Describe the bug
multiple copies of the OpenMP runtime

To Reproduce
Follow https://www.lepton.ai/docs/overview/quickstart
run until lep photon run --name mygpt2 --local

Expected behavior
GPT2 runs locally

Screenshots

% lep photon create --name mygpt2 --model hf:gpt2

Photon mygpt2 created.
(base) weiwen@Weis-MacBook-Air wenwei202.github.io % lep photon run --name mygpt2 --local

Launching photon on port: 8080
2023-09-16 08:40:25.612 | INFO     | leptonai.photon.hf.hf:pipeline:167 - Creating pipeline for text-generation(model=gpt2, revision=11c5a3d5).
HuggingFace download might take a while, please be patient...
OMP: Error #15: Initializing libiomp5.dylib, but found libomp.dylib already initialized.
OMP: Hint This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://www.intel.com/software/products/support/.
zsh: abort      lep photon run --name mygpt2 --local
(base) weiwen@Weis-MacBook-Air wenwei202.github.io % curl -X 'POST' \
  'http://0.0.0.0:8080/run' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "inputs": "Once upon a time"
}'
curl: (7) Failed to connect to 0.0.0.0 port 8080 after 0 ms: Connection refused
(base) weiwen@Weis-MacBook-Air wenwei202.github.io % KMP_DUPLICATE_LIB_OK=TRUE lep photon run --name mygpt2 --local

Launching photon on port: 8080
2023-09-16 08:46:43.152 | INFO     | leptonai.photon.hf.hf:pipeline:167 - Creating pipeline for text-generation(model=gpt2, revision=11c5a3d5).
HuggingFace download might take a while, please be patient...
Downloading (…)11c5a3d5/config.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 665/665 [00:00<00:00, 53.6kB/s]
Downloading (…)/11c5a3d5/vocab.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.04M/1.04M [00:00<00:00, 7.18MB/s]
Downloading (…)/11c5a3d5/merges.txt: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 456k/456k [00:00<00:00, 5.59MB/s]
Downloading (…)5a3d5/tokenizer.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.36M/1.36M [00:00<00:00, 4.20MB/s]
Downloading model.safetensors: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 548M/548M [00:08<00:00, 62.7MB/s]
zsh: segmentation fault  KMP_DUPLICATE_LIB_OK=TRUE lep photon run --name mygpt2 --local
(base) weiwen@Weis-MacBook-Air wenwei202.github.io % /Users/weiwen/opt/anaconda3/lib/python3.9/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

Environment (please complete the following information):

OS: [e.g. iOS]

Python version: [e.g. 3.10]
Python 3.9.13

(base) weiwen@Weis-MacBook-Air ~ % which python
/Users/weiwen/opt/anaconda3/bin/python

LeptonAI sdk version: [obtain this via lep --version]
lep, version 0.9.4

Additional context
Add any other context about the problem here.

Get error There is no item named 'ui\\\\404.html' in the archive when delpoying

I use offical code in search_with_lepton repo, and I can run it locally on my windows 11 following step by step:

export BING_SEARCH_V7_SUBSCRIPTION_KEY=YOUR_BING_SUBSCRIPTION_KEY
cd web && npm install && npm run build
BACKEND=BING python search_with_lepton.py

But, when I delopy it to Lepton AI with command:
lep photon run -n search-with-lepton-modified -m search_with_lepton.py --env BACKEND=BING --env BING_SEARCH_V7_SUBSCRIPTION_KEY=YOUR_BING_SUBSCRIPTION_KEY
I get the error from replica logs. Then, I download the lastest version photon search-with-lepton-modified@latest and find that the file 404.html does exist under ui folder, so it really confuses me.

[BUG] worker_max_concurrency not work

Describe the bug
Even if the max_worker_concurrency in the worker class is set to a number greater than 1, 'on_task' will still process requests from the queue one by one.

To Reproduce

from leptonai.photon import Worker, Photon
import time
from loguru import logger



class LeptonWorker(Worker):
    worker_max_concurrency = 2
    handler_max_concurrency = 2

    def init(self):
        logger.info(f'[Befor Init Worker Class] Worker concurrency is set to {self.worker_max_concurrency}, \n handle concurrency is set to {self.handler_max_concurrency}')
        super().init()

    def on_task(
        self,
        task_id: str,
    ):
        logger.info(f'[On Task loop] Task id {task_id} get into on_task!')
        time.sleep(15)
        logger.info(f'[On Task loop] Task id {task_id} done!')
        return {"res": None}

    @Photon.handler
    def run(
        self,
    ):
        return self.task_post({})

Expected behavior
When max_worker_concurrency=2, two 'on_task' should be able to run simultaneously

Screenshots

Environment (please complete the following information):

OS: Ubuntu 20.04
Python version: 3.10
LeptonAI library version: lep, version 0.18.2

Additional context
Add any other context about the problem here.

API Reference may be vague or unclear.[BUG]

There are same API call example with two different API at lepton.ai,i am confused,may be It's not a bug!

HF enhancement

Is your feature request related to a problem? Please describe.
N/A

Describe the solution you'd like
From former colleagues using sdk:

support passing hf url directly in --model
explicitly print out the hf download path for users to be not concerned about large file downloads

Describe alternatives you've considered
N/A

Additional context
N/A

[BUG] issues with 'pad_token_id'

Describe the bug
I've been learning leptonai with offical tutorials

To Reproduce
Following the steps of offical tutorials, when i tend to post a request through either curl tools or a python3 code (above). I will always got the error showed in the above picture.

from leptonai.client import Client, local
import os
#api_token = os.environ.get('LEPTON_API_TOKEN')
#client = Client('uzlymnrx', 'mygpt2', token=api_token)
client = Client(local(port=8080))
client.healthz()
client.paths()
print(client.run(inputs='Once upon a time'))

Expected behavior
Reciving the response of gpt2 model:

"Once upon a time, the warring powers of North Africa, Europe and Asia had their own different doctrines concerning its most sacred affairs, and for a while the idea of a common treaty was still in vogue. It has long been reported that after"

Screenshots

Environment (please complete the following information):

OS: [Mac OS Sonoma 14.0]
Python version: [3.10.0]
LeptonAI library version: [lep, version 0.11.0]

Additional context
I also tried deploied gpt2 model in the cloud, there is no error at all, everything works.
But i always got the error locally.

Playground is only for demostration purpose

Models that are for demonstration purposes only, can we still use them via API and generate outputs? How are they priced? Or one cannot use these models via API at all?

local api endpoints

we are running locally on linux.
When we use curl we can send inputs": "Translate this text to French"
Are there other endpoints exposed ?

We tried this "question": "Who is Frederick Douglass?"
and received "detail":"Not Found"}

Thanks !

Explicitly add timeout option to Client

Is your feature request related to a problem? Please describe.
Nope. Feedback obtained from user.

Describe the solution you'd like
Right now, there isn't a way for the client to specify a timeout. We should probably extend the client constructor to specify a timeout option that applies to the function calls in the client.

Describe alternatives you've considered
N/A

Additional context
N/A

Add `--extra-deps` to the `photon create` cli

There are a few cases (for example, huggingface model) where dependencies need to be specified. While one can create a photon file to do so, it is definitely preferrable to just have an additional dependency flag in CLI.

Gradio has a deprecated warning in photon/hf/

/home/jiayq/Documents/code/lepton/sdk/leptonai/photon/hf/hf.py:349: GradioDeprecationWarning: The `style` method is deprecated. Please set these arguments in the constructor instead.
  txt = gr.Textbox(

[BUG] step2@Quickstart: local server clashed with error "zsh: bus error lep photon run --name mygpt2 --local"

step2@Quickstart: local server clashed with error "zsh: bus error lep photon run --name mygpt2 --local"

following https://www.lepton.ai/docs/overview/quickstart
everything normal with local server up and runnging at /0.0.0.0:8080
execute the first request at step2. I tried command line curl, browser-based execution and and via Lepton's python clients. All crashed the sever

environment:

Mac with Apple M1
python 3
lepton 0.16.0

server side

(base) user1@btf ~ % lep photon run --name mygpt2 --local

Launching photon on port: 8080
2024-02-02 12:27:02.451 | INFO | leptonai.photon.hf.hf:pipeline:213 - Creating pipeline for text-generation(model=gpt2, revision=11c5a3d5).
HuggingFace download might take a while, please be patient...
2024-02-02 12:27:02.452 | INFO | leptonai.photon.hf.hf:pipeline:218 - Note: HuggingFace caches the downloaded models in ~/.cache/huggingface/ (or C:\Users<username>.cache\huggingface\ on Windows). If you have already downloaded the model before, the download should be much faster. If you run out of disk space, you can delete the cache folder.
2024-02-02 12:27:05.760 | WARNING | leptonai.photon.photon:_mount_route:928 - Skip mounting ui as gradio is not installed
2024-02-02 12:27:05.762 | INFO | leptonai.photon.photon:_uvicorn_run:820 - Setting up signal handlers for graceful incoming traffic shutdown after 5 seconds.
2024-02-02 12:27:05,763 - INFO:
If you are using standard photon, a few urls that may be helpful:
- http://0.0.0.0:8080/docs OpenAPI documentation
- http://0.0.0.0:8080/redoc Redoc documentation
- http://0.0.0.0:8080/openapi.json Raw OpenAPI schema
- http://0.0.0.0:8080/metrics Prometheus metrics

If you are using python clients, here is an example code snippet:
from leptonai.client import Client, local
client = Client(local(port=8080))
client.healthz() # checks the health of the photon
client.paths() # lists all the paths of the photon
client.method_name? # If client has a method_name method, get the docstring
client.method_name(...) # calls the method_name method
If you are using ipython, you can use tab completion by typing client. and then press tab.

2024-02-02 12:27:05,767 - INFO: Started server process [30338]
2024-02-02 12:27:05,767 - INFO: Waiting for application startup.
2024-02-02 12:27:05.768 | INFO | leptonai.photon.photon:uvicorn_startup:763 - Starting photon app - running startup prep code.
2024-02-02 12:27:05,768 - INFO: Application startup complete.
2024-02-02 12:27:05,768 - INFO: Uvicorn running on http://0.0.0.0:8080 (Press CTRL+C to quit)
2024-02-02 12:28:58,488 - INFO: 127.0.0.1:49342 - "GET /openapi.json HTTP/1.1" 200 OK
Setting pad_token_id to eos_token_id:50256 for open-end generation.
zsh: bus error lep photon run --name mygpt2 --local

client side

# qs.py
# Via python client
from leptonai import Client
url = "http://localhost:8080"
client = Client(url)
print(client.run(inputs='Once upon a time'))

(base) user1@btf lepton % python qs.py
Traceback (most recent call last):
File "/Users/user1/miniforge3/lib/python3.10/site-packages/httpx/_transports/default.py", line 67, in map_httpcore_exceptions
yield
File "/Users/user1/miniforge3/lib/python3.10/site-packages/httpx/_transports/default.py", line 231, in handle_request
resp = self._pool.handle_request(req)
File "/Users/user1/miniforge3/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py", line 268, in handle_request
raise exc
File "/Users/user1/miniforge3/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py", line 251, in handle_request
response = connection.handle_request(request)
File "/Users/user1/miniforge3/lib/python3.10/site-packages/httpcore/_sync/connection.py", line 103, in handle_request
return self._connection.handle_request(request)
File "/Users/user1/miniforge3/lib/python3.10/site-packages/httpcore/_sync/http11.py", line 133, in handle_request
raise exc
File "/Users/user1/miniforge3/lib/python3.10/site-packages/httpcore/_sync/http11.py", line 111, in handle_request
) = self._receive_response_headers(**kwargs)
File "/Users/user1/miniforge3/lib/python3.10/site-packages/httpcore/_sync/http11.py", line 176, in _receive_response_headers
event = self._receive_event(timeout=timeout)
File "/Users/user1/miniforge3/lib/python3.10/site-packages/httpcore/_sync/http11.py", line 226, in _receive_event
raise RemoteProtocolError(msg)
httpcore.RemoteProtocolError: Server disconnected without sending a response.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/Users/user1/work/mycode/lepton/qs.py", line 5, in
print(client.run(inputs='Once upon a time'))
File "/Users/user1/miniforge3/lib/python3.10/site-packages/leptonai/client.py", line 468, in _method
res = self._post(
File "/Users/user1/miniforge3/lib/python3.10/site-packages/leptonai/client.py", line 409, in _post
return self._session.post(f"{self.url}/{path.lstrip('/')}", *args, **kwargs)
File "/Users/user1/miniforge3/lib/python3.10/site-packages/httpx/_client.py", line 1146, in post
return self.request(
File "/Users/user1/miniforge3/lib/python3.10/site-packages/httpx/_client.py", line 828, in request
return self.send(request, auth=auth, follow_redirects=follow_redirects)
File "/Users/user1/miniforge3/lib/python3.10/site-packages/httpx/_client.py", line 915, in send
response = self._send_handling_auth(
File "/Users/user1/miniforge3/lib/python3.10/site-packages/httpx/_client.py", line 943, in _send_handling_auth
response = self._send_handling_redirects(
File "/Users/user1/miniforge3/lib/python3.10/site-packages/httpx/_client.py", line 980, in _send_handling_redirects
response = self._send_single_request(request)
File "/Users/user1/miniforge3/lib/python3.10/site-packages/httpx/_client.py", line 1016, in _send_single_request
response = transport.handle_request(request)
File "/Users/user1/miniforge3/lib/python3.10/site-packages/httpx/_transports/default.py", line 230, in handle_request
with map_httpcore_exceptions():
File "/Users/user1/miniforge3/lib/python3.10/contextlib.py", line 153, in exit
self.gen.throw(typ, value, traceback)
File "/Users/user1/miniforge3/lib/python3.10/site-packages/httpx/_transports/default.py", line 84, in map_httpcore_exceptions
raise mapped_exc(message) from exc
httpx.RemoteProtocolError: Server disconnected without sending a response.

Reduce dependency when installing leptonai

Is your feature request related to a problem? Please describe.
By installing leptonai via poetry, the image size increased 10x resulted in longer deployment time.

Describe the solution you'd like
can we somehow have a core version or lighter version of leptonai so I can choose the one with or without additional dependency like CUDA?

Additional context

I have attached the screenshot if image built with/without leptonai lib.

Does it work with Apple metal for --local ?

Looking to get the most of my M2 Max macbook pro...

Please deploy Llama3

Please deploy Llama 3 from Meta:
https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct

[BUG] wrong clone address

Describe the bug
wrong clone address in CONTRIBUTING.md
true address : https://github.com/leptonai/leptonai.git
now address: https://leptonai/leptonai.git

In lep deployment update, allow using "latest" as the photon id for the photon.

This makes it easy for users to update the deployment to the most recent one, after doing a photon push. Right now, the user has to manually find the recently pushed photon's id, and then copy it over to the lep deployment update cli call.

Original issue at https://github.com/leptonai/lepton/issues/2659

[BUG] whisperx api will translate other language automatically.

Describe the bug
whisperx api will translate other language automatically. when send audio file by base64 string,

To Reproduce
Steps to reproduce the behavior.
fetch(https://whisperx.lepton.run/run, {
method: 'POST',
headers: {
Authorization: Bearer token,
// Add other headers required by the API
'Content-Type': 'application/json'

        },
        body: JSON.stringify({
            input: base64audiostring,
            language: model.language
        })
    });

Expected behavior
A clear and concise description of what you expected to happen.
it should only transcribe, not translate to english
Screenshots
If applicable, add screenshots to help explain your problem.

Environment (please complete the following information):

OS: [e.g. iOS]
Python version: [e.g. 3.10]
LeptonAI library version: [obtain this via lep --version]

Additional context
Add any other context about the problem here.

pip install -U leptonai install successfull，but run lep --help failed

pip install -U leptonai install successfull，but run lep --help failed
system win11家庭版

how to revoke a token?

Nice project!

Is your feature request related to a problem? Please describe.
I want to allow someone to log in using my token(lep login -c xxx:xxxx), but after a certain period of time, I may want to revoke this privilege.

Describe the solution you'd like
Perhaps adding a feature to revoke the token?

Describe alternatives you've considered
It seems that lep logout --purge only clears the local permission information and does not invalidate the token.

Additional context

a team workspace to share credits and running jobs

Is your feature request related to a problem? Please describe.

My team is using lepton to develop LLM agent. However, we quickly run out of the free credits. When I am going to top up more credits for our team to share, I find there is no team space.

Describe the solution you'd like
a team workspace can share the running jobs and credits

Describe alternatives you've considered
at least, my team member can share my credit rather than I top up for them one by one.

Additional context
Add any other context or screenshots about the feature request here.

cc @leeeizhang

Have you considered providing multiple sub classes in single python scripy?

Is your feature request related to a problem? Please describe.

I try to create multi photon, get
bash Failed to create photon: Found multiple sub classes of Photon in test.py: ['Counter', 'Counter1']
Describe the solution you'd like

Now:

c = Client(local(8080))
print(f"Add 3, result: {c.add(x=3)}") # This will return 3
print(f"Add 5, result: {c.add(x=5)}") # This will return 8
print(f"Sub 2, result: {c.sub(x=2)}") # This will return 6

curl -X POST -H "Content-Type: application/json" -d '{"x": 3}' http://localhost:8080/add

Hope:

c = Client(local(8080, 'Counter'))
print(f"C Add 3, result: {c.add(x=3)}") # This will return 3
print(f"C Add 5, result: {c.add(x=5)}") # This will return 8
print(f"C Sub 2, result: {c.sub(x=2)}") # This will return 6

c1 = Client(local(8080, 'Counter1'))
print(f"C1 Add 3, result: {c1.add(x=3)}") # This will return 3
print(f"C1 Add 5, result: {c1.add(x=5)}") # This will return 8
print(f"C1 Sub 2, result: {c1.sub(x=2)}") # This will return 6

curl -X POST -H "Content-Type: application/json" -d '{"x": 3}' http://localhost:8080/Counter/add

I'm not sure the need is justified in terms of security and privacy.

Mason

[Doc] document content typo?

The document for specifying resource type: https://www.lepton.ai/docs/walkthrough/resource_types, is using the flag --resource-type, yet in CLI source code it is actually --resource-shape. I guess this is a typo on the documentation?

Build example on github integration

Basically, if I have a github repository that is my codebase, and the codebase builds and launches deployments, we should show how one could integrate that with the github workflow. Specifically, as a minimal example, say we have such a repo:

/
- photon.py
- requirements.txt
- .github/
    - workflows/

and usually we run

lep photon create -n myphoton -m photon.py
lep photon push -n myphoton
lep photon run -n myphoton

lep deployment update -n myphoton --id NEW_ID

We should show how a user can integrate this into the github workflow and automate it.

Original issue at https://github.com/leptonai/lepton/issues/2656

ADD ISSUE DESCRIPTION HEREBY

ADD ISSUE DESCRIPTION HERE

Version: 1.87.0
Commit: 019f4d1419fbc8219a181fab7892ebccf7ee29a2
User Agent: Mozilla/5.0 (Linux; Android 10; K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/122.0.0.0 Safari/537.36
Embedder: vscode.dev

Extension version: 0.40.0

Originally posted by @Cheapetsgrow in microsoft/vscode-remote-repositories-github#431

Reorganize storage files through web UI

Is your feature request related to a problem? Please describe.
When I want to manage and reorganize the files uploaded to my storage: https://dashboard.lepton.ai/workspace/<myworkspace>/storage, there is no drag and drop functionality to let me easily change file location, and also I cannot edit file name.

Describe the solution you'd like

Drag and Drop functionality to reorganize the files/folders.
Able to rename the file

[HF task support]

What huggingface task type would you like to support?
This should be a huggingface task type from https://huggingface.co/tasks.

Specifically, what model are you interested in under this task type?
This should be a huggingface model, like https://huggingface.co/meta-llama/Llama-2-13b-chat-hf

Would you mind sharing a bit more, like use cases and errors you encountered?
Add any other context or screenshots about the feature request here.

[help] How to use python code to achieve the same effect as the command line

I saw a line of code lep photon run -n echo -m my_photon.py --local,
for more automation,
I need to start the service with python code.
How should I do it?
Could you provide an example?

[BUG] backend cannot be SEARCHAPI

Describe the bug
when I set backend as SEARCHAPI, cannot deploy successfully.

To Reproduce

Expected behavior
I want to use 3rd party api for google search

Screenshots

Environment (please complete the following information):
https://dashboard.lepton.ai/

How do I stop / pause a deployment instead of removing / deleting it?

Is your feature request related to a problem? Please describe.

As shown above, I can not find a button to pause my service.

I can not find the CLI for pausing/stopping a deployment, either.

Are there any other considerations for not having this functionality?

Incorrect error reported when file not found

> lep photon create -n test -m nonexisteng.py 
Failed to create photon: No module named 'nonexisteng'

It seems a bit odd that we are returning a no module error - should be "file not found" or something?

[BUG] KV isn't working, even though the LEPTON_API_TOKEN environment variable is set.

Describe the bug

I'm using KV in Photon, and I've set the LEPTON_API_TOKEN environment variable when creating the Deployment, but it still doesn't work properly.

RuntimeError: It seems that you are not logged in to any workspace yet. Please log in and then retry. To log in, use `lep login` from the commandline.

To Reproduce

Creating a KV in Photon results in an error.

KV(namespace, create_if_not_exists=True, wait_for_creation=True)

Expected behavior

Reading and writing KV normally.

Environment (please complete the following information):
lep, version 0.18.1

[BUG] Got pip error when follow the contribution guide

Describe the bug
Follow install steps defined in contribution.md (i am using python v12 via conda), error throwed after executing

pip install .[test]

ERROR: Could not find a version that satisfies the requirement torch (from leptonai) (from versions: none)
ERROR: No matching distribution found for torch

To Reproduce
execute 'pip install .[test]' under python env >= 12

Expected behavior
No errors got

Screenshots
If applicable, add screenshots to help explain your problem.

Environment (please complete the following information):

OS: MacOS
Python version: 3.12.0
LeptonAI library version: current main branch

Additional context
SImilar issues found, due to python env over 3.12

pytorch/pytorch#114127

‌‌[BUG] The Stable Diffusion Web UI page deployed online displays "Connection errored out."

Describe the bug
The Stable Diffusion Web UI page deployed online displays "Connection errored out."

To Reproduce
/

Expected behavior
/

Screenshots

Environment (please complete the following information):

OS: Win10
Python version:
LeptonAI library version:

Additional context
/

[question] how to allow public to access deployments?

I followed the quickstart and everything went well. Then, I "Edit deployment" by setting "Access Tokens = Enable Public access".

By doing this, I thought the public can use my "demo" without access tokens. The endpoint is indeed open to public. However:

With endpoint "https://mgjao2r9-mygpt2.tin.lepton.run/run" in my browser , I got "{"detail":"Method Not Allowed"}"
With endpoint "https://mgjao2r9-mygpt2.tin.lepton.run" in my browser, I got "{"detail":"Not Found"}"

`lep login` on a headless machine?

Is your feature request related to a problem? Please describe.
I usually work on remote Linux box in the terminal. Is there any way to make lep login work?

Describe the solution you'd like
Heroku solved this issue with a -i flag. However, if you are planning to support MFA, an alternative method would be preferable.

[BUG] Sorry, we might be overloaded, try again later.

Hello,

I don't know where I'm making a mistake, but once I launch the front end after following the documentation for the backend I hosted on your platform, I get this message after asking a question:

Sorry, we might be overloaded, try again later.

Can anyone help me? Thank you

[HF task support]

What huggingface task type would you like to support?
This should be a huggingface task type from https://huggingface.co/tasks.

Specifically, what model are you interested in under this task type?
This should be a huggingface model, like https://huggingface.co/meta-llama/Llama-2-13b-chat-hf

Would you mind sharing a bit more, like use cases and errors you encountered?
Add any other context or screenshots about the feature request here.

Add report issue / feedback link in document

Is your feature request related to a problem? Please describe.
Yes. Encountered issues following QuickStart and want to submit feedback.

Describe the solution you'd like
An explicit url in the QuickStart document on where to submit issues instead of having an imperceptible Github icon at the far corner.

Describe alternatives you've considered
Guess and Google Lepton AI github

Additional context
Add any other context or screenshots about the feature request here.

inference with photons

So on twitter the Lepton AI account specifies that on Llama your optimized runtime runs up to 90 times faster than the transformer baselines. Is this only something that is implemented in your playground or is this something done through the photons? It states that the "magic" is done underneath but I haven't found anything in your documentation on the website talking about this. Or is it that it's expected the developers will use vLLM for inference optimization. If that's the case why would we need to use the container when we could simply use vLLM at its source? It also may be the case that those optimizations are only coming from the cloud based part of Lepton, which if that's the case it seems a little disingenuous not to state that.

[BUG]connection error

Describe the bug

Failed to create photon:
(MaxRetryError('HTTPSConnectionPool(host='huggingface.co', port=443): Max
retries exceeded with url: /api/models/mistralai/Mistral-7B-Instruct-v0.1
(Caused by SSLError(SSLCertVerificationError(1, "[SSL:
CERTIFICATE_VERIFY_FAILED] certificate verify failed: Hostname mismatch,
certificate is not valid for 'huggingface.co'. (_ssl.c:997)")))'), '(Request
ID: 92e7ab3a-6a8f-4b86-8626-9c40352a8499)')
To Reproduce

lep photon create --name mistral-7b --model hf:mistralai/Mistral-7B-Instruct-v0.1

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Environment (please complete the following information):

OS: MacOS
Python version: 3.10
LeptonAI library version: lep, version 0.11.0

Additional context
located in China and [redacted by repository admin to protect users' privacy] able to access huggingface.co.

leptonai / leptonai Goto Github PK

leptonai's Introduction

Lepton AI

Getting started with one-liner

Checking out more examples

Writing your own photons

Contributing

License

leptonai's People

Contributors

Stargazers

Watchers

Forkers

leptonai's Issues

step2@Quickstart: local server clashed with error "zsh: bus error lep photon run --name mygpt2 --local"

environment:

server side

client side

Recommend Projects

Recommend Topics

Recommend Org