Comments (5)
Hey Keming, interested in taking a look at this issue, I briefly looked into some rust crates for this feature and found this crate. This crate seems to have support for redis cache, sized cache and timed cache (although i dont believe they have timed + sized cache). My first thought would be to add an axum middleware to handling the caching logic. What are your thoughts on this?
from mosec.
Hey Keming, interested in taking a look at this issue, I briefly looked into some rust crates for this feature and found this crate. This crate seems to have support for redis cache, sized cache and timed cache (although i dont believe they have timed + sized cache). My first thought would be to add an axum middleware to handling the caching logic. What are your thoughts on this?
I think this PR should come with a benchmark. I don't know if this lib fits our requirements.
- multi routes
- local & remote cache
- cache TTL
- cache size limit
I don't know how it handles the cache key. Since the key/value could be a huge image (like 3 x 1000 x 1000 f32)
. The benchmark should include different key/value types like a simple string, an image, an embedding, etc.
from mosec.
Good point. Do you think the cache should be aware of the exact content type?
from mosec.
Good point. Do you think the cache should be aware of the exact content type?
No. Because we don't really parse the HTTP request body on the Rust side. I list different types of data just because their sizes are different.
from mosec.
For the benchmark, you can check https://github.com/tensorchord/inference-benchmark/tree/main/benchmark
from mosec.
Related Issues (20)
- feat: how to reload like fastapi or flask HOT 1
- feat: support adaptive batching HOT 2
- feat: support iteration level scheduling HOT 1
- feat: add swagger for the openapi
- feat(docs): add advanced user guide for custom mixin
- unload model automatic HOT 3
- feat: Have a base image
- feat: Migration guide for triton python backend HOT 1
- Is mosec available on windows? HOT 1
- bug: log is not formatted HOT 1
- feat: support for llama(.cpp) HOT 2
- feat: add route-level metrics
- feat: support dynamic configuration for dynamic batching without service restart HOT 1
- bug: Issue running `echo.py`: Thread 'main' panicked at src/main.rs:118:46: local time offset: IndeterminateOffset HOT 4
- bug: 'site-packages/mosec/bin/mosec' not found on macOS M1 HOT 3
- feat: CORS support ? HOT 1
- bug: "request body is too large" when passing numpy array HOT 11
- bug: Exception(b'inference timeout') HOT 2
- bug: rerank example got an error HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mosec.