jozu-ai / kitops Goto Github PK
View Code? Open in Web Editor NEWTools for easing the handoff between AI/ML and App/SRE teams.
Home Page: https://KitOps.ml
License: Apache License 2.0
Tools for easing the handoff between AI/ML and App/SRE teams.
Home Page: https://KitOps.ml
License: Apache License 2.0
Describe the problem you're trying to solve
It should be possible to use corporate proxy with kit CLI
Describe the solution you'd like
kit CLI should respect the HTTP_PROXY* environment settings. It should also be possible to set proxies through CLI flags.
Describe the problem you're trying to solve
When editing kitfiles it should be able to get validation and code assist
Describe the solution you'd like
Provide a JSON Schema that can be referenced by editors that support JSON Schema based validation.
The Schema should eventually be moved to schema store once kitfile API is more stable.
Describe the problem you're trying to solve
Like MacOS we should also sing the windows binaries.
Describe the solution you'd like
Describe the problem you're trying to solve
It's unclear when you do a pack or unpack how long it will take or whether it's even still working. For large models and datasets there could be quite a long delay in building the ModelKit.
Describe the solution you'd like
Some kind of status update including a description of what it's doing at various stages / steps.
Describe alternatives you've considered
n/a
Additional context
none
Describe the problem your feature would solve
It can sometimes be difficult to understand what a model is meant to be used for and what datasets are / why they were included.
Describe the solution you'd like
A readme included in the ModelKit would make it easy for the Kitfile producer and consumer to understand the context for the ModelKit and the assets it is packaging.
Describe alternatives you've considered
Descriptions in the Kitfile are okay, but are either very short and keep the Kitfile readable, or are very long and wreck the readability.
Additional context
[none]
Describe the problem you're trying to solve
The new Quick Start does a good job of getting someone familiar with the basic Kit commands, but it doesn't go into Kitfile authorship or other critical commands.
Describe the solution you'd like
A Next Steps with Kit doc that follows from the Quick Start would help new users to build their own ModelKits and learn how to take advantage of tagging and registry managment.
Currently, kit
defaults to storing everything within $HOME/.kitops
. Since the .kitops
directory is not intended for direct user access (and instead serves as general storage for OCI artifacts, credentials, etc.), it makes more sense to use more standard directories for each operating system:
$HOME/.local/share/kitops
%LOCALAPPDATA%/kitops
In addition, it would be convenient to respect an environment variable -- e.g. $KITOPS_HOME
-- for overriding this default (in addition to the --config
flag). This would allow using an alternate storage path for multiple commands (or setting a default value system wide) without needing to add the --config
flag to each command.
Currently, ModelKits are stored using one OCI spec index per repository, using the folder structure
<storage-root>
└── <registry>
└── <organization>
├── <repository1>
│ ├── blobs
│ ├── index.json
│ └── oci-layout
└── <repository2>
├── blobs
├── index.json
└── oci-layout
As the OCI image index spec does not leave easy room for multiple repositories within one index, tagging the same image into two separate repositories currently uses double the storage. In other words, executing
kit tag my-image:mytag my-other-image:mytag
results in the blobs for my-image
being copied to another directory.
Note this issue isn't present for ModelKits within the same repository -- i.e. my-image:tag1
and my-image:tag2
will share storage as expected.
Since blobs are content-addressable and there are no auth concerns with locally-stored modelkits, it makes sense to store each blob only once, and reference them from multiple different indexes. This would cut down on storage requirements for ModelKits while keeping a relatively pure OCI image index structure.
Alternatively, we could abandon using the image index structure for local storage and instead implement an alternate way of tracking references to ModelKits in local storage. This would avoid the need for potentially awkward workarounds to manage accessing and removing blobs locally.
Describe the problem you're trying to solve
To reduce adoption friction among data scientists we need a Python library that makes it trivial for someone to create a kitfile or build a ModelKit as part of their code.
Describe the solution you'd like
A clear and concise description of what you want to happen.
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Additional context
Add any other context or screenshots about the feature request here.
Please design an FAQ section for the home page. This section should display the questions with collapsed answers. Let's start with 10 questions.
Describe the bug
When trying to run kit pack
with the attached Kitfile I get an error:
Failed to pack model kit: error resolving /Users/bmicklea/Documents/GitHub: lstat /Users/bmicklea/Documents/GitHub/Users: no such file or directory
It looks like it's appending a second Users
to the path, although that's weird because the path actually has more to it...
To Reproduce
Steps to reproduce the behavior:
kit pack ./
with the attached kitfile (it'll need to have the .txt extension removed)Version
Next
Describe the bug
The links to join the KitOps discord have either expired or are directing to the old discord server. This needs to be updated.
I've created a link that will not expire: https://discord.gg/YyAfWnEg
Design the interface for the demo dev command.
missing UI elements:
Currently, CLI operations that require uploading or downloading data from a remote registry work silently, which can be confusing if the operation takes a relatively long time.
The CLI should support progress bars for kit push
, kit pull
, and kit export
(when ModelKit isn't present locally). We should only print these progress bars when the CLI is being used interactively (i.e. don't print progress bars in CI environments, etc.).
Describe the problem your feature would solve
When using very large datasets (e.g., image and video data especially), it's painful to have to clone them to my local machine in order to package up a ModelKit.
Describe the solution you'd like
It would be nice if you could provide a URI to the dataset in the Kitfile. I understand that that would trade download time for me for wait time as the ModelKit fetched the data in order to be built...as long as it doesn't happen on my machine 😄
Describe alternatives you've considered
There isn't really an alternative beyond cloning everything locally.
WHEN CLOSED UPDATE THE NEXT STEPS DOC (NEXT-STEPS.MD) WHICH REFERENCES THIS ISSUE.
Describe the problem you're trying to solve
There's no way to remove a tag from a ModelKit if it only has one tag (you end up removing the ModelKit).
Describe the solution you'd like
A flag to remove a tag (even the last tag) would help.
Describe alternatives you've considered
Adding the new tag and then removing the old. This might be better if we think we always want a tag for every ModelKit...
Add completions for operations that work on local storage:
kit push <modelkit>
-- complete with ModelKits (registry/repository:tag) stored locallykit unpack <modelkit>
-- same as abovekit remove <modelkit>
-- same as abovekit tag <source> <dest>
-- complete for source ModelKit; maybe for destination as well?kit list
-- complete for local repositories (i.e. where a registry is not included)Describe the problem you're trying to solve
Bulleted and numbered lists in the docs are a little hard to read - they're not very indented and generally just kind of run together with the other text. See the use-cases.md
as a good example.
Describe the solution you'd like
A little deeper indent on lists would help. Perhaps if there was more space between regular paragraphs and a little less at the top of the first item in a list that would help too. The docs need room to breathe! 😄
Describe alternatives you've considered
Not using lists, but that seems extreme.
Additional context
n/a
Describe the problem you're trying to solve
Currently, the unpack command does not fully cater to the specific needs and expectations of different target environments, such as inference engines. There's a gap in functionality where the command fails to dynamically adjust and resolve the unpacked files based on the unique structure and metadata requirements of these environments. This lack of flexibility can lead to inefficiencies and potential mismatches when integrating unpacked files into various runtime, test, development, or deployment settings.
Describe the solution you'd like
To address this issue, I propose enhancing the kit unpack
command to offer a more adaptable and environment-aware unpacking mechanism. Although ModelKits
are designed to be abstract away from different environments, it's essential for the kit unpack
functionality to be able to cater to the unique needs of these environments when unpacking artifacts.
A feasible approach to achieving this is through the introduction of a plugin mechanism. This system would allow for the development of environment-specific plugins that kit can leverage during the unpacking process. Each plugin would instruct the kit unpack command on how to properly structure the unpacked files to meet the requirements of a target environment
ModelKits
are an abstracted from the runtime, test, development, or deployment environments.
This would allow us to customize the kit unpack process to meet the diverse requirements of different target environments efficiently, which in turn streamlines deployment and integration workflows, improving user experience across various environments.
Describe the problem you're trying to solve
Run a model in a ModelKit locally for development/testing purposes
Describe the solution you'd like
Develop a test harness that exposes REST APIs and an additional chat interface for LLM development on a locally run web server and inferences the given model.
Describe alternatives you've considered
Modifying similar projects to use ModelKits
Issue list:
Describe the problem you're trying to solve
We need to wrap all of our buttons on KitOps in GA4 tags for tracking purposes. Most importantly the Install button
https://developers.google.com/analytics/devguides/collection/ga4/events?client_type=gtag
Describe the problem you're trying to solve
A mechanism to exclude files or folders from ModelKit during packaging
Describe the solution you'd like
implement .kitignore
to ignore files/folders to be excluded from model kit packaging.
Describe alternatives you've considered
Use .gitignore
Describe the problem you're trying to solve
ModelKits and the assets they contain can come from any location and be built by anyone. There are no inherent guarantees in any of the existing model / dataset packaging mechanism of provenance or safety. Users want a way to know where the package they are using has come from so they can make their own decision about whether to trust it.
Describe the solution you'd like
ModelKits should be able to include attestations for the package and its contents. We could use something like the SLSA's verification summary and include it with the ModelKit as an option. This would make ModelKits the first packaging for AI/ML that provides provenance attestations.
Describe the problem you're trying to solve
Reference material that shows how to use Kitops for fine-tuning LLMs
Describe the solution you'd like
Additional context
We can take this solution as a permanent service for kitops.ml site.
Describe the problem you're trying to solve
Once I have a large-ish number of ModelKits it can become annoying to have to clean them out one-by-one.
Describe the solution you'd like
Some kind of CLI command to do bulk removes based on tags perhaps?
Create a tutorial with samples that shows how a ModelKit can be easily packaged into a container image using the kit unpack
command as part of the docker file
By adding color and/or spaces and lines to the layout.
Color is limited to: https://en.wikipedia.org/wiki/ANSI_escape_code
Describe the bug
In iOS Safari, the Marquee of logos in the kitops site has a weird timing and spacing issue.
To Reproduce
Steps to reproduce the behavior:
We should simplify the quick start and make it more step-by-step so they understand the power of Kit
We can then follow with helping them build their own ModelKit
Describe the bug
I created a new link that is supposed to never expire: https://discord.gg/3eDb4yAN
Implement an init
command to initialize a kitfile on a given repository. The command should be able to introspect the folder and suggests a kitfile. It should also be able to use artifacts from different frameworks like MLFlow to initialize the information
The init command is an interactive command. It should be able to discover artifacts on a folder like jupyter notebooks, csv. json, well known serialized models and interactively complete the generation of the kitfile.
Also init should recognize and generate kitfile from the info generated by MLFlow's save_model
The ModelKit Manifest section of the Modelkit specification should refer to the OCI specification so that it is clear for the reader that the manifest is inherited from OCI.
The commands & flags should be sorted by the most common usage. The order is currently alphabetical.
Tasks:
We want to make it easy for the community to use Kit with their favourite LLMs including:
Describe the problem you're trying to solve
The current LLM prompt configuration and interaction UX / UI is ugly and somewhat confusing.
Describe the solution you'd like
We should rethink what aspects are consistently vs rarely changed and redesign the interface to make it both smoother and more joyful to use.
Describe the problem you're trying to solve
The version number that is generated for the documentation site should have a reference to the version of the CLI it is generated with.
Describe the problem you're trying to solve
Although I can build nearly any deployable artifact from the serialized model that is in the ModelKit, it's still work...
Describe the solution you'd like
Generate a Dockerfile based on my ModelKit and some basic criteria I provide.
Describe alternatives you've considered
Making Dockerfiles on my own from the unpack
ed model in a ModelKit.
Additional context
none
Describe the problem you're trying to solve
Create a reference material on how to use KitOps for creating RAG pipelines.
Describe the solution you'd like
Additional context
We can also make this solution deployed as a permanent service for Kitops.ml website.
GH tags to add:
Go Report Card
https://goreportcard.com/
Hits to track GH visits
https://hits.seeyoufarm.com/
Release #
License Type
KitOps logo at the top of Readme
Issue counter
*Add Good first issue tag
KitOps site Footer Popup
by adding color.
e.g.
error: red
warning: yellow
additional messages: grey
Color code should match #77
Color is limited to: https://en.wikipedia.org/wiki/ANSI_escape_code
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.