h1st-ai / h1st Goto Github PK
View Code? Open in Web Editor NEWPower Tools for AI Engineers With Deadlines
Home Page: https://h1st.ai
License: Other
Power Tools for AI Engineers With Deadlines
Home Page: https://h1st.ai
License: Other
Various h1st
submodules still contain __init__.py
files that are empty, containing neither useful initialization code nor "public" API classes / functions for those submodules.
Such empty __init__.py
files are unnecessary in Python 3.x and should be cleaned up.
Need to implement the logic to restore object properties/submodel
Originally posted by @nqbao in #19 (comment)
Consider moving h1st/h1st/tests up to h1st/tests, and h1st/h1st/h1st/* up to h1st/h1st, then removing h1st/h1st/h1st !
Is your feature request related to a problem? Please describe.
Recently I created a PR #158 that had been approved and merged to the main
branch, even through everything went well in the PR it self, the merge commit that landed on main
trigger another Github Action job to immediately Build and publish to TestPyPI and PyPI. This job then failed due when it cannot upload a wheel file with the same name to a previous deployed version. I understand that I can somehow create another PR to bump the package version so the package can be build and release properly. But I find this to be very strange:
main
should be build and push to PyPI immediately, that would cause many unnecessary version changes, even if we some how tag these package version using the exact date and time when the CI job was ran.h1st
version release on PyPI. But what about the change logs for each version ? Where and how our user can find the changes between each h1st
version easily ?Describe the solution you'd like
Inspired from other popular Python package, I think we should change our git workflow with better build and release CD pipelines to solve this problem:
h1st
python package on every merge to main
. Instead what we should do is to only trigger this pipeline on release
creation: https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#releaseh1st
version. The release creation will also trigger the build and release CD pipeline, so it will only be run when it actually neededmain
merge policy, only allow squashed commit with link to the original PR so the git commit history on main
will always be nice and readable, this also allow automatic change logs of Github to work very wellDescribe alternatives you've considered
None
Additional context
None
Dependencies such as SciKit-Learn and TensorFlow are currently difficult to install on Apple M1 and certain variants of Windows.
This will slow down the adoption of H1st.
Let's look into simplifying dependencies for broader platform compatibility.
... that elicits human input.
Is your feature request related to a problem? Please describe.
Power Tools for AI Engineers With Deadlines
means quick prototyping and may cause unprecedented logical error. The should be embedded tool to visualize or show users with analysis on data and model.
Describe the solution you'd like
trust
inside h1st which defines many concepts (Not yet stated in documentation) which uses shap and lime. There should be documentation and what it can help.causal inference
? This will shine in industries' cases where there is limited/ sparse data.Describe alternatives you've considered
Some stated above, TBD
Additional context
h1st
output. Esp. when incorporating human knowledge into framework to produce modelsIs your feature request related to a problem? Please describe.
Me, my colleagues and my customers feel frustrated when trying to install h1st in the machines as we can encounter multiple kind of errors. The cost in terms of time and labor is high, we have to wait so long for that to fix.
Describe the solution you'd like
Github action runs on Pull request to check with the combination of:
Describe alternatives you've considered
Additional context
Current Model base assumes predictive models only.
Is your feature request related to a problem? Please describe.
h1st
release in in terms of features
, bug-fix
Describe the solution you'd like
nightly
version on weekly basisDescribe alternatives you've considered
No.
Additional context
Describe the bug
CI failing when setting up environment and installing Poetry. This is probably due to the fact that the Poetry installation procedure has changed competely.
To Reproduce
Steps to reproduce the behavior:
From backlog list:
Is your feature request related to a problem? Please describe.
h1st
want to have example, there should be h1st-example
Describe the solution you'd like
h1st
codebaseDescribe alternatives you've considered
No
Additional context
Currently, if users want to use h1st.stackensemble, they need to write many lines of client code.
Given trained h1st.models, make a stack ensemble that doesn't require writing much client code from user.
Is your feature request related to a problem? Please describe.
h1stis meant for
Power Tools for AI Engineers With Deadlines` as it helps incorporating human knowledge with data to model quickly. I find it frustrated when trying to debug the modeling classes. There should be some kind of visualization.
Describe the solution you'd like
h1st
should have a picture as real graph- based execution. An interesting example has been written in Notebook.Describe alternatives you've considered
Not yet
Additional context
h1st
https://github.com/h1st-ai/h1st/wiki
so users can see what's coming down the pike, and contributors can also help fill in the blanks.
Describe the bug
If using python 3.7 32 bit on window 10, when running "pip3 install h1st", the following error ocurs:
PS C:\users\VietNV\AppData\Local\Programs\Python\Python37-32\Scripts> ./pip3 install h1st
Requirement already satisfied: h1st in c:\users\vietnv\appdata\local\programs\python\python37-32\lib\site-packages\h1st-2020.8-py3.7.egg (2020.8)
Collecting NumPy<1.19,>=1.18.4
Using cached numpy-1.18.5-cp37-cp37m-win32.whl (10.8 MB)
Collecting Pandas<1.2,>=1.0.4
Using cached pandas-1.1.3-cp37-cp37m-win32.whl (7.8 MB)
Collecting PyArrow>=0.17.1
Using cached pyarrow-1.0.1.tar.gz (1.3 MB)
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing wheel metadata ... done
Requirement already satisfied: cloudpickle==1.3.0 in c:\users\vietnv\appdata\local\programs\python\python37-32\lib\site-packages (from h1st) (1.3.0)
ERROR: Could not find a version that satisfies the requirement Ray==0.8.6 (from h1st) (from versions: none)
ERROR: No matching distribution found for Ray==0.8.6 (from h1st)
PS C:\users\VietNV\AppData\Local\Programs\Python\Python37-32\Scripts>
To Reproduce
Steps to reproduce the behavior:
Expected behavior
If h1st module doesnot work with python 32-bit, please mention this in userguide.
Desktop (please complete the following information):
If it’s your first time here and you’re looking to help, we can really use more publicly shareable examples of use cases involving Oracle
, such as the ones here: https://github.com/h1st-ai/h1st/tree/main/user/example-apps/oracle.
The base classes are at https://github.com/h1st-ai/h1st/tree/main/h1st/model/oracle
I’ll add a link to API documentation once that’s available.
using the FFN embedding idea. Cc @TheVinhLuong102 @aht
The third cell of examples/Forecasting/notebooks/forecast.ipynb
:
prepared_data = m.prep_data(m.load_data())
This fails because data is missing.
FileNotFoundError: [Errno 2] No such file or directory: './data/train.csv'
In examples/AutomotiveCybersecurity/notebooks/Automotive Cybersecurity - Cold Start Problem.ipynb
DATA_LOCATION = "COMING-SOON" df = pd.read_parquet('%s/train/attacks/20181113_Driver1_Trip1-0.parquet' % DATA_LOCATION)
This gets an OSError because there's no data.
We want to keep it simple for users.
Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
Describe the solution you'd like
A clear and concise description of what you want to happen.
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Additional context
Add any other context or screenshots about the feature request here.
Is your feature request related to a problem? Please describe.
I find it difficult to follow when trying to install h1st on different machines
Describe the solution you'd like
Learn from pytorch and apply to h1st, that would dramatically improve user experience when installing h1st. First impression is crucial.
Describe alternatives you've considered
No
Additional context
Is your feature request related to a problem? Please describe.
fsspec
is unused in pyproject.toml
. Even Local Storage
inside h1st at here does not use that.Describe the solution you'd like
fsspec
inside pyproject.toml
Describe alternatives you've considered
No
Additional context
Is your feature request related to a problem? Please describe.
Describe the solution you'd like
pydantic
with documentationDescribe alternatives you've considered
Additional context
No
Create a simple runnable "hello world" graph.
Demonstrate how to use 1. simple node & 2. decision node.
Explain data flow & control flow / execution.
Currently installing H1st with Python 3.10 encounters a Poetry-related error, even when the Python requirement range is extended to cover Python 3.10.
The error message says either ModuleNotFoundError: No module named 'distutils.util'
or ModuleNotFoundError: No module named 'poetry'
Add ideas/suggestions here and we'll discuss/greenlight them. Thanks.
Describe the bug
On window 10, python version is 3.9. When running "pip3 install h1st", the following error showed:
ERROR: Command errored out with exit status 1:
command: 'c:\python39\python.exe' -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\VietNV\AppData\Local\Temp\pip-install-sak903ey\h1st\setup.py'"'"'; file='"'"'C:\Users\VietNV\AppData\Local\Temp\pip-install-sak903ey\h1st\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' egg_info --egg-base 'C:\Users\VietNV\AppData\Local\Temp\pip-pip-egg-info-y9mm4mm8'
cwd: C:\Users\VietNV\AppData\Local\Temp\pip-install-sak903ey\h1st
Complete output (7 lines):
Traceback (most recent call last):
File "", line 1, in
File "C:\Users\VietNV\AppData\Local\Temp\pip-install-sak903ey\h1st\setup.py", line 18, in
long_description = f.read()
File "c:\python39\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 507: character maps to
----------------------------------------
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
It should install h1st successfully without any error.
Desktop (please complete the following information):
Really minor thing, just letting you know.
On the GitHub wiki, there's a link in the sidebar: https://h1st.ai/Introduction.html - it's labeled as "Why H1st?". That link leads to a 404 now.
EDIT: The fixed link is: https://www.h1st.ai/tutorials/why-h1st-ai
Is your feature request related to a problem? Please describe.
h1st
uses s3fs
defined in pyproject.toml and S3 storage.h1st
needs support for: GCS
, Azure Blob
, Aliyun OSS
, etc ?Power Tools for AI Engineers With Deadlines
, we should decide to do this inside this project to boost data scientists' productivity.Describe the solution you'd like
h1st[gcp]
, h1st[azure], h1st[aws]
, h1st[aws]
, h1st[all]
Describe alternatives you've considered
h1st
, but this one will need more examples?Additional context
Is your feature request related to a problem? Please describe.
I find it frustrated when using h1st
as I need to go into test
folder to find out how to use features. This is wrong.
Describe the solution you'd like
modeller
should have at least 1 working example, can be just IRISDescribe alternatives you've considered
TBD
Additional context
h1st
users use it inside actual projectsNeed to implement the logic to restore object properties/submodel
Originally posted by @nqbao in #19 (comment)
Users need a high-level (a) motivation, (b) design principles, and (c) current support for this module, e.g., Trustworthy AI
Currently, h1st
require tensorflow = ">=2.10.0"
as one of it dependencies but we should remove tensorflow
from h1st dependencies list.
tensorflow
is not easy to install correctly on all architectures/platformsTensorflow is a machine learning framework that support multiple OS (Windows, Mac, Linux) and most of the Python version. But due to performance reason, it also have to support many different hardware acceleration method (CPU/GPU/TPU/NPU) and different compute architecture (x86, ARM, IBM Power). To take advantage of all these different platform, it have to depend on hardware specific framework and SDK such as: CUDA, CuDNN, TensorRT (for NVIDIA GPU) or MKL-DNN, OpenAPI (for Intel CPU), ROCm (for AMD CPU/GPU/APU), Metal, CoreML (for Apple CPU/GPU/Neural Engine), ... These SDK often get dynamically linked to the Tensorflow binary and require another level of dependencies management on it own so it can be utilize properly by Tensorflow.
Build, maintain, test and release all of these variation of the package is already a huge challenge for the Tensorflow team, and they already had to defer the work to multiple third party such as AWS, Intel and NVIDIA. This make it very difficult to trivially install Tensorflow on multiple platform, just take a quick look on this official installation instruction from Tensorflow. You can already see that each platform have it own installation command, some use conda
, some use pip
, some even have it's own separate python package.
Since it is crucial for the user to install the correct variation of Tensorflow so they can take full advantage of their hardware. And there is no way for poetry
(our package manger) to accommodate all of that variation, we should not force h1st
's user to go with a generic tensorflow>=2.10.0
requirement.
tensorflow
is a heavy package that are huge in sizeLooking at the PyPI pages that track all of tensorflow
wheel files, we can see that in the majority of the wheel files for all platforms and python version have size bigger than 400 MB, if we count all the other dependencies that tensorflow
require, it can easily go up to 500 MB. After unpack for installation, it could took close to a whole GB of space in most cases.
tensorflow
does not get use very often, even internally by AitomaticMost of our work use scikit-learn
and not tensorflow
. For more complex model, pytorch
are increasingly being adopted by the world and our own engineer instead of tensorflow
. So why make tensorflow
a mandatory dependencies on every single project that use h1st
when it just sitting there as dead weight, costing us more in time, disk space and network transfer cost.
tensorflow
version ?Many legacy ML system still uses tensorflow<=1.14
so by requiring tensorflow = ">=2.10.0"
in h1st
, we will lose compatibility with all of those system and potential user. This will slow down the adoption of H1st.
tensorflow
can be safely remove from h1st
codebase altogetherIf we look for where tensorflow
are being used in h1st
codebase, you can quickly see that it is only being use for saving and loading MLModel that originate from Tensorflow.
This saving and loading logic can be implemented by h1st
user to add support for any type of model that they might need to save and load along side with each h1st's MLModel. We already did this internally to support pytorch model as well. It is pretty easy to do ;)
We should remove tensorflow
from h1st dependencies list, along side with any saving and loading code for tensorflow
In theory one can provide a conda
repo with every variation of the package similar to what pytorch did here
Even then:
conda
can supportI think we should consider this alternatives impractical and shouldn't bother trying this way. Each user should decide how they want to install tensorflow
for themself and how they want to use it with h1st
.
Same suggestion was made in #105
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.