Comments (14)
Hello @linyaoli! Thanks for bringing up this question 🙂 Hera submits tasks by packaging the payload of a function as literal text packaged in an Argo script template! So, when the create_thumbnails
function is used to create a task the content of it is taken and passed to an Argo script template. This means the function body is transmitted over the wire to Argo, which then rewrites it into its own Python script that it then runs using whatever command
was passed. So, in the case of create_thumbnails
, we have the body:
blob = read_image(...)
inside some remote script on Argo. Now, the problem is that the imports are not read from the file in which the script is defined "locally", so they are not replicated remotely. In this specific case, the implementation of create_thumbnails
needs to be:
...
class ThumbnailTask(ArgoTask):
def __init__(self, metadata: ThumbnailMetadata):
# .... some more code
def create_thumbnails(self, image_url: str, thumbnail_sizes: list[dict]):
from app.common.image.image_utils import read_image
blob = read_image(self.image_url)
This way the import is captured and recreated remotely in an Argo task. The imported dependency is provided, in this case, by my-repo.dkr.ecr.us-east-2.amazonaws.com/my-app:latest
, which is the used Docker image
!
Does this provide further clarity for your use case? Happy to write more, and perhaps point to some links on Argo's side as well!
from hera.
Hi @flaviuvadan, thanks for the prompt response!
I tried what you suggested, and the script was unable to import the package app
. I also tried to import the relative path but failed.
│ main Traceback (most recent call last):
│ main File "/argo/staging/script", line 5, in <module>
│ main from ..image.image_utils import create_thumbnail, decode_blob, read_image
│ main ImportError: attempted relative import with no known parent package
However I double checked the code path it did exist.
A little more background:
The image is a fastapi service, I only use small part of the code to run inside Argo to create thumbnails. What I can confirm is the Dockerfile is valid and the service has no problem resolving packages.
from hera.
Hi @flaviuvadan, so I finally figured out what's going on(partially). I'm using poetry
as the package manager and apparently it caused Argo unable to find packages.
After I manually installed necessary dependencies via pip
in Dockerfile and everything has worked out.
from hera.
@linyaoli ah, I understand now. If the Dockerfile is using Poetry it means there's a virtual environment created, which contains the necessary dependencies. The default command used by Hera and Argo to launch the submitted script is python
, which uses the system Python installation. In your scenario, there are two options:
- leave the poetry installation in, and provide the necessary
command
to a task, such aspoetry run
- don't use
command
and instead install everything in system Python, which is the route you noticed as a possibility
from hera.
@flaviuvadan one more issue however still remains baffling to me: you mentioned that I could import my own package inside the script file but Argo was still unable to find it:
│ main Traceback (most recent call last): │
│ main File "/argo/staging/script", line 11, in <module> │
│ main from app.common.image.image_utils import read_image │
│ main ModuleNotFoundError: No module named 'app'
So what I did is to put all code in one giant script, this works but creates loads of duplicated code.
from hera.
@linyaoli yes, that will definitely work, but has the caveat you mentioned 😞 if you run the image on local, start a Python REPL, and run from app.common.image.image_utils import read_image
, can you do it successfully? I expect that problem to be replicated on local through execution inside Docker if it's a problem with the way the packages are installed.
Also, I just realized something... the example we are talking about is a class method not a stand-alone function. Hera does not currently parse out class methods and there may be problems associated with indentation. In addition, I am not sure whether self
values are accessible 🤔 self
will not exist remotely at execution time.
from hera.
If to replicate exact same process locally, the script wouldn't work. However, since my project is running inside fastAPI (still in container), I can trigger the script in one of the APIs and it suddenly ran as a charm.
I tried to do some deductions here, the script I'm using has been modified to a standalone function and when I call read_image
it still failed.
from hera.
Let's reiterate the setup so we make sure we're on the same page:
- there's a Dockerfile that has a poetry command that installs the dependencies of the application (call it X)
FROM some-base:latest
RUN poetry install ...
When we launch this using docker run -it X:latest python -c "from app.common.image.image_utils import read_image"
we should get nothing because the import should be successful if the dependencies are installed fine.
- there's a script we want to call, and this script is not in a Python class
def foo(...):
from app.common.image.image_utils import read_image
...
- there's a Hera task launched with this image
Task('foo', foo, func_params=[...], image="X:latest")
Is this a good summary, @linyaoli? 🤔
from hera.
@flaviuvadan, the docker command threw an error
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/app/app/common/image/image_utils.py", line 1, in <module>
import cv2
ModuleNotFoundError: No module named 'cv2'
The read_image
looks like this:
import cv2
def read_image(path: str):
# some code
Other part of your summary is correct. Thanks again, @flaviuvadan, really appreciate it.
from hera.
Ah, great! So, one of the next steps is installing that dependency and perhaps giving it another go! You should update this discussion if you have the chance to try that 🙂
from hera.
Do you mean to run with
docker run -it X:latest poetry run python -c "from app.common.image.image_utils import read_image"
if run without poetry
, I have tons of packages to install manually. Is this the right direction?
Running with poetry
however is without any error.
from hera.
Running with
poetry
however is without any error.
Ah, got it! So it does run in the virtual environment of Poetry. Did you pass ["poetry", "run", "python"]
to the command
field on the Hera Task
as well?
Task('t', foo, image='X:latest', command=["poetry", "run", "python"])
I think that's the missing thing we did not discuss.
from hera.
I did pass ["poetry", "run", "python"]
into Hera Task
. However as I mentioned, the script is still unable to make reference to code outside the package (besides installed by poetry).
However the command
docker run -it X:latest poetry run python -c "from app.common.image.image_utils import read_image"
did work out. I wonder what's the difference between two here.
from hera.
I solved the issue in my project by adding workdir to PYTHONPATH env variable while building docker image.
from hera.
Related Issues (20)
- Improve example directory structure to accommodate complex "mini projects" HOT 4
- Container ImagePullPolicy builds an incompatible type HOT 2
- On-cluster testing improvements
- Support loaders for script Parameters HOT 1
- make test is broken with cappa dependency HOT 1
- Create a hera mypy plugin HOT 1
- Error messages improvement
- Arguments mapping is very verbose depending on use case
- Remove the need to write `.value` on `Parameter`s passed to `arguments` HOT 1
- How to set a workflow parameters default value HOT 1
- VolumeMounts for sidecars disappear
- Robust validation for k8s resource requirements
- Save dummy outputs when runner script raises an exception HOT 1
- RunnerInput: Got 400 from Argo, a value was not supplied in the parameter HOT 2
- RunnerInput/Output feedback/issues HOT 2
- Docs feedback
- Remove Python 3.8 support HOT 1
- Reusable "mock" task/step HOT 2
- Hera runner debug log mode
- Example "user_container.py" doesn't work
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from hera.