Giter Site home page Giter Site logo

Comments (14)

flaviuvadan avatar flaviuvadan commented on June 5, 2024

Hello @linyaoli! Thanks for bringing up this question 🙂 Hera submits tasks by packaging the payload of a function as literal text packaged in an Argo script template! So, when the create_thumbnails function is used to create a task the content of it is taken and passed to an Argo script template. This means the function body is transmitted over the wire to Argo, which then rewrites it into its own Python script that it then runs using whatever command was passed. So, in the case of create_thumbnails, we have the body:

blob = read_image(...)

inside some remote script on Argo. Now, the problem is that the imports are not read from the file in which the script is defined "locally", so they are not replicated remotely. In this specific case, the implementation of create_thumbnails needs to be:

...
class ThumbnailTask(ArgoTask):
    def __init__(self, metadata: ThumbnailMetadata):
        # .... some more code

    def create_thumbnails(self, image_url: str, thumbnail_sizes: list[dict]):
        from app.common.image.image_utils import read_image 
        blob = read_image(self.image_url)

This way the import is captured and recreated remotely in an Argo task. The imported dependency is provided, in this case, by my-repo.dkr.ecr.us-east-2.amazonaws.com/my-app:latest, which is the used Docker image!

Does this provide further clarity for your use case? Happy to write more, and perhaps point to some links on Argo's side as well!

from hera.

linyaoli avatar linyaoli commented on June 5, 2024

Hi @flaviuvadan, thanks for the prompt response!

I tried what you suggested, and the script was unable to import the package app. I also tried to import the relative path but failed.

│ main Traceback (most recent call last):                                                                                                                  
│ main   File "/argo/staging/script", line 5, in <module>                                                                                                  
│ main     from ..image.image_utils import create_thumbnail, decode_blob, read_image                                                                      
│ main ImportError: attempted relative import with no known parent package

However I double checked the code path it did exist.

A little more background:

The image is a fastapi service, I only use small part of the code to run inside Argo to create thumbnails. What I can confirm is the Dockerfile is valid and the service has no problem resolving packages.

from hera.

linyaoli avatar linyaoli commented on June 5, 2024

Hi @flaviuvadan, so I finally figured out what's going on(partially). I'm using poetry as the package manager and apparently it caused Argo unable to find packages.

After I manually installed necessary dependencies via pip in Dockerfile and everything has worked out.

from hera.

flaviuvadan avatar flaviuvadan commented on June 5, 2024

@linyaoli ah, I understand now. If the Dockerfile is using Poetry it means there's a virtual environment created, which contains the necessary dependencies. The default command used by Hera and Argo to launch the submitted script is python, which uses the system Python installation. In your scenario, there are two options:

  • leave the poetry installation in, and provide the necessary command to a task, such as poetry run
  • don't use command and instead install everything in system Python, which is the route you noticed as a possibility

from hera.

linyaoli avatar linyaoli commented on June 5, 2024

@flaviuvadan one more issue however still remains baffling to me: you mentioned that I could import my own package inside the script file but Argo was still unable to find it:

main Traceback (most recent call last):                                                                                                                  │
│ main   File "/argo/staging/script", line 11, in <module>                                                                                                 │
│ main     from app.common.image.image_utils import read_image                                                                                             │
│ main ModuleNotFoundError: No module named 'app'

So what I did is to put all code in one giant script, this works but creates loads of duplicated code.

from hera.

flaviuvadan avatar flaviuvadan commented on June 5, 2024

@linyaoli yes, that will definitely work, but has the caveat you mentioned 😞 if you run the image on local, start a Python REPL, and run from app.common.image.image_utils import read_image, can you do it successfully? I expect that problem to be replicated on local through execution inside Docker if it's a problem with the way the packages are installed.

Also, I just realized something... the example we are talking about is a class method not a stand-alone function. Hera does not currently parse out class methods and there may be problems associated with indentation. In addition, I am not sure whether self values are accessible 🤔 self will not exist remotely at execution time.

from hera.

linyaoli avatar linyaoli commented on June 5, 2024

If to replicate exact same process locally, the script wouldn't work. However, since my project is running inside fastAPI (still in container), I can trigger the script in one of the APIs and it suddenly ran as a charm.

I tried to do some deductions here, the script I'm using has been modified to a standalone function and when I call read_image it still failed.

from hera.

flaviuvadan avatar flaviuvadan commented on June 5, 2024

Let's reiterate the setup so we make sure we're on the same page:

  • there's a Dockerfile that has a poetry command that installs the dependencies of the application (call it X)
FROM some-base:latest

RUN poetry install ...

When we launch this using docker run -it X:latest python -c "from app.common.image.image_utils import read_image" we should get nothing because the import should be successful if the dependencies are installed fine.

  • there's a script we want to call, and this script is not in a Python class
def foo(...): 
    from app.common.image.image_utils import read_image
    ...
  • there's a Hera task launched with this image
Task('foo', foo, func_params=[...], image="X:latest")

Is this a good summary, @linyaoli? 🤔

from hera.

linyaoli avatar linyaoli commented on June 5, 2024

@flaviuvadan, the docker command threw an error

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/app/app/common/image/image_utils.py", line 1, in <module>
    import cv2
ModuleNotFoundError: No module named 'cv2'

The read_image looks like this:

import cv2

def read_image(path: str):
    # some code

Other part of your summary is correct. Thanks again, @flaviuvadan, really appreciate it.

from hera.

flaviuvadan avatar flaviuvadan commented on June 5, 2024

Ah, great! So, one of the next steps is installing that dependency and perhaps giving it another go! You should update this discussion if you have the chance to try that 🙂

from hera.

linyaoli avatar linyaoli commented on June 5, 2024

Do you mean to run with

docker run -it X:latest poetry run python -c "from app.common.image.image_utils import read_image"

if run without poetry, I have tons of packages to install manually. Is this the right direction?

Running with poetry however is without any error.

from hera.

flaviuvadan avatar flaviuvadan commented on June 5, 2024

Running with poetry however is without any error.

Ah, got it! So it does run in the virtual environment of Poetry. Did you pass ["poetry", "run", "python"] to the command field on the Hera Task as well?

Task('t', foo, image='X:latest', command=["poetry", "run", "python"])

I think that's the missing thing we did not discuss.

from hera.

linyaoli avatar linyaoli commented on June 5, 2024

I did pass ["poetry", "run", "python"] into Hera Task. However as I mentioned, the script is still unable to make reference to code outside the package (besides installed by poetry).

However the command

docker run -it X:latest poetry run python -c "from app.common.image.image_utils import read_image"

did work out. I wonder what's the difference between two here.

from hera.

shivkurtarkar avatar shivkurtarkar commented on June 5, 2024

I solved the issue in my project by adding workdir to PYTHONPATH env variable while building docker image.

from hera.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.