This probably is more a "how-to" issue than an "issue", but I am not sure where it fal

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<a class="user-mention notranslate" data-hovercard-type="us

hi <a class="user-mention notranslate" data-hovercard-type="user" data-ho

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Common or shared variables for different notebooks in a pipeline about elyra HOT 9 OPEN

arundeep78 commented on August 17, 2024

Common or shared variables for different notebooks in a pipeline

from elyra.

Comments (9)

lresende commented on August 17, 2024

There are individual node properties and pipeline "global" properties... you can find it as different tabs on properties panel

from elyra.

arundeep78 commented on August 17, 2024

@lresende thanks for response. I beleive I was not clear with my message, when I mentioned about pipeline parameters.

Yes, these parameters are available, but only when scheduled as pipeline. But this the final stage when someone has developed all those notebooks . How would those notebooks get those variables, when they are not running as pipeline?

In the example above, lets say I am developing Part 1 notebook interactively. In my case especially a new kubernetes pod is started for that notebook. It does not have access to any other file that one can see in the explorer.

The way in a normal python/jupyter environment, I can create a simple .py file with some configs and can access it any other notebook. How can I do it in this case ?

from elyra.

kevin-bates commented on August 17, 2024

hi @arundeep78. You might try looking at using volume mounts, either conditional or unconditional, by adjusting the kernel-pod template. This, in combination with KERNEL_WORKING_DIR (which points into the mount location) would allow you to simulate local pipeline invocations that don't leverage EG.

from elyra.

lresende commented on August 17, 2024

Yes, these parameters are available, but only when scheduled as pipeline. But this the final stage when someone has developed all those notebooks . How would those notebooks get those variables, when they are not running as pipeline?

We would usually specify environment variables, and in the notebook, lookup the env vars with default values for local runs.

os.getenv("MY_VAR", "default_local_value")

This way, you will get the correct value from MY_VAR when running as pipelines and otherwise default to the local value.

from elyra.

arundeep78 commented on August 17, 2024

@lresende Sorry, I probably do not understand the whole architecture completely.

I start an IBm Elyra environment and start a Jupyterlab interface. This has some directory structure and notebooks.
I open a given notebook, select a given kernel ( in this case a python3 kernel). Now when this kernel starts through enterprise gateway (EG), it pulls up the kernel definition defined in the configuration and starts that kernel as a kubernetes pod and connects to it. I am refering to this digram from Jupyter Enterprise gateway

If I want to use environment variables, then I would have to customize this image to get those variables in the interactive development phase. Which if I would do won't make sense.

As in the jupyterlab environment I may have different pipelines confugured in different folders e.g..
pipeline-finance and pipeline-news . In both cases I have multiple notebooks, which refers to variable name table_name. If I set that in the image then it will fail.

Either I am mising something or people just do not use it this way and configure a working notebook which runs independently in its own environment.

from elyra.

arundeep78 commented on August 17, 2024

hi @arundeep78. You might try looking at using volume mounts, either conditional or unconditional, by adjusting the kernel-pod template. This, in combination with KERNEL_WORKING_DIR (which points into the mount location) would allow you to simulate local pipeline invocations that don't leverage EG.

@kevin-bates thanks. I will read that documentation and will come back. But just to clarify, we don't have 'local pipelines'.

We develop notebooks interactively using EG, which starts the kernel from its standard image. in our case kernel named "python_kubernetes". Once this notebook is working, we schedule it through Elyra interface on Airflow. As notebooks have all variables inside without dependencies on any environment variables and so, it runs fine. Only challenge we have is, scenarios when functionality is splitted in multiple notebooks, we are duplicating those variables in all notebooks.

from elyra.

kevin-bates commented on August 17, 2024

@arundeep78 - Thanks for the additional information. So your notebook "nodes" must be requiring resources that are not available locally, yet are also available in airflow. And also, combining them onto one server (as would be the case if you, say, used Jupyter Hub to host each elyra-server) using that server's kernels locally, would still have insufficient resources - is that correct? If that's the case, then, yeah, looking into the volume mounts approach is probably your best bet. Seems like you might be able to use unconditional volume mounts and just enter the necessary instructions into the kernel-pod templates directly (rather than relying on KERNEL_ values.)

from elyra.

arundeep78 commented on August 17, 2024

@kevin-bates what do you mean when you say "insufficient resources" for nodes? are you refering to CPU, memory etc. or ?

Anyways, we do not lack compute power.

In Elyra development environment we have a github repo that synch all notebooks to Elyra. In that we have multiple folders that contains all notebooks relevant for a single pipeline example below.

During development of lets say notebook1, Elyra will start a kernel which will not have access to "common_paramters.py", which may contain common variables that are needed by notebook2 and notebook3 as well.

I assume if I schedule it as a pipleine and add common_paramters.py as dependencies for all notebooks in the pipeline then it will be availabe on the node where this notebook will run and I should be able to import to as from common_parameters import * or simmilar.

But, during the interactive development of these notebooks, the kernel that has been started as kubernetes pod, does not have access to this 'common_parameters.py' file, This means I cannot just develop a notebook in interactive mode and use the same in pipeline (unless there is a way).

These paramters I cannot define in kernel images, as their values is different for each notebook. e.g. for pipeline1 table_name is t_db_table1 and so on.

Also, these I am not sure if they are really environment variables. These are more like "application vairables" as in application defined by pipeline and environment variable is something like database connection, which can be different for development or production, but still use the same variables for table name.

I hope I made the situation more clear than before as to what I am trying to achieve to find a way to reach there.

from elyra.

kevin-bates commented on August 17, 2024

My comment use essentially asking "why do you need to use EG to develop your notebooks?". If you have enough resources locally you don't need EG and the (local) files are available to all notebooks (nodes).

from elyra.

Common or shared variables for different notebooks in a pipeline about elyra HOT 9 OPEN

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent