Pentaho Data Integration (pdi) is available at /opt/pentaho
folder and its version depends on what is specified at Dockerfile.
The container is created with the following folder structure:
├── app
│ ├── jobs
│ ├── pentaho-extra-libs
│ └── results
| |── data
│ └── logs
| └── jvm
├── opt
│ └── pentaho
| |── ...
| |── libs
| |── ...
...
Folder | Purpose |
---|---|
/app/jobs |
Stores kjb and dependent ktr files to be executed. |
/app/pentaho-extra-libs |
Stores libs that are going to be copied to /opt/pentaho/lib when running the container. |
/app/results/data |
Stores the exported results of execution, for example, CSV files that are generated by your Job/Transformation. |
/app/results/logs |
Stores the logs generated by pdi kitchen execution. |
/app/results/logs/jvm |
Stores the JVM dump logs in case of an OutOfMemory error. |
- Navigate to
./examples
folder:
cd ./examples
- Create a shell script which will be sourced before running the docker image, avoiding sensitive values to be exposed:
cat << EOF > my-variables.sh
export APP_DB_DATABASE="<MY-APP-DB-DATABASE>"
export APP_DB_SERVER_ADDRESS="<MY-APP-DB-SERVER-ADDRESS>"
export APP_DB_SERVER_PORT="<MY-APP-DB-SERVER-PORT>"
export APP_DB_USER="<MY-APP-DB-USER>"
export APP_DB_USER_PWD="Encrypted <MY-APP-DB-USER-PWD>"
EOF
- Source the
my-variables.sh
source ./my-variables.sh
- Execute the image with the requested mounted volumes and with the parameters you defined into your job and transformation files.
docker run \
-it \
-v $(pwd)/inputs/jobs:/app/jobs \
-v $(pwd)/inputs/pentaho-extra-libs:/app/pentaho-extra-libs \
-v $(pwd)/outputs:/app/results \
ricardosouzamorais/pentaho-pdi-kitchen \
"-param:APP_DB_DATABASE=$APP_DB_DATABASE" \
"-param:APP_DB_SERVER_ADDRESS=$APP_DB_SERVER_ADDRESS" \
"-param:APP_DB_SERVER_PORT=$APP_DB_SERVER_PORT" \
"-param:APP_DB_USER=$APP_DB_USER" \
"-param:APP_DB_USER_PWD=$APP_DB_USER_PWD"
JVM MIN AND MAX MEMORY LEVEL: The default values for Xms
and Xmx
are 128m
and 512m
, respectivelly, and can be redefined by overriding APP_JVM_MIN_MEMORY
and APP_JVM_MAX_MEMORY
environment variables as the following example:
docker run \
-it \
-e APP_JVM_MIN_MEMORY="512m" \
-e APP_JVM_MAX_MEMORY="1024m" \
-v $(pwd)/inputs/jobs:/app/jobs \
...
LOG LEVEL: The log level is defined as Basic
by default and can be overriden using the APP_LOG_LEVEL
environment variable as the following example:
docker run \
-it \
-e APP_LOG_LEVEL="Detailed" \
-v $(pwd)/inputs/jobs:/app/jobs \
...
Git Bash Windows: If running on Git Bash Windows adds a /
before $(pwd)
:
docker run \
...
-v /$(pwd)/inputs/jobs:/app/jobs \
-v /$(pwd)/inputs/pentaho-extra-libs:/app/pentaho-extra-libs \
-v /$(pwd)/outputs:/app/results \
ricardosouzamorais/pentaho-pdi-kitchen \
...