Comments (3)
I would like to have a try on this issue.
from dolphinscheduler.
Current workaround for me is to pass --master ... --deploy-mode cluster
in the extra options. Since spark-submit will use the last values, this will send task to local cluster. For example look at this log which has my own --master
option which overrides Dolphin --master local
:
[INFO] 2024-02-02 14:27:38.934 +0000 - Final Shell file is :
#!/bin/bash
BASEDIR=$(cd `dirname $0`; pwd)
cd $BASEDIR
export SPARK_HOME=/opt/spark-3.5.0-bin-hadoop3
export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64
${SPARK_HOME}/bin/spark-submit --master local
--class com.example.monitor.ScanMonitor --conf spark.driver.cores=1 --conf spark.driver.memory=512M
--conf spark.executor.instances=2 --conf spark.executor.cores=2
--conf spark.executor.memory=2G
--master spark://devel:7077 --deploy-mode cluster
file:/opt/apache-dolphinscheduler-3.2.0-bin/standalone-server/files/default/resources/monitor-0.1-jdk11.jar producer
...
24/02/02 14:27:54 INFO ClientEndpoint: Driver successfully submitted as driver-20240202142754-0003
2024-02-02 14:28:00.038 +0000 - ->
24/02/02 14:27:59 INFO ClientEndpoint: State of driver-20240202142754-0003 is RUNNING
24/02/02 14:27:59 INFO ClientEndpoint: Driver running on 172.16.254.204:35595 (worker-20240202141308-172.16.254.204-35595)
24/02/02 14:27:59 INFO ClientEndpoint: spark-submit not configured to wait for completion, exiting spark-submit JVM.
from dolphinscheduler.
Current workaround for me is to pass
--master ... --deploy-mode cluster
in the extra options. Since spark-submit will use the last values, this will send task to local cluster. For example look at this log which has my own--master
option which overrides Dolphin--master local
:[INFO] 2024-02-02 14:27:38.934 +0000 - Final Shell file is : #!/bin/bash BASEDIR=$(cd `dirname $0`; pwd) cd $BASEDIR export SPARK_HOME=/opt/spark-3.5.0-bin-hadoop3 export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64 ${SPARK_HOME}/bin/spark-submit --master local --class com.example.monitor.ScanMonitor --conf spark.driver.cores=1 --conf spark.driver.memory=512M --conf spark.executor.instances=2 --conf spark.executor.cores=2 --conf spark.executor.memory=2G --master spark://devel:7077 --deploy-mode cluster file:/opt/apache-dolphinscheduler-3.2.0-bin/standalone-server/files/default/resources/monitor-0.1-jdk11.jar producer ... 24/02/02 14:27:54 INFO ClientEndpoint: Driver successfully submitted as driver-20240202142754-0003 2024-02-02 14:28:00.038 +0000 - -> 24/02/02 14:27:59 INFO ClientEndpoint: State of driver-20240202142754-0003 is RUNNING 24/02/02 14:27:59 INFO ClientEndpoint: Driver running on 172.16.254.204:35595 (worker-20240202141308-172.16.254.204-35595) 24/02/02 14:27:59 INFO ClientEndpoint: spark-submit not configured to wait for completion, exiting spark-submit JVM.
Thanks @git-blame for quick work around, indeed it will work in the extra options, but master is a important parameter among spark as mentioned.
I will communicate with community to see if it is by design in previous discussions.
If not, I will add paramater into spark task.
from dolphinscheduler.
Related Issues (20)
- [Bug] [Worker] Task out param may loss
- [Bug] [dolphinscheduler-ui] UI timed scheduler Improvement HOT 1
- [Bug] [Python] Pass parameter from Python task to downstream did not work
- [Bug] [dolphinscheduler-datasource-hive] Can not connect HiveServer2 HA Zookeeper HOT 1
- [Feature][Datasource] Add Flink Datasource
- [Bug] [frontend] page only show 10 cases HOT 9
- task is waiting to excecuted for more than 12 hours and seems not to be overtimed HOT 9
- [Feature][failed strategy] i want to A serial workflow, if a node fails, the following nodes continue to execute instead of all subsequent dependent workflows stopping HOT 1
- [Improvement] [sub_process] the subprocess's tenant is always default and cann't run HOT 15
- [Bug] [Master,Api-server] version:3.2.1 ProcessServiceImpl.java:1964 NullPointerException HOT 1
- [Bug] [MASTER] DependentExecute.java:398 NPE HOT 1
- [Bug] [dolphinscheduler-task-java] The jar type task script error of the java node causes a running error HOT 3
- [Bug] [API] edit file in resource center and do nothing exit will recive err HOT 3
- [idea][Process Definition] Execute strategy for serial discard with oldest task HOT 2
- [Bug] [Api] project prefrence do not work HOT 5
- [Doc][remote shell] missing remote shell doc for version 3.2.1
- [Bug] [flinkStream] The FlinkStream component cannot be started HOT 3
- 多依赖任务bug HOT 2
- depend DAG bug HOT 2
- [Bug] [Master] The task has been killed might still be dispatched
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dolphinscheduler.