apache / incubator-livy Goto Github PK

Apache Livy is an open source REST interface for interacting with Apache Spark from anywhere.

License: Apache License 2.0

Java 26.45% Shell 1.66% Scala 63.16% Python 6.42% R 0.08% JavaScript 0.98% HTML 0.57% CSS 0.08% FreeMarker 0.16% Dockerfile 0.44%

livy bigdata spark apachelivy

incubator-livy's Issues

livy0.7.1 Request failed

spark version: 3.2.1
live version: release-0.7.1
Request failed：
{"msg":"requirement failed: Cannot find Livy REPL jars."}。
Can you help me？thanks

现在已经可以使用scala 2.12.15成功编译livy 0.8.0，但是在使用 spark3.2 及 3.3 以上版本时，pyspark 3.3 可以在yarn启动spark，但是提交不了任务，这是代码：
from pyspark.sql import SparkSession
spark=SparkSession.builder.appName("test").enableHiveSupport().getOrCreate()

spark.sql("show databases").show()
这是问题报错：
23/05/18 14:11:34 INFO BlockManagerMasterEndpoint: Registering block manager localhost:41641 with 366.3 MiB RAM, BlockManagerId(2, localhost, 41641, None)
23/05/18 14:11:34 INFO SparkEntries: Spark context finished initialization in 16437ms
23/05/18 14:11:34 INFO SparkEntries: Created Spark session.
23/05/18 14:11:41 ERROR PythonInterpreter: Process has died with 1
23/05/18 14:11:41 ERROR PythonInterpreter: Traceback (most recent call last):
File "/tmp/6067082446938324509", line 722, in
sys.exit(main())
File "/tmp/6067082446938324509", line 570, in main
exec('from pyspark.sql import HiveContext', global_dict)
File "", line 1, in
File "/home/cocdkl/soft/spark-3.3.0-bin-hadoop3/python/lib/pyspark.zip/pyspark/init.py", line 71
def since(version: Union[str, float]) -> Callable[[F], F]:
^
SyntaxError: invalid syntax

通过hue提交的代码，希望得到解答

【livy-8.0-2.12 spark3.2.1】kerberos认证问题

livy.conf

livy.server.launch.kerberos.keytab=xxx.keytab
[email protected]

我确定keytab文件和principal是正确的，但是在创建interactive sesison时报错

23/11/10 11:43:22 INFO rpc.RpcServer: Connected to the port 10000
23/11/10 11:43:22 WARN common.ClientConf: Your hostname, into5, resolves to a loopback address, but we couldn't find any external IP address!
23/11/10 11:43:22 WARN common.ClientConf: Set livy.rsc.rpc.server.address if you need to bind to another address.
23/11/10 11:43:22 INFO sessions.InteractiveSessionManager: Registering new session 4
23/11/10 11:43:22 INFO sessions.InteractiveSessionManager: Registered new session 4
23/11/10 11:43:22 WARN ipc.Client: Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mec
hanism level: Failed to find any Kerberos tgt)]
23/11/10 11:43:25 INFO utils.LineBufferedStream: SLF4J: Class path contains multiple SLF4J bindings.
23/11/10 11:43:25 INFO utils.LineBufferedStream: SLF4J: Found binding in [jar:file:/app/spark-3.2.1-bin-hadoop2.7/jars/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
23/11/10 11:43:25 INFO utils.LineBufferedStream: SLF4J: Found binding in [jar:file:/app/spark-3.2.1-bin-hadoop2.7/jars/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
23/11/10 11:43:25 INFO utils.LineBufferedStream: SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
23/11/10 11:43:25 INFO utils.LineBufferedStream: SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
23/11/10 11:43:27 INFO utils.LineBufferedStream: 23/11/10 11:43:27 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
23/11/10 11:43:29 INFO utils.LineBufferedStream: 23/11/10 11:43:29 WARN DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
23/11/10 11:43:30 INFO utils.LineBufferedStream: 23/11/10 11:43:30 WARN Client: Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSS
Exception: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
23/11/10 11:43:30 INFO utils.LineBufferedStream: Exception in thread "main" java.io.IOException: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "into5/192.168.1.65"; destination host is: "into1":8020; 
23/11/10 11:43:30 INFO utils.LineBufferedStream:        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:776)
23/11/10 11:43:30 INFO utils.LineBufferedStream:        at org.apache.hadoop.ipc.Client.call(Client.java:1480)
23/11/10 11:43:30 INFO utils.LineBufferedStream:        at org.apache.hadoop.ipc.Client.call(Client.java:1413)
23/11/10 11:43:30 INFO utils.LineBufferedStream:        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
23/11/10 11:43:30 INFO utils.LineBufferedStream:        at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)

若在spark-default.conf中增加

spark.kerberos.keytab=xxx.keytab
[email protected]

则以上问题解决，但是，livy server日志中一直在输出

23/11/10 11:43:52 WARN ipc.Client: Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
23/11/10 11:44:22 WARN ipc.Client: Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
23/11/10 11:44:52 WARN ipc.Client: Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
23/11/10 11:45:22 WARN ipc.Client: Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]

请给我提供一些帮助，谢谢！

Error[Failed to launch livy session, session status is dead] on connecting to spark through livy using R

I am using sparklyr version 1.7.8
and latest livy version in master branch incubator-livy

On connecting to spark through livy using R

library(sparklyr)
sc <- spark_connect(master = "local", method = "livy", version ="3.1.1")

Command is throwing an error:

Error in livy_connection(master, config, app_name, version, hadoop_version, :
Failed to launch livy session, session status is dead

Could anyone please help me understand what could the issue be.

Does Livy support Flink

Does it support FlinkSQL, PyFlink, and Flinkml

The SparkStreaming operator fails to execute

kafka as SparkStreaming input and output

Use spark.readStream.format("kafka") read kafka data and decode binary data to string
Use df.map(_.Seq.foldLeft(""))(_ + separtor + _).writeStream("kafka") output data to kafka
If I fail to output to kafka, then no matter how I change the kafka topic later, the stream computing will fail,ArrayIndexOutOfBoundsException: 1 exception will report.If I only output to the console there will be no error
If I run the same code snippet directly in spark-shell without using livy, the effect is the same as 3

Build fails when using -Pscala-2.12

Build fails on assembly module due to scala-2.11 dependencies.
I think this is due to the ${scala.binary.version} being used in the modules sections, and it's populated before the profiles are loaded.
When I ensure that scala-2.12 is the default in the <properties> then it works fine.

can user auto run the init script when the livy session is created?

e.g: I have a script: init.scala , created some common functions inside this script. init.scala should run first when I started livy session

Escape backtick from spark-submit arguments

Currently, livy does not escape backticks from user-provided spark-submit arguments. So if a customer is passing any arguments that contain backticks, it will be considered as command substitution during spark-submit causing that argument to become blank or invalid.

Example:

--query 'select * from test_db.`test_table`'

will become

--query 'select * from test_db.'

livy 传参问题

怎么用有两个代码片段，一个session ，通过sql 执行的结果数据集，传递给下一个片段，参数怎么传递，另外spark 3.5.1版本是否支持

livy:What are the plans for the later stage

What are the plans for the later stage, whether they support more engines, whether it is better for tenants, isolation, and context support。
What are the advantages and differences compared to Apache Linkis

Can Livy not rely on Spark

Livy can interface with Java, Python, Shell, SQL, etc., not necessarily Spark, so it can integrate many languages through jdbc, hive, MySQL, Python, etc

New livy release request

There have been several updates to master recently, including adding support for later Python versions by fixing a bug that would only allow one line per cell.
Can we possibly have a new release of livy? Amongst other things, this would allow AWS EMR to pull in the latest release with all the fixes.

{”msg”:“Rejected, Reason: Blacklisted configuration values in session config: spark.submit.deployMode”}

spark version 3.5.1
livy version. 0.8.0

Requesting through the Api gets the following return

org.springframework.web.client.HttpClientErrorException$BadRequest: 400 Bad Request: "{"msg":"Rejected, Reason: Blacklisted configuration values in session config: spark.submit.deployMode"}"

But I don't see any error reported in the logs, is it incompatible with spark 3.5.1

BTW, I also set upconf/spark-blacklist.conf and it doesn't seem to be taking effect

Build with scala 2.12.15 is failing, spark3.3.1, hadoop:3.3.4, hive:3.1..3

Exception when compiling 13 sources to /opt/incubator-livy/test-lib/target/classes
java.lang.NoSuchMethodError: scala.tools.nsc.Settings.nowarn()Lscala/tools/nsc/settings/AbsSettings$AbsSetting;

Dockerfile build fails on `livy-server` step

Hi,

I try host locally this solution for apache spark on its language flavors with command run docker build -t livy-ci dev/docker/livy-dev-base/. After some installation steps, the error log appers on terminal. I attempt on Linux Ubuntu 20.04.

[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO] 
[INFO] Livy Project Parent POM ............................ SUCCESS [01:26 min]
[INFO] livy-api ........................................... SUCCESS [03:30 min]
[INFO] livy-client-common ................................. SUCCESS [  4.819 s]
[INFO] livy-test-lib ...................................... SUCCESS [  2.996 s]
[INFO] multi-scala-project-root ........................... SUCCESS [  1.042 s]
[INFO] livy-core-parent ................................... SUCCESS [  0.177 s]
[INFO] livy-core_2.11 ..................................... SUCCESS [  9.340 s]
[INFO] livy-rsc ........................................... SUCCESS [ 50.358 s]
[INFO] livy-repl-parent ................................... SUCCESS [ 25.652 s]
[INFO] livy-repl_2.11 ..................................... SUCCESS [04:05 min]
[INFO] livy-server ........................................ FAILURE [ 48.405 s]
[INFO] livy-assembly ...................................... SKIPPED
[INFO] livy-client-http ................................... SKIPPED
[INFO] livy-scala-api-parent .............................. SKIPPED
[INFO] livy-scala-api_2.11 ................................ SKIPPED
[INFO] livy-integration-test .............................. SKIPPED
[INFO] livy-coverage-report ............................... SKIPPED
[INFO] livy-examples ...................................... SKIPPED
[INFO] livy-python-api .................................... SKIPPED
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 11:26 min
[INFO] Finished at: 2023-06-22T21:17:57+00:00
[INFO] Final Memory: 108M/1364M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal on project livy-server: Could not resolve dependencies for project org.apache.livy:livy-server:jar:0.8.0-incubating-SNAPSHOT: Failed to collect dependencies at io.dropwizard.metrics:metrics-healthchecks:jar:3.1.0: Failed to read artifact descriptor for io.dropwizard.metrics:metrics-healthchecks:jar:3.1.0: Could not transfer artifact io.dropwizard.metrics:metrics-healthchecks:pom:3.1.0 from/to central (https://repo1.maven.org/maven2): Connection reset -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn <goals> -rf :livy-server

apache / incubator-livy Goto Github PK

incubator-livy's Issues

livy0.7.1 Request failed

和spark 3.2.0 及以上版本兼容问题

【livy-8.0-2.12 spark3.2.1】kerberos认证问题

Error[Failed to launch livy session, session status is dead] on connecting to spark through livy using R

Does Livy support Flink

The SparkStreaming operator fails to execute

Build fails when using -Pscala-2.12

can user auto run the init script when the livy session is created?

Escape backtick from spark-submit arguments

livy 传参问题

livy:What are the plans for the later stage

Can Livy not rely on Spark

New livy release request

{”msg”:“Rejected, Reason: Blacklisted configuration values in session config: spark.submit.deployMode”}

Build with scala 2.12.15 is failing, spark3.3.1, hadoop:3.3.4, hive:3.1..3

Dockerfile build fails on `livy-server` step

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent