apache / incubator-livy Goto Github PK
View Code? Open in Web Editor NEWApache Livy is an open source REST interface for interacting with Apache Spark from anywhere.
Home Page: https://livy.apache.org/
License: Apache License 2.0
Apache Livy is an open source REST interface for interacting with Apache Spark from anywhere.
Home Page: https://livy.apache.org/
License: Apache License 2.0
spark version: 3.2.1
live version: release-0.7.1
Request failed:
{"msg":"requirement failed: Cannot find Livy REPL jars."}。
Can you help me?thanks
现在已经可以使用scala 2.12.15成功编译livy 0.8.0,但是在使用 spark3.2 及 3.3 以上版本时,pyspark 3.3 可以在yarn启动spark,但是提交不了任务,这是代码:
from pyspark.sql import SparkSession
spark=SparkSession.builder.appName("test").enableHiveSupport().getOrCreate()
spark.sql("show databases").show()
这是问题报错:
23/05/18 14:11:34 INFO BlockManagerMasterEndpoint: Registering block manager localhost:41641 with 366.3 MiB RAM, BlockManagerId(2, localhost, 41641, None)
23/05/18 14:11:34 INFO SparkEntries: Spark context finished initialization in 16437ms
23/05/18 14:11:34 INFO SparkEntries: Created Spark session.
23/05/18 14:11:41 ERROR PythonInterpreter: Process has died with 1
23/05/18 14:11:41 ERROR PythonInterpreter: Traceback (most recent call last):
File "/tmp/6067082446938324509", line 722, in
sys.exit(main())
File "/tmp/6067082446938324509", line 570, in main
exec('from pyspark.sql import HiveContext', global_dict)
File "", line 1, in
File "/home/cocdkl/soft/spark-3.3.0-bin-hadoop3/python/lib/pyspark.zip/pyspark/init.py", line 71
def since(version: Union[str, float]) -> Callable[[F], F]:
^
SyntaxError: invalid syntax
通过hue提交的代码,希望得到解答
livy.conf
livy.server.launch.kerberos.keytab=xxx.keytab
[email protected]
我确定keytab文件和principal是正确的,但是在创建interactive sesison时报错
23/11/10 11:43:22 INFO rpc.RpcServer: Connected to the port 10000
23/11/10 11:43:22 WARN common.ClientConf: Your hostname, into5, resolves to a loopback address, but we couldn't find any external IP address!
23/11/10 11:43:22 WARN common.ClientConf: Set livy.rsc.rpc.server.address if you need to bind to another address.
23/11/10 11:43:22 INFO sessions.InteractiveSessionManager: Registering new session 4
23/11/10 11:43:22 INFO sessions.InteractiveSessionManager: Registered new session 4
23/11/10 11:43:22 WARN ipc.Client: Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mec
hanism level: Failed to find any Kerberos tgt)]
23/11/10 11:43:25 INFO utils.LineBufferedStream: SLF4J: Class path contains multiple SLF4J bindings.
23/11/10 11:43:25 INFO utils.LineBufferedStream: SLF4J: Found binding in [jar:file:/app/spark-3.2.1-bin-hadoop2.7/jars/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
23/11/10 11:43:25 INFO utils.LineBufferedStream: SLF4J: Found binding in [jar:file:/app/spark-3.2.1-bin-hadoop2.7/jars/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
23/11/10 11:43:25 INFO utils.LineBufferedStream: SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
23/11/10 11:43:25 INFO utils.LineBufferedStream: SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
23/11/10 11:43:27 INFO utils.LineBufferedStream: 23/11/10 11:43:27 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
23/11/10 11:43:29 INFO utils.LineBufferedStream: 23/11/10 11:43:29 WARN DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
23/11/10 11:43:30 INFO utils.LineBufferedStream: 23/11/10 11:43:30 WARN Client: Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSS
Exception: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
23/11/10 11:43:30 INFO utils.LineBufferedStream: Exception in thread "main" java.io.IOException: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "into5/192.168.1.65"; destination host is: "into1":8020;
23/11/10 11:43:30 INFO utils.LineBufferedStream: at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:776)
23/11/10 11:43:30 INFO utils.LineBufferedStream: at org.apache.hadoop.ipc.Client.call(Client.java:1480)
23/11/10 11:43:30 INFO utils.LineBufferedStream: at org.apache.hadoop.ipc.Client.call(Client.java:1413)
23/11/10 11:43:30 INFO utils.LineBufferedStream: at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
23/11/10 11:43:30 INFO utils.LineBufferedStream: at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
若在spark-default.conf中增加
spark.kerberos.keytab=xxx.keytab
[email protected]
则以上问题解决,但是,livy server日志中一直在输出
23/11/10 11:43:52 WARN ipc.Client: Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
23/11/10 11:44:22 WARN ipc.Client: Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
23/11/10 11:44:52 WARN ipc.Client: Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
23/11/10 11:45:22 WARN ipc.Client: Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
请给我提供一些帮助,谢谢!
I am using sparklyr version 1.7.8
and latest livy version in master branch incubator-livy
On connecting to spark through livy using R
library(sparklyr)
sc <- spark_connect(master = "local", method = "livy", version ="3.1.1")
Command is throwing an error:
Error in livy_connection(master, config, app_name, version, hadoop_version, :
Failed to launch livy session, session status is dead
Could anyone please help me understand what could the issue be.
Does it support FlinkSQL, PyFlink, and Flinkml
kafka as SparkStreaming input and output
spark.readStream.format("kafka")
read kafka data and decode binary data to stringdf.map(_.Seq.foldLeft(""))(_ + separtor + _).writeStream("kafka")
output data to kafkaArrayIndexOutOfBoundsException: 1
exception will report.If I only output to the console there will be no errorBuild fails on assembly module due to scala-2.11 dependencies.
I think this is due to the ${scala.binary.version}
being used in the modules
sections, and it's populated before the profiles are loaded.
When I ensure that scala-2.12 is the default in the <properties>
then it works fine.
e.g: I have a script: init.scala
, created some common functions inside this script. init.scala
should run first when I started livy session
Currently, livy does not escape backticks from user-provided spark-submit arguments. So if a customer is passing any arguments that contain backticks, it will be considered as command substitution during spark-submit causing that argument to become blank or invalid.
Example:
--query 'select * from test_db.`test_table`'
will become
--query 'select * from test_db.'
怎么用有两个代码片段,一个session ,通过sql 执行的结果数据集,传递给下一个片段,参数怎么传递,另外spark 3.5.1版本是否支持
What are the plans for the later stage, whether they support more engines, whether it is better for tenants, isolation, and context support。
What are the advantages and differences compared to Apache Linkis
Livy can interface with Java, Python, Shell, SQL, etc., not necessarily Spark, so it can integrate many languages through jdbc, hive, MySQL, Python, etc
There have been several updates to master recently, including adding support for later Python versions by fixing a bug that would only allow one line per cell.
Can we possibly have a new release of livy? Amongst other things, this would allow AWS EMR to pull in the latest release with all the fixes.
spark version 3.5.1
livy version. 0.8.0
Requesting through the Api gets the following return
org.springframework.web.client.HttpClientErrorException$BadRequest: 400 Bad Request: "{"msg":"Rejected, Reason: Blacklisted configuration values in session config: spark.submit.deployMode"}"
But I don't see any error reported in the logs, is it incompatible with spark 3.5.1
BTW, I also set upconf/spark-blacklist.conf
and it doesn't seem to be taking effect
Exception when compiling 13 sources to /opt/incubator-livy/test-lib/target/classes
java.lang.NoSuchMethodError: scala.tools.nsc.Settings.nowarn()Lscala/tools/nsc/settings/AbsSettings$AbsSetting;
Hi,
I try host locally this solution for apache spark on its language flavors with command run docker build -t livy-ci dev/docker/livy-dev-base/
. After some installation steps, the error log appers on terminal. I attempt on Linux Ubuntu 20.04.
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] Livy Project Parent POM ............................ SUCCESS [01:26 min]
[INFO] livy-api ........................................... SUCCESS [03:30 min]
[INFO] livy-client-common ................................. SUCCESS [ 4.819 s]
[INFO] livy-test-lib ...................................... SUCCESS [ 2.996 s]
[INFO] multi-scala-project-root ........................... SUCCESS [ 1.042 s]
[INFO] livy-core-parent ................................... SUCCESS [ 0.177 s]
[INFO] livy-core_2.11 ..................................... SUCCESS [ 9.340 s]
[INFO] livy-rsc ........................................... SUCCESS [ 50.358 s]
[INFO] livy-repl-parent ................................... SUCCESS [ 25.652 s]
[INFO] livy-repl_2.11 ..................................... SUCCESS [04:05 min]
[INFO] livy-server ........................................ FAILURE [ 48.405 s]
[INFO] livy-assembly ...................................... SKIPPED
[INFO] livy-client-http ................................... SKIPPED
[INFO] livy-scala-api-parent .............................. SKIPPED
[INFO] livy-scala-api_2.11 ................................ SKIPPED
[INFO] livy-integration-test .............................. SKIPPED
[INFO] livy-coverage-report ............................... SKIPPED
[INFO] livy-examples ...................................... SKIPPED
[INFO] livy-python-api .................................... SKIPPED
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 11:26 min
[INFO] Finished at: 2023-06-22T21:17:57+00:00
[INFO] Final Memory: 108M/1364M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal on project livy-server: Could not resolve dependencies for project org.apache.livy:livy-server:jar:0.8.0-incubating-SNAPSHOT: Failed to collect dependencies at io.dropwizard.metrics:metrics-healthchecks:jar:3.1.0: Failed to read artifact descriptor for io.dropwizard.metrics:metrics-healthchecks:jar:3.1.0: Could not transfer artifact io.dropwizard.metrics:metrics-healthchecks:pom:3.1.0 from/to central (https://repo1.maven.org/maven2): Connection reset -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
[ERROR]
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR] mvn <goals> -rf :livy-server
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.