Comments (12)
Looks like it is not finding the companion object TDigest$
. I've never tried it that way, but it may be a consequence of how spark loads jar
files.
Out of curiosity, does this work for you:
spark-shell --packages "org.isarnproject::isarn-sketches-spark:0.3.1-sp2.3-py3.6"
from isarn-sketches-spark.
Also: import org.isarnproject.sketches._
from isarn-sketches-spark.
Thanks. I tried what you asked for
$spark-shell --packages "org.isarnproject:isarn-sketches-spark:0.3.1-sp2.3-py3.6"
Exception in thread "main" java.lang.RuntimeException: [unresolved dependency: org.isarnproject#isarn-sketches-spark;0.3.1-sp2.3-py3.6: not found]
from isarn-sketches-spark.
What's your recommended way of loading the package?
from isarn-sketches-spark.
you need two colons after org.isarnproject
from isarn-sketches-spark.
$ spark-shell --packages "org.isarnproject::isarn-sketches-spark:0.3.1-sp2.3-py3.6"
Exception in thread "main" java.lang.IllegalArgumentException: requirement failed: Provided Maven Coordinates must be in the form 'groupId:artifactId:version'. The coordinate provided is: org.isarnproject::isarn-sketches-spark:0.3.1-sp2.3-py3.6
at scala.Predef$.require(Predef.scala:224)
at org.apache.spark.deploy.SparkSubmitUtils$$anonfun$extractMavenCoordinates$1.apply(SparkSubmit.scala:1000)
at org.apache.spark.deploy.SparkSubmitUtils$$anonfun$extractMavenCoordinates$1.apply(SparkSubmit.scala:998)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186)
at org.apache.spark.deploy.SparkSubmitUtils$.extractMavenCoordinates(SparkSubmit.scala:998)
at org.apache.spark.deploy.SparkSubmitUtils$.resolveMavenCoordinates(SparkSubmit.scala:1220)
at org.apache.spark.deploy.DependencyUtils$.resolveMavenDependencies(DependencyUtils.scala:49)
at org.apache.spark.deploy.SparkSubmit$.prepareSubmitEnvironment(SparkSubmit.scala:350)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:170)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:136)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
from isarn-sketches-spark.
Oh, right it wants the scala version explicit.
spark-shell --packages "org.isarnproject:isarn-sketches-spark_2.11:0.3.1-sp2.3-py3.6"
I do think you'll also want: import org.isarnproject.sketches._
from isarn-sketches-spark.
Thanks. But...
scala> import org.isarnproject.sketches._
import org.isarnproject.sketches._
scala> val udaf = tdigestUDAF[Double]
:26: error: not found: value tdigestUDAF
val udaf = tdigestUDAF[Double]
from isarn-sketches-spark.
ah, I meant "in addition to the other imports you already had"
from isarn-sketches-spark.
Got it. Thank you!
from isarn-sketches-spark.
I think originally, you might have just been missing the import org.isarnproject.sketches._
, although I'm not sure if the jar
file would have included all the right dependencies.
For maven style packages, the "intended" way is to use --packages ...
, but if you got the full set of jar files for all deps, then --jars
ought to work too
from isarn-sketches-spark.
I'll follow your suggestion of importing packages. I didn't know what's the right package name to use in the first place. Now I know. Thanks!
from isarn-sketches-spark.
Related Issues (15)
- Add an option for feature importance from KL-divergence
- Python: cdfInverse results in wrong order of values on monotonic distribution with large ranges HOT 4
- Python: can't serialize/pickle TDigest due to error in __reduce__
- Unsafe Symbol TDigestSQL HOT 6
- Can't use TDigestUDT in spark UDF to get cdf HOT 2
- Python: cdfInverse does not accept a value of 1 as q HOT 6
- TDigest __repr__ not in line with constructor HOT 7
- Potential secutiry vulnerability in the shared library zstd. Can you help upgrade to patch versions? HOT 2
- Support for Spark 3.2 HOT 3
- Serialize Dataframes from TDigest UDAFs HOT 8
- TDigestSQL results from the TDigestUDAF cannot themselves be aggregated in Spark SQL. HOT 4
- use treeAggregate
- get the t-digests from a TDigestFI model
- rewrite pyspark TDigest to wrap scala via py4j
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from isarn-sketches-spark.