Comments (8)
I'm not sure there is a way to fix it. Internally Spark represents Option[Option[Int]]
as a single nullable Integer
, so there is no difference in encoding Some(None)
and None
. To fix it we have to use to customize encoding for such cases, but this would break compatibility with the rest of Spark code, so the best we can is to break compilation for cases we don't support properly.
from frameless.
Spark has exactly the same behaviour:
scala> val xs: List[Option[Option[Int]]] = List(Some(None), None)
xs: List[Option[Option[Int]]] = List(Some(None), None)
scala> spark.createDataset(xs).collect()
res3: Array[Option[Option[Int]]] = Array(Some(None), Some(None))
from frameless.
Another example of things is X1[Option[X1[Option[Int]]]
, it doesn't work because in Spark struct itself can't be nullable, only it's fields, that's why in this case we can't find the difference between X1#a
is null or X1#a#a
is null.
from frameless.
very thorough explanation! I am not sure there is anything we can do. I am not even sure if there is any difference is semantics. For example, None
or or Some(Some(None))
has pretty much the same semantic for me.
from frameless.
It's actually possible have an unboxed option where None
and Some(None)
can be differentiated. scala-unboxed-option uses the following trick: values are stored as themselves (thus the unboxed), and they is one "None
value" for each level of nesting:
object None extends Option[Nothing]
object SomeNone extends Option[Option[Nothing]]
object SomeSomeNone extends Option[Option[Option[Nothing]]]
// ...
But that would be something to change in Spark not in Frameless :)
from frameless.
@kanterov what about things like TypedDataset[Tuple1[Vector[Option[Vector[Option[A]]]]]]
? I'm actually using this ...
from frameless.
@tscholak hm... good question, probably the best is to add to the test suite, AFAIK should work if A
is primitive
from frameless.
In my case, A
is primitive, yes. I didn't have any problems with it so far, but just to be sure that it doesn't break any laws, I can add it to the tests.
from frameless.
Related Issues (20)
- spark-sql 3.1.2 can't work with frameless-dataset 0.11.1 HOT 3
- Snapshot publish failed
- Compatibility with Spark 3.2.1 HOT 11
- Cats-effect 3 roadmap HOT 1
- CI release failure HOT 7
- How should parse and convert data from an external medium in a generic way? HOT 2
- Frameless 0.13 release HOT 2
- spark 3.4 support - replacing dataTypeFor logic HOT 8
- 3.4 AgnosticEncoder support - Spark Connect HOT 1
- [feature] DatasetT HOT 1
- AVG and KMeans tests fix HOT 1
- Add scalafmt HOT 1
- Add support for TypedDeltaTable
- use HOT 1
- Iterate over TypedColumns with evidence
- Spark 3.5 update HOT 10
- type inference for .opt no longer works without explicit type argument in Scala 2.13.x HOT 3
- Defective schema generation on array/seq column HOT 5
- scalafmt was not maintained for some of the code? HOT 2
- Add TypedEncoder for shapeless Record. HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from frameless.