Giter Site home page Giter Site logo

chill's Introduction

Chill

build Codecov branch Latest version Chat

Extensions for the Kryo serialization library including serializers and a set of classes to ease configuration of Kryo in systems like Hadoop, Storm, Akka, etc.

Compatibility

Serialization compatibility is NOT guaranteed between releases, and for this reason, we don't recommend using it for long-term storage. Serialization is highly dependent on scala version compatibility and on the underlying Kryo serializers, which take different approaches to compatibility.

Building Chill

> sbt
sbt:chill-all>  compile # to build chill
sbt:chill-all>  publishM2 # to publish chill to your local .m2 repo
sbt:chill-all>  publishLocal # publish to local ivy repo.

Chill has a set of subprojects: chill-java, chill-hadoop, chill-storm and chill-scala. Other than chill-scala, all these projects are written in Java so they are easy to use on any JVM platform.

Chill-Java

The chill-java package includes the KryoInstantiator class (factory for Kryo instances) and the IKryoRegistrar interface (adds Serializers to a given Kryo). These two are composable to build instantiators that create instances of Kryo that have the options and serializers you need. The benefit of this over a direct Kryo instance is that a Kryo instance is mutable and not serializable, which limits the safety and reusability of code that works directly with them.

To deserialize or serialize easily, look at KryoPool:

int POOL_SIZE = 10;
KryoPool kryo = KryoPool.withByteArrayOutputStream(POOL_SIZE, new KryoInstantiator());
byte[] ser = kryo.toBytesWithClass(myObj);
Object deserObj = kryo.fromBytes(myObj);

The KryoPool is a thread-safe way to share Kryo instances and temporary output buffers.

Chill Config

Hadoop, Storm, and Akka all use a configuration that is basically equivalent to a Map[String, String]. The com.twitter.chill.config package makes it easy to build up KryoInstantiator instances given a Config instance, which is an abstract class acting as a thin wrapper over whatever configuration data the system, such as Hadoop, Storm or Akka, might give.

To configure a KryoInstantiator use ConfiguredInstantiator with either reflection, which takes a class name and instantiates that KryoInstantiator, or an instance of KryoInstantiator and serializes that instance to use later:

class TestInst extends KryoInstantiator { override def newKryo = sys.error("blow up") }

// A new Config:
val conf = new JavaMapConfig
// Set-up class-based reflection of our instantiator:
ConfiguredInstantiator.setReflect(conf, classOf[TestInst])
val cci = new ConfiguredInstantiator(conf)
cci.newKryo // uses TestInst
//Or serialize a particular instance into the config to use later (or another node):

ConfiguredInstantiator.setSerialized(conf, new TestInst)
val cci2 = new ConfiguredInstantiator(conf)
cci2.newKryo // uses the particular instance we passed above

Chill in Scala

Scala classes often have a number of properties that distinguish them from usual Java classes. Often scala classes are immutable, and thus have no zero argument constructor. Secondly, object in scala is a singleton that needs to be carefully serialized. Additionally, scala classes often have synthetic (compiler generated) fields that need to be serialized, and by default Kryo does not serialize those.

In addition to a ScalaKryoInstantiator which generates Kryo instances with options suitable for scala, chill provides a number of Kryo serializers for standard scala classes (see below).

The MeatLocker

Many existing systems use Java serialization. MeatLocker is an object that wraps a given instance using Kryo serialization internally, but the MeatLocker itself is Java serializable. The MeatLocker allows you to box Kryo-serializable objects and deserialize them lazily on the first call to get:

import com.twitter.chill.MeatLocker

val boxedItem = MeatLocker(someItem)

// boxedItem is java.io.Serializable no matter what it contains.
val box = roundTripThroughJava(boxedItem)
box.get == boxedItem.get // true!

To retrieve the boxed item without caching the deserialized value, use meatlockerInstance.copy.

Serializers for Scala classes

These are found in the chill-scala directory in the chill jar (originally this project was only scala serializers). Chill provides support for singletons, scala Objects and the following types:

  • Scala primitives
    • scala.Enumeration values
    • scala.Symbol
    • scala.reflect.Manifest
    • scala.reflect.ClassManifest
    • scala.Function[0-22] closure cleaning (removing unused $outer references).
  • Collections and sequences
    • scala.collection.immutable.Map
    • scala.collection.immutable.List
    • scala.collection.immutable.Vector
    • scala.collection.immutable.Set
    • scala.collection.mutable.{Map, Set, Buffer, WrappedArray}
    • all 22 scala tuples

Chill-bijection

Bijections and Injections are useful when considering serialization. If you have an Injection from T to Array[Byte] you have a serialization. Additionally, if you have a Bijection between A and B, and a serialization for B, then you have a serialization for A. See BijectionEnrichedKryo for easy interop between bijection and chill.

KryoInjection: easy serialization to byte Arrays

KryoInjection is an injection from Any to Array[Byte]. To serialize using it:

import com.twitter.chill.KryoInjection

val bytes:  Array[Byte]    = KryoInjection(someItem)
val tryDecode: scala.util.Try[Any] = KryoInjection.invert(bytes)

KryoInjection can be composed with Bijections and Injections from com.twitter.bijection.

Chill-Akka

To use, add a key to your config like:

    akka.actor.serializers {
      kryo = "com.twitter.chill.akka.AkkaSerializer"
    }

Then for the super-classes of all your message types, for instance, java.io.Serializable (all case classes and case objects are serializable), write:

   akka.actor.serialization-bindings {
     "java.io.Serializable" = kryo
   }

With this in place you can now disable Java serialization entirely:

akka.actor {
  # Set this to on to enable serialization-bindings defined in
  # additional-serialization-bindings. Those are by default not included
  # for backwards compatibility reasons. They are enabled by default if
  # akka.remote.artery.enabled=on.
  enable-additional-serialization-bindings = on
  
  allow-java-serialization = off
}

If you want to use the chill.config.ConfiguredInstantiator see ConfiguredAkkaSerializer otherwise, subclass AkkaSerializer and override kryoInstantiator to control how the Kryo object is created.

Documentation

To learn more and find links to tutorials and information around the web, check out the Chill Wiki.

The latest ScalaDocs are hosted on Chill's Github Project Page.

Contact

Discussion occurs primarily on the Chill mailing list. Issues should be reported on the GitHub issue tracker.

Get Involved + Code of Conduct

Pull requests and bug reports are always welcome!

We use a lightweight form of project governance inspired by the one used by Apache projects. Please see Contributing and Committership for our code of conduct and our pull request review process. The TL;DR is send us a pull request, iterate on the feedback + discussion, and get a +1 from a Committer in order to get your PR accepted.

The current list of active committers (who can +1 a pull request) can be found here: Committers

A list of contributors to the project can be found here: Contributors

Maven

Chill modules are available on Maven Central. The current groupid and version for all modules is, respectively, "com.twitter" and 0.10.0. Each scala project is published for 2.11, 2.12 and 2.13. Search search.maven.org when in doubt.

chill-scala is not published separately; to use it, reference chill. To add the dependency to your project using SBT:

"com.twitter" %% "chill" % "0.10.0"

Authors

License

Copyright 2012 Twitter, Inc.

Licensed under the Apache License, Version 2.0.

Thanks to Yourkit

YourKit supports open source projects with innovative and intelligent tools for monitoring and profiling Java and .NET applications. YourKit is the creator of YourKit Java Profiler, YourKit .NET Profiler, and YourKit YouMonitor.

chill's People

Contributors

2m avatar alanbato avatar andrewsmartin avatar caniszczyk avatar chermenin avatar cmoh avatar dependabot[bot] avatar figpope avatar gdiet avatar ianoc avatar isnotinvain avatar johnynek avatar koertkuipers avatar lamdor avatar mansurashraf avatar mslinn avatar nevillelyh avatar ngocdaothanh avatar oscar-stripe avatar ravwojdyla avatar regadas avatar rmetzger avatar rxin avatar ryanlecompte avatar scala-steward avatar schmmd avatar sritchie avatar tsdeng avatar unkarjedy avatar zakattacktwitter avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

chill's Issues

ClosureCleaner is pretty broken

We need some strict unit tests, and we need to make it pass.

Right now, I'm thinking this is not actually working like we expect.

We need to decompile a bunch of examples and verify that it is doing what we expect. Currently, I don't believe that it is (maybe for spark's REPL it works, but in our use case, I don't think it is).

chill thrift

subclasses of TBase should just use thrift to serialize, not fields.

Release Chill 0.2.1 for Scala 2.10.x

The version has been released for Scala 2.9.2, but not for 2.10.x.

I'm not using Scala 2.9.x any more, but once you have released for 2.9.2, how about 2.9.3?

Deserialization of list of list or case class that contains list

I don't know if I use chill as it supposed to be used, but here are some trivial failing cases that I found?

test("int list serialize/deserialize") {
    val listObj = List(
      List(List(1)),
      List(List(1))
    )

    val serializedObject = KryoBijection.serialize(listObj)
    val deserializedObj = KryoBijection.deserialize[List[List[Int]]](serializedObject)

    assert(listObj === deserializedObj)
  }

Results in
List(List(List(1)), List(List(1))) did not equal List(1, List(1), List(1, List(1)))

test("int list container serialize/deserialize") {
    val listObj = List(
      ListContainer(List(1)),
      ListContainer(List(1))
    )

    val serializedObject = KryoBijection.serialize(listObj)
    val deserializedObj = KryoBijection.deserialize[List[ListContainer]](serializedObject)

    assert(listObj === deserializedObj)
  }

case class ListContainer(ints: List[Int])

Results in
List(ListContainer(List(1)), ListContainer(List(1))) did not equal List(1, ListContainer(List(1)))

Add scala.runtime.BoxedUnit

I think this is serialized okay now, but we need to add tests and I think if we register this it can be free (it is a singleton, but a Java class).

In chill-hadoop check Kryo/Buffer creation.

It is costly to create a kryo instance and right now the code is doing it on every open. Cascading calls open after almost every entry in the tuple, so this could kill performance.

We do this because we think (fear? have been burned?) by Kryo being mutable. What we want is an immutable Kryo instance that is only used for serialization and will not change any tokens or register any new ones.

issue with specialized Tuple's in WrappedArray

Hi guys,

We at Spark ran into a problem using Chill to serialize WrappedArray. See the following code. It looks like the problem is with a WrappedArray for specialized tuples. If I replace the ints with string (and thus no specialized Tuple2), the exception goes away.

This is becoming a major problem for the Spark project because anything with primitive WrappedArray would fail.

@mateiz, @dlyubimov

scala> val ser = new spark.KryoSerializer
ser: spark.KryoSerializer = spark.KryoSerializer@7798ce82

scala> ser.newInstance.deserialize[Array[(Int, Int)]]( ser.newInstance.serialize(Array((1, 2), (2, 3)))  )
res12: Array[(Int, Int)] = Array((1,2), (2,3))

scala> ser.newInstance.deserialize[Seq[(Int, String)]]( ser.newInstance.serialize(Array((1, "2"), (2, "3")).toSeq)  )
res20: Seq[(Int, String)] = WrappedArray((1,2), (2,3))

scala> ser.newInstance.deserialize[Seq[(Int, Int)]]( ser.newInstance.serialize(Array((1, 2), (2, 3)).toSeq)  )
java.lang.ArrayIndexOutOfBoundsException: -2
  at java.util.ArrayList.get(ArrayList.java:324)
  at com.esotericsoftware.kryo.util.MapReferenceResolver.getReadObject(MapReferenceResolver.java:42)
  at com.esotericsoftware.kryo.Kryo.readReferenceOrNull(Kryo.java:773)
  at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:624)
  at com.twitter.chill.WrappedArraySerializer$$anonfun1.apply(WrappedArraySerializer.scala:44)
  at com.twitter.chill.WrappedArraySerializer$$anonfun1.apply(WrappedArraySerializer.scala:43)
  at scala.collection.immutable.Range.foreach(Range.scala:81)
  at com.twitter.chill.WrappedArraySerializer.read(WrappedArraySerializer.scala:43)
  at com.twitter.chill.WrappedArraySerializer.read(WrappedArraySerializer.scala:21)
  at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729)
  at spark.KryoSerializerInstance.deserialize(KryoSerializer.scala:75)
  at <init>(<console>:13)
  at <init>(<console>:18)
  at <init>(<console>:20)
  at <init>(<console>:22)
  at <init>(<console>:24)
  at .<init>(<console>:28)
  at .<clinit>(<console>)
  at .<init>(<console>:11)
  at .<clinit>(<console>)
  at $export(<console>)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
  at java.lang.reflect.Method.invoke(Method.java:597)
  at spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:629)
  at spark.repl.SparkIMain$Request$$anonfun$10.apply(SparkIMain.scala:890)
  at scala.tools.nsc.interpreter.Line$$anonfun$1.apply$mcV$sp(Line.scala:43)
  at scala.tools.nsc.io.package$$anon$2.run(package.scala:25)
  at java.lang.Thread.run(Thread.java:680)

Update Readme to explain usage

Explain how to use the library. Probably just something like: KryoSerializer.registerAll(k) on a k: Kryo which might have been generated from (new KryoSerializer { }).getKryo

NPE on deserialized fn

This code causes a NPE for me, with scala 2.10.1 and master:

object Foo {
    def Bar = 1
}

object Main extends Application {
override def main(args: Array[String]) {
    val bytes = KryoInjection(() => {
                                  println(Foo.Bar)
                              })
    val x = KryoInjection.invert(bytes)
    val f = x.get match {
        case f : Function0[Unit] => f()
    }
}

Any idea what I am doing wrong here?

Can't upgrade to 2.9.3

I don't know if this is due to some hidden state on my machine (I nuked everything I could think of), but adding scalaVersion := "2.9.3", to the build gives me the below (and it is super frustrating).

From my irc plea for help:

[10:30am] oscarboykin: first it warns that it can't find source:
[10:30am] oscarboykin: [warn] :: org.scalacheck#scalacheck_2.9.3;1.10.0!scalacheck_2.9.3.src
[10:30am] oscarboykin: [warn] :: org.scala-tools.testing#specs_2.9.3;1.6.9!specs_2.9.3.src
[10:30am] oscarboykin: but then this becomes an error:
[10:30am] oscarboykin: [info] Done updating.
[10:30am] oscarboykin: sbt.ResolveException: download failed: org.scalacheck#scalacheck_2.9.3;1.10.0!scalacheck_2.9.3.src
[10:30am] oscarboykin: download failed: org.scala-tools.testing#specs_2.9.3;1.6.9!specs_2.9.3.src
[10:30am] oscarboykin:  at sbt.IvyActions$.sbt$IvyActions$$resolve(IvyActions.scala:214)
[10:31am] oscarboykin: meanwhile, 1) the source jars are there: http://oss.sonatype.org/content/repositories/releases/org/scala-tools/testing/specs_2.9.3/1.6.9/
[10:31am] oscarboykin: 2) I don't know why there are needed anyway
[10:32am] oscarboykin: but they are not suffixed with .src as sbt seems to be looking for:
[10:32am] oscarboykin: http://oss.sonatype.org/content/repositories/releases/org/scala-tools/testing/specs_2.9.3/1.6.9/specs_2.9.3-1.6.9-sources.src
[10:32am] oscarboykin: they are .jars.

Actual output:

[info] Loading project definition from /Users/oscarb/workspace/chill/project
[info] Set current project to chill-all (in build file:/Users/oscarb/workspace/chill/)
[info] Updating {file:/Users/oscarb/workspace/chill/}chill...
[info] Resolving org.ow2.asm#asm-tree;4.0 ...
[warn]  [FAILED     ] org.scalacheck#scalacheck_2.9.3;1.10.0!scalacheck_2.9.3.src:  (0ms)
[warn] ==== local: tried
[warn]   /Users/oscarb/.ivy2/local/org.scalacheck/scalacheck_2.9.3/1.10.0/srcs/scalacheck_2.9.3-sources.src
[warn] ==== sonatype-snapshots: tried
[warn]   http://oss.sonatype.org/content/repositories/snapshots/org/scalacheck/scalacheck_2.9.3/1.10.0/scalacheck_2.9.3-1.10.0-sources.src
[warn] ==== sonatype-releases: tried
[warn]   http://oss.sonatype.org/content/repositories/releases/org/scalacheck/scalacheck_2.9.3/1.10.0/scalacheck_2.9.3-1.10.0-sources.src
[warn] ==== public: tried
[warn]   http://repo1.maven.org/maven2/org/scalacheck/scalacheck_2.9.3/1.10.0/scalacheck_2.9.3-1.10.0-sources.src
[warn]  [FAILED     ] org.scala-tools.testing#specs_2.9.3;1.6.9!specs_2.9.3.src:  (0ms)
[warn] ==== local: tried
[warn]   /Users/oscarb/.ivy2/local/org.scala-tools.testing/specs_2.9.3/1.6.9/srcs/specs_2.9.3-sources.src
[warn] ==== sonatype-snapshots: tried
[warn]   http://oss.sonatype.org/content/repositories/snapshots/org/scala-tools/testing/specs_2.9.3/1.6.9/specs_2.9.3-1.6.9-sources.src
[warn] ==== sonatype-releases: tried
[warn]   http://oss.sonatype.org/content/repositories/releases/org/scala-tools/testing/specs_2.9.3/1.6.9/specs_2.9.3-1.6.9-sources.src
[warn] ==== public: tried
[warn]   http://repo1.maven.org/maven2/org/scala-tools/testing/specs_2.9.3/1.6.9/specs_2.9.3-1.6.9-sources.src
[warn]  ::::::::::::::::::::::::::::::::::::::::::::::
[warn]  ::              FAILED DOWNLOADS            ::
[warn]  :: ^ see resolution messages for details  ^ ::
[warn]  ::::::::::::::::::::::::::::::::::::::::::::::
[warn]  :: org.scalacheck#scalacheck_2.9.3;1.10.0!scalacheck_2.9.3.src
[warn]  :: org.scala-tools.testing#specs_2.9.3;1.6.9!specs_2.9.3.src
[warn]  ::::::::::::::::::::::::::::::::::::::::::::::
[info] Updating {file:/Users/oscarb/workspace/chill/}chill-java...
[info] Resolving com.esotericsoftware.kryo#kryo;2.21 ...
[info] Updating {file:/Users/oscarb/workspace/chill/}chill-all...
[info] Resolving org.scala-tools.testing#specs_2.9.3;1.6.9 ...
[info] Done updating.
[info] Resolving org.objenesis#objenesis;1.2 ...
[info] Updating {file:/Users/oscarb/workspace/chill/}chill-hadoop...
[info] Resolving org.scala-tools.testing#specs_2.9.3;1.6.9 ...
[info] Done updating.
[info] Resolving log4j#log4j;1.2.17 ...
[info] Done updating.
sbt.ResolveException: download failed: org.scalacheck#scalacheck_2.9.3;1.10.0!scalacheck_2.9.3.src
download failed: org.scala-tools.testing#specs_2.9.3;1.6.9!specs_2.9.3.src
    at sbt.IvyActions$.sbt$IvyActions$$resolve(IvyActions.scala:214)
    at sbt.IvyActions$$anonfun$update$1.apply(IvyActions.scala:122)
    at sbt.IvyActions$$anonfun$update$1.apply(IvyActions.scala:121)
    at sbt.IvySbt$Module$$anonfun$withModule$1.apply(Ivy.scala:117)
    at sbt.IvySbt$Module$$anonfun$withModule$1.apply(Ivy.scala:117)
    at sbt.IvySbt$$anonfun$withIvy$1.apply(Ivy.scala:105)
    at sbt.IvySbt.liftedTree1$1(Ivy.scala:52)
    at sbt.IvySbt.action$1(Ivy.scala:52)
    at sbt.IvySbt$$anon$3.call(Ivy.scala:61)
    at xsbt.boot.Locks$GlobalLock.withChannel$1(Locks.scala:98)
    at xsbt.boot.Locks$GlobalLock.withChannelRetries$1(Locks.scala:81)
    at xsbt.boot.Locks$GlobalLock$$anonfun$withFileLock$1.apply(Locks.scala:102)
    at xsbt.boot.Using$.withResource(Using.scala:11)
    at xsbt.boot.Using$.apply(Using.scala:10)
    at xsbt.boot.Locks$GlobalLock.ignoringDeadlockAvoided(Locks.scala:62)
    at xsbt.boot.Locks$GlobalLock.liftedTree1$1(Locks.scala:52)
    at xsbt.boot.Locks$GlobalLock.withLock(Locks.scala:52)
    at xsbt.boot.Locks$.apply0(Locks.scala:31)
    at xsbt.boot.Locks$.apply(Locks.scala:28)
    at sbt.IvySbt.withDefaultLogger(Ivy.scala:61)
    at sbt.IvySbt.withIvy(Ivy.scala:102)
    at sbt.IvySbt.withIvy(Ivy.scala:98)
    at sbt.IvySbt$Module.withModule(Ivy.scala:117)
    at sbt.IvyActions$.update(IvyActions.scala:121)
    at sbt.Classpaths$$anonfun$work$1$1.apply(Defaults.scala:955)
    at sbt.Classpaths$$anonfun$work$1$1.apply(Defaults.scala:953)
    at sbt.Classpaths$$anonfun$doWork$1$1$$anonfun$58.apply(Defaults.scala:976)
    at sbt.Classpaths$$anonfun$doWork$1$1$$anonfun$58.apply(Defaults.scala:974)
    at sbt.Tracked$$anonfun$lastOutput$1.apply(Tracked.scala:35)
    at sbt.Classpaths$$anonfun$doWork$1$1.apply(Defaults.scala:978)
    at sbt.Classpaths$$anonfun$doWork$1$1.apply(Defaults.scala:973)
    at sbt.Tracked$$anonfun$inputChanged$1.apply(Tracked.scala:45)
    at sbt.Classpaths$.cachedUpdate(Defaults.scala:981)
    at sbt.Classpaths$$anonfun$47.apply(Defaults.scala:858)
    at sbt.Classpaths$$anonfun$47.apply(Defaults.scala:855)
    at sbt.Scoped$$anonfun$hf10$1.apply(Structure.scala:586)
    at sbt.Scoped$$anonfun$hf10$1.apply(Structure.scala:586)
    at scala.Function1$$anonfun$compose$1.apply(Function1.scala:49)
    at sbt.Scoped$Reduced$$anonfun$combine$1$$anonfun$apply$12.apply(Structure.scala:311)
    at sbt.Scoped$Reduced$$anonfun$combine$1$$anonfun$apply$12.apply(Structure.scala:311)
    at sbt.$tilde$greater$$anonfun$$u2219$1.apply(TypeFunctions.scala:41)
    at sbt.std.Transform$$anon$5.work(System.scala:71)
    at sbt.Execute$$anonfun$submit$1$$anonfun$apply$1.apply(Execute.scala:232)
    at sbt.Execute$$anonfun$submit$1$$anonfun$apply$1.apply(Execute.scala:232)
    at sbt.ErrorHandling$.wideConvert(ErrorHandling.scala:18)
    at sbt.Execute.work(Execute.scala:238)
    at sbt.Execute$$anonfun$submit$1.apply(Execute.scala:232)
    at sbt.Execute$$anonfun$submit$1.apply(Execute.scala:232)
    at sbt.ConcurrentRestrictions$$anon$4$$anonfun$1.apply(ConcurrentRestrictions.scala:160)
    at sbt.CompletionService$$anon$2.call(CompletionService.scala:30)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
    at java.lang.Thread.run(Thread.java:722)
[error] (chill/*:update) sbt.ResolveException: download failed: org.scalacheck#scalacheck_2.9.3;1.10.0!scalacheck_2.9.3.src
[error] download failed: org.scala-tools.testing#specs_2.9.3;1.6.9!specs_2.9.3.src

Give Ryan LeCompte commit access

@caniszczyk Can you hook up @ryanlecompte ? He has help review and contributed code. Additionally he helped get chill merged in Spark.

Ryan: FYI our rule is: never merge code you write yourself. Always ping another committer. Also, only merge when travis is green (obviously).

KryoBijection not deserializing complex List properly

This is using latest chill. See the following:

import com.twitter.chill._
import scala.collection._

val s = List(Some(mutable.HashMap(1->1, 2->2)), None, Some(mutable.HashMap(3->4)))
KryoBijection.invert(KryoBijection(s))

The above incorrectly produces:

res4: java.lang.Object = List(Some(Map(3 -> 4)), None, Some(Map(3 -> 4)))

Remove duplication of config object

There are a bunch of similar classes about building and setting up Kryo given Map[String, String] (see below).

Can we add a base class that gets most of the common cases, put that in chill-java and leverage it in KryoRegistrationHelper.scala.

There is some serious code smell going on here. Ideally we could get storm to depend on chill-java and share this java-only code.


https://github.com/twitter/chill/blob/develop/chill-scala/src/main/scala/com/twitter/chill/KryoRegistrationHelper.scala

https://github.com/twitter/chill/blob/develop/chill-hadoop/src/main/java/com/twitter/chill/hadoop/KryoFactory.java

https://github.com/nathanmarz/storm/blob/master/storm-core/src/jvm/backtype/storm/serialization/IKryoFactory.java

https://github.com/nathanmarz/storm/blob/master/storm-core/src/jvm/backtype/storm/serialization/IKryoDecorator.java

Deal with inner case classes

inner case classes seem to capture references to outer even if they are not used.

If we can find some cheap way to detect this, it would really help scalding and possibly spark.

Add example about serializing to bytes and deserializing from bytes

Currently README only mentions MeatLocker.
Should I add an example like this to README (I will send a pull request):

import scala.util.Try
import com.twitter.chill.KryoBijection

object SeriDeseri {
  def serialize(ref: AnyRef): Array[Byte] = KryoBijection(ref)
  def deserialize(bytes: Array[Byte]) = Try { KryoBijection.invert(bytes) }
}

Regex marshal/unmarshal failure

Weird regex error. See below:

import com.twitter.chill._

val r = """\bhilarious""".r

r.findFirstIn("oh boy this is hilarious").isDefined
true

val chilled = KryoInjection.invert(KryoInjection(r)).get.asInstanceOf[Regex]

chilled.findFirstIn("oh boy this is hilarious").isDefined

java.lang.IndexOutOfBoundsException: No group 0
    at java.util.regex.Matcher.group(Matcher.java:470)
    at java.util.regex.Matcher.group(Matcher.java:428)
    at scala.util.matching.Regex.findFirstIn(Regex.scala:213)
    at .<init>(<console>:13)
    at .<clinit>(<console>)
    at .<init>(<console>:11)
    at .<clinit>(<console>)
    at $print(<console>)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

chill protobuf

Make a serializer that just let's protobuf messages do their thing. Make it easy to register for any subclass of Message.

Wrong SerDe of mutable.Map[Int, mutable.Set[AnyRef]]

When doing a roundtrip over a mutable Map of mutable Sets, the result is not the same. It seems that it reuses each mutable.Set as the basis for the next.

This happens with Kryo 2.21 and Chill 0.2.3

Example:

val obj = mutable.Map(4 -> mutable.Set("house1", "house2"), 1 -> mutable.Set("name3", "name4", "name1", "name2"), 0 -> mutable.Set(1, 2, 3, 4))

MeatLocker(obj).copy // => Map(4 -> Set(house1, house2), 1 -> Set(name3, house1, house2, name4, name1, name2), 0 -> Set(name3, house1, 1, house2, 2, 3, name4, name1, 4, name2))

Register actual Map/Set Instances

Registering these will prevent their names from being written into the stream, which could be a big win.

scala> Map(1->2).getClass
res124: java.lang.Class[_ <: scala.collection.immutable.Map[Int,Int]] = class scala.collection.immutable.Map$Map1

scala> Map(1->2, 2->4).getClass
res125: java.lang.Class[_ <: scala.collection.immutable.Map[Int,Int]] = class scala.collection.immutable.Map$Map2

scala> Map(1->2, 2->4, 4->5).getClass
res126: java.lang.Class[_ <: scala.collection.immutable.Map[Int,Int]] = class scala.collection.immutable.Map$Map3

scala> Map(1->2, 2->4, 4->5, 5->6).getClass
res127: java.lang.Class[_ <: scala.collection.immutable.Map[Int,Int]] = class scala.collection.immutable.Map$Map4

scala> Map(1->2, 2->4, 4->5, 5->6, 6->7).getClass
res128: java.lang.Class[_ <: scala.collection.immutable.Map[Int,Int]] = class scala.collection.immutable.HashMap$HashTrieMap

scala> Set().getClass
res132: java.lang.Class[_ <: scala.collection.immutable.Set[Nothing]] = class scala.collection.immutable.Set$EmptySet$

scala> Set(1).getClass
res133: java.lang.Class[_ <: scala.collection.immutable.Set[Int]] = class scala.collection.immutable.Set$Set1

scala> Set(1,2).getClass
res134: java.lang.Class[_ <: scala.collection.immutable.Set[Int]] = class scala.collection.immutable.Set$Set2

scala> Set(1,2,3).getClass
res135: java.lang.Class[_ <: scala.collection.immutable.Set[Int]] = class scala.collection.immutable.Set$Set3

scala> Set(1,2,3,4).getClass
res136: java.lang.Class[_ <: scala.collection.immutable.Set[Int]] = class scala.collection.immutable.Set$Set4

scala> Set(1,2,3,4,5).getClass
res137: java.lang.Class[_ <: scala.collection.immutable.Set[Int]] = class scala.collection.immutable.HashSet$HashTrieSet

chill scrooge

For scrooge generated objects, make sure to use scrooge's thrift serialization.

Improve performance by using the Unsafe-based serialization supportted by trunk version of Kryo

It could be worth trying to test with the current upcoming trunk version of Kryo. It contains significant perormance improvements including Unsafe-based serialization and ability to serialize directly into off-heap memory with almost a native speed if required. To use these new features, you only need to use new new IO classes (like UnsafeInput and UnsafeOutput) instead of standrad Input/Output classes.

IMHO, these features could be very useful in general and for Big Data apps in particular (Hadoop, Cassandra, Storm, etc).

BTW, after I added Unsafe-serialization to Kryo, some other well-known frameworks included similar features. The list includes: Avro (I added it there myself), Hazelcast (they reused may ideas and adpated it to their APIs). MapDB could be the next one.

Expected performance improvements are rather significant: You often see 3-4X speedups. If your classes contain arrays of native types (ints, floats, doubles) you can expect even 10x performance improvements.

Chill should to support scala Enumerations

Caused by: java.lang.NoSuchMethodError: scala.Enumeration$Val: method <init>()V not found
    at scala.Enumeration$ValConstructorAccess.newInstance(Unknown Source)
    at com.esotericsoftware.kryo.Kryo$1.newInstance(Kryo.java:978)
    at com.esotericsoftware.kryo.Kryo.newInstance(Kryo.java:1027)
    at com.esotericsoftware.kryo.serializers.FieldSerializer.create(FieldSerializer.java:228)
    at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:217)
    at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:718)
    at com.twitter.chill.Tuple3Serializer.read(TupleSerializers.scala:60)

NPE on deserialized fn referencing a singleton

Just in case it got missed in the comment of my prior issue, I've created a new issue here.

The following code causes a NPE using the develop branch (checked out a few days ago) with scala 2.10:

object Test {
    def apply() = {
        println("Test applied")
    }

}

object Main extends Application {
    import com.twitter.chill._
    override def main(args: Array[String]) {
        val TestAsVar = Test
        val bytes = KryoInjection(() => {
                                      println("Anon. fn")
                                      Test()
                                  })
        val x = KryoInjection.invert(bytes)
        println(x)
        x.get match {
            case f : Function0[Unit] => f()
        }
    }
}

The call to Test() causes a NPE. The interesting thing is: if I replace "Test()" with "TestAsVar()", then it works.

mutale.ArrayBuffer gets incorrectly deserialized as a Vector

scala> import com.twitter.chill._
import com.twitter.chill._

scala> import scala.collection._
import scala.collection._

scala> val a = mutable.ArrayBuffer(1,2,3,4,5)
a: scala.collection.mutable.ArrayBuffer[Int] = ArrayBuffer(1, 2, 3, 4, 5)

scala> KryoInjection(a)
res0: Array[Byte] = Array(1, 0, 115, 99, 97, 108, 97, 46, 99, 111, 108, 108, 101, 99, 116, 105, 111, 110, 46, 109, 117, 116, 97, 98, 108, 101, 46, 65, 114, 114, 97, 121, 66, 117, 102, 102, 101, -14, 1, 5, 2, 2, 2, 4, 2, 6, 2, 8, 2, 10)

scala> KryoInjection.invert(KryoInjection(a))
res1: Option[java.lang.Object] = Some(Vector(1, 2, 3, 4, 5))

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.