Giter Site home page Giter Site logo

Comments (2)

sirthias avatar sirthias commented on June 10, 2024 2

Ok, Philipp, let's take a step back and look at your overall requirements first.

When choosing a serialization approach for an application usually the following factors are the most important:

  1. Interoperability
    Who is supposed to decode what you encode? Just your own application/system or other systems that are potentially outside of your control?
    Same on the reading side: Are you only decoding your own encodings or do you need to be able to deserialize bytes that other applications (e.g. clients) have encoded?

  2. Durability
    Are the serializations more or less "transient" (e.g. in a client/server protocol interaction) or does your system need to be able to deserialize bytes that have been encoded potentially years prior? In the latter case long-term evolvability / versioning becomes a crucial part of the whole approach!

  3. Efficiency
    Both spacial and temporal efficiency is often a requirement but usually not the foremost priority.

  4. Self-Describability
    IMHO this is often underrated. Protobuf bytes, for example, are completely inscrutable if you don't have the schema information for them.
    When picking a serialization approach it's important to think about how you are going to manage schema evolution in the long-term, if the encoded format doesn't include structure information. JSON and CBOR are easier in that regard because they describe themselves on the lowest level, which doesn't mean though, that they contain everything your application needs to properly deserialize them!

Borer implements CBOR and JSON, which are both properly defined, "standard" formats, which help with (1) and (2) above.
However, by relying on reflection or full auto-deriving "magic" you give up a lot of control, which can make proper versioning quite hard!
(1) and especially (2) are going to be a challenge in any real application with such an approach.
For example, what happens if you would like to add, remove or rename a field in some class deep down in your model? By relying on reflection it's going to be difficult to maintain all the history of how a field used to be called 2 years earlier.

Regarding (3) borer is quite alright, especially on the CBOR side. But you'll probably be fine with any of the other available scala libs for JSON as well here.
(4) isn't really an issue with borer as CBOR as well as JSON are "self-describing" in the basic sense.

Now to your Codec[Any] question:
One of borer's strength is the quite fine control that is gives you over what exactly you'd like to be derived and what you'd like to define manually.
With a Codec[Any] borer cannot help you at all. Derivation is off the table. You are in "define-everything-manually-or-rely-on-reflection-land".
If you really need a Codec[Any] I wouldn't use borer at all since it doesn't really much value beyond the raw low-level layer of understanding JSON and CBOR.
And if you don't need CBOR I'd rather go for another serialization lib.

Also, I personally wouldn't mix or stack two different serialization libs when you can also get away with a single one.

So, to sum up:
borer is heavily targeted towards idiomatic scala, which is a typeclass-based approach to serialization.
The compiler helps where you'd like it to and you get proper compile-time errors for all missing codecs.
No reflection, no runtime errors, no "super-macros" that claim to solve everything automatically and are hard to debug and to properly reign in.

And have the user supply Encoder[T] and Decoder[T]. This is more hassle for the user because they have to bring their own codec but it comes with a few advantages...

In fact, this is not a hassle at all! It's exactly what I would expect!
I would like to have control over how my type T is supposed to be (de)serialized, because it's a key piece of logic once you think about longer-term evolution outside of toy projects!
And if I really don't care I can simply say

implicit val fooCodec = deriveCodec[Foo]

and have the compiler apply the default logic.
Then, later, I can always switch to any other logic for this specific type (which might be somewhere deep in my model hierarchy), while everything else stays what it is.

Embrace the Codec[T] !! :)

from borer.

phdoerfler avatar phdoerfler commented on June 10, 2024

First of all, thanks a lot for your detailed reply!
About your questions:

  1. Only my own application
  2. Data does not need to survive multiple years, it's okay if the format changes from one day to another.
  3. Definitely not a priority here
  4. Very important! That's why I went for JSON

I do agree with all the points you bring up. This is really just a toy library for fun and maybe to help developers visualize what they are storing in their event sourcing journal for a short time.
Once they have seen what they wanted to see they would best be advised to switch to a different storage backend again.

The event sourcing implementation I'm writing this for, akka persistence, requires a event journal plugin to be able to serialize AnyRef, hence the need for Codec[Any]. I know Borer wasn't made for this and, again, for anything serious I would absolutely prefer hand written Codecs. But I was hoping I could perhaps coerce Borer into it because I otherwise really like the library.

This being said, I have run into further complications with akka persistence since I wrote this issue. I was hoping I'd find the time to deal with those and then have a more insightful reply. Alas, my spare time has recently been a bit on the short side.

from borer.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.