julianpeeters / sbt-avrohugger Goto Github PK

View Code? Open in Web Editor NEW

132.0 5.0 50.0 549 KB

sbt plugin for generating Scala sources for Apache Avro schemas and protocols.

License: Apache License 2.0

Scala 100.00%

avro sbt-plugin scala

sbt-avrohugger's Introduction

sbt-avrohugger

sbt plugin for generating Scala case classes and ADTs from Apache Avro schemas, datafiles, and protocols.

Install the plugin (compatible with sbt 1.3+)

Add the following lines to the file myproject/project/plugins.sbt in your project directory:

addSbtPlugin("com.julianpeeters" % "sbt-avrohugger" % "2.8.3")

Usage

The following tasks and settings are automatically imported to your build:

Tasks:

Name	Description
`avroScalaGenerate`	Compiles the Avro files into Scala case classes.
`avroScalaGenerateSpecific`	Compiles the Avro files into Scala case classes implementing `SpecificRecord`.

Compile

Wire the tasks into compile in your build.sbt:

e.g.: Compile / sourceGenerators += (Compile / avroScalaGenerate).taskValue

By default, the plugin looks Avro files in src/main/avro and generates Scala files in $sourceManaged, e.g., target/scala-3.3.1/src_managed/main/compiled_avro/ (to choose different locations, please see Changing Settings).

Test

And/Or wire the tasks into the Test config, putting Avro files in src/test/avro:

e.g.: Test / sourceGenerators += (Test / avroScalaGenerate).taskValue

Manually

To run the tasks manually, please see Changing Settings or the sbt docs in order to ensure the compiler will be able to find the generated files.

Watch Avro Files

To enable file-watching for avro files, e.g. in ~compile, use:

e.g.: watchSources ++= ((Compile / avroSourceDirectory).value ** "*.avdl").get

Settings:

Standard Settings

Name	Default	Description
`avroSourceDirectories`	`Seq("src/main/avro")`	List of paths containing `.avsc`, `.avdl`, and/or `*.avro` files.
`avroScalaSource`	`$sourceManaged/main/compiled_avro`	Path for the generated `.scala` or `.java` files.
`avroScalaCustomTypes`	`Standard.defaultTypes`	Customizable Type Mapping.
`avroScalaCustomNamespace`	`Map.empty[String, String]`	Map for reassigning namespaces.

SpecificRecord Settings

Name	Default	Description
`avroSpecificSourceDirectories`	`Seq("src/main/avro")`	Path containing `.avsc`, `.avdl`, and/or `*.avro` files.
`avroSpecificScalaSource`	`$sourceManaged/main/compiled_avro`	Path for the generated `.scala` or `.java` files.
`avroScalaSpecificCustomTypes`	`SpecificRecord.defaultTypes`	Customizable Type Mapping.
`avroScalaSpecificCustomNamespace`	`Map.empty[String, String]`	Map for reassigning namespaces.

Changing Settings

Settings for each format's task can be extended/overridden by adding lines to your build.sbt file.

E.g., to change how classes of SpecificRecords format are generated, use:

Compile / avroSpecificSourceDirectories += (Compile / sourceDirectory).value / "myavro"

Compile / avroSpecificScalaSource := new java.io.File("myScalaSource")

Compile / avroScalaSpecificCustomNamespace := Map("my.example"->"my.overridden.ex", "test.*" -> "wildcarded")

Compile / avroScalaSpecificCustomTypes := {
  avrohugger.format.SpecificRecord.defaultTypes.copy(
    array = avrohugger.types.ScalaVector)
}

record can be assigned to ScalaCaseClass and ScalaCaseClassWithSchema (with schema in a companion object)
array can be assigned to ScalaSeq, ScalaArray, ScalaList, and ScalaVector
enum can be assigned to JavaEnum, ScalaCaseObjectEnum, EnumAsScalaString, and ScalaEnumeration
fixed can be assigned to , ScalaCaseClassWrapper and ScalaCaseClassWrapperWithSchema(with schema in a companion object)
union can be assigned to OptionEitherShapelessCoproduct and OptionalShapelessCoproduct
int, long, float, double can be assigned to ScalaInt, ScalaLong, ScalaFloat, ScalaDouble
date logical type can be assigned to JavaTimeLocalDate and JavaSqlDate
timestamp-millis logical type can be assigned to JavaTimeInstant and JavaSqlTimestamp
uuid logical type can be assigned to UUID
decimal can be assigned to e.g. ScalaBigDecimal(Some(BigDecimal.RoundingMode.HALF_EVEN)) and ScalaBigDecimalWithPrecision(None) (via Shapeless Tagged Types)
protocol can be assigned to ScalaADT and NoTypeGenerated (see Protocol Support)

Datatypes

Supports generating case classes with arbitrary fields of the following datatypes: see avrohugger docs - supported datatypes

Testing

Please run unit tests in src/test/scala with ^ test, and integration tests in src/sbt-test with ^ scripted sbt-avrohugger/*, or, e.g. scripted sbt-avrohugger/specific.

Credits

sbt-avrohugger is based on sbt-avro originally developed by Juan Manuel Caicedo, and depends on avrohugger.

Contributors


Marius Soutier Vince Tse Saket Raúl Raja Martínez Marco Stefani Sacha Barber sullis Alexandre Bertails Alexander Khotyanov	Brennan Saeta Jerome Wacongne Jon Morra Paul Snively Diego E. Alonso Blas Andrew Gustafson Jonas Grabber mcenkar	Daniel Lundin Ryan Koval Simonas Gelazevicius Zach Cox Chris Albright Fede Fernández natefitzgerald Angel Sanadinov

Fork away, just make sure the tests pass before you send a pull request.

Criticism is appreciated.

sbt-avrohugger's People

Contributors

Stargazers

Watchers

Forkers

javierarrieta efortin rkoval skate056 kionka defg davidhoyt alancnet mcoffin jon-morra-zefr snadorp durp s-gelazevicius chaumcca umnick84 saveveltri foreblue banno christopherdavenport kmatasfp sullis inafets boxsterman sachabarber chrisalbright barambani agustafson fedefernandez gingerkirsch aleksandr-vin pascals-ager withinboredom azaroui konaitis nadavwr jobegrabber niqdev tidymaze geerdink yaak talentformation targeter dhlparcel mcenkar ltronky ouertani leonpoon steve-e despitedeath leonteq

sbt-avrohugger's Issues

New PR for https://github.com/julianpeeters/sbt-avrohugger/pull/25

Seeing an unexpected error in the scripted tests after merging conflicting #25 and #30, so I reverted #25 and will try #30 solo first. This means #25 must be resubmitted or otherwise dealt with.

[error] source file '/Users/julianpeeters/Dropbox/avrohugger/avrohugger-core/src/sbt-test/projects/GenericCaseObjectEnumSerializationTests/project' could not be found

Bubble up error messages

Currently there's no way of knowing that the problem with your schema is, you just don't see any generated classes. So it'd be nice to see Avro exceptions in the sbt log output.

Cannot resolve AvroConfig

I add the following to my build.sbt file:

sbtavrohugger.SbtAvrohugger.scavroSettings
(stringType in avroConfig) := "String"

And get an error: "not found: value sbtavro"

Avro error with imports in 0.10.1

Hi, I'm importing AVSC schemas in my AVDL files. This is working fine in 0.9.x, but since 0.10.x, I'm getting the following error:

(modelJVM/avro:generateSpecific) org.apache.avro.SchemaParseException: Can't redefine: com.example.ImportedType

at org.apache.avro.Schema$Parser.parse(Schema.java:932)
    at avrohugger.input.parsers.FileInputParser.getSchemaOrProtocols(FileInputParser.scala:53)
    at avrohugger.input.parsers.FileInputParser$$anonfun$3.apply(FileInputParser.scala:67)
    at avrohugger.input.parsers.FileInputParser$$anonfun$3.apply(FileInputParser.scala:66)
    at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251)
    at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251)

AvscFileSorter not taking nullable array into account

Hi,

I just run into an issue adding one new nested level in my avsc files. I used the 0.9.1 version.
It seems that this kind of type

"type": [
        "null",
        {
          "type": "array",
          "items": "model.IdAndLabel"
        }
      ]

does not allow model.IdAndLabel to be taken into account by the AVSCFileSorter.

If the null is removed, the files are treated in the correct order

"type":
        {
          "type": "array",
          "items": "model.IdAndLabel"
        }

I had 3 levels of nested object like the following. The father.avsc is treated before idAndLabel.avsc resulting in the following error :
org.apache.avro.SchemaParseException: Undefined name: "model.IdAndLabel"

idAndLabel.avsc

{
  "type": "record",
  "name": "IdAndLabel",
  "namespace": "model",
  "fields": [
    {
      "name": "id",
      "type": {
        "type": "string"
      }
    },
    {
      "name": "label",
      "type": {
        "type": "string"
      }
    }
  ]
}

father.avsc

{
  "type": "record",
  "name": "Father",
  "namespace": "model",
  "fields": [
    {
      "name": "idsAndLabels",
      "type": [
        "null",
        {
          "type": "array",
          "items": "model.IdAndLabel"
        }
      ]
    }
  ]
}

grandFather.avsc

{
  "type": "record",
  "name": "GrandFather",
  "namespace": "model",
  "fields": [
    {
      "name": "father",
      "type": "model.Father"
    }
  ]
}

Thanks !

Feature: Support Vector for avroScalaCustomTypes

Supporting Vector specifically instead of (or at least in addition to) Seq would be very useful in order to force users to think about run-time semantics for insertion (big-O).

If you want to go whole-hog, any type for which there is an ApplicativePlus.

Task that converts avro IDL to AVSC

#27 (comment)

Any way to extend a trait from a record type's companion object ?

At the moment when generating a record (I use a SpecificRecord but I guess is the same for the others) also the companion object of the case class is (correctly) generated. So for something like

protocol TestProtocol {

  record Test {
    long id;
    string text;
  }
}

the generated code is

import scala.annotation.switch

case class Test(var id: Long, var text: String) extends org.apache.avro.specific.SpecificRecordBase {
  def this() = this(0L, "")
  def get(field$: Int): AnyRef = {
    (field$: @switch) match {
      case 0 => {
        id
      }.asInstanceOf[AnyRef]
      case 1 => {
        text
      }.asInstanceOf[AnyRef]
      case _ => new org.apache.avro.AvroRuntimeException("Bad index")
    }
  }
  def put(field$: Int, value: Any): Unit = {
    (field$: @switch) match {
      case 0 => this.id = {
        value
      }.asInstanceOf[Long]
      case 1 => this.text = {
        value.toString
      }.asInstanceOf[String]
      case _ => new org.apache.avro.AvroRuntimeException("Bad index")
    }
    ()
  }
  def getSchema: org.apache.avro.Schema = Test.SCHEMA$
}

object Test {
  val SCHEMA$ = new org.apache.avro.Schema.Parser().parse("{\"type\":\"record\",\"name\":\"Test\",\"namespace\":\"com.hbc.product.streams.topology.source\",\"fields\":[{\"name\":\"id\",\"type\":\"long\"},{\"name\":\"text\",\"type\":\"string\"}]}")
}

Is there any way I could specify a trait that would be extended by the companion object ? Something like

object Test extends TestInstances {
  final val SCHEMA$ = new org.apache.avro.Schema.Parser().parse("{\"type\":\"record\",\"name\":\"Test\",\"namespace\":\"com.hbc.product.streams.topology.source\",\"fields\":[{\"name\":\"id\",\"type\":\"long\"},{\"name\":\"text\",\"type\":\"string\"}]}")
}

The use case I have is that this way it would be possible to put type class instances in the implicit scope for the generated types avoiding orphans. Thanks.

Doesn't support custom fixed types

My avdl has a top level field (inside the protocol) like this:

fixed Bytes16(16);

Using sbt-avrohugger on this file gives this error:

java.lang.RuntimeException: Only RECORD and ENUM can be top-level definitions

This page specifies how these values are defined, with an example at the bottom of the page: https://avro.apache.org/docs/1.8.2/idl.html#format_fixed

Support usage in test config

I'd like to use sbt-avrohugger to generate classes specifically for tests, with the schemas living in src/test/avro. With the current plugin, it seems the only way to do this is to create a separate project for tests that contains the avro files in src/main/avro.

I'm not sure what benefits using a config has for this plugin, and perhaps using prefixed keys would be a better design. Instead of using a config, the keys would be prefixed, e.g., avroGenerateSpecific, and then could be used in another config, e.g., test:avroGenerateSpecific.

There's some notes on this in the sbt docs: http://www.scala-sbt.org/0.13/docs/Plugins-Best-Practices.html#Configuration+advices

When cross-building is enabled, the code is generated only for one version

In my SBT project, my main Scala version is set to 2.11.8, but because I am also using Spark, I set up my project for cross-building for 2.10.6 and 2.11.8.

However, the +avro:generate-specific (the plus for crossbuild mode) puts the generated Scala code to target/scala-2.10/src_managed/main/compiled_avro. I manually have to copy it to the respective scala_2.11 folder.

Without the copying, a sbt +compile works, but when using the project in Eclipse, I manually have to add the 2.10 folder to my project, otherwise it shows a lot of compiler errors.

Is it possible to tell the plugin to output the generated code to both directories?

Schema objects hard to get and reuse

Waiting on a fix in avrohugger:

Move schema val into a companion object (and generate a companion object!).
Have getSchema method reference the static schema in the companion object, rather than instantiate a new Schema object each time.

sbt-avrohugger conflicts with custom source generators

We have a custom source generator in our project, which dumps some build info in project sources:

    sourceGenerators in Compile += Seq(Def.task {
      val file = (sourceManaged in Compile).value / "settings.scala"
      IO.write(file, s"""package com.snowplowanalytics.${name.value}.generated
                         |object ProjectSettings {
                         |  val version = "${version.value}"
                         |  val name = "${name.value}"
                         |  val organization = "${organization.value}"
                         |  val scalaVersion = "${scalaVersion.value}"
                         |}
                         |""".stripMargin)
      Seq(file)
    }.taskValue)

It was working well for a long time, but now (after we added sbt-avrohugger) it fails compile with following error:

[info] Compiling 38 Scala sources to /vagrant/target/scala-2.11/classes...
[error] /vagrant/target/scala-2.11/src_managed/main/ProjectSettings.scala:2: ProjectSettings is already defined as object ProjectSettings
[error] object ProjectSettings {
[error]        ^
[error] one error found
[error] (compile:compileIncremental) Compilation failed
[error] Total time: 23 s, completed Aug 3, 2016 11:32:33 AM

This problem arise at very weird circumstances - at first run it works fine, but then starts to fail with above error even if I delete target directory.

Cant seem to find the "avroScalaGenerateSpecific" in built.sbt

I have the following structure in IntelliJ/SBT setup

my folder

build.sbt
project
- Deps.sbt
- plugins.sbt
common
- src
  - main
    - avro
      - userSchema.avsc
publisher
subscriber

The schema looks like this

{
    "namespace": "com.barber.kafka.avro",
     "type": "record",
     "name": "user",
     "fields":[
         {
            "name": "id", "type": "int"
         },
         {
            "name": "name",  "type": "string"
         }
     ]
}

plugins.sbt looks like this
addSbtPlugin("com.julianpeeters" % "sbt-avrohugger" % "2.0.0-RC10")

And here is my build.sbt file (see how its a multiproject SBT)

import sbt.Keys.scalaVersion


lazy val root = (project in file(".")).
  aggregate(publisher, subscriber).
  settings(
    inThisBuild(List(
      organization := "com.barber",
      scalaVersion := "2.12.1",
      resolvers += "Sonatype OSS Snapshots" at "https://oss.sonatype.org/content/repositories/snapshots",
      resolvers += "io.confluent" at "http://packages.confluent.io/maven/"
    )),
    name := "scala_kafka_specific_avro_example"
  )

lazy val publisher = (project in file ("publisher")).
  settings(
    name := "publisher",
    /* sourceGenerators in Compile += (avroScalaGenerateSpecific in Compile).taskValue, */
    libraryDependencies ++= Seq(
      kafka,
      avro,
      avroSerializer,
      logBack)
  ).dependsOn(common)

lazy val subscriber = (project in file ("subscriber")).
  settings(
    name := "subscriber",
      /* sourceGenerators in Compile += (avroScalaGenerateSpecific in Compile).taskValue, */
    libraryDependencies ++= Seq(
      kafka,
      avro,
      avroSerializer,
      logBack)
  ).dependsOn(publisher, common)

lazy val common = (project in file ("common")).
  settings(
    name := "common",
    /* sourceGenerators in Compile += (avroScalaGenerateSpecific in Compile).taskValue, */
    libraryDependencies ++= Seq(
      kafka,
      avro,
      avroSerializer,
      logBack)
  )

And the Deps.sbt file looks like this

import sbt._

object Deps {
  lazy val kafka = "org.apache.kafka" % "kafka_2.11" % "1.1.0"
  lazy val avro =  "org.apache.avro" % "avro" % "1.8.2"
  lazy val avroSerializer = "io.confluent" % "kafka-avro-serializer" % "3.2.1"
  lazy val logBack = "ch.qos.logback" %  "logback-classic" % "1.1.7"
}

The issue I am having is that the build.sbt file doesn't seem to know about the avroScalaGenerateSpecific task.

I know the plugin is working, as I can run this SBT command

sbt common/avroScalaGenerateSpecific

And I get the correctly generated SpecificRecordBasecase class implementation in src_managed folder

Example of overriding ouputDir?

Heya,
Sorry for the question in the form of a ticket that is likely just my lack of knowledge/general confusion with sbt, but I am really struggling setting overriding the outputDir.

I saw that sbt-avro has an example to override here:
https://github.com/cavorite/sbt-avro

eg:

seq( sbtavro.SbtAvro.avroSettings : _*)
(stringType in avroConfig) := "String"

Here, stringType is a SettingKey so works. However, outputDir is a sbt.Def.Setting and I can't figure out how to get this magic to kick in.

Any help would be appreciated and I can send a PR back to push this to the README.

Thanks!

Specify avsc definitions in multiple directories.

Hi, :)

Is it currently possible to specify and import avsc definitions from multiple directories? I've been poking at the settings sourceDirectory, which I believe currently has to be a single file and wasn't sure if this was due to avro (avsc file ordering) issues/limitations or as part of this project.

Ultimately I'm trying to see if there is a handy way of sharing some core avsc definitions across other packages to allow them to extend them.

Thanks and best,

Nina

avroScalaCustomTypes with array adds invalid constructor to case classes from generateSpecific

I have set the following in SBT:

import sbtavrohugger._, formats.specific._
import SpecificAvroSettings._
import AvrohuggerSettings._

(avroScalaCustomTypes in avroConfig) := Map("array" -> classOf[Array[_]])

This generates a case class which adds a default constructor that attempts to use Nil for Array which causes a compile error in scala 2.10. It would be better to use Array.empty or new Array[T](0).

Release for sbt-1.0.0

Would be great to have an sbt-1.0.0 compatible release.

Add setting to specify enum style for formats other than `SpecificRecord`

Not able to work with Scala 2.11 when i add this plugin

I am not able to use Scala 2.11 in my test project when i am using this plugin. Below is my build.sbt :

name := "ch19-spark-2"
version := "1.0"
scalaVersion := "2.11.8"
crossScalaVersions := Seq("2.11.8", "2.10.6")
sbtPlugin := true
val sparkVersion = "2.2.0"
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-core" % sparkVersion
)

my plugin.sbt looks like :

addSbtPlugin("com.julianpeeters" % "sbt-avrohugger" % "0.16.0")

The error that i keep getting is :
Conflicting cross-version suffixes in: org.json4s:json4s-ast, org.json4s:json4s-core

missing imports in generated classes

In updating the plugin from 0.10.1 to 0.11.0 I got a compilation error due to missing import statements.

I have a record SR in namespace xxx.model.v2 that is refers to types (X, Y, Z) defined in xxx.model namespace.
In the 0.10.1 generated code I have an import

import xxx.model.{X, Y, Z}

that is missing in 0.11.0.

(those are not the actual names of the classes but I hope that this will make sense).
Thanks!

Regression between 2.0.0-RC19 and 2.0.0-RC20

It seems that there is a regression between 2.0.0-RC19 and 2.0.0-RC20.
I am running SBT 1.3.3.
When I uptdate sbt-avrohugger to 2.0.0-RC20, I loose all the dependencies in Test scope.
It works fine when setting back 2.0.0-RC19.

How to reproduce :

create a new project with sbt via sbt new scala/scala-seed.g8
add a file project/plugins.sbt with addSbtPlugin("com.julianpeeters" % "sbt-avrohugger" % "2.0.0-RC20")
in sbt shell run test:compile

reverting to 2.0.0-RC19 fixes it.

Correct ordering of avdl files

I have a project consisting of a number of avdl files that depend on each other. On my local Mac I can force everything to work by naming files in alphabetical order. However, the build fails on an AWS machine. I'm trying to figure out how files are sorted there, but this seems like something that should be handled by the plugin. I see AVSCFileSorter which looks great for avsc files, but, looking at FileWriter it would appear avdl files are not sorted at all.

I think a sorter for avdl files would be great.

Thanks

Support avro 1.8.x

Relative path for (sourceDirectory in avroConfig) fails

If I don't fix the base in the override, e.g.:
(sourceDirectory in avroConfig) := baseDirectory.value / "relative/path/to/my/avro"
I get a
java.util.NoSuchElementException: key not found: relative/path/to/my/avro.avdl

It might be helpful to update the docs to indicate the change in behavior (0.13 seems to work fine with a relative path)

Generate Scala case classes in a custom package inside the src folder of an sbt project

I would like to generate Scala case classes inside a specific directory/package inside the src main folder of an sbt project.

The final goal is to be able to address and use the generated case classes inside the project source code.

I tried to play a bit with the avroScalaSource config but I didn't managed to force the plugin to generate the case classes inside the directory I need.

For the sake of completeness this is one of my try:

avroScalaSource in Compile := new java.io.File(s"${baseDirectory.value}/scala/com/my/custom/package")

P.S: sorry for the silly question

Error when adding plugin settings for both 'generate' and 'generate-specific'

I am not sure if it's a real issue, but importing plugin settings for both 'generate' and 'generate-specific' yields this error at compilation:

[error] .../hello/target/scala-2.11/src_managed/main/compiled_avro/com/avroserializer/MyClass.scala:4:MyClass is already defined as case class MyClass
[error] case class MyClass(var field1: Int, var field2: Int, var field3: Float) extends org.apache.avro.specific.SpecificRecordBase {

even though I only did avro:generate-specific and only one MyClass.scala class was generated. Problem solved when I imported only one setting in the SBT file, here I needed seq( sbtavrohugger.SbtAvrohugger.avroSettings : _*). It could be good to modify slightly the readme to tell people not to add both.

dependency syntax error

build.sbt dependency should be

"com.julianpeeters" %% "avrohugger-core" % "0.11.0"
^
instead of

"com.julianpeeters" % "avrohugger-core" %% "0.11.0"

Optional BigDecimal doesn't compile

Hi, I got an error when attempting to specify a field to be an optional decimal one (following the example here). Relevant custom type config is like so:

avroScalaSpecificCustomTypes in Compile := {
  avrohugger.format.SpecificRecord.defaultTypes.copy(
    enum = avrohugger.types.JavaEnum,
    uuid = avrohugger.types.JavaUuid,
    decimal = avrohugger.types.ScalaBigDecimal,
    timestampMillis = avrohugger.types.JavaTimeInstant,
    // Multiple records within a protocol definition will be placed in the same file inheriting from a base sealed trait
    protocol = ScalaADT
  )
}

Avdl record:

@namespace("payments")
protocol Payment {

  record Payment {
    string id;
    timestamp_ms  createdAt;
    union {decimal(16, 2), null} amount;
  }
}

Plugin version:
addSbtPlugin("com.julianpeeters" % "sbt-avrohugger" % "2.0.0-RC15")

The error that I get when I do sbt ";clean;compile" is:

[error] D:\domain\target\scala-2.12\src_managed\main\compiled_avro\payments\Payment.scala:24:21: value decimalConversion is not a member of object payments.Payment
[error]             Payment.decimalConversion.toBytes(bigDecimal, schema, decimalType)
[error]                     ^
[error] D:\domain\scala-2.12\src_managed\main\compiled_avro\payments\Payment.scala:51:34: value decimalConversion is not a member of object payments.Payment
[error]               BigDecimal(Payment.decimalConversion.fromBytes(buffer, schema, decimalType))
[error]                                  ^
[error] two errors found
[error] (Compile / compileIncremental) Compilation failed

If I replace the union {decimal(16, 2), null} with just decimal(16, 2) it works. When I try decimal = avrohugger.types.ScalaBigDecimalWithPrecision, instead of decimal = avrohugger.types.ScalaBigDecimal, with or without the union I get an error similar to:

[error] D:\domain\target\scala-2.12\src_managed\main\compiled_avro\payments\Payment.scala:6:82: not found: type @@
[error] case class Payment(var id: String, var createdAt: java.time.Instant, var amount: @@[scala.math.BigDecimal, ((shapeless.Nat._1, shapeless.Nat._6), shapeless.Nat._2)]) extends
org.apache.avro.specific.SpecificRecordBase {
[error]                                                                                  ^
[error] D:\domain\target\scala-2.12\src_managed\main\compiled_avro\payments\Payment.scala:47:22: not found: type @@
[error]       }.asInstanceOf[@@[scala.math.BigDecimal, ((shapeless.Nat._1, shapeless.Nat._6), shapeless.Nat._2)]]
[error]                      ^
[error] 8 errors found
[error] (Compile / compileIncremental) Compilation failed
[error] Total time: 8 s, completed Jan 19, 2019 6:58:57 PM

What am I doing wrong? Thanks in advance!

specific generator doesn't honour avroScalaCustomTypes

Hi. In build,sbt I specified such project:
lazy val root = (project in file(".")) .settings(sbtavrohugger.SbtAvrohugger.specificAvroSettings: _*) .settings((avroScalaCustomTypes in avroConfig) := Map("array"->classOf[Seq[_]])) .settings((sourceDirectory in avroConfig) := new java.io.File("src/main/resources/avro")) .settings((scalaSource in avroConfig) := sourceManaged.value / "avro-gen")

After classes are generated I still have mapping array-to-List. But I want to have array-to-Seq mapping. It works fine if I use usual avroSettings, not specific one.

Use Avro version from avrohugger

Can we remove the explicit Avro dependency and just use the one from avrohugger? That way you don't have to update it in two places.

Cannot Generate enums other than JavaEnum

Following the readme I'm trying to change the type generated for enum but I cannot make it work. When I add this

avroScalaSpecificCustomTypes in Compile := {
  SpecificRecord.defaultTypes.copy(
    enum = ScalaCaseObjectEnum
  )
}

to my build.sbt the plugin stops generating the classes, even those that don't have enums at all (it only creates the folders in src_managed). If I change it back to use the JavaEnum

avroScalaSpecificCustomTypes in Compile := {
  SpecificRecord.defaultTypes.copy(
    enum = JavaEnum
  )
}

It works fine. I'm using 2.0.0-RC14

Move AVSCFileSorter to avrohugger

AVSCFileSorter is generally useful outside the SBT plugin actually and was wondering if you would consider moving it into avrohugger.

Namespace mapping causes ClassCastException upon deserialization

When I try to serialize/deserialize an Avro structure using the SpecificDatumWriter/SpecificDatumReader classes, the deserialization throws a ClassCastException:

Exception in thread "main" java.lang.ClassCastException: org.apache.avro.generic.GenericData$Record cannot be cast to foo.scala.Status

I created a project to reproduce the error. Removing the namespace mapping from the build.sbt and change import foo.scala._ to import foo._ in Main.scala resolves the problem.

I also tried to work around the problem by instantiating the reader with a custom class-loader, but it doesn't seem to work (I started from the supposition that the error is coming from the class loader not being able to load the correct classes because of the namespace mapping, see also: AVRO-1620, AVRO-1240 for similar errors)

new SpecificDatumReader[Status](Status.SCHEMA$, Status.SCHEMA$, new SpecificData(new ClassLoader() {
    override def loadClass(s: String, b: Boolean): Class[_] = {
      if (s == "foo.Status")
        super.loadClass("foo.scala.Status", b)
      else
        super.loadClass(s, b)
    }
  }))

Consider supporting uploads to Kafka schema registry

draft courtesy of @xelax: https://gitter.im/julianpeeters/avrohugger?at=57c732e129ee4a6705812df9

Missing shapeless import

Hello,
I had an issue today with the generation of a case class with a missing shapeless import.
You can find below the avro files and the generated class
array error.tar.gz

This occurs while using an array type with more than 3 custom type
{"namespace": "com.sigfox.das.avro", "type": "record", "name": "RawMessageWithError", "fields": [ {"name": "message", "type": "string"}, {"name": "meta", "type": {"type": "array", "items": ["com.sigfox.das.avro.NumberMetaData", "com.sigfox.das.avro.OtherMetaData", "com.sigfox.das.avro.StringMetaData"]}} ] }

I did my test with the avroScalaGenerate task. I tried with the 1.1.0 and the 2.0.0-RC10 versions

Thanks
Pierre

Allow for IDL imports from different dirs/classpath

Let me start by complementing you on the very awesome and handy library!

A tiny request - correct me if I am wrong but right now, it looks like the only option to import another avro schema in IDL is to have the file being imported in the same directory as that of the importing avdl (e.g. https://github.com/julianpeeters/avrohugger/blob/master/avrohugger-core/src/test/avro/import.avdl).

This ultimately leads to a flat hierarchy where all of my (40+) .avsc and .idl files must be placed in the same folder. However, I would like to be able to group these events/entities nicely in different sub-directories to improve readability and most importantly to clearly define the boundaries of the different sub-domains (the so called bounded contexts) in our system.

I saw that there is a similar (resolved) request in Apache Avro: https://issues.apache.org/jira/browse/AVRO-971 - not sure if this is going to be of any help.

Thanks in advance and happy holidays! 🎅

404 on maven

Trying to download the plugin using sbt, I've noticed a failure:
sbt.ResolveException: unresolved dependency: com.julianpeeters#sbt-avrohugger;0.5.0: not found at sbt.IvyActions$.sbt$IvyActions$$resolve(IvyActions.scala:294)
Looking at maven, I've noticed I'm getting 404s for both 0.5.0 and 0.5.1:

curl -v http://repo1.maven.org/maven2/com/julianpeeters/sbt-avrohugger/0.5.1/
   Trying 199.27.76.209...
 Connected to repo1.maven.org (199.27.76.209) port 80 (#0)
 GET /maven2/com/julianpeeters/sbt-avrohugger/0.5.1/ HTTP/1.1
 Host: repo1.maven.org
 User-Agent: curl/7.43.0
 Accept: */*

 HTTP/1.1 404 Not Found
 Server: nginx
 Content-Type: text/html
 Via: 1.1 varnish
 Content-Length: 564
 Accept-Ranges: bytes
 Date: Sun, 08 Nov 2015 00:35:12 GMT
 Via: 1.1 varnish
 Age: 8
 Connection: keep-alive
 X-Served-By: cache-iad2142-IAD, cache-jfk1029-JFK
 X-Cache: HIT, HIT
 X-Cache-Hits: 1, 1
 X-Timer: S1446942912.330150,VS0,VE0

Trying to use sbt-avrohugger in Spark project, and unable to find plugin for Scala 2.11

Apache Spark is not yet built for 2.12 (I know...) so I still need to use scala 2.11.X, and there is no sbt-avrohugger for scala 2.11.

My build is looking for a version here:

https://repo1.maven.org/maven2/com/julianpeeters/sbt-avrohugger_2.11_1.0/2.0.0-RC4/sbt-avrohugger-2.0.0-RC4.pom

Which is (as you know) not available... 😬

Conversely, if I try to use Scala 2.12.X, I cannot find my Spark libs.

I've also tried downgrading to Scala 2.10 and sbt 0.13, to no avail; the same outcome - no spark libs compiled for 2.10.X

Would you be willing to publish a version for Scala 2.11?

Support downloading schemas from Kafka Schema registry

courtesy @xelax: https://gitter.im/julianpeeters/avrohugger?at=57c132878877dae6209e8c11

Regression for `avroScalaGenerateSpecific` since 2.0.0-RC16

Using avroScalaGenerateSpecific to generate 1 avsc file

with v2.0.0-RC15 - takes 1-2s
with v2.0.0-RC16 and later - takes 60s+; latest RC22 is also affected

Setup: Mac 10.14, sbt v1.3.5, openjdk version "1.8.0_232"

Avro

{
  "namespace": "ApiService",
  "name": "Event",
  "type": "record",
  "fields": [
    {
      "name": "event_type",
      "doc": "Event type",
      "type": {
        "name": "DPEventType",
        "type": "enum",
        "symbols": [
          "CREATE",
          "UPDATE",
          "DELETE",
          "OTHER"
        ]
      }
    },
    {
      "name": "entity",
      "type": {
        "name": "Entity",
        "type": "record",
        "fields": [
          {
            "name": "auto_id",
            "doc": "unique identifier",
            "type": "long"
          },
          {
            "name": "date_created",
            "doc": "Date of creation",
            "type": "string"
          },
          {
            "name": "date_updated",
            "doc": "Date of update",
            "type": "string"
          },
          {
            "name": "flags",
            "doc": "Bitmask of flag capabilities",
            "type": "long"
          }
        ]
      }
    }
  ]
}

Support for top level type definitions

Our company has a set of defined avro schemas which we're trying to use avrohugger on.

In these avro schemas, they have defined a top-level schema for fixed decimals:

{
    "type": "fixed",
    "size": 8,
    "namespace": "com.company.namespace",
    "name": "CustomDecimal",
    "logicalType": "decimal",
    "precision": 18,
    "scale": 6
}

This type is then referenced throughout other avro schemas like so:

{
    "namespace": "com.company.namespace.other",
    "type": "record",
    "name": "Operation",
    "fields": [
        {
            "name": "OperationType",
            "type": "com.company.namespace.other.OtherRecord"
        },
        {
            "name": "OperationAdjustment",
            "type": "com.company.namespace.CustomDecimal"
        },
        {
            "name": "OperationMode",
            "type": "com.company.namespace.other.ModeRecord"
        },
        {
            "name": "OperationValue",
            "type": "com.company.namespace.CustomDecimal"
        }
    ]
}

This allows us to define one fixed standard for decimal precision and scale across multiple records as we build our library of company known types.

Unfortunately, avrohugger does not yet have support for top level types. I'm not sure how to best represent that, but my guess right now would be to define them as a scala type - ie if avrohugger sees the top level avro schema above, it would generate something like the following:

package com.company.namespace
type CustomDecimal = BigDecimal

How to deal with two classes with the same name

Hi,

first of all: great plugin! I'm trying to use it within the context of generating case classes for objects stored in Kafka messages. I have no direct way of modifying the schema, and of of those schemas has a nested record with the same name of the root record. Like so:

{
  "type" : "record",
  "name" : "Order",
  "namespace" : "my.namespace1",
  "fields" : [ {
    "name" : "id",
    "type" : "string"
  }, {
    "name" : "timestamp",
    "type" : "long"
  }, {
    "name" : "source",
    "type" : "string"
  }, {
    "name" : "type",
    "type" : "string"
  }, {
    "name" : "payload",
    "type" : {
      "type" : "record",
      "name" : "Order",
      "namespace" : "my.namespace2",
      "fields" : [ 
         // some fields
      ]
  }
}

When I use your plugin to generate the Scala class hierarchy, I end up with a top-level case class that looks like this:

/** MACHINE-GENERATED FROM AVRO SCHEMA. DO NOT EDIT DIRECTLY */
package my.namespace1

import my.namespace2.Order

case class Order(id: String, timestamp: Long, source: String, `type`: String, payload: Order)

As you can see, the imported my.namespace2.Order shadows the definition of the case class my.namespace1.Order itself.

Any idea how to deal with this kind of situation?

Miss support of logicalType and number of fields

I found two issues:

If I wrote fields more than 254, there is a limit of JVM, because the constructor of treehugger.
The plugin lacks support of logicalType, which makes the plugin lack of two important types (UUID and BigDecimal)

Empty record causes a case class to be generated without parenthesis (compile time error)

Hey Julian,

I'm trying to model a Calculator using event sourcing. I want to add a Reset event that takes no class parameters so just case class Reset() so I use an empty message.

Here is my avdl

@namespace("com.experiments.calculator")
protocol MyApplication {
    record Added {
        int value;
    }

    record Subtracted {
        int value;
    }

    record Divided {
        int value;
    }

    record Multiplied {
        int value;
    }

    record Reset {}
}

All generated case classes work with the exception of Reset. Instead of generating case class Reset(), it generates case class Reset which is invalid in Scala 2.11.
I'm most likely doing something wrong so is there a better way to go about this?

I'm using the sbtavrohugger.SbtAvrohugger.avroSettings setting

Also one more question, I wanted these generated case classes to inherit from an Event trait. Is it possible to do so via the avdl or other settings?

Cheers and thanks for everything
Cal

Doesn't mangle/backtick reserved words

Reproduce with the following file:

protocol test {
  record Test {
    boolean `public`;
  }
}

Get the error:

java.lang.UnsupportedOperationException: `public` is a reserved keyword and cannot be used as field name

RC16 breaks `Test` config of sub-modules

https://gitter.im/julianpeeters/avrohugger?at=5d3a1688cfdf7d0312e60c92

hello! We’re starting to use sbt-avrohugger. The plugin itself works great to generate scala. However, I’m running into a confusing error.
If I start with a clean, tests-passing tree of our master branch, without the plugin, then I add sbt-avrohugger to plugins.sbt with no other changes, suddenly the tests in my submodule are unable to resolve any dependencies
that is, literally the diff is just the addSbtPlugin and nothing else, and suddenly tests are not happy
Any ideas what could be going on here? I’m using 2.0.0-RC16 if that helps
happy to open a ticket too but figured I could try here first
repro:
sbt the-submodule/clean the-submodule/test succeeds
Add addSbtPlugin line to plugins.sbt (there are multiple other plugins already there, working fine)
sbt the-submodule/clean the-submodule/test fails with no dependencies seemingly able to resolve
(i.e. even scalatest isn’t there, and we get value should is not a member of String for a ”foo” should “bar” in {)

Nick Aldwin @NJAldwin 14:20

A protocol file (.avdl) with only imports will fail to generate

I was getting the following error:

[error] .../ProtocolName.scala:4: expected class or object definition
[error] ()
[error] ^
[error] one error found
[error] (compile:compileIncremental) Compilation failed

An invalid file with the name of the protocol is generated. It contains only the namepace name and the problematic parentheses:

/** MACHINE-GENERATED FROM AVRO SCHEMA. DO NOT EDIT DIRECTLY */
package com.example

()

Using bin/avro-tools-1.8.1.jar generate the protocol properly (for exemple to convert to .avsc), so it's not a syntax issue.

Then I had added a dummy error, to make sure the protocol had more than just records. After a sbt clean the protocol then generated properly:

  /** Include an dummy error, otherwise sbt-avrohugger will fail to generate the protocol */
  error DummyError {
    string message;
  }

I'm using version 0.13.0.

Additional empty class when generating Enum in SpecificRecord format

The following .avdl schema:

@namespace("namespace.avro")
protocol EnumProtocol {
    enum MyEnum {
        VALUE1,
        VALUE2
    }
}

using the following settings:

sourceGenerators in Compile += (avroScalaGenerateSpecific in Compile).taskValue

is generated correctly, but I also get a EnumProtocol.scala file in the same package with the following body:

/** MACHINE-GENERATED FROM AVRO SCHEMA. DO NOT EDIT DIRECTLY */
package namespace.avro

import scala.annotation.switch

which causes compiler to produce some warnings:

[warn] Found names but no class, trait or object is defined in the compilation unit.
[warn] The incremental compiler cannot record the dependency information in such case.
[warn] Some errors like unused import referring to a non-existent class might not be reported.

Is this behavior known?

Missing dependency in case of specific-generator

The docs doesn't mention it, but to be able to use specific-generator I had to add

libraryDependencies += "org.apache.avro" % "avro" % "1.8.1"

to the build.sbt. Is it normal? If yes, I'd like to submit a patch for the README.md

julianpeeters / sbt-avrohugger Goto Github PK

sbt-avrohugger's Introduction

sbt-avrohugger

Install the plugin (compatible with sbt 1.3+)

Usage

Tasks:

Compile

Test

Manually

Watch Avro Files

Settings:

Changing Settings

Datatypes

Testing

Credits

Contributors

Fork away, just make sure the tests pass before you send a pull request.

Criticism is appreciated.

sbt-avrohugger's People

Contributors

Stargazers

Watchers

Forkers

sbt-avrohugger's Issues

Recommend Projects

Recommend Topics

Recommend Org