Giter Site home page Giter Site logo

Comments (6)

julianpeeters avatar julianpeeters commented on June 25, 2024

Hi Cory, great idea.

Currently AvroTypeProviderMacro uses FileParser.getSchemas(infile), which uses the org.apache.avro.Schema.Parser to parse each file.

I think the problem is that I used a new org.apache.avro.Schema.Parser for each file, instead of using (and re-using) a static one like this SO question recommends.

My time frame to explore this is likely a good amount time from now, but If you'd like to take a shot at it, I've reformatted your example schemas to fit with current tests (thanks for those!), added a test that is currently failing, and pushed it up to a new branch.

If for some reason the static Parser doesn't do the trick, I might look into the maven-avro-plugin to see how they accomplish this (sounds like it can), or start thinking about a simple way to store the schemas or generated classes.

I'm not sure I understand what you mean by "plugin", so maybe this is irrelevant but, "plugin" as in build-tool plugin like an sbt-plugin? I see macros as a build-tool agnostic replacement for plugins (eh, both have other pros and cons), and I've got some sbt-plugin projects that I wish were easier to keep consolidated with their main project. I haven't looked into the use-case at all, but my inclination would be to first look into how we could add a @AvroValidation macro? For example, @AvroTypeProviders is already a macro (I feel like you knew that though, so sorry if I'm off the mark here).

Thanks for the interest and throwing around these good ideas!

from avro-scala-macro-annotations.

cory-p-oncota avatar cory-p-oncota commented on June 25, 2024

I will start by giving a shot at using a single parser. It sounds like this could very well do the trick for this and may in fact be intended behavior for an Avro implementation.

In regards to the plugin idea, maybe it's best to move the conversation to another issue as a feature enhancement or possibly even into Avro's codebase itself.

My current primary use case is based around the fact that Avro currently only supports structural validation (to my knowledge). In my application, we are looking to replace Joi with a language agnostic solution and implementing Avro for serialization and schemata is just the first step. Like Joi or JSON-Schema, I'd love to implement business rules (constraints, dynamic requires, etc) into my models so they can be abstracted from the application layer in consuming microservices.

I see this coming about as some form of mix-in with Avro but I will take a better look at the parser to see if custom fields can actually be supplied and go from there.

Cheers on the contributions and use of macros. Very much looking forward to the ways libraries like this can change my workflow!

from avro-scala-macro-annotations.

julianpeeters avatar julianpeeters commented on June 25, 2024

I'd love to implement business rules (constraints, dynamic requires, etc) into my models so they can be abstracted from the application layer in consuming microservices

Rad.

Very much looking forward to the ways libraries like this can change my workflow

Hehe, same.

from avro-scala-macro-annotations.

cory-p-oncota avatar cory-p-oncota commented on June 25, 2024

Reaching out for thoughts since I hit a bit of a road block in implementing this. Essentially, simply implementing the parser as a property of the FileParser object worked and will successfully load the schemas, however, the change broke the existing tests on the AvroTypeProviderTestNestedSchemaFile schema because the schema has already been loaded when we are trying to generate the MetaData case class.

We reach a "chicken-or-egg" situation because the obvious solution is to check the parser's definition for the fully qualified name via schema.getTypes() prior to add the schema to the parser, however, in order to do so, we must parse it with a temporary parser in order to get the name, effectively hitting the same error as before when working the separate files (the parent record fails on using the temporary parser since order matters).

Any ideas aside from implementing a manual JSON parser to cache/check the name prior to even running through the parser?

from avro-scala-macro-annotations.

cory-p-oncota avatar cory-p-oncota commented on June 25, 2024

I've partially solved the above using a couple of methods together:

  1. FileParser now extends Parser directly in singleton fashion

  2. AvroTypeProviderMacro now caches file names which it has already processed and looks them up from FileParser when already processed in order to get the correct schema definition

  3. SchemaStore is eliminated and schemas generated via AvroRecord are added to the FileParser. This is possible because Parser already implements all features of the SchemaStore.

This commit has the changes thus far, however, I believe we're hitting a point where an opinionated approach is required. The default value tests 15-18 are failing due to the fact that they use AvroTypeProviderTestDefaultValue00 which has already been defined and thus cannot be redefined. Essentially, when using a single parser approach the opinion must be held that one cannot implement a message definition in more than one avsc file. Thoughts?

from avro-scala-macro-annotations.

julianpeeters avatar julianpeeters commented on June 25, 2024

Excellent work. I hadn't realized that my misuse of the parser had led to illegal schema definitions. Unfortunately it also reveals that this feature request should in fact be to Avro, for the same reason. IMO it's a worthwhile feature to add to Avro, and if you feel you're energies could be better spent there, here I'd be grateful to accept a pull request of what you've got so far and fix the tests to conform to Avro's opinion when I get the time.

from avro-scala-macro-annotations.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.