Giter Site home page Giter Site logo

Comments (19)

jhthorsen avatar jhthorsen commented on July 17, 2024 1

I've added support for "skip_schema_validation" in jhthorsen/mojolicious-plugin-openapi@673079d

from json-validator.

minusdavid avatar minusdavid commented on July 17, 2024

Actually, maybe this is a case where we just need to use a newer Mojolicious::Plugin::OpenAPI::Perl version, since it almost seems like Mojolicious::Plugin::OpenAPI::Perl 4.02 and JSON::Validator 4.16 might not validate the OpenAPI spec at plugin registration time?

Am I missing something?

from json-validator.

minusdavid avatar minusdavid commented on July 17, 2024

Oh I think 'my $errors = $self->validator->errors;' is doing the validation? That's obscure.

from json-validator.

jhthorsen avatar jhthorsen commented on July 17, 2024

Are you sure it's the validation that takes a long time, and not just loading the spec file and finding potential "$ref"?

What is Mojolicious::Plugin::OpenAPI::Perl ?

That's obscure.

Please provide some sort of constructive suggestion.

from json-validator.

minusdavid avatar minusdavid commented on July 17, 2024

Thanks for the quick response!

Are you sure it's the validation that takes a long time, and not just loading the spec file and finding potential "$ref"?

I am not at all sure. At this point, I'm reviewing other people's code, and I've mostly just traced the slowness to Mojolicious::Plugin::OpenAPI. They were doing an extra load_and_validate_schema outside of Mojolicious::Plugin::OpenAPI and that was killing performance. Someone else suggested the validation being the slowness, and I was following it up, but when you put it like that... the loading/resolving $ref could certainly be the time consuming bit. It is a massive spec file with a lot of references.

Comparing 1.08 and 4.16... it looks like 4.16 might have caching for resolved schemas? I don't understand where the "id" comes from or whether that just applies to URLs.

What is Mojolicious::Plugin::OpenAPI::Perl ?

That must be a typo. Just Mojolicious::Plugin::OpenAPI.

That's obscure.

Please provide some sort of constructive suggestion.

My apologies. I should've said "that seems obscure", as I wasn't sure if I was understanding correctly or not.

Usually, if I see an errors() method, it's called after a method that was supposed to do some work. For instance with DBI it would be $sth->execute($foo) or die $dbh->errstr(). Or "cgi_error()" after upload() in CGI.

When I was reading Mojolicious::Plugin::OpenAPI, I figured $self->validator->errors must be a method to collect the errors from a previous validator method, but it wasn't clear what method that could've possibly been. That's when I looked at the errors method and saw it contained the logic for the resolve and validate in it. I just found that confusing as the behaviour didn't fit with the name of the method. Not a big drama though.

from json-validator.

minusdavid avatar minusdavid commented on July 17, 2024

Looking at all the $ref references and it seems that there are about 410 of them all up.

from json-validator.

minusdavid avatar minusdavid commented on July 17, 2024

What is Mojolicious::Plugin::OpenAPI::Perl ?

In hindsight, I may have been expanding "libmojolicious-plugin-openapi-perl" and left on the last Perl by accident heh...

from json-validator.

minusdavid avatar minusdavid commented on July 17, 2024

My apologies. After more investigation, it seems that the problem is elsewhere in Mojolicious::Plugin::OpenAPI. The loading and validating is actually quite quick.

I think that it actually might be memory related. But I'll open up a new issue on Mojolicious::Plugin::OpenAPI...

from json-validator.

minusdavid avatar minusdavid commented on July 17, 2024

After hacking on the Mojolicous::Plugin::OpenAPI and JSON::Validator libraries directly, I've narrowed down the problem to JSON::Validator::_validate(). (My earlier tests at #246 (comment) were using bundle() which just did loading and resolving not validating...)

When only 1 process is run, it's reasonably fast. However, when you run many processes in parallel, it takes a much longer time to validate the schema.

As I mentioned in my opening comment, I'm using the same schema/spec across many instances, so it's a waste of resources to re-validate it 120 times at once.

That said, creating a checksum for a deep Perl hash is easier said than done.

In my use case, we're loading a static spec/schema file but then hacking the data structure returned by JSON::Validator->bundle, so we can't use a checksum from the actual file either. I suppose in that case... it would be handy if there were a way to add routes to Mojolicious::Plugin::OpenAPI beyond mangling the original schema/spec...)

from json-validator.

minusdavid avatar minusdavid commented on July 17, 2024

Ohh or maybe not.

In load_and_validate_schema(), I can use the following to get a consistent checksum between different Perl runs:

my $bytes = Mojo::JSON::encode_json($spec);
my $digest = Digest::MD5::md5_hex($bytes);

That would be perfect. Admittedly that's for an older JSON::Validator, but the concept would work the same for the new JSON::Validator.

Probably JSON::Validator::Schema->errors would have a check around here: https://metacpan.org/source/JHTHORSEN/JSON-Validator-4.16/lib/JSON/Validator/Schema.pm#L12

(Although JSON::Validator::Schema might be too generic for the check to be there. You'll know more about the implementation specific details.)

from json-validator.

minusdavid avatar minusdavid commented on July 17, 2024

I figure < 1 second to calculate a checksum vs > 30 seconds spent validating would be a big win.

from json-validator.

minusdavid avatar minusdavid commented on July 17, 2024

(Apologies for the high volume of comments and the stream-of-consciousness. Hopefully it helps to clarify the problem and proposed solution.)

from json-validator.

minusdavid avatar minusdavid commented on July 17, 2024

Without the validation, I'm able to restart 120 processes in about 18 seconds with very little load on the server.

I think it's worth having the validation, but if a checksum can be cached on a schema/spec that we know has passed validation... then I think we'd be in a good place.

(Something I might not be considering... what are we validating this schema/spec again... there might need to be more data cached than I initially suspect)

from json-validator.

jhthorsen avatar jhthorsen commented on July 17, 2024

@minusdavid: Would it fix your issue if applied this diff to M::P::OpenAPI?

diff --git a/lib/Mojolicious/Plugin/OpenAPI.pm b/lib/Mojolicious/Plugin/OpenAPI.pm
index 9060943..24a1ae9 100644
--- a/lib/Mojolicious/Plugin/OpenAPI.pm
+++ b/lib/Mojolicious/Plugin/OpenAPI.pm
@@ -21,7 +21,7 @@ sub register {
   $self->validator->allow_invalid_ref(1)                    if $config->{allow_invalid_ref};
   $self->_set_schema_version($config->{version_from_class}) if $config->{version_from_class};

-  my $errors = $self->validator->errors;
+  my $errors = $config->{skip_schema_validation} ? [] : $self->validator->errors;
   die @$errors if @$errors;

   unless ($app->defaults->{'openapi.base_paths'}) {

from json-validator.

minusdavid avatar minusdavid commented on July 17, 2024

Thanks, Jan. I think that's a useful option.

However, I like the idea of the validation cache a bit better, because it mean that you're still doing validation on the schema, but you're only doing it 1 time. If your schema changes, then you validate it again, but just the once.

That being said, with a bunch of parallel processes, there would be a bit of a race, so in practice you'd probably be validating it N times before the first one completes successfully. But that would still be a better scenario than validating all the time for every process.

from json-validator.

jhthorsen avatar jhthorsen commented on July 17, 2024

I'm not going to implement what you request, but I'll consider a PR.

What I could however consider is to add support for passing in a JSON::Validator object to M::P::OpenAPI. That way, you might be able to freeze and store that object and restore it from disk... No idea if that idea will actually work though.

from json-validator.

minusdavid avatar minusdavid commented on July 17, 2024

Fair enough. I was planning on writing a PR, so I'm glad to hear that you'll consider one.

I'm not sure I follow the logic of sharing a JSON::Validator object between different processes. That seems riskier than checking a checksum?

from json-validator.

minusdavid avatar minusdavid commented on July 17, 2024

Actually, thinking about it again, I suppose that jhthorsen/mojolicious-plugin-openapi@673079d should be sufficient.

Since I have the schema before giving it to M::P::OpenAPI, I can do my own checksum calculation and set the configuration option based on my analysis. The caching doesn't have to be done within JSON::Validator at all really.

from json-validator.

minusdavid avatar minusdavid commented on July 17, 2024

Although... I suppose that would man we'd either have to use JSON::Validator separately before M::P::OpenAPI, or set the cache after M::P::OpenAPI has registered itself without dying due to validation errors. I guess that's OK.

from json-validator.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.