Giter Site home page Giter Site logo

Size of generated code about k8s-openapi HOT 7 OPEN

cgwalters avatar cgwalters commented on July 24, 2024
Size of generated code

from k8s-openapi.

Comments (7)

Arnavion avatar Arnavion commented on July 24, 2024 1

Compilation time will not change because of what's being discussed in this issue. That would be #77

from k8s-openapi.

Arnavion avatar Arnavion commented on July 24, 2024

So just to be clear, your concern about the size is only because you're vendoring, right? Or is there any other reason?


What is the mechanism of deduping?

Type-alias stuff would be complicated because it's not just a matter of discovering duplicates but also of moving the "common" type into some place unaffected by version features. Eg v1_23::Pod can't just be an alias for v1_22::Pod because v1_22::Pod only exists when the v1_22 feature is enabled, so v1_22::Pod would have to be moved to common::Pod and then both v1_22::Pod and v1_23::Pod would be aliases of that.

An easier way would be to just symlink v1_23::Pod's file to v1_22::Pod's file. Also that'll be even more space savings, because the size of the symlink is the length of the file names, which would be smaller than a file that contains a type alias. I know cargo publish supports symlinks in the source, but I don't know if it unsymlinks them when constructing the actual tarball. Do you know?


What should be the basis of deduping? Let's say the crate supports five versions 1.21 through 1.25...

  • If Pod is identical in all versions (for some definition of "identical", say ignoring doc comments as you suggested), then sure it can be deduped.

  • What if it's identical in 1.21 through 1.24 but not 1.25? Is it still deduped for 1.21 through 1.23, while 1.24 and 1.25 keep their individual files? If yes, what is the smallest set of versions for which we would dedupe? The absolute minimum of two versions?

  • What if it's identical in 1.21 through 1.23 and separately it's identical in 1.24 and 1.25? Do we have two deduped files? If we have to have multiple deduped files for the same type, what is the scheme to prevent them from interfering?

I think the answers are:

  1. Any file which is identical in two versions or more can be deduped, for maximum savings.

  2. The deduped filename can incorporate the version module name of the lowest version. Eg in the third point above we would end up with v1_21_pod.rs and v1_24_pod.rs

What do you think?

from k8s-openapi.

cgwalters avatar cgwalters commented on July 24, 2024

So just to be clear, your concern about the size is only because you're vendoring, right?

Yes, that's the primary concern. (Which isn't really about disk space though that matters a little, but about auditability)

If Pod is identical in all versions (for some definition of "identical", say ignoring doc comments as you suggested), then sure it can be deduped.

I think we'd need to do some scripting to see just how many types are identical, but I'd just offhand guess it's at least 50%.

What should be the basis of deduping?

There is an alternative approach that would require more work in the code generator, which is basically to use #[cfg] inside a single common module. I suspect that's nontrivial.

I know cargo publish supports symlinks in the source, but I don't know if it unsymlinks them when constructing the actual tarball. Do you know?

symlinks have a few suboptimal things; one is that not all search and IDE tools handle them nicely. Another is Windows.

Related to all of the above: It'd probably help overall to just use the latest version of the documentation (I suspect there's 0.5% of cases where this might be misleading, but it's probably worth it to default and then add overrides)

from k8s-openapi.

Arnavion avatar Arnavion commented on July 24, 2024

Okay. One more question: Do you use kube or do you use the API operations that k8s-openapi provides via the api feature?

As part of adding v1.28 support there are a bunch of non-trivial changes that need to be made to the code generator. The change to support extra parameters as mentioned there is already done, but there are more changes needed on top of that, and I'm wondering if it's worth it. Basically every user I know disables the api feature and uses kube instead, so I might just drop API operations from the next release entirely. That would also decreases the size of the crate by a lot.

from k8s-openapi.

Arnavion avatar Arnavion commented on July 24, 2024

v0.20 has been released with the API operations-related code removed. Is it small enough now? If not, we can keep this open for working on a deduping solution.

from k8s-openapi.

cgwalters avatar cgwalters commented on July 24, 2024

One more question: Do you use kube

We currently use kube, yes.

And, thanks for looking at this. It looks like v0.20 is better, but still really large:

$ du -shc vendor/k8s-openapi/src/*
4.0K    vendor/k8s-openapi/src/byte_string.rs
12K     vendor/k8s-openapi/src/deep_merge.rs
12K     vendor/k8s-openapi/src/lib.rs
8.0K    vendor/k8s-openapi/src/resource.rs
6.6M    vendor/k8s-openapi/src/v1_22
7.0M    vendor/k8s-openapi/src/v1_23
7.0M    vendor/k8s-openapi/src/v1_24
6.4M    vendor/k8s-openapi/src/v1_25
6.6M    vendor/k8s-openapi/src/v1_26
6.8M    vendor/k8s-openapi/src/v1_27
7.1M    vendor/k8s-openapi/src/v1_28
48M     total

Versus previously:

du -shc vendor/k8s-openapi/src/*
8.0K    vendor/k8s-openapi/src/api.rs
4.0K    vendor/k8s-openapi/src/byte_string.rs
12K     vendor/k8s-openapi/src/deep_merge.rs
24K     vendor/k8s-openapi/src/lib.rs
8.0K    vendor/k8s-openapi/src/resource.rs
9.7M    vendor/k8s-openapi/src/v1_20
9.8M    vendor/k8s-openapi/src/v1_21
8.2M    vendor/k8s-openapi/src/v1_22
8.5M    vendor/k8s-openapi/src/v1_23
8.5M    vendor/k8s-openapi/src/v1_24
7.8M    vendor/k8s-openapi/src/v1_25
8.1M    vendor/k8s-openapi/src/v1_26
61M     total

from k8s-openapi.

fbernier avatar fbernier commented on July 24, 2024

I'd like to add that this generated code takes a long time to compile. I have a beefy desktop computer and k8s_openapi is the slowest crate to compile in a 800 crates project at 5.49s.

from k8s-openapi.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.