Giter Site home page Giter Site logo

Comments (6)

enisoc avatar enisoc commented on August 17, 2024 4

The new "bulk optional fields" syntax got me thinking about this associative list problem again.

If I understand the new syntax correctly, it seems like, inside a struct context, [exprA]: exprB means, "Unify exprB with the value of any field whose field name when unified with exprA is not _|_."

I was thinking that a generalization of this inside a list context might allow me to specify the kinds of constraints I want to apply onto associative lists (specifically thinking of those in many k8s APIs).

The proposed rules would be something like this, where exprA and exprB are both structs:

  1. When encountered inside a list context, [exprA]: exprB means, "Unify exprA & exprB with any element in the list whose value when unified with exprA is not _|_."
  2. When encountered inside a list context, exprA: exprB means, "Unify exprA & exprB with any element in the list whose value when unified with exprA is not _|_ (same as rule 1 so far). In addition, if no existing elements can be unified with exprA, append exprA & exprB as a new element."

I see this as being sort of analogous to setting an element of a map, except the map happens to be structured as an associative list. Since the elements of an associative list are structs, the "key" (exprA) is also a struct in this case. That even lets you define multiple fields, which creates a multi-column primary key (something I mentioned earlier we'd need for k8s APIs).

Some examples:

// Define a container somewhere.
containers: [
  {name: "mycontainer"}: {
    image: "us.gcr.io/my-registry/my-image"
    command: "foo"
  }
]
// Somewhere else, apply constraints on the container by name.
containers: [
  {name: "mycontainer"}: {
    // Apply a constraint on the image for a container.
    image: =~"^us\.gcr\.io/"
    // Apply a constraint (e.g. give it a consistent name) on a particular
    // port, if it exists. There could be a udp port 443 as well, but we
    // won't touch that because we use a multi-column primary key.
    ports: [
      [{protocol:"tcp",containerPort:443}]: {
        name: "https"
      }
    ]
  }
]
// Add another container to the associative list,
// while keeping existing elements (mycontainer).
containers: [
  {name: "othercontainer"}: {
    image: "otherimage"
  }
]

from cue.

enisoc avatar enisoc commented on August 17, 2024

Thanks for starting to brainstorm on this! Overall, this first idea for an approach seems powerful enough at first glance that I can imagine using it to implement k8s strategic merge semantics for lists.

To give you some flavor of the "rabbit hole" of additional complexity I mentioned in the email thread, the next most interesting case after x.name is where the "primary key" (in the relational sense) consists of multiple "columns". For example, we have a list of "port specs" where the primary key is a tuple of (protocol,portnum) where protocol can be "tcp" or "udp".

It seems like your first proposed approach would work for this too if we do something like:

portList: {
    <- "\(x.protocol + ':' + tostr(x.portnum))" : x for x in $
    -> [ x for x in $ ] // same as for single-column PK
}

It's not the most obvious thing, but for most users it would be hidden inside the generated CUE library for k8s API objects and they wouldn't have to think about it.

The next crazy thing down the rabbit hole is probably the idea that sometimes it's important to preserve the relative order of items in lists that get merged as if they were maps. I'm not sure what CUE currently guarantees, if anything, about the emitted order of fields.

from cue.

mpvl avatar mpvl commented on August 17, 2024

@enisoc: thanks for these insights.

Regarding guarantees of ordering: currently the ordering is based on the order of appearance of a field within the language. This is simple and generally gives nice results, but it does sometimes result in some unfortunate reordering. What I was thinking of supporting for maps is to do topological sorts so that the relative ordering of the elements as they appear in the map is preserved. The main idea behind this is to have nicer output, I hadn't thought of it in terms of guarantees and what that will mean for the semantics. It may be okay to guarantee that a merge of two lists guarantees a certain ordering as long as no cycles are introduced, without introducing this concept in the value lattice. That seems fishy theoretically, but may be a practical conclusion.

I've tried many different approaches for the annotation of strategic merges in the mean time. The main are quite unsatisfactory. One notation I'm investigating that has some promise is to give additional constraints to lists. For instance:

[...string]{3}     // a list of strings of length 3
[...string]{<=10}  // a list of strings of at most length 10.

Similarly, we could introduce additional constraints in terms of a strategic merge interpretation, something like

a: [...v1.Object]::{"\(strings.ToCamel(<-kind))" "\(<-metadata.name)": _}
or
a: [...v1.Object]{[strings.ToCamel(<-kind)] [<-metadata.name]}

or whatever notation. This would tell cue that a list encountered at field a should be interpreted as a strategic merge. The <- operator would access the element, allowing to refer to the element values for which to construct the key.

If only kubernetes objects were specified at the top-level, mixing in additional constraints would be easy. For instance

service <Name>: v1.Service

would then further restrict object kinds of type Service accordingly.

If this is not the case, and we want CUE to mimic a json object stream natively, one could perhaps write

[...v1.Object]{[strings.ToCamel(<-kind)] [<-metadata.name]: (v1&v1beta1)[<-kind]}

where additional constraints for the elements are selected from one of the respective packages. See the recent addition of cue get go for understanding generating CUE templates from Kubernetes code.

This need a lot more though. This means we can now represent as a map or list. In raw mode, one may want to represent it as a map, for evaluated output a list. The topological sort approach needs working out. Also, this may not break associativity, commutativity or idempotence. This means we need to introduce something in lists similar to how integer literals work and that the exact type can't be evaluated until all information is available for a field. This could be fine, though.

So a lot of potential issues, but it is worth it. Strategic merges are not only common in Kubernetes, but also graph unification is not great for handling lists, and I'm sure this issue is not limited to Kubernetes.

from cue.

extemporalgenome avatar extemporalgenome commented on August 17, 2024

Ideally users would just use the native API of whatever system they work on.

One of the implications of having a better language/tool, like CUE, to manage configuration is that you can of course reduce data duplication. You might use the same CUE data in Kubernetes, Terraform, and local JSON config outputs. In these cases, there often isn't a single native API or data format to target, or even when there is just one format, it can be overly complex and obscure the meaning of the data.

If CUE had generic data mapping/transformation capabilities, perhaps the ideal would then be to shift technology concerns (like Kubernetes) out of the core CUE code and into isolated, output-oriented packages?

from cue.

jlongtine avatar jlongtine commented on August 17, 2024

from cue.

cueckoo avatar cueckoo commented on August 17, 2024

This issue has been migrated to cue-lang/cue#14.

For more details about CUE's migration to a new home, please see cue-lang/cue#1078.

from cue.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.