Giter Site home page Giter Site logo

Comments (7)

Manishearth avatar Manishearth commented on September 15, 2024

I'm somewhat bearish on adding more map abstractions, we have yet to refurbish ZeroMap, and unsafe abstractions are costly to write and maintain.

This can be modeled with a MultiFieldsVarULE, but that results in multiple offset tables. It is better to share the offset table.

I think this is the correct solution here. This approach just costs another couple bytes or so: I don't see how a new abstraction "shares" offset tables.

Now, if we wish to fix that, I am in favor of a newer version of MultiFieldsVarULE that takes a const param that predeclares the number of fields and avoids the len allocation.


I'm not completely against this, to be clear, I'm just worried about code complexity and don't think it's worth it. This abstraction wouldn't be able to share much code with VarZeroVec because its length int will have to be at the very beginning.

There is a good case to be made here that the VarZeroVec code should be refactored to operate on [..offsets, ..data] buffers with an externally-passed length rather than the current [length, ..offsets, ..data]. This would both enable such optimizations without much extra unsafe, and enable MultiFieldsVarULE<2>.

Long run, here is what I want us to do:

  • #3128, which would make all of these pieces more modular
  • Perform the VarZeroVec refactor mentioned above. This could be done with additional parse functions, or with a VarZeroSliceUnbound type that represents the shorter buffer
  • Perform the MultiFieldsVarULE improvement mentioned above
  • Then, holistically consider what map types are worth introducing, and how to do them in ways that reduce unsafe code

This will take time and will definitely not make it in time for 2.0 or for currency stabilization.

(That said, I think it would be nice and possible to prioritize #3128 for 2.0: I'm finishing up Diplomat work and could do that next if people think that's useful)

I am in general not in favor of the pattern of a potential general abstraction coming up in the process of stabilizing a component and getting fast tracked for it: I think we need to get better at thinking of the data model when designing a component. I understand that not all needs are immediately apparent , but this is at least the second time we're proposing a new zerovec abstraction for patterns/currency at a stage I feel is somewhat too late in the process.

I am generally in favor of tweaking existing general abstractions to support a use case, and collecting these needs so that in the long term we can design general abstractions with a proper understanding of the needs.

from icu4x.

sffc avatar sffc commented on September 15, 2024

I'm fine gating this based on the MultiFieldsVarULE const size improvement (#5240).

However, as noted multiple times before (#5230), I am gravely concerned with what's happening with all our finely sliced data markers, to the point that I think we need to take a pause and have a serious look at how we can make it more efficient in terms of size and compile times. There are many reasons why I think we should not graduate any component using finely sliced data markers until we have put in the leg work to improve them. So, my position is that we should block currency/units/compact graduation on fixing this issue.

Note: I'm fairly happy with the size of neo skeleton finely sliced data markers because we use a custom Packed structure (although the compile times are still slower than desired).

from icu4x.

Manishearth avatar Manishearth commented on September 15, 2024

I'm fine gating this based on the MultiFieldsVarULE const size improvement (#5240).

works for me!

So, my position is that we should block currency/units/compact graduation on fixing this issue.

Do we have other timeline pressure on these things? I'm happy to do that if we get the time to do this properly.

from icu4x.

sffc avatar sffc commented on September 15, 2024

So, my position is that we should block currency/units/compact graduation on fixing this issue.

Do we have other timeline pressure on these things? I'm happy to do that if we get the time to do this properly.

I think this doesn't need to be in the 2.0 critical path, but graduating the components is in the 2024 OKRs.

Also, I don't think we need the whole constellation of changes in order to unblock graduation. I would be surprised if the minimal change that reduces stack size and compile times would take more than 2-3 PRs. I'm fine with (and prefer) deferring the Map API improvements to later, since ultimately they should just sit on top of the underlying data types.

from icu4x.

Manishearth avatar Manishearth commented on September 15, 2024

I think this doesn't need to be in the 2.0 critical path, but graduating the components is in the 2024 OKRs.

2024 is still cutting it close I think since we've still got a lot of 2.0 to do. But it's fine if it's just the MultiVarULE changes.

Not in favor of designing a new map type on a tight schedule. I don't think it would be good, and we'd lock ourselves in to the existing model even more.

from icu4x.

sffc avatar sffc commented on September 15, 2024
  • @sffc I'd like to add this issue to the milestone.
  • @Manishearth I do not agree with the solution proposed in the OP.
  • @sffc I want the problem in the milestone, not necesarily the specific solution. I am okay with the framework Manish proposed in a reply to the issue.
  • @younies The proposal blocks graduation of units, currencies, compact, and even duration by dependency. Why not just ship ZeroMap?
  • @Manishearth ZeroMap is not data-compatible with the proposed solutions.
  • @sffc We are acknowledging that finely sliced data markers are special and that we should put extra care into their design.

Seeking consensus on:

  • Finely sliced data markers, defined as a data marker with a large or unbounded number of data marker attributes, should be represented as compactly as possible, generally understood to mean with no more than 1 variable-length structure at the top level of the data struct, which can be defined on a case-by-case basis
  • Components should not be graduated until this condition is met
  • We should endeavor to include this constraint during the design work of a component that makes use of finely sliced data markers
  • Revisit in 2 months.

LGTM: @sffc @younies @robertbastian @Manishearth

from icu4x.

Manishearth avatar Manishearth commented on September 15, 2024

I think a really good proposal that gets maximum space wins and is compatible with future map abstractions would be:

  • We make the VarZeroSliceUnbound abstraction #5378
  • We use it to implement MultiFieldsVarULE<const> (#5240)
  • We use MultiFieldsVarULE<2> with the two fields being a ZeroSlice and VarZeroSliceUnbound and have a tiny amount of unchecked in the wrapper using the keys.len() = values.len() invariant.

from icu4x.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.