Comments (7)
I'm somewhat bearish on adding more map abstractions, we have yet to refurbish ZeroMap, and unsafe abstractions are costly to write and maintain.
This can be modeled with a MultiFieldsVarULE, but that results in multiple offset tables. It is better to share the offset table.
I think this is the correct solution here. This approach just costs another couple bytes or so: I don't see how a new abstraction "shares" offset tables.
Now, if we wish to fix that, I am in favor of a newer version of MultiFieldsVarULE that takes a const param that predeclares the number of fields and avoids the len allocation.
I'm not completely against this, to be clear, I'm just worried about code complexity and don't think it's worth it. This abstraction wouldn't be able to share much code with VarZeroVec because its length int will have to be at the very beginning.
There is a good case to be made here that the VarZeroVec code should be refactored to operate on [..offsets, ..data]
buffers with an externally-passed length rather than the current [length, ..offsets, ..data]
. This would both enable such optimizations without much extra unsafe, and enable MultiFieldsVarULE<2>
.
Long run, here is what I want us to do:
- #3128, which would make all of these pieces more modular
- Perform the VarZeroVec refactor mentioned above. This could be done with additional parse functions, or with a VarZeroSliceUnbound type that represents the shorter buffer
- Perform the MultiFieldsVarULE improvement mentioned above
- Then, holistically consider what map types are worth introducing, and how to do them in ways that reduce unsafe code
This will take time and will definitely not make it in time for 2.0 or for currency stabilization.
(That said, I think it would be nice and possible to prioritize #3128 for 2.0: I'm finishing up Diplomat work and could do that next if people think that's useful)
I am in general not in favor of the pattern of a potential general abstraction coming up in the process of stabilizing a component and getting fast tracked for it: I think we need to get better at thinking of the data model when designing a component. I understand that not all needs are immediately apparent , but this is at least the second time we're proposing a new zerovec abstraction for patterns/currency at a stage I feel is somewhat too late in the process.
I am generally in favor of tweaking existing general abstractions to support a use case, and collecting these needs so that in the long term we can design general abstractions with a proper understanding of the needs.
from icu4x.
I'm fine gating this based on the MultiFieldsVarULE const size improvement (#5240).
However, as noted multiple times before (#5230), I am gravely concerned with what's happening with all our finely sliced data markers, to the point that I think we need to take a pause and have a serious look at how we can make it more efficient in terms of size and compile times. There are many reasons why I think we should not graduate any component using finely sliced data markers until we have put in the leg work to improve them. So, my position is that we should block currency/units/compact graduation on fixing this issue.
Note: I'm fairly happy with the size of neo skeleton finely sliced data markers because we use a custom Packed structure (although the compile times are still slower than desired).
from icu4x.
I'm fine gating this based on the MultiFieldsVarULE const size improvement (#5240).
works for me!
So, my position is that we should block currency/units/compact graduation on fixing this issue.
Do we have other timeline pressure on these things? I'm happy to do that if we get the time to do this properly.
from icu4x.
So, my position is that we should block currency/units/compact graduation on fixing this issue.
Do we have other timeline pressure on these things? I'm happy to do that if we get the time to do this properly.
I think this doesn't need to be in the 2.0 critical path, but graduating the components is in the 2024 OKRs.
Also, I don't think we need the whole constellation of changes in order to unblock graduation. I would be surprised if the minimal change that reduces stack size and compile times would take more than 2-3 PRs. I'm fine with (and prefer) deferring the Map API improvements to later, since ultimately they should just sit on top of the underlying data types.
from icu4x.
I think this doesn't need to be in the 2.0 critical path, but graduating the components is in the 2024 OKRs.
2024 is still cutting it close I think since we've still got a lot of 2.0 to do. But it's fine if it's just the MultiVarULE changes.
Not in favor of designing a new map type on a tight schedule. I don't think it would be good, and we'd lock ourselves in to the existing model even more.
from icu4x.
- @sffc I'd like to add this issue to the milestone.
- @Manishearth I do not agree with the solution proposed in the OP.
- @sffc I want the problem in the milestone, not necesarily the specific solution. I am okay with the framework Manish proposed in a reply to the issue.
- @younies The proposal blocks graduation of units, currencies, compact, and even duration by dependency. Why not just ship ZeroMap?
- @Manishearth ZeroMap is not data-compatible with the proposed solutions.
- @sffc We are acknowledging that finely sliced data markers are special and that we should put extra care into their design.
Seeking consensus on:
- Finely sliced data markers, defined as a data marker with a large or unbounded number of data marker attributes, should be represented as compactly as possible, generally understood to mean with no more than 1 variable-length structure at the top level of the data struct, which can be defined on a case-by-case basis
- Components should not be graduated until this condition is met
- We should endeavor to include this constraint during the design work of a component that makes use of finely sliced data markers
- Revisit in 2 months.
LGTM: @sffc @younies @robertbastian @Manishearth
from icu4x.
I think a really good proposal that gets maximum space wins and is compatible with future map abstractions would be:
- We make the VarZeroSliceUnbound abstraction #5378
- We use it to implement
MultiFieldsVarULE<const>
(#5240) - We use
MultiFieldsVarULE<2>
with the two fields being aZeroSlice
andVarZeroSliceUnbound
and have a tiny amount of unchecked in the wrapper using thekeys.len() = values.len()
invariant.
from icu4x.
Related Issues (20)
- Migrate data structs to icu_pattern in 2.0 HOT 3
- `icu_capi` builds a lib instead of a staticlib HOT 2
- C++ API namespace HOT 1
- C/C++ header paths
- Neo date formatter: Options<R> vs R::Options and same for .format HOT 8
- Decide on names for icu_datetime and icu_calendar errors over FFI HOT 1
- Should we adopt a more consistent naming convention for traits and marker types? HOT 28
- LocaleExpander and LocaleDirectionality should use `AsRef` for their type parameter
- GitHub Pages and GCS resource efficiency HOT 11
- Create a way to map a Locale to a default Currency HOT 6
- For Currency Long Formatting: check that the .json data is completed
- Refactor: Move `MeasureUnit` to a Separate Crate HOT 3
- Choose an appropriate name for `MeasureUnit` crate HOT 2
- Consider Adding Specific Formatter for Long Formatting in for currency formatter HOT 2
- Add return type to `Yoke::with_mut` and `Yokeable::transform_mut` HOT 6
- Suggestion: Use `PluralRules` instead of `Count` HOT 4
- Missing `other` Unit Pattern for `sd` Locale in the Data for Long Currency HOT 5
- Better abstractions for splitting lengths out from VarZeroVec
- Improve handling of overlap patterns in semantic datetime
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from icu4x.