Giter Site home page Giter Site logo

Comments (77)

zbraniecki avatar zbraniecki commented on June 8, 2024 6

Intl.NumberFormat.prototype.formatToParts is now shipped in two engines - SpiderMonkey and V8 - behind the flag.
I would like us to request Stage 4 at the next TC39 meeting.

from ecma402.

zbraniecki avatar zbraniecki commented on June 8, 2024 5

Intl.NumberFormat.prototype.formatToParts reached Stage 4 at today's TC39 meeting. I believe once we merge the PR, we can close this issue! :)

from ecma402.

zbraniecki avatar zbraniecki commented on June 8, 2024 2

Intl.DateTimeFormat.prototype.formatToParts has been exposed to the Web in today's Firefox - http://hg.mozilla.org/mozilla-central/rev/3f47a92541c8

from ecma402.

zbraniecki avatar zbraniecki commented on June 8, 2024 2

Hi all,

@littledan pointed out that we failed to document the rationale for the decision to add NumberFormat.prototype.formatToParts alongside with the DateTimeFormat's one.

We did talk about it quite a bit during TC39 meetings, and showed examples as we were requesting the stage advancements, but never wrote down the rationale, so here's my attempt to capture the spirit of it for the record:

  1. Rich formatting for numbers using map/reduce model

At the beginning, in September 2015, we discussed several approaches to allow for internationalization-friendly rich formatting of the data formatted by the Intl formatters.

It was fairly clear to all of us (@ericf, @caridy, @stas, @srl295 and me) that we're aiming at the most simple, chainable, solution that will allow users to achieve the simple, most common, tasks easily, while allowing more complicated ones if needed (with accompanying complexity).

The feature request was based on my experience of working on Firefox OS where we encountered a number of UX/design requirements for rich formatting including:

- Dates (ex. "May <strong>05</strong> 2014")
- Time (ex. "09:49 <i>pm</i>")
- DateTimes (ex. "<strong>Monday</strong>, 09/12/2015")
- Numbers (ex. "1 000.<strong>35</strong>")
- Currencies (ex. "<strong>EUR</strong> 35")
- Percent (ex. "35.4 <strong>%</strong>")

The initial focus, due to requirements of my use case project, were primarily focused on DateTimeFormat API, but alongside, @stasm wrote the spec proposal for the corresponding NumberFormat API.

On October 30th 2015, @ericf came up with an API that serves the need elegantly in our opinion allowing for any call to format function to be replaced with a chained formatToParts/map/reduce sequence.

  1. Cohesive API

Although secondary to the first reason, I believe it's worth bringing up the principle of least astonishment.
formatToParts is often requested for formatting dates and units, and @caridy made the decision to provide formatToParts to all formatters.
I believe that there is a value in covering every Intl API that has format function with a formatToParts counterpart.

Known limitation:

The formatToParts/map/reduce chain for NumberFormat comes up with a particular challenge that other APIs either existing (DateTimeFormat) or planned (ListFormat, UnitFormat, RelativeTimeFormat, DurationFormat) do not share.
In particular, it formats a full number that has components. For example "1 000 000.00" can be formatted on two levels of granularity - into ["1", "000", "000", ".", "00"] and ["1 000 000", ".", "00"].
Each one allows makes different thing easy. The former makes it easy to style each component of the number, the latter makes it easy to format the whole integer part of the number with one token.

The difference is that the former tokenization allows to tokenize the whole integer, although in a bit awkward way, while the latter does not allow for the styling of each component.

ICU provides an API that allows to workaround this challenge by producing overlapping list of tokens where you can get start/end positions for both, the whole integer and its components.
The challenge with it, is that it cannot be used to map/reduce the list into a string.

In the end, we ( @srl295 , @caridy , @ericf , @stasm and me) came to conclusion, that while such API may be useful, it's a separate API and should not be using formatToParts name.
We see value in getting formatToParts for both of the reasons above into the spec, and if there's interest in getting an overlapping API just for NumberFormat due to its unique use cases, we should explore it separately.

I hope it helps!

from ecma402.

stasm avatar stasm commented on June 8, 2024 2

I filed #160. I had to refresh my memory on what happened to the original PR a year ago. If the summary in #160 is off, please correct me.

from ecma402.

caridy avatar caridy commented on June 8, 2024 2

two years, yay!

merged fe7b0a4

from ecma402.

zbraniecki avatar zbraniecki commented on June 8, 2024 1

Thanks for feedback @jungshik !

With the above tokenization, it's hard to do that. I'm not sure how to address this issue in a generic/locale-independent way.

My take on that is that it's unfortunately not possible to optimize the API for those cases. It would be on the userland library that will use the API to add finetuning like that.

Similar situation is with NumberFormat -

PLN 1,000,000.20
0:currency:'PLN'
1:integer:'1'
2:group:', '
3:integer:'000',
4:group:', '
5:integer:'000'
6:decimal:'.'
7:fraction:'20'

One could argue that you may want to format the whole number, while we return groups of integers, but I believe that this API should stay generic and let the mapping handle special cases, like:

let nf = new Intl.NumberFormat('en-US', {
  style: 'currency',
  currency: 'PLN'
});
let parts = nf.formatToParts(1000000.20);

let insideInteger = false;

let fmtstr = parts.map(part => {
  switch(part.type) {
    case 'integer':
      if (!insideInteger) {
        insideInteger = true;
        return `<b>${part.value}`;
      }
    case 'group':
      return part.value;
    default:
      if (insideInteger) {
        insideInteger = false;
        return `</b>${part.value}`;
      }
      return part.value;
  }
}).join('');

Similar thing could be done to your case with weekedays:

let f = new global.IntlPolyfill.DateTimeFormat('ko', {
  year: 'numeric',
  month: 'numeric',
  day: 'numeric',
  weekday: 'short',
  hour: 'numeric',
  minute: 'numeric',
  second: 'numeric'
});
let parts = f.formatToParts(Date.now());

let fmtstr = parts.map((part, i) => {
  switch (part.type) {
    case 'literal':
      if (parts[i+1] && parts[i+1].type === 'weekday') {
        return part.value.replace('(', '<b>(');
      }
      if (parts[i-1] && parts[i-1].type === 'weekday') {
        return part.value.replace(')', ')</b>');
      }
    case 'weekday':
      if (parts[i-1] && parts[i-1].value.includes('(')) {
        return part.value;
      }
      return `<b>${part.value}</b>`;
    default:
      return part.value;
  }
}).join('');

from ecma402.

zbraniecki avatar zbraniecki commented on June 8, 2024 1

We're waiting for second vendor to step in now. @jungshik, @littledan - do you know if there's a chance to get this into V8 even in an experimental form so that Google would be comfortable recommending it for stage4?

The last call to get this into 4th edition is probably at January's TC39 meeting.

from ecma402.

zbraniecki avatar zbraniecki commented on June 8, 2024 1

With https://hg.mozilla.org/integration/autoland/rev/353baf72f789 landed, SpiderMonkey now exposes both without any flag.

from ecma402.

caridy avatar caridy commented on June 8, 2024

I'm not sure about this at first glance, but it seems that we have some similarities between what you describe here and what a tagged template string does (https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/template_strings#Tagged_template_strings), which ultimately give you more control over the formatting of the final output. Maybe we should think in those terms instead of introducing yet another template form.

Aside from that, having the custom format in the constructor prevent you from reusing the formatter in multiple places if the custom format is different.

from ecma402.

zbraniecki avatar zbraniecki commented on June 8, 2024

I agree, I didn't spend too much time thinking about how the API should look like, and I'd be happy to reuse template strings as long as it allows for token formatting in datetime string construction.

You're also correct about format in constructor, but I would say that it may be a good tradeoff. Formatters should be consistent and if a customer needs to produce two different strings according to different formattings, he may need two different formatters.
But if it's easier to add a format template string as an argument for formatter.format, I'm totally fine with that.

from ecma402.

srl295 avatar srl295 commented on June 8, 2024

ICU (and Java) use a FieldPosition construct. perhaps something like this:

var formatter = Intl.DateTimeFormat(navigator.languages, {
  hour: 'numeric',
  minute: 'numeric'
});
var fieldPosition = {};
var string = formatter.format(new Date(), fieldPosition);
// string = "12:34"
// fieldPosition = { hour: [ 0,1], minute: [4,5] };

Not sure what the right idiom is, but that's conceptually the data you might want. THen you can do whatever with the result.

from ecma402.

zbraniecki avatar zbraniecki commented on June 8, 2024

Awesome!
The API feels a bit awkward for JS, but it does solve the problem.

@caridy - do you have any preference on how the API should look like? I definitely like the formatting to be on formatter.format, but I'm not sure if it's better to have the second argument alter the returned string or stay close to ICU/Java implementation and provide additional information that allow the user to modify the string later.

from ecma402.

caridy avatar caridy commented on June 8, 2024

let me ask around tomorrow, and see if someone has a good idea about the API. I'm ok with a second argument, but I'm not ok mutating that argument as proposed above.

from ecma402.

zbraniecki avatar zbraniecki commented on June 8, 2024

Ok, so would you prefer the approach proposed by @srl295 - we return the start and end indexes of the tokens?

from ecma402.

zbraniecki avatar zbraniecki commented on June 8, 2024

@srl295 , @caridy : I wrote a patch proposal for the spec to add fieldPosition to FormatDateTime function.
https://github.com/tc39/ecma402/compare/master...zbraniecki:dtformattokens?expand=1

Can you tell me how does it look at what's the next step?

I need to get something in our platform soon and would like to make it in line with the committee thinking about this feature.

from ecma402.

zbraniecki avatar zbraniecki commented on June 8, 2024

And here is the patch for Intl.js polyfill - https://github.com/andyearnshaw/Intl.js/compare/master...zbraniecki:dtformattokens?expand=1

from ecma402.

caridy avatar caridy commented on June 8, 2024

First thing first, we need to agree on the API first. I know I dropped the ball on this one since it wasn't at the top of my list, but I can certainly spend some time on this api this week. Please, add a comment here with the exact change (minimum amount of text) so we can discuss it, reading the spec diff :) to understand the proposal is not optimal.

from ecma402.

zbraniecki avatar zbraniecki commented on June 8, 2024

Sure!

The proposal is to add an optional, second argument for Intl.DateTimeFormat.format(date, (bool) fieldPosition). If added, the return value is an object with two fields - string and fieldPosition where fieldPosition is an object matching ICU FieldPosition type [0].

Example:

var f = Intl.DateTimeFormat('ar', {
  hour: 'numeric',
  minute: 'numeric'
  hour12: true
});

var res = f.format(date, true);
// { string: '٤:١٩ م',
//  fieldPosition:
//   { hour: { beginIndex: 0, endIndex: 1 },
//     minute: { beginIndex: 2, endIndex: 4 },
//     dayperiod: { beginIndex: 5, endIndex: 6 } } }

I considered using a separate function, but felt like it's easier to extend what FormatDateTime returns.
Let me know what you think.

[0] http://icu-project.org/apiref/icu4c/classicu_1_1DateFormat.html#a620a647dcf9ea97d7383ee1efaf182d1

from ecma402.

zbraniecki avatar zbraniecki commented on June 8, 2024

Oh, and the reason I went for beginIndex and endIndex is because that's what ICU returns. I'm totally ok converting it to an array like in @srl295 proposal.

from ecma402.

caridy avatar caridy commented on June 8, 2024

@zbraniecki few notes:

  • fieldPosition structure seems fine.
  • string is probably no good, maybe value, or result, or just the toString() version of the object?
  • boolean arguments are very very ambiguous, and we have been trying to escape that trap for all new APIs.
  • maybe a separate method instead of reusing format() because that one is focused on producing a string representation of the date value.

/cc @ericf @rwaldron can you guys chime in?

from ecma402.

zbraniecki avatar zbraniecki commented on June 8, 2024

Yeah, as I'm playing with it, I am changing my mind, how about:

var f = Intl.DateTimeFormat('ar', {
  hour: 'numeric',
  minute: 'numeric'
  hour12: true
});

f.format(date); //  ٤:١٩ م

var res = f.formatWithPosition(date);
// { value: '٤:١٩ م',
//  fieldPosition:
//   { hour: { beginIndex: 0, endIndex: 1 },
//     minute: { beginIndex: 2, endIndex: 4 },
//     dayperiod: { beginIndex: 5, endIndex: 6 } } }

Alternatively we could do format and pattern methods, but that would require user to launch the same code twice - once to format, and second time to get the pattern (I don't think retrieving a pattern without retrieving a formatted string is going to be useful), so formatWithPosition sounds more applicable.

from ecma402.

zbraniecki avatar zbraniecki commented on June 8, 2024

Here's Intl.js polyfill branch with the latest proposal: https://github.com/zbraniecki/Intl.js/tree/dtformattokens

from ecma402.

zbraniecki avatar zbraniecki commented on June 8, 2024

For C++, I'm using udat_formatForFields [0] and it returns more field types than we recognize in Intl[1].

My suggestion is to map it to field names this way:

{
  0: 'era',
  1: 'year',
  2: 'month',
  3: 'day',
  4: 'hour',
  5: 'hour',
  6: 'minute',
  7: 'second',
  9: 'weekday',
 14: 'dayperiod',
 15: 'hour',
 16: 'hour',
 17: 'timeZoneName',
}

and ignore the rest for now.
In the future we may want to add things timeSeparator and millisecond.

[0] http://www.icu-project.org/apiref/icu4c/udat_8h.html#a4bc9d9661c115dcb337803bc89730b3a
[1] http://bugs.icu-project.org/trac/browser/icu/trunk/source/i18n/unicode/udat.h#L488

from ecma402.

zbraniecki avatar zbraniecki commented on June 8, 2024

@caridy - what do you think about formatWithPosition?

from ecma402.

zbraniecki avatar zbraniecki commented on June 8, 2024

Here's the proposal for ecma402 spec patch to add this: https://github.com/zbraniecki/ecma402/tree/dtwithposition

Opinions? @caridy , @rwaldron , @srl295 , @domenic, @rxaviers ?

It's kind of top priority for me because without that our platform has very hacky solution to style tokens in time strings and we do this in multiple places. The result is that in some locales our UI is broken and I'd like to get this feature landed for the next release.

from ecma402.

ericf avatar ericf commented on June 8, 2024

@zbraniecki, @caridy and I spent some time discussing this today. The first thing is we both agree that the ability to do "rich-text" formatting is an important feature. I had the same initial thought as @caridy about how this feels similar to a template literal tag function… but after going further and prototyping how a library like react-intl would use this we landed on something slightly different from your proposal.

Instead of having a fully formatted string with a collection of positions, we ended up wanting an array for formatted part descriptors.

let dateFormat = new Intl.DateTimeFormat('en');
let date = Date.now();

let formattedDate = dateFormat.formatToParts(date)
    .map(({value}) => value)
    .reduce((string, part) => string + part);

let fancyFormattedDate = dateFormat.formatToParts(date)
    .map(({type, value}) => {
        switch (type) {
            case 'month': return `<b>${value}</b>`;
            default     : return value;
        }
    })
    .reduce((string, part) => string + part);

console.log(dateFormat.format(date)); // "10/30/2015"
console.log(formattedDate);           // "10/30/2015"
console.log(fancyFormattedDate);      // "<b>10</b>/30/2015"

The bikeshed hasn't been painted here yet, but we think having an array of formatted part descriptors like this makes it easier to process each part and put them back together in a final string than trying to use substrings on positions. The parts can be processed left-to-right without worrying about throwing off the position indexes of the following parts when modifying the current part; e.g., wrapping it with HTML.

For the example above, the array of part descriptors might look something like this:

[
    {
        type: 'month',
        value: '10',
    },
    {
        type: 'separator',
        value: '/'
    },
    {
        type: 'day',
        value: '30',
    },
    {
        type: 'separator',
        value: '/'
    },
    {
        type: 'year',
        value: '2015',
    }
]

Note how the values are always strings. type is probably not the correct label, but this is mainly to illustrate this idea compared with the positions idea.

from ecma402.

caridy avatar caridy commented on June 8, 2024

To add to that, we think formatToParts() (we can bikeshed on that method name) will be a method of DateTimeFormat, NumberFormat, RelativeTimeFormat, etc. In general, if you can format something, you should also be able to format the parts. Internally, format() and formatToParts() relies on the same abstract operation, but format() does put the parts together using the algo described above for formattedDate.

from ecma402.

zbraniecki avatar zbraniecki commented on June 8, 2024

Awesome! Thanks a lot for this brainstorming. I love the API and I'll be happy to update my proposal to it.

One question about naming of type

That's unfortunately a little bit more complex to implement and I'm wondering how you'd like to approach.
We generally have two sources of datetime formatting patterns:

  • JS pattern string (like: "hmsv": "h:mm:ss a v" or "GyMMMMEd": "E, d MMMM y G")
  • ICU's udat_formatForFields ( [0] )

Both approaches make it easy to index positions of date time components, but not much for the separators.

While ICU returns "timeSeparator" field [0], everything else is a fair game. It's hard to know if the separator between day and month will be /, ., , or something else.

Because of that, would you just want to use a generic term separator for everything that is not a token?
It's a bit pity, because if someone would like to style time separator in 12<span>:</span>36 they'd have to find hour and minute token and the separator type between them (which may also not be present?).
But I don't really see a way to name all separators and their combinations (minute/dayperiod separator? weekday/year separator?)

[0] http://www.icu-project.org/apiref/icu4c/udat_8h.html#a4bc9d9661c115dcb337803bc89730b3a
[1] http://www.icu-project.org/apiref/icu4c/udat_8h.html#adb09b47d4576513229f83f2e8f507fc2a948cc45ccae55dcf5c094e3c33a2c3d7

from ecma402.

caridy avatar caridy commented on June 8, 2024

I'm not sure I follow your question, but I can tell you that separator is probably not a good type, maybe just text, or token, or something is better to signal that it is part of the formatted value that doesn't correspond to any of the units used to produce the output.

from ecma402.

ericf avatar ericf commented on June 8, 2024

@caridy I think the concept of separator seems correct. There might more types like this as well which cover the concept of "text".

@zbraniecki Are you thinking we'd need something like this?

{
    type: 'field',
    field: 'month',
    value: '10'
}

from ecma402.

zbraniecki avatar zbraniecki commented on June 8, 2024

No, what I meant is that I'm not sure if "Wednesday, 03/2015 12:32 am" should be reported as:

[
  { type: 'weekday', value: 'Wednesday' },
  { type: 'separator', value: ', ' },
  { type: 'month', value: '03' },
  { type: 'separator', value: '/' },
  { type: 'year', value: '2015' },
  { type: 'separator', value: ' ' },
  { type: 'hour', value: '12' },
  { type: 'separator', value: ':' },
  { type: 'hour', value: '32' },
  { type: 'separator', value: ' ' },
  { type: 'dayperiod', value: 'am' },
]

or should we try to name separators like timeSeparator, dateSeparator (ICU has a special token named timeSeparator). But I guess it doesn't make much sense since there may be all combinations of separators.

So the question @caridy asked stays - type separator or type text or sth else?

from ecma402.

zbraniecki avatar zbraniecki commented on June 8, 2024

I got Intl.js polyfill to support the proposed formatToParts. Can you look at it and provide feedback? I'll write a patch to spec next.

from ecma402.

ericf avatar ericf commented on June 8, 2024

@zbraniecki I'm wondering how LRM (\u200E) and RLM (\u200F) characters should be handled when we format to parts. Do you have any thoughts on this and whether the marker chars should be "sticky" to the field types, or separator types?

I'm not an expert how these bidi markers actually work and what happens when we process the parts by wrapping them with a span. e.g.:

new Date(0).toLocaleString("ar",{month:"numeric",day:"numeric",year:"numeric"}).replace(/\u200F/g, '[RLM]');

// "٣١[RLM]/١٢[RLM]/١٩٦٩"

If we wanted to bold the month and create an HTML string, would the marker be part of the <b> element's children or the / separator?

This is related to: #28

from ecma402.

zbraniecki avatar zbraniecki commented on June 8, 2024

@ericf - good question. My initial take is that it's part of the separator, because everything that is not a datetime component is.

I consulted with our arabic community leader and run an experiment and it looks good:

var parts = [
  {"type":"day","value":"٣١"},
  {"type":"separator","value":"\u200F/"},
  {"type":"month","value":"١٢"},
  {"type":"separator","value":"\u200F/"},
  {"type":"year","value":"١٩٦٩"}
];


var formattedStr = parts.map(({type, value}) => {
        switch (type) {
            case 'year': return `<b>${value}</b>`;
            case 'day': return `<b>${value}</b>`;
            case 'month': return `<b>${value}</b>`;
          case 'separator': return `<small style="color: red">${value}</small>`;
            default     : return value;
        }
    })
    .reduce((string, part) => string + part);

var annotatedStr = formattedStr.replace(/\u200F/g, '[RLM]');

elem.innerHTML = formattedStr;
elem2.textContent = annotatedStr;

Does it match your expectations?

from ecma402.

zbraniecki avatar zbraniecki commented on June 8, 2024

Here's a jsbin with this example in action: https://jsbin.com/lefujuqewo/edit?html,js,output

from ecma402.

caridy avatar caridy commented on June 8, 2024

I don't think we will have issues with the marks, at the end of the day most inline tags will be ignored by screen readers, and direction marks will still be considered, isn't it?

from ecma402.

zbraniecki avatar zbraniecki commented on June 8, 2024

That's my interpretation as well, but I wasn't sure about it so wanted to verify.

from ecma402.

ericf avatar ericf commented on June 8, 2024

@zbraniecki looks good and it makes sense to me as well that the LRMs and RLMs are part of the separators.

from ecma402.

zbraniecki avatar zbraniecki commented on June 8, 2024

We made some progress with the planning of this feature. I have a polyfill working and confirmation from @caridy that we want to try to get it in rev 3.

The challenge I'm facing now is with the design of the Number.formatToParts counterpart. The data for that in CLDR is not stored in {token}{sep}{token2} format like DateTime and that means that we have to choose the level of granularity and figure out how to extract it from the pattern.

Patterns for numbers look like this: http://www.unicode.org/cldr/charts/28/summary/pl.html#4058

ICU also doesn't provide any counterpart to udat_formatForFields which means that implementing this using ICU will be a challenge.

In the perfect world I imagine tokens to me:

{
  'number':
  'fraction':
  'currency':
  'percentsign':
  'negativesign':
  'fractionseparator':
  'thousandseparator':
  'separator':
}

but I honestly doubt that we can create anything like that out of the data that we have.

We can aim for something much simpler (maybe: "-23.45%" -> "{negativesign}{number}{fractionseparator}{fraction}{percentsign}"?) or we can not add it for NumberFormat now.

Based on what I see, formatting numbers is much less important for UX than formatting dates so we may want to wait with it and I'm afraid that it'll be hard to make the returned model forward-compatible if we'll ever want to add more detail.

Opinions?

from ecma402.

caridy avatar caridy commented on June 8, 2024

I see a couple of string token replacements in the format number algo, specifically:

1. Replace each digit in n with the value of digits[digit].
1. If n contains the character ".", then replace it with an ILND String representing the decimal separator.
1. Replace the substring "{number}" within result with n.
1. Replace the substring "{currency}" within result with cd.

that should be enough to highlight the pieces that they might want to style, like the currency symbol, and the number.

from ecma402.

rxaviers avatar rxaviers commented on June 8, 2024

I'm late here, but wanted to say I loved the API.

Great work!

The only thing I would consider changing is the separator name. I would call it literal or something else that is more generic to better represent some parts of the more verbose formatted dates. For example, things like at, or o'clock, or de (in Portuguese or Spanish):

  • yyyy.MM.dd G 'at' HH:mm:ss zzz => 1996.07.10 AD at 15:08:56 PDT
  • hh 'o''clock' a, zzzz => 12 o'clock PM, Pacific Daylight Time
  • d 'de' MMMM => 5 de janeiro

Just emphasizing, I assume you are naming separator everything that isn't a date field.

from ecma402.

rxaviers avatar rxaviers commented on June 8, 2024

About number parts, I would consider preferring CLDR (UTS#35) names for the tokens:

from ecma402.

caridy avatar caridy commented on June 8, 2024

+1 on "literal"

from ecma402.

zbraniecki avatar zbraniecki commented on June 8, 2024

yup, works for me. I may need some help to transfer UTS#35 into tokens for NumberFormat.formatToParts

from ecma402.

rxaviers avatar rxaviers commented on June 8, 2024

I'll be happy to help time permitting.

from ecma402.

caridy avatar caridy commented on June 8, 2024

formatToParts.pdf
Official Proposal - Nov 2015

from ecma402.

zbraniecki avatar zbraniecki commented on June 8, 2024

Polyfill: https://github.com/zbraniecki/proposal-intl-formatToParts
Patch against Intl.js: https://github.com/zbraniecki/Intl.js/tree/dtformattoparts

@caridy - does it look good?

from ecma402.

zbraniecki avatar zbraniecki commented on June 8, 2024

PR against Intl.js: andyearnshaw/Intl.js#142
PR against ecma402 spec: #64

Gecko patch: https://hg.mozilla.org/mozilla-central/rev/15bd594b5982

@ericf, @rxaviers - can you take a look, especially at the spec?

@caridy - I tested perf impact on format and unfortunately it doesn't seem to give us any win, but the perf difference is within 4% (node 5.4) so I think it's ok.

from ecma402.

zbraniecki avatar zbraniecki commented on June 8, 2024

Heads up, this has been advanced to Stage 2 as of today. DateTimeFormat.formatToParts is basically ready for review, and @stasm is finishing NumberFormat.formatToParts. We have reviewers assigned.

from ecma402.

littledan avatar littledan commented on June 8, 2024

Is this completely ready for review? I only see spec text for DateTimeFormat.formatToParts at #64 , no NumberFormat.

from ecma402.

caridy avatar caridy commented on June 8, 2024

I will do a second pass today @littledan, ideally I will have time today to help Stas with NumberFormat as well.

from ecma402.

stasm avatar stasm commented on June 8, 2024

I was out last week. I'll work on the NumberFormat spec this week and will sync with @caridy around Thursday.

from ecma402.

stasm avatar stasm commented on June 8, 2024

@caridy in #30 (comment):

+1 on "literal"

Both #64 and #79 now use separator. Would you like to change this to literal?

from ecma402.

caridy avatar caridy commented on June 8, 2024

I'm ok either way, you guys decide it :)

from ecma402.

stasm avatar stasm commented on June 8, 2024

My vote would go to literal as more versatile and neutral in meaning. For instance, denominations from #37 could be described as type literal (i.e. { type: 'literal', value: 'thousand' }) but likely not separator.

from ecma402.

rxaviers avatar rxaviers commented on June 8, 2024

+1 to literal 😄

from ecma402.

zbraniecki avatar zbraniecki commented on June 8, 2024

The spec patch has been rebased on top of the master and with @stasm changes applied.

I also create a PR for Intl.js to switch sepatator to literal - andyearnshaw/Intl.js#153

from ecma402.

zbraniecki avatar zbraniecki commented on June 8, 2024

The spec has been reviewed and TC39 advanced it to Stage 3.

from ecma402.

zbraniecki avatar zbraniecki commented on June 8, 2024

We should update the Intl.js implementation to match the spec.

I already updated the dayPeriod and literal token names, but in the spec we changed function names and made the formatToParts non-bound.

from ecma402.

zbraniecki avatar zbraniecki commented on June 8, 2024

Intl.js has been updated to use dayPeriod and literal - so only making formatToParts not-bound remains.

I also landed an update to SpiderMonkey implementation to fully match the spec - https://bugzilla.mozilla.org/show_bug.cgi?id=1260858 - so we're ready to unprefix DateTimeFormat.prototype.formatToParts once the spec lands.

from ecma402.

SebastianZ avatar SebastianZ commented on June 8, 2024

I'm obviously very late on this, but I wonder if @srl295's proposed syntax having a second parameter holding the parts was actually considered or altering the return value of format() via a second parameter were actually considered.

So, the modified example of @srl295 may look like this:

var formatter = Intl.DateTimeFormat(navigator.languages, {
  hour: 'numeric',
  minute: 'numeric'
});
var parts = {};
var string = formatter.format(new Date(), parts);
// string = "12:34"
// parts = [{type: 'hour', value: '12'}, {type: 'separator', value: ':'}, {type: 'hour', value: '34'}]

Or, if modifying parameters is considered bad by more people than @caridy, the second parameter could control the type of return value. As boolean parameters should be avoided, how about a string:

formatter.format(new Date(), "parts");

or constants:

formatter.format(new Date(), Intl.PARTS_FORMAT)

Sebastian

from ecma402.

zbraniecki avatar zbraniecki commented on June 8, 2024

@SebastianZ why do you think what you're suggesting is better than formatToParts function name?

from ecma402.

caridy avatar caridy commented on June 8, 2024

@SebastianZ in general, if the function forks the logic based on some arguments, then in principle it should be splitted into to functions. "parts" or Intl.PARTS_FORMAT is a no-go for me due to that same principle.

from ecma402.

SebastianZ avatar SebastianZ commented on June 8, 2024

@SebastianZ why do you think what you're suggesting is better than formatToParts function name?

It's not better but it's an alternative. The advantage is that the existing function is reused instead of introducing a new function, which does similar things.

@SebastianZ in general, if the function forks the logic based on some arguments, then in principle it should be splitted into to functions. "parts" or Intl.PARTS_FORMAT is a no-go for me due to that same principle.

Sounds reasonable, but they also share parts of the logic and there are many counter examples throughout the APIs like String.prototype.replace(), XPathEvaluator.evaluate() or XMLHttpRequest.open().

Also, can you please expand on why you're not ok with mutating the second argument as suggested by @srl295? (It would work like PHP's matches argument in preg_match().)

Again, I don't have a strong opinion about either of the syntaxes, I just want to make sure all alternatives are considered and the reasons be clarified why to choose one of them (or not to choose the others).

Sebastian

from ecma402.

caridy avatar caridy commented on June 8, 2024

@SebastianZ yeah, we heavily discussed the alternatives, we look at the prior art (like the PHP implementation) and the precedent in ES, and we have no precedent for mutations, and in fact the committee will not pass an API with that form in principle (you can probably search thru the tc39 notes for similar proposals in the past). So we ended up with 2 main options, reusing format or introducing an alternative api that can share the guts by creating an abstract operation, which ultimate with decided to go with it.

from ecma402.

jungshik avatar jungshik commented on June 8, 2024

A bit of observation after making a draft CL to implement formatToParts for DateTimeFormat in v8.

I realized that it can be tricky to style different 'logical' parts in some locales. For instance, I got the following in Korean with options {year:"numeric", month:"numeric", day:"numeric", weekday: "short", hour:"numeric", minute:"numeric", second:"numeric"}. (Firefox implementation would give the same result afaict).

2016. 8. 24. (수) 오후 3:41:49
0:year:'2016'
1:literal:'. '
2:month:'8'
3:literal:'. '
4:day:'24'
5:literal:'. ('
6:weekday:'수'
7:literal:') '
8:dayperiod:'오후'
9:literal:' '
10:hour:'3'
11:literal:':'
12:minute:'41'
13:literal:':'
14:second:'49'

Some UI folks would consider it better to style '(수)' ( '수' (short weekday name) enclosed by parentheses) as a unit rather than styling only '수' (weekday name alone).

With the above tokenization, it's hard to do that. I'm not sure how to address this issue in a generic/locale-independent way.

from ecma402.

jungshik avatar jungshik commented on June 8, 2024

Of course, one can do a locale-dependent (and even-exact-format-dependent) tweaking like you showed.

With the above tokenization, it's hard to do that. I'm not sure how to address this issue in
a generic/locale-independent way.

I should have emphasized generic/locale-independent way in my previous comment. Having to have a locale-dependent tweaking (if (lang==foo) do this else if (lang==bar) do that .....) is unfortunate, but sometimes that's the way it is, I'm afraid.

To avoid any confusion (sorry that I was not clear), I'm not saying that we should change this API (partly because I don't know how let alone whether it'd be possible without changing data such as CLDR).

from ecma402.

zbraniecki avatar zbraniecki commented on June 8, 2024

I believe that the snippets I provided are the closest thing we can do to locale-independant behavior. It may have exceptions (which may need if (lang in [])), but if you want to capture more than the bare parts, you may use such snippets.

My understanding is that the API provides the lowest common denominator and allows for higher level APIs to be built on top.

from ecma402.

zbraniecki avatar zbraniecki commented on June 8, 2024

Intl.NumberFormat.prototype.formatToParts has landed in SpiderMonkey today -
https://hg.mozilla.org/integration/mozilla-inbound/rev/f849271896d3

from ecma402.

littledan avatar littledan commented on June 8, 2024

@zbraniecki Looks like the ICU part has its upstreaming still in progress. It'll be easier for us to implement in V8 after that's done. Is there any particular reason why it has to get to Stage 4 at this meeting? Could we wait for a V8 implementation until the ICU side work is more mature?

from ecma402.

zbraniecki avatar zbraniecki commented on June 8, 2024

Yeah, certainly.

I do think about ES releases as milestones, mostly because that's how the market treats them. While I'm not worried about V8 shipping features quickly, I believe that if we miss the train for ES2017 with formatToParts and PluralRules, it significantly increases the chance that it'll add a whole year until all modern JS engines will ship it and that we won't be able to rely on it until at least Q3 2018.

My hopes were for those two features to be in a good shape for stage 4 in time for ES2017. If you believe that it's not worth it and if you don't share my worry about market adoption, I trust your experience on the matter.

from ecma402.

littledan avatar littledan commented on June 8, 2024

I think you've been doing a good job building towards getting browsers to support this by making sure there's an implementation in Mozilla and getting the ICU part upstreamed properly. I haven't seen a lot of evidence that browsers are waiting for the annual cutoff to implement features, on the other hand. I am not sure how long it will take for people to be able to depend on the feature--IE11 will probably never support it, but that browser may be in use for some time.

from ecma402.

jungshik avatar jungshik commented on June 8, 2024

Note that ICU is NOT a blocker for v8 because v8 does not plan to use ICU's C API. The ICU change made by Mozilla is just to port an existing C++ API to C API. What I'm not yet 100% sure is the practical usefulness of NumberFormt.formatToParts (and I haven't fully digested the proposed API, yet)

from ecma402.

rxaviers avatar rxaviers commented on June 8, 2024

I took the liberty to update this issue description and include the proposal link.

from ecma402.

caridy avatar caridy commented on June 8, 2024

@zbraniecki can you add the agenda item for this? If you don't plan to be in the next meeting (Boston) I can do the update.

Also, we need to work on the spec text, to get the PR ready before the next meeting, many things has changed since the revert, and we need to work things out.

from ecma402.

zbraniecki avatar zbraniecki commented on June 8, 2024

@stasm would you be interested in updating your PR to the current master?

from ecma402.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.