Giter Site home page Giter Site logo

Comments (13)

ibdafna avatar ibdafna commented on June 22, 2024 1

Okay, understood 😊 I think I have some ideas about how to fix this. Shouldn't be anything difficult. Will report back!

from lumino.

ibdafna avatar ibdafna commented on June 22, 2024 1

Digging deeper, I think this may have to do with our rendering optimizations. When resizing column/row headers or scrolling, we trigger a full on-screen buffer paint - this is extremely slow. We should be writing to the off-screen buffer and copying any changes from that buffer to the on-screen one.

lumini_perf_2

from lumino.

fcollonval avatar fcollonval commented on June 22, 2024

friendly ping to @martinRenou and @ibdafna

from lumino.

martinRenou avatar martinRenou commented on June 22, 2024

Thanks for the ping.

@krassowski do you need text wrapping (line return for long text in cells) in your case? I see there is one measureText call that we could probably get rid of if you don't need text wrapping.

from lumino.

martinRenou avatar martinRenou commented on June 22, 2024

In the case of text wrapping enabled, we can perhaps come up with a better algorithm that uses less measureText calls.
We could also have some tricks to work around measureText slowness in the case of mono fonts.

from lumino.

ibdafna avatar ibdafna commented on June 22, 2024

Thanks for reporting! I'd argue this is an extreme case when each string has a length of 10240 characters - not justifying the performance, but just putting it out there. One simple fix would be to put a cap on the maximum string length we apply wrapping to and just disable the functionality if it's longer than the max.

EDIT: I don't think there's text wrapping enabled in this case - just eliding (which also measures the text).

We measure the text length in all cases - eliding or wrapping. textWrapping is optional, but eliding isn't. We should probably add a boolean for eliding, and if both wrapping and eliding are set to false then no text measurements calls need to be made. That's a useful but not sufficient approach. We can also apply some caching for {string:length}, although if n * m strings are all unique, we would end up with n * m entries which is not good for memory footprint.

let textWidth = gc.measureText(text).width;

from lumino.

krassowski avatar krassowski commented on June 22, 2024

I see there is one measureText call that we could probably get rid of if you don't need text wrapping.

Indeed, this one. No maybe not? Sorry I don't see it anymore!

EDIT: I don't think there's text wrapping enabled in this case - just eliding (which also measures the text).

I agree, this looks like an issue in eliding, but when I skimmed in the code I saw that wrapping is implemented with naive linear search but could use binary search to reduce the cost to log(n):

if (wordsInColumn.length === 0) {
let curLineTextWidth = gc.measureText(textInCurrentLine).width;
while (curLineTextWidth > boxWidth && textInCurrentLine !== '') {
// Iterating from the end of the string until we find a
// substring (0,i) which has a width less than the box width.
for (let i = textInCurrentLine.length; i > 0; i--) {
const curSubString = textInCurrentLine.substring(0, i);
const curSubStringWidth = gc.measureText(curSubString).width;
if (curSubStringWidth < boxWidth || curSubString.length === 1) {
// Found a substring which has a width less than the current
// box width. Rendering that substring on the current line
// and setting the remainder of the parent string as the next
// string to iterate on for the next line.
const nextLineText = textInCurrentLine.substring(
i,
textInCurrentLine.length
);
textInCurrentLine = nextLineText;
curLineTextWidth = gc.measureText(textInCurrentLine).width;
gc.fillText(curSubString, textX, curY);
curY += textHeight;
// No need to continue iterating after we identified
// an index to break the string on.
break;
}
}
}
}
// Multiple words in column header. Fitting maximum
// number of words possible per line (box width).
else {
while (wordsInColumn.length !== 0) {
// Processing the next word in the queue.
const curWord = wordsInColumn.shift();
// Joining that word with the existing text for
// the current line.
const incrementedText = [textInCurrentLine, curWord].join(' ');
const incrementedTextWidth = gc.measureText(incrementedText).width;
if (incrementedTextWidth > boxWidth) {
// If the newly combined text has a width larger than
// the box width, we render the line before the current
// word was added. We set the current word as the next
// line.
gc.fillText(textInCurrentLine, textX, curY);
curY += textHeight;
textInCurrentLine = curWord!;
} else {
// The combined text hasd a width less than the box width. We
// set the the current line text to be the new combined text.
textInCurrentLine = incrementedText;
}
}
}

from lumino.

ibdafna avatar ibdafna commented on June 22, 2024

We should look to improve the wrapping algorithm for sure, but I guess my point here is that we should eliminate or significantly speed up the measureText call which happens regardless or wrapping/eliding

from lumino.

krassowski avatar krassowski commented on June 22, 2024

Not sure how much we can speed up measureText as it is pretty close to the metal.

Eliding a sentence with 1024 characters to a 1-character size column with current binary search takes 10 measureText calls. We could reduce it to 1 call (on the happy path) with:

// If dealing with a very large string
if (textWidth > 4 * boxWidth) {
   const approximateLengthToKeep = Math.ceil(text.length * boxWidth/textWidth);
   // it is probably wise to add a small margin (increasing the number of calls on the happy path to two)
   // in order to reduce the chance of going into worst case scenario of the old algorithm.
   const margin = 1;
   const candidateText = text.substring(0, approximateLengthToKeep + margin) + elide;
   const candidateWidth = gc.measureText().width;
   // Keep the result if we did not overshoot because of unfavourable string composition;
   // the old algorithm will fine-tune it to remove the few remaining extra characters
   if (newWidth >= boxWidth) {
      text = candidateText;
      textWidth = newWidth;
   } else {
      // no-op
      // we would end up here for strings like: `iiiiimmmmmm` if we are unlucky.
      // Of course we could try to continue with a new best guess, but we should bail
      // after a few trials to avoid particularly nasty strings (worst case scenario is
      // alternating between overshoot and undershoot).
   }
}
// keep old algorithm below as it was
while (textWidth > boxWidth && text.length > 1) { 
     if (text.length > 4 && textWidth >= 2 * boxWidth) { 
       // If text width is substantially bigger, take half the string 
       text = text.substring(0, text.length / 2 + 1) + elide; 
     } else { 
       // Otherwise incrementally remove the last character 
       text = text.substring(0, text.length - 2) + elide; 
     } 
     textWidth = gc.measureText(text).width; 
   } 
 }

The bigger issue is a wide column with the text, say equivalent to 200 characters (this is what I saw locally). Then if you start with 399 characters you would call measureText 199 times. This looks like a realistic scenario for many users: you add a column with one paragraph but want to display only the size of about one sentence.

from lumino.

krassowski avatar krassowski commented on June 22, 2024

The bigger issue is a wide column with the text, say equivalent to 200 characters (this is what I saw locally). Then if you start with 399 characters you would call measureText 199 times. This looks like a realistic scenario for many users: you add a column with one paragraph but want to display only the size of about one sentence.

Example in details:

import {  JupyterFrontEnd, JupyterFrontEndPlugin } from '@jupyterlab/application';
import { MainAreaWidget } from '@jupyterlab/apputils';
import { DataGrid, DataModel, BasicKeyHandler, BasicMouseHandler, BasicSelectionModel } from '@lumino/datagrid';

class LargeDataModel extends DataModel {
  rowCount(region: DataModel.RowRegion): number {
    return region === 'body' ? 100 : 1;
  }

  columnCount(region: DataModel.ColumnRegion): number {
    return region === 'body' ? 1 : 1;
  }

  data(region: DataModel.CellRegion, row: number, column: number): any {
    if (region !== 'body') {
      return `${row}, ${column}`;
    }
    return 'x'.repeat(250);
  }
}

const extension: JupyterFrontEndPlugin<void> = {
  id: 'datagrid',
  autoStart: true,
  activate: (
    app: JupyterFrontEnd
  ) => {
    const model = new LargeDataModel();
    const content = new DataGrid();
  
    content.dataModel = model;
    // (handlers are not required for reproduction)
    content.keyHandler = new BasicKeyHandler();
    content.mouseHandler = new BasicMouseHandler();
    content.selectionModel = new BasicSelectionModel({
      dataModel: model
    });
    content.resizeColumn('body', 0, 800);
    const widget = new MainAreaWidget({content});
    widget.id = 'datagrid-example';
    widget.title.label = 'Slow Datagrid Example';
    app.shell.add(widget, 'main');
  },
};

export default extension;

Screenshot from 2022-10-27 20-38-37

Loading freezes the UI for 1.5 seconds on Chrome.

from lumino.

ibdafna avatar ibdafna commented on June 22, 2024

measureText can be sped up via a shim which does some caching or other internal accounting to reduce the CPU time. Just to iterate again, the issue raised above is separate from wrapText, which is another topic. The performance bottleneck you are experiencing above is not caused by the wrapText algorithm. We should look at fixing this issue first, before moving to wrapText. wrapText is disabled by default for now.

from lumino.

krassowski avatar krassowski commented on June 22, 2024

Yes I agree that the issue I saw is not in wrapText (I never said it was).

As for caching, it might help a bit with scrolling (at cost of doubling the RAM if not much worse) but as soon as user resizes the column we are back at square one.

Just to spell it out, I was referring to eliding in last few comments and eliding is enabled by default and inefficient in the worst case scenario (as described above).

from lumino.

ibdafna avatar ibdafna commented on June 22, 2024

@krassowski @fcollonval I've done more profiling to see what the performance is like without eliding or wrapping, and unfortunately this is still quite laggy even with zero calls to measureText. The paint call seems to take a long time. I am wondering why we're calling paint so much - we should be able to blit the content.

lumino_perf

image

This is for resize, but it's even worse for scrolling.

from lumino.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.