Giter Site home page Giter Site logo

Comments (21)

nealstewart avatar nealstewart commented on August 17, 2024 2

@mbostock Any more recent thoughts in this area? I need some line smoothing for a time series of continuous data in my application, and would love to contribute something.

from d3-shape.

mbostock avatar mbostock commented on August 17, 2024 1

I think it might be a good idea to do smoothing as a pre-processing step in data space rather than doing it internally as a curve:

  • Data space seems like a more natural fit when specifying kernel bandwidth.
  • You might want to compute a smooth fit once before interactively redrawing (e.g., pan & zoom).
  • You might want to access points in the computed smooth fit for display (e.g., hover tooltip).

I’m imagining something that has a x and y accessors and other configuration (e.g., kernel and bandwidth) and then computes the smooth curve, generating an array of [[x1, y1], [x2, y2], …] points in data space. Those can then be passed to a line generator, likely with an x and y accessor that apply scales to transform to screen space.

(I’m slightly tempted to have it return [[x1, x2, x3, …], [y1, y2, y3, …]] because that’s a more more efficient representation, but it would make implementing the accessors a little bit more awkward, so it feels like premature optimization.)

from d3-shape.

mbostock avatar mbostock commented on August 17, 2024 1

Here’s a simple moving average implementation:

https://observablehq.com/@d3/moving-average

from d3-shape.

mbostock avatar mbostock commented on August 17, 2024

A kernel regression is another possibility, as in this kernel density estimation example. That implementation is O(n^2), but I suspect you could make it O(n lg n), or even O(n) if you assume that data is already sorted on x, and then implement a single-pass, fixed-window incremental algorithm.

from d3-shape.

mbostock avatar mbostock commented on August 17, 2024

LOESS is another option.

from d3-shape.

mbostock avatar mbostock commented on August 17, 2024

And here’s a simple test case:

screen shot 2015-12-01 at 1 17 59 pm

With a smooth curve in R:

cvgpl7jueaamixy

from d3-shape.

curran avatar curran commented on August 17, 2024

Yes! When I first saw this visualization I thought of trying to implement a gaussian KDE for use with D3.

from d3-shape.

mbostock avatar mbostock commented on August 17, 2024

I couldn’t resist taking a crack at it. Here’s the slow O(n^2) implementation of kernel smoothing:

screen shot 2015-12-01 at 3 34 12 pm

Things that would be nice:

  • Make it faster than O(n^2) by tracking a fixed window of input points (in x ± bandwidth).
  • Support kernels besides Epanechnikov.
  • Remove the scale factor 0.75 from the default Epanechnikov kernel, since it has no effect.
  • Maybe pick a default bandwidth of 20 (fast)? Or Silverman’s rule of thumb (slow; global)?
  • Maybe pick a default precision of 10 (distance in pixels between computed spline control points)?
  • Remove dependency on d3_scale.

from d3-shape.

mbostock avatar mbostock commented on August 17, 2024

I’m going back to working on API documentation if anyone else wants to make a better implementation…

from d3-shape.

curran avatar curran commented on August 17, 2024

Doing a little research, here's a list of common kernel functions

  • Uniform
  • Triangular
  • Epanechnikov
  • Quartic
  • Triweight
  • Tricube
  • Gaussian
  • Cosine
  • Logistic
  • Silverman

Also, here's a formula for bandwidth estimation using "Silverman's rule of thumb".

from d3-shape.

mbostock avatar mbostock commented on August 17, 2024

See also science.js, which implements bandwidth estimators and various kernels.

from d3-shape.

e-n-f avatar e-n-f commented on August 17, 2024

My fit above, by the way, is the sum of two gaussian PDFs, plus the sum of the same gaussians for x-24 and x+24, all raised to a power, for no good reason other than that it looked about right. I'd love to know a better way than eyeballing it and then applying general curve fitting to guess the components of things that look like the sum of gaussians.

from d3-shape.

mbostock avatar mbostock commented on August 17, 2024

See also ggplot2’s stat_smooth, which uses LOESS and GAM (generalized additive models). And ggplot2 has a geom_smooth which can smooth using a ribbon, showing an upper and lower bound on the smoothed model. That might be a nice curve type for an area shape.

from d3-shape.

curran avatar curran commented on August 17, 2024

Here's the Loess + D3 example from science.js by @jasondavies . I made a bl.ock to study it:

loess

It looks like the API expects loess(xValuesArray, yValuesArray) and returns an array of smoothed Y values.

from d3-shape.

mbostock avatar mbostock commented on August 17, 2024

Since the kernel operates in pixel space, I bet it would be reasonable to just have stupid defaults for the bandwidth (±20px) and precision (control points every 10px). Though certainly it would be nice if Silverman’s rule of thumb were easy to implement if you wanted a better default value, albeit at the cost of a global computation.

from d3-shape.

curran avatar curran commented on August 17, 2024

Here's an example that adds a LOESS curve to your tweet example using science.js:
loess

I made this to try to understand the science.js LOESS implementation. Here are a few things I learned:

  • It was necessary to sort the data by time. I thought it would be sorted by time already (based on the order from the CSV file), but somehow it doesn't work without first sorting by time.
  • The loess function accepts arrays of values in data space, and surprisingly it works with dates.
  • The loess function returns an arrays of smoothed Y values that is the same length as the original arrays. I would have hoped that the length of the returned array is configurable, but it looks like in this implementation it is not.

from d3-shape.

tomgp avatar tomgp commented on August 17, 2024

+1 on the idea that this should be a data space operation.

from d3-shape.

mbostock avatar mbostock commented on August 17, 2024

Probably should create a d3-smooth module for smoothing functions.

from d3-shape.

muonsw avatar muonsw commented on August 17, 2024

Hope you don't mind me commenting but I found my way to this after looking at the issues on science.js.

I'm sure things have moved on since December, but I implemented a kernel smoothing function for use with D3 a while back. It cuts off the calculation for each point once the calculated weight goes below a defined threshold (with relation to the weight at the point that is being smoothed).

  • It looks like it scales linearly but there is a trade-off in accuracy. So for the twitter data in question it’s about an order of magnitude quicker (depending on the bandwidth) but gives a line that can be up to 0.03% different.
  • It assumes that data is sorted on the x-coordinate and hence that the calculation can be cut off.
  • It assumes that the kernel function will always decrease beyond the cut-off point.

I'm fairly new to JavaScript so I'm sure that it’s not the best implementation. Anyway, I've created a gist if it’s of any interest (or use).

gist

from d3-shape.

Fil avatar Fil commented on August 17, 2024

d3-regression implements loess
https://github.com/HarryStevens/d3-regression

from d3-shape.

mbostock avatar mbostock commented on August 17, 2024

This feels more appropriate to do in data space rather than geometrically, so I don’t think d3-shape is the right place for this. Still interested in seeing progress here, though.

from d3-shape.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.