Giter Site home page Giter Site logo

tidymetrics's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tidymetrics's Issues

cross_by_periods behavior

Hi - thanks for creating this great package. I have a question on the behavior of cross_by_periods. In the sample data below, the max date is 3/21/2020

df <- tibble::tibble(
  date = structure(c(18333, 18334, 18335, 18336, 18337, 18338, 18339, 
    18340, 18341, 18342), class = "Date"),
  count = c(8, 23, 38, 64, 97, 118, 156, 229, 314, 426)
)
df %>% 
  cross_by_periods('day',windows = 7) %>% 
  summarise(roll=mean(count))

Two questions:

  1. the function seems to add days depending on the window you specify (i.e. the max date in the data is 3/21, but the function adds days until 3/27. for the rolling_7d calculation). Is this the desired behavior?

  2. the function starts a 7day rolling window from the first day (instead of the 7th); is it possible to adjust this?

Thanks for your time.

add postgres unit tests

Hey @ramnathv, following up from pairing--I open a PR with code I was using to test tidymetrics against postgres. Let me know if there are any adjustments that would be useful!

Right now it's running well against a local db (spun up using the included docker-compose.yml file), but would need a couple tweaks to get up on travis.

I set it up to work with a subset of flights data, and copied one of the existing tests to work against postgres.

Couple things to note

  • I set the port to be 5433 locally (since my system wide postgres uses the default), and the default 5432 on Travis.
  • I think there is a slight issue with the calculation of calendar dates (copied from datacamp's data-pipeline-views), causing the test to fail...

The datacamp code sets the date here to "2012-12-31", but the test expects "2013-01-01".

image

Misprint in function description

Function cross_by_dimensions, misprint in description in word "All", extra letter "l":

replaces the value of the column with the word "Alll"

Make create_metrics less opinionated

I was using tidymetrics in a screencast and I noticed how much more opinionated it is than it needs to be. This makes it difficult to get someone up and running,

I'd be very happy to implement this myself but wanted to run the approach by you @ramnathv

Current interface

Right now, create_metrics() requires the following in the YAML header:

  • name, which it then splits into three parts (because the first is generally metrics_), and the second and third turn into category and subcategory
  • owner
  • metrics, with title and description for each metric
  • dimensions, with title and description for each metric

Proposal

I'm proposing a new interface. First, all the metadata is optional, so that if you run create_metrics on a table with a date column you'll get something right away.

  • category (optional)
  • subcategory (even more optional)
  • owner (optional)
  • metrics (optional): If this doesn't have anything in it, the titles will be the metric IDs, and the descriptions could be blank.
  • dimensions (optional) If this doesn't have anything in it, the titles will be the metric IDs, and the descriptions could be blank.

(For reverse compatibility we could maybe allow name that gets split up into category/subcategory, but I'm not even sure about that).

How we'd handle this in shinymetrics is an open question. If a description is NA, it could show no description at all, or could say something like "To fill in a description, add description: to the metric's metadata" or whatever.

Note that this would make the metric full IDs less strict; they wouldn't always be category_subcategory_prefix_metric, they might just be category_prefix_metric or just prefix_metric. But I think it's worth it to have people get up and running with a metric really quickly.

check for missing dimensions

When making a metric with create_metrics(), it should check to see if the documentation on dimensions are missing (similar to how it checks for missing metric documentation.)

When there's only one dimension, the All tab doesn't appear

Is this intentional?

Reproducible example. YAML header:

---
name: metrics_stock_prices
owner: drob
metrics:
  usd_close:
    title: Closing Price
    description: Close price, in USD, at the end of this time period.
  nb_volume:
    title: Volume
    description: Number of shares traded
dimensions:
  symbol:
    title: Stock
    description: Stock symbol
---

Code:

library(dplyr)
library(tidymetrics)
library(shinymetrics)
library(tidyquant)
stocks <- tq_get(c("AAPL", "GOOG"))

stocks_summarized <- stocks %>%
  cross_by_dimensions(symbol) %>%
  cross_by_periods(c("day", "week")) %>%
  summarize(nb_volume = sum(volume),
            usd_close = last(close))

m <- create_metrics(stocks_summarized)

preview_metric(m$stock_prices_nb_volume)

Result:

image

Cross by dimensions should be able to calculate 1-depth rather than all combinations

This makes the size of the intermediate table with k dimensions linear in k rather than 2^k.

cross1 <- bind_rows(mutate(mtcars, wt = "All"), mtcars %>% mutate(wt = as.character(wt)))
result_full <- bind_rows(mutate(cross1, mpg = "All"), cross1 %>% mutate(mpg = as.character(mpg)))

cross1 <- bind_rows(mutate(mtcars, wt = "All"), mtcars %>% mutate(wt = as.character(wt))) %>% mutate(mpg = as.character(mpg))
cross2 <- bind_rows(cross1, mtcars %>% mutate(mpg = as.character(mpg)) %>% mutate(wt = as.character(wt)))

# Ideal interface something like
cross_by_dimension(mtcars, depth = NULL)
cross_by_dimension(mtcars, depth = 1)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.