Giter Site home page Giter Site logo

pillar's Introduction

pillar

Lifecycle: stable R build status Coverage status CRAN status

pillar provides tools for styling columns of data, artfully using colour and unicode characters to guide the eye.

Due to limitations of GitHub's Markdown display, formatting cannot be shown in the README. The same content is available on https://pillar.r-lib.org/ with proper formatting.

Installation

# pillar is installed if you install the tidyverse package:
install.packages("tidyverse")

# Alternatively, install just pillar:
install.packages("pillar")

Usage

pillar is a developer-facing package that is not designed for end-users. It powers the print() and format() methods for tibbles. It also and defines generics and helpers that are useful for package authors who create custom vector classes (see https://github.com/krlmlr/awesome-vctrs#readme for examples) or custom table classes (like dbplyr or sf).

library(pillar)

x <- 123456789 * (10 ^ c(-3, -5, NA, -8, -10))
pillar(x)
#> <pillar>
#>       <dbl>
#> 123457.    
#>   1235.    
#>     NA     
#>      1.23  
#>      0.0123

tbl_format_setup(tibble::tibble(x))
#> <pillar_tbl_format_setup>
#> <tbl_format_header(setup)>
#> # A data frame: 5 × 1
#> <tbl_format_body(setup)>
#>             x
#>         <dbl>
#> 1 123457.    
#> 2   1235.    
#> 3     NA     
#> 4      1.23  
#> 5      0.0123
#> <tbl_format_footer(setup)>

Custom vector classes

The primary user of this package is tibble, which lets pillar do all the formatting work. Packages that implement a data type to be used in a tibble column can customize the display by implementing a pillar_shaft() method.

library(pillar)

percent <- vctrs::new_vctr(9:11 * 0.01, class = "percent")

pillar_shaft.percent <- function(x, ...) {
  fmt <- format(vctrs::vec_data(x) * 100)
  new_pillar_shaft_simple(paste0(fmt, " ", style_subtle("%")), align = "right")
}

pillar(percent)
#> <pillar>
#> <percent>
#>       9 %
#>      10 %
#>      11 %

See vignette("pillar", package = "vctrs") for details.

Custom table classes

pillar provides various extension points for customizing how a tibble-like class is printed.

tbl <- vctrs::new_data_frame(list(a = 1:3), class = c("my_tbl", "tbl"))

tbl_sum.my_tbl <- function(x, ...) {
  c("Hello" = "world!")
}

tbl
#> # Hello: world!
#>       a
#>   <int>
#> 1     1
#> 2     2
#> 3     3

See vignette("extending", package = "pillar") for a walkthrough of the options.


Code of Conduct

Please note that the pillar project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

pillar's People

Contributors

batpigandme avatar davidchall avatar gavinsimpson avatar github-actions[bot] avatar hadley avatar indrajeetpatil avatar jimhester avatar krlmlr avatar lionel- avatar michaelchirico avatar rkahne avatar romainfrancois avatar wibeasley avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pillar's Issues

Flexible widths

colformat should return an object that can be rendered at multiple widths, and should include some metadata about possible widths, as well as the optimal width.

Can pillar_shaft.POSIXt() respect the digits.secs option?

I think it would be nice if pillar_shaft.POSIXt() printed POSIXct objects in the same way as base R, respecting getOption("digits.secs"). This is mainly useful for printing fractional seconds, which aren't currently shown at any time in the current implementation.

The following seems to work well enough for me:

pillar_shaft.POSIXt <- function(x, ...) {
  
  # Get the value of the option, default is 0
  fractional       <- getOption("digits.secs")
  # The width needs to be adjusted. If we are printing fractional seconds, adjust
  # by adding the number of fractional seconds to print +1 for the decimal, otherwise do nothing.
  fractional_width <- ifelse(fractional, fractional + 1, 0)
  
  date <- format(x, format = "%Y-%m-%d")

  # Use the "%OS" format to print. When we don't use any fractional seconds, I believe
  # "%OS0" is equivalent to "%S"
  time <- format(x, format = paste0("%H:%M:%OS", fractional))
  
  datetime <- paste0(date, " " , style_subtle(time))
  datetime[is.na(x)] <- NA
  
  # Add to the width
  new_pillar_shaft_simple(datetime, width = 19 + fractional_width, align = "left")
}

Using this gives:

from <- as.POSIXct("14:03:55", format="%H:%M:%OS",tz="UTC")
to   <- as.POSIXct("14:04:00", format="%H:%M:%OS", tz="UTC")

ex <- tibble::tibble(datetime = seq(from, to, by = 0.01))

ex
#> # A tibble: 501 x 1
#>    datetime           
#>    <dttm>             
#>  1 2018-01-04 14:03:55
#>  2 2018-01-04 14:03:55
#>  3 2018-01-04 14:03:55
#>  4 2018-01-04 14:03:55
#>  5 2018-01-04 14:03:55
#>  6 2018-01-04 14:03:55
#>  7 2018-01-04 14:03:55
#>  8 2018-01-04 14:03:55
#>  9 2018-01-04 14:03:55
#> 10 2018-01-04 14:03:55
#> # ... with 491 more rows

options(digits.secs = 4)

ex
#> # A tibble: 501 x 1
#>    datetime                
#>    <dttm>                  
#>  1 2018-01-04 14:03:55.0000
#>  2 2018-01-04 14:03:55.0099
#>  3 2018-01-04 14:03:55.0199
#>  4 2018-01-04 14:03:55.0299
#>  5 2018-01-04 14:03:55.0399
#>  6 2018-01-04 14:03:55.0499
#>  7 2018-01-04 14:03:55.0599
#>  8 2018-01-04 14:03:55.0699
#>  9 2018-01-04 14:03:55.0799
#> 10 2018-01-04 14:03:55.0899
#> # ... with 491 more rows

The results from using options(digits.secs = 4) are a bit strange, but I think this has been confirmed as the intended output by R core.

Fix formatting on Windows

especially R-devel. Known output tests differ. Ideally, we'd be able to recreate the same output on Windows and Linux with LC_CTYPE=C or LC_CTYPE=latin1.

Idea on datetimes/timezones

Following this idea from tidyverse/tibble#173 - guessing that this is the place now.

The idea in the original issue is to do something like:

#> # A tibble: 1 × 3
#>                 time1               time2               time3
#>                <dttm>           <dttm-02>           <dttm+11>
#> 1 2015-06-01 01:00:00 2015-06-01 01:00:00 2015-06-01 01:00:00

The non-DST offset is shown in the column header.

Given that you are now using color and font-weight as signifiers, could this be applied to datetimes?

The column header could be something like <dttm+11+12> then the +11, +12, and values could be colored/weighted accordingly.

Show significant but constant digits in a different color

Example: lat-lon of a local area, years, years and months in dates varying by days, strings with a common prefix, ...

Should simply look at the decimal/textual representation and check:

  1. If sign different, abort
  2. If leftmost digit different anywhere, abort
  3. Else, highlight and proceed with the next digit

How should the significant-but-constant digits be formatted?

Because this heuristic seems to be valid not only for numbers, but also dates, hms, strings, ..., maybe we need a generic helper.

Reference: tidyverse/tibble#305 (comment)

How to use with a data frame?

I'm looking forward to use this to present tables via data frames. But I can't seem to see a data frame method in the pkg so far, and this doesn't work:

colformat.data.frame <- function(x, ...){
  x[] <- lapply(x, colformat)
  x
}

xx <- colformat(antibiotic[, 2:4])
str(xx)

So I'm curious to know how we can format columns in a dataframe with these methods. Thanks!

control over < > for print.tbl_df method

The standard way to print units of measure is between square brackets, as in [km/h]. When I do this in a tibble header, I get <[km/h]>, which looks odd (example below) and takes unnecessary space. Is there a way to get rid of the < and >? Should I raise this issue with tibble?

suppressPackageStartupMessages(library(units))
mt = mtcars
mt$mpg = set_units(mt$mpg, km/h)
library(tibble)
(m <- as.tibble(mt))
# # A tibble: 32 x 11
#         mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
#  * <[km/h]> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#  1     21.0  6.00   160 110    3.90  2.62  16.5  0     1.00  4.00  4.00
#  2     21.0  6.00   160 110    3.90  2.88  17.0  0     1.00  4.00  4.00
#  3     22.8  4.00   108  93.0  3.85  2.32  18.6  1.00  1.00  4.00  1.00
#  4     21.4  6.00   258 110    3.08  3.22  19.4  1.00  0     3.00  1.00
#  5     18.7  8.00   360 175    3.15  3.44  17.0  0     0     3.00  2.00
#  6     18.1  6.00   225 105    2.76  3.46  20.2  1.00  0     3.00  1.00
#  7     14.3  8.00   360 245    3.21  3.57  15.8  0     0     3.00  4.00
#  8     24.4  4.00   147  62.0  3.69  3.19  20.0  1.00  0     4.00  2.00
#  9     22.8  4.00   141  95.0  3.92  3.15  22.9  1.00  0     4.00  2.00
# 10     19.2  6.00   168 123    3.92  3.44  18.3  1.00  0     4.00  4.00
# # ... with 22 more rows

Consider helpers API

And need to think about sparkline and sparkbar (which are useful helpers for lists of numeric vectors)

Should they have a common prefix?

I think this should be the focus of the second release of pillar and for now just we probably just want to un-export spark_line() and spark_bar()

Possible update of reference to earliest usage of technique

Hi

This looks like a great package. I noticed the reference to sparklines being first used in 2009. Edward Tufte actually wrote about this technique in The Visual Display of Quantitative Data back in 1983. A good discussion of the history of his usage and prior art can be found here: https://www.edwardtufte.com/bboard/q-and-a-fetch-msg?msg_id=000AIr

Perhaps this link can be included in the readme? It certainly is interesting!

Best
Leon

Last observation carried forward

It would be handy to provide a helper class that only produced a value when it changed

ie.

x <- c("a", "a", "a", "b", "b")
format(locf(x))
#> a . . b .

Handle more than one column

so that printing the body part of a tibble becomes entirely the responsibility of this package.

New constructor: multicolformat() or colformats().

@hadley: Do you have a strong opinion against this?

Likert helper

likert(c(1, 3, 5), max = 5)
#> +----
#> --+--
#> ----+

No colours in Linux or Windows

I'm not seeing any colour in the output on Linux or Windows. Is there something else I need to install? What is a console that supports colour? That would be helpful to know for using this package.

Here's my RStudio console in Windows:

col_win

Here's my RStudio console in Linux:

col_linux

0 displayed as NA

print(tibble::tibble(x = c(0, 1e-30)), width = 20)
#> # A tibble: 2 x 1
#>                   x
#>               <dbl>
#> 1      NANANANA    
#> 2          1.00e⁻³⁰

Control significant figures with an option

For instance, I see that

Digits after the first three are dimmed to emphasise the important components.

But you'll have to explain to me how the digits 201 are the important components of the year column in the example 😛 . Being able to change the default behavior somehow to give me 4 digits would be nice in instances like my class where students end up reading in a lot of examples where year is a column. I want them to like tibbles and not get frustrated by this unintuitive behavior.

Strange formatting of AsIs list columns

Maybe due to as.character.AsIs() ?

library(magrittr)
list(a = 1:3, b = list(1, 1:2, 1:3)) %>% pillar::colonnade()
#>       a b        
#>   <int> <list>   
#> 1     1 <dbl [1]>
#> 2     2 <int [2]>
#> 3     3 <int [3]>
list(a = 1:3, b = I(list(1, 1:2, 1:3))) %>% pillar::colonnade()
#>       a b         
#>   <int> <S3: AsIs>
#> 1     1 1         
#> 2     2 1:2       
#> 3     3 1:3

@hadley: How do we deal with this?

Related: tidyverse/tibble#304

NA styling

I'm not in love with the current treatment. Did we try yellow or red foreground colour?

Don't print decimals if no value in vector has decimals.

When importing data from other software packages into R (e.g. from Stata, SAS or SPSS, using haven), vector are of type double, even if they are integers.

Would you mind checking if a vector has "floating point" values, or are actually "interger-doubles", and then omit the decimals? (something like is.numeric(x) && !all(x %% 1 == 0, na.rm = T))

Current output:

library(tibble)
tibble(a = c(1, 2, 3), b = c(1L, 2L, 3L))
#> # A tibble: 3 x 2
#>       a     b
#>   <dbl> <int>
#> 1  1.00     1
#> 2  2.00     2
#> 3  3.00     3

Since all values in a are "integers", the desired output would be like column b. The problem is, that this is a guess, if it's a double or probably was intended as integer. But I can think of (new) R users being confused when they see their values in the SPSS data sheet as "integers", and in the R console as doubles.

setting string max length?

hello everyone, thanks for your great work!

Just wondering if there are any plans about implementing that? I think its pretty useful. For instance, in Pandas one could simply do:

In [43]: df = pd.DataFrame(np.array([['foo', 'bar', 'bim', 'uncomfortably long string'],
   ....:                             ['horse', 'cow', 'banana', 'apple']]))
   ....: 

In [44]: pd.set_option('max_colwidth',40)

In [45]: df
Out[45]: 
       0    1       2                          3
0    foo  bar     bim  uncomfortably long string
1  horse  cow  banana                      apple

In [46]: pd.set_option('max_colwidth', 6)

In [47]: df
Out[47]: 
       0    1      2      3
0    foo  bar    bim  un...
1  horse  cow  ba...  apple

In [48]: pd.reset_option('max_colwidth')

which is really helpful when one prints a tibble that contains both text and numeric values.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.