Giter Site home page Giter Site logo

tidyverse / lubridate Goto Github PK

View Code? Open in Web Editor NEW
720.0 47.0 207.0 13.17 MB

Make working with dates in R just that little bit easier

Home Page: https://lubridate.tidyverse.org

License: GNU General Public License v3.0

Emacs Lisp 0.05% R 68.72% TeX 8.46% C 3.53% Shell 0.05% CSS 19.12% Makefile 0.07%
r date-time date

lubridate's Introduction

tidyverse

CRAN status R-CMD-check Codecov test coverage

Overview

The tidyverse is a set of packages that work in harmony because they share common data representations and API design. The tidyverse package is designed to make it easy to install and load core packages from the tidyverse in a single command.

If you’d like to learn how to use the tidyverse effectively, the best place to start is R for Data Science (2e).

Installation

# Install from CRAN
install.packages("tidyverse")
# Install the development version from GitHub
# install.packages("pak")
pak::pak("tidyverse/tidyverse")

If you’re compiling from source, you can run pak::pkg_system_requirements("tidyverse"), to see the complete set of system packages needed on your machine.

Usage

library(tidyverse) will load the core tidyverse packages:

You also get a condensed summary of conflicts with other packages you have loaded:

library(tidyverse)
#> ── Attaching core tidyverse packages ─────────────────── tidyverse 2.0.0.9000 ──
#> ✔ dplyr     1.1.3     ✔ readr     2.1.4
#> ✔ forcats   1.0.0     ✔ stringr   1.5.0
#> ✔ ggplot2   3.4.4     ✔ tibble    3.2.1
#> ✔ lubridate 1.9.3     ✔ tidyr     1.3.0
#> ✔ purrr     1.0.2     
#> ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
#> ✖ dplyr::filter() masks stats::filter()
#> ✖ dplyr::lag()    masks stats::lag()
#> ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

You can see conflicts created later with tidyverse_conflicts():

library(MASS)
#> 
#> Attaching package: 'MASS'
#> The following object is masked from 'package:dplyr':
#> 
#>     select
tidyverse_conflicts()
#> ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
#> ✖ dplyr::filter() masks stats::filter()
#> ✖ dplyr::lag()    masks stats::lag()
#> ✖ MASS::select()  masks dplyr::select()
#> ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

And you can check that all tidyverse packages are up-to-date with tidyverse_update():

tidyverse_update()
#> The following packages are out of date:
#>  * broom (0.4.0 -> 0.4.1)
#>  * DBI   (0.4.1 -> 0.5)
#>  * Rcpp  (0.12.6 -> 0.12.7)
#>  
#> Start a clean R session then run:
#> install.packages(c("broom", "DBI", "Rcpp"))

Packages

As well as the core tidyverse, installing this package also installs a selection of other packages that you’re likely to use frequently, but probably not in every analysis. This includes packages for:

  • Working with specific types of vectors:

    • hms, for times.
  • Importing other types of data:

    • feather, for sharing with Python and other languages.
    • haven, for SPSS, SAS and Stata files.
    • httr, for web apis.
    • jsonlite for JSON.
    • readxl, for .xls and .xlsx files.
    • rvest, for web scraping.
    • xml2, for XML.
  • Modelling

    • modelr, for modelling within a pipeline
    • broom, for turning models into tidy data

Code of Conduct

Please note that the tidyverse project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

lubridate's People

Contributors

alberthkcheng avatar batpigandme avatar billdenney avatar brunj7 avatar cderv avatar davisvaughan avatar dougmitarotonda avatar dpseidel avatar garrettgman avatar gegznav avatar hadley avatar imanuelcostigan avatar jasonelaw avatar jimhester avatar jmobrien avatar joethorley avatar jonboiser avatar krlmlr avatar larmarange avatar lorenzwalthert avatar michaelchirico avatar mmaechler avatar qulogic avatar stragu avatar sushmitavgopalan16 avatar tomcardoso avatar trevorld avatar vspinu avatar wibeasley avatar zeehio avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

lubridate's Issues

Support partial dates

How can lubridate support partial dates? (e.g. year + month, or year + day). Should setting (e.g.) yday to NA create a partial date? What sort of data structure is needed to support these dates?

how to work with date ranges greater than 16 years?

lubridate calculates the difference between dates in seconds
a <- now() # "2009-07-27 11:01:12 CDT"

z <- now() #"2009-07-27 11:01:19 CDT"
z - a = 6.10436415672302 seconds

This difference is automatically stored as a duration. Since durations cannot handle intervals greater than 5e+08 seconds, we cannot subtract dates more than ~16 years apart.

a <- a + 250000001 # "2017-06-28 23:27:53 CDT"
z <- z - 250000011 # "2001-08-24 22:34:28 CDT"

z - a = Error in new_duration(second = as.numeric(x, units = "secs")) : seconds overflow: see 'duration' documentation

This limits the date combinations that we can subtract, but also hampers pretty.date() for date ranges greater than 16 years.

BC dates

Remember to self: consider BC dates and time (Before Christ/ Before Common Era)

  • Garrett

TO READ: should_advise(...)

We should add a message to ymd( ) that appears depending on where ymd(
) is being used. The message reveals what format is being used.

ymd(..., advice = should_advise(...))

TO READ: what is now?

Should now always be the time the package is loaded (as its currently written)? It will never really be "now" when its called
now <- Sys.time()

Or should it be a function that returns the "now" of whenever its called?
now <- function() Sys.time()

make unit names consistent with base::R?

With base::difftime() R uses "secs", "mins", "days", and "weeks" for unit inputs.

With update.Date and other functions we're using "second", "minute", "hour", "*day", "week", "month" "year".

Should we keep our unit names consistent with base::R?

lubridate must first load plyr

The guess_formats() code uses "mlply" from the plyr package.
Is it possible to make lubridate also load plyr when it installs?
Or could we include the code for mlply in the lubridate package?

Create new duration format

Instead of storing durations as data.frames, store them as atomic numbers where the 9 rightmost digits are seconds and the remainder are months.

Blocks of 126239400 seconds can be converted to four years, since this is the number of seconds in three years and a leap year. This will prevent the number of seconds from spilling into the months columns. (An exception will have to be made for years like 2000 which should be leap years but are not. I'm still thinking about this.)

This change will require:

  • a new print_duration function
  • new as.duration(), is.duration(), new_duration() and related functions
  • new Ops functions (+, -, /, *)
  • the new ops functions seem to require new accessor functions for the durations (for example, second(duration))

Operation on vectors

Add code to +, -, *, /, so that if n1 < n2, our functions can expand it like a data frame and then perform the operation.

get rid of my computer's stupid reoccuring error

This error message crashes R whenever I set options(error = recover), which is interfering with debugging

Whenever I type an open parenthesis immediately following a word, the GUI console immediately displays

Error in try(gsub("\s+", " ", paste(capture.output(print(args(options))), :
unused argument(s) (silent = TRUE)

without even waiting for me to execute the command.

Ensure that all examples work.

Once you have run roxygenise, you can use the following code to run all of the examples in the documentation. You might find the following function helpful - it's a bit faster than using R CMD check because it just runs the examples:

library(plyr)
library(tools)

run_examples <- function(path) {
  files <- dir(path, full = TRUE)

  parsed <- llply(files, parse_Rd)
  names(parsed) <- basename(files)

  extract_examples <- function(rd) {
    tags <- tools:::RdTags(rd)
    paste(unlist(rd[tags == "\\examples"]), collapse = "")
  }

  examples <- llply(parsed, extract_examples)
  m_ply(cbind(file = names(examples), example = examples), 
    function(file, example) {
      cat("Checking ", file, "...\n", sep = "")
      eval(parse(text = example))
    }
  )  
}

Find update.Date

It looks like it's disappeared in the latest version, so please go back in time and figure out what happened to it!

find a list of roxygen tags

...to use while documenting.

So far I can emulate this for anything

\#' TITLE
\#' Sub Title
\#' (blank line)
\#' Paragraph explaining stuff
\#' (blank line) 
\#' @param    
\#' @author
\#' @keywords
\#' examples
\#' example1()
\#' example2()
\#example <- function()

Still to learn:
how do I combine documentation for functions?
add a see also tag?
put more than one author?
add a usage tag?
how do I link to other documentation pages?

Daylight Savings time is FUBAR

...just in general.

But its also bad for lubridate. When DST changes, days are not specifically 24 hours (They may be 23 or 25 hours).

date <- as.POSIXlt("2009-03-07 00:00:00") #"2009-03-07 00:00:00 CST"

If you work with financial markets, you may want
date + days(2) = "2009-03-09 00:00:00 CDT"

If you work in physical sciences, you may want
date + days(2) = "2009-03-09 01:00:00 CDT"

I've written code for both ways. But we can't ask people what they want each time they use the + function. Is there a way to make an option the user can set like:

options(days = "exact")
options(days = "relative")

?

fix year<-, and month<-

year<-(x, y) and month<-(x,y) both fail if x is a POSIXlt object with an unspecified timezone.

Recreate RNews table

recreate the table in RNews 4/1 with lubridate. Everything should appear less complicated

dmy, mdy, etc.

repopulate parse.r with the helper functions dmy() mdy() myd() etc. etc.

'month<-'(e1, e2) can not handle partial months

...because it uses ISOdate() which returns and NA when the month value is not an integer. Which means we won't be able to add partial months, using add_duration_to_date.

I wrote an alternate version of "month<-" that uses the sequence approach, and put it in the code after the original version.

This code works, but users will have to scratch their head twice instead of once to understand the 'month<-' function (because it will have the sequence script).

What do you suggest?

make as.period.interval

as.period.interval(interval, periods)

"bites out" as many periods as possible of the largest units listed in periods
repeats for second largest units using the remaining interval
until all listed units have been handled
appends remaining number of seconds to the period

Override difftime

lubridate should override difftime to ensure that it always returns difference in seconds.

make print.duration handle vectors and data frames

currently:

a
1 month and 1 second

c(a,a)
$months
[1] 1

$seconds
[1] 1

$months
[1] 1

$seconds
[1] 1

list(a,a)
[[1]]
1 month and 1 second

[[2]]
1 month and 1 second

data.frame( one = a)
one.months one.seconds
1 1 1

any suggestions?

TO READ: Make guess-format() differentiate between %y and %Y

Should write guess_format() on
ymd() to choose between %y and %Y. Add "%" to
the end of each string and then look for format strings that also end
in "%". i.e, "__ __ __ %" Otherwise both will work and %y will just
throw out the last extra two numbers

Pretty date breaks

Need method like pretty() from base R that given a date range, provides a set of pretty breaks that nicely span the range.

TO READ: sort out subtraction weirdness

for example:
y - d = 1 year, -1 weeks and 6 days

but it would be better to keep everything positive
(i.e, y - d = 11 months and ? days
= ?days (364 or 365))
On second thought, is the current method preferred?
(what about y - d = 1 year and -1 days)

Use @alias not @method

You only need to use @method if you're documenting a particular method. Otherwise just use @alias

Fix links

Links should be written like \code{\link{a}}, \code{\link{b}}

identify and handle four year periods with no leap years

the durations format automatically prevents seconds data from spilling into months data by moving four year blocks of seconds to the months side. This works because 3 years + 1 leap year is (normally) a fixed number of seconds.

Sometimes, however, a four year block does not have a leap year (for example 2000). I nthis case the duration will be 1 day too long.

I need to create a way to test for this and fix it.

Make durations interact with difftime and numeric

"+.duration" <- "+.difftime" <- function() does not work because difftimes are added according to Ops.difftime()
"+.duration" <- "Ops.difftime()" doesn't work because Ops.difftime() needs a value for ".Generic"

What about unrecognized POSIXct dates?

A user may have a list of dates in POSIXct format, but without the class attribute, i.e. they'll just look like a list of numbers.

He or she could easily change them to POSIXct format using

as.POSIXct(x, origin = "1970-01-01")

Or whatever the origin date is (should they originate from another point).

We haven't written any commands that would handle these dates. Do we want to add to our existing commands? Or do we expect the user to get his or her POSIXct dates in shape with the above code before using lubridate?

  • Garrett

print.duration output too long?

What do you think of the print.duration output? Here's the current output:

Duration: 0 years, 0 months, 0 weeks, 0 days, 0 hours, 0 minutes and 1 seconds

Should I make it exclude the zero entries and discriminate between 1 and more than one? (day vs. days)? Would it be a bad idea to get rid of "Duration:" ?

1 second

how to handle data.frames of duration objects?

Since a duration is also a data.frame, putting it into data structures that do not retain its class attributes converts it back into a data frame without warning.

  1. putting durations into a vector with c() creates a list where each element is alternatively a $months or $seconds value

we can prevent this or insert warning by writing a c.duration() but this will not be called if the first element in the vector is not a duration.

  1. putting durations into a list with list() creates a list where each element is a duration
  2. putting durations into a data.frame with data.frame() drops the duration class and cbinds the remaining 1 x 2 dataframes

Which of these would we like to users to use with durations, and what do we want the output ot look like?

expanding/contracting for weekends

Did we still want to make a function that would allow a user to expand or contract a period for weekends? We mentioned it at the beginning of the project. But I'm not sure what that would specifically look like. Perhaps:

today + business_days(days(3))
today + business_hours(hours(50))
tomorrow + market_days(days(2))

?
maybe business_days() could modify the class of its argument and then we could write a +/- method for that new class
something for version 2?

Partial periods

Months and years are periods of non-fixed duration. Does lubridate add partial periods or require users to specify a partial period as a duration in days?
ex.
2009-06-14 + months(2.5)
or
2009-06-14 + months(2) + days(15)

clean up update.Date()

decide how to handle different orders of months and years. remove commenting

Add a note in the ?update.Date() explaining that it is just a setting
function (not an adding function) . Include examples demonstrating
this

create documentation

Create code to make the help files using roxygenize() save in a
folder called man/

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.