The mortaar from isaakiel

Grouping variable with NAs

Wenn bei der Gruppierungsvariable NA vorkommen, gibt es eine nicht wirklich hilfreiche Fehlermeldung. Da table() die NA's ignoriert, habe ich sehr lange gebraucht.

Das Problem entsteht bei input_functions.R ab Zeile 140

   # Create a dataframe (restab) filled with zeros,
   # with the column count of the grouping columns (+2)
   # and the row count of the maximum age (+1).
remat=matrix(data=0,ncol=length(unique(asd$Group))+2,nrow=(max(asd$ende,na.rm=T)+1))
   restab=as.data.frame(remat)
   # Set the dataframes column names to age, the groups names and "All".
   names(restab)=c("Age",as.character(unique(asd$Group)),"All")
   # Set the age values from 0 to the maximum age +1.

Zum verdeutlichen habe ich mal folgenden Code:

test<-c(rep(letters[1:3],5),NA)
table(test)
length(unique(test))
as.character(unique(test))

Das einfachste wir sein, wir werfen einen ERROR aus, z.B. in Zeile 136

if (any(is.na(asd$Group))){stop("NA in grouping variable not allowed.")}

Seems very reasonable!

Plot function - age category offset

I just realized that there is a problem with the plot function because the age categories are offset by one. This is due to the snippet cumsum(x$a). My first attempt cumsum(x$a) - x$a did not work. Any other ideas?

Example data

We need at least two more example datasets - so far just one (schleswig_ma) is properly prepared and referenced.

siedl.txt seems to be a promising candidate.

PreRelease ToDo: Output function

clean up the "ToDo's" in the code

rework examples and use the provided exemplary data

PreRelease ToDo: analytical_functions

I do not understand the new warning messages. I just have generated a life table from the Magdalenenberg data set, which only contains the age range and the number of deceased. I still got the warning "In one of your data.frames are more than the two necessary columns a, Dx. Note that these additional columns will be dropped in the output."

PreRelease: Submitting issues

I have finally submitted mortAAR to CRAN! However, I got it back with comments (see below). Actually, I am not sure what CRAN is expecting from us (especially concerning the "small executables").

Thanks, please elaborate the provided functionality in the description field.

Can you provide a reference in the 'Description' field of your DESCRIPTION file? If so, please write the reference in the form
authors (year) doi:...
authors (year) arXiv:...
authors (year) http:... (if doi/arXiv not available)
authors (year) https:... (if doi/arXiv not available)
authors (year, ISBN:...)
with no space after 'doi:', 'arXiv:', 'http:', 'https:' and angle brackets for auto-linking.

Please add more small executable examples in your Rd-files.
Something like
\examples{
examples for users:
executable in < 5 sec
for checks
\dontshow{
examples for checks:
executable in < 5 sec together with the examples above
not shown to users
}
donttest{
further examples for users (not used for checks)
}
}
would be desirable.

Please do not comment out examples.

Please fix and resubmit.

PreRelease ToDo: Lifetable function

the default of the function shall be the "modern" approach, i.e. an acv for all ages younger than 5. [attention: this might/will affect the test]

multvec shall be printed to the output as well; called "nax" bei Kiefitz [provide appropriate reference]
style the literature in documentation [and perhaps put some flesh on the bones on the roots of the method]
document the brand new Lx equations

Proposal for release names

For the release names, how about learning from the successful (SpaceX) and naming the releases after the ships from the Culture novels?

https://en.wikipedia.org/wiki/List_of_spacecraft_in_the_Culture_series

Some examples:

Of Course I Still Love You
Kiss My Ass
Irregular Apocalypse
The Hand of God 137
Ultimate Ship The Second
The Precise Nature Of The Catastrophe
Just Read the Instructions

PreRelease ToDo: Tests

praise your fetish. achieve 100% coverage ...

PreRelease ToDo: Authors

if you participated put your name in Description file

PreRelease ToDo: siedl data

implement, appropriately name and document the siedl.txt dataset

helpful link: http://r-pkgs.had.co.nz/data.html

Source of the data: Kokkotidis/Richter 1991, p. 228 Dok. 1.

File .csl not found in resource path

In the latest CRAN-tests, there surfaced an error in the processing of the references in the vignettes via pandoc. It has to be fixed by 2022/05/06:

Check: re-building of vignette outputs 
Result: ERROR 
    Error(s) in re-building vignettes:
    --- re-building 'mortAAR_vignette-1.Rmd' using rmarkdown
    File .csl not found in resource path
    Error: processing vignette 'mortAAR_vignette-1.Rmd' failed with diagnostics:
    pandoc document conversion failed with error 99
    --- failed re-building 'mortAAR_vignette-1.Rmd'
Error: Vignette re-building failed.
    Execution halted 
Flavor: [r-release-windows-x86_64]

A Google-search did not bring up obvious solutions. The error probably implies that pandoc cannot access the csl-file linked in the vignettes. The error only occurs on the test Windows system, but also on my Mac the command RCurl::url.exists('https://www.zotero.org/styles/offa') used in the vignettes returns FALSE, which of course leaves the text string "r library(RCurl); ifelse(url.exists('https://www.zotero.org/styles/offa'), 'https://www.zotero.org/styles/offa', '')" empty. The corresponding command of the package httr httr::GET('https://www.zotero.org/styles/offa') strangely returns Error in curl::curl_fetch_memory(url, handle = handle) : SSL certificate problem: certificate has expired. However, a test of the address yielded a valid certificate. Increasing the connecttimeout setting of RCurl did not help, either.
I am puzzled. One possible solution seems to bundle a copy of the offa-csl-file with the package but this would defy the whole idea of csl-styles somehow. Any other ideas?

Bug - representativity table

Hi, I probably found a bug in lt.representativity.mortaar_life_table (representativity.R file).

Line 74: There should be mortality$q15_5 instead of mortality$q10_5.
Line 77: There should be mortality$q0_5 / mortality$q15_5 instead of mortality$q0_5 / mortality$q10_5

Travis-build fails

@nevrome: Could you please test the latest master? Locally, everything tests fine, but the travis-build fails. The log says:

-- FAILURE (test_life_table.R:8:3): life.table produces the right output with sp
Snapshot of `life.table(nitra_prep, option_spline = 10)` has changed:
`old$x` is an S3 object of class <factor>
`new$x` is a character vector ('0--4', '5--9', '10--14', '15--19', '20--24', ...)

Run `snapshot_accept('life_table')` if this is a deliberate change

â•�â•� testthat results  â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�â•�
FAILURE (test_life_table.R:8:3): life.table produces the right output with spline-function

[ FAIL 1 | WARN 0 | SKIP 0 | PASS 118 ]
Error: Test failures

I suppose it has to do with the serialization-option instead of 'json2' but otherwise the test fails, probably because of rounding issues.
Any ideas?

Install error

install_github('ISAAKiel/mortAAR') throws the following error:

Downloading GitHub repo ISAAKiel/mortAAR@master
from URL https://api.github.com/repos/ISAAKiel/mortAAR/zipball/master
Installing mortAAR
'/usr/lib/R/bin/R' --no-site-file --no-environ --no-save --no-restore --quiet CMD INSTALL  \
  '/tmp/RtmpF2gZuF/devtoolsd083f8a55e8/ISAAKiel-mortAAR-43265aa' --library='/home/dirk/R/x86_64-pc-linux-gnu-library/3.3'  \
  --install-tests 

* installing *source* package ‘mortAAR’ ...
** R
Error in parse(outFile) : 
  /tmp/RtmpF2gZuF/devtoolsd083f8a55e8/ISAAKiel-mortAAR-43265aa/R/input_functions.R:63:2: Unerwartete(s) 'else'
62: }
63: }else
     ^
ERROR: unable to collate and parse R files for package ‘mortAAR’
* removing ‘/home/dirk/R/x86_64-pc-linux-gnu-library/3.3/mortAAR’
Fehler: Command failed (1)

I think the line in question should be like:

  }
  if(length(methode)==1){
      meth=rep(methode,ceiling(max(asd$ende)/methode))
  }

As local setwd() is set within input_functions.R and Siedlungsbestattungen_ueberblick.txt is missing from the repo, I'm unable to install_local() and test the patch.

Last checks of mortAAR 1.1 Schirndorf before release

mortAAR seems to be ready to be put back on CRAN where it has been removed due to some instable dependencies. The new version is 1.1 and includes many new functions.
It would be great if some of you would bother to test and have a look on the new vignettes ("Life table corrections" and "Reproduction"). @nevrome suggested to put the vignettes together via bookdown which seems to be a reasonable idea – for the next release.

PostRelease: Quickcheck tests

This package will be on CRAN soon. We should use this for a much more reliable test infrastructure.

Add population size estimate

Very simple function with D, e0 (with options for correction), t (to be added by user) and option for correction factor.

PreRelease ToDo: Input function

implement the option/"method-switcher" to state whether the age intervals are inclusive or exclusive, i.e. 20-39 and 40-60 vs. 20-40 and 40-60

PreRelease ToDo: Fix group name in tabular output

I know it is distressful but at least on my system the problem with the group-name not included in the tabular output of the life table function still persists (e.g., ":female" instead of "sex:female"). Is this just me (or more specifically, my Mac)?

PreRelease ToDo: pitfalls

enhance documentation with typical pitfalls of archaeological data to avoid rubbish in: e.g. age intervals, apply acv etc.

function1() could be replaced by a call to tidyr::separate()

I think tidyr::separate() does the same thing as mortAAR::function1() -- and it's a lot more flexible and foolproof.

test <- read.table("data-raw/siedl.txt",header=T,sep="\t")

library(mortAAR)
function1(test, "Jahrefeld")

library(tidyr)
tidyr::separate(test, Jahrefeld, into = c("from", "to"), remove = FALSE)

I think we could mention this usecase in a vignette or an example, but we don't need to cover it with an own function in mortAAR.

Check if Bayesian Approach can be added

See this paper for a Bayesian Approach for Life table data with R-Code in the Supplemental Data:

https://doi.org/10.1002/ajpa.24211

PostRelease ToDo: Fancy Code

in general but in particular for the input (e.g. prep.life.table) function

aim: clean code, i.e. as little loops as possible; since dplyr and tidyverse things are already dependencies of the package their data-crunching tools could/should be used [R-Code 2.0 ...]

PostRelease: Uncomment summary-function

Comment from CRAN:

For your next version:
Please uncomment the example in summe.Rd:
#summentest=summe(c(1,2,NA,4),c(5,6,7,8)

PostRelease: fix default value of agerange in prep.life.table

At the moment, Usage states that "included" is the default value for agerange, while description says "excluded".

Plot option "color" does not work

For the new branch "Plot_option_line_kind" I have tried to implement an option to print the lines in the plots in color. Simply replacing line 115 in master

my_plot <- ggplot2::ggplot(my_x, ggplot2::aes_string(x="a",y=variable_name,lty="dataset"))

with

my_plot <- ggplot2::ggplot(my_x, ggplot2::aes_string(x="a",y=variable_name,color="dataset"))

does work as expected, it only throws an error: Fehler: measure variables not found in data: color.
The way I tried to parametrize it in the new branch does not work, however.

PostRelease: as.mortaar_life_table and as.mortaar_life_table_list

I think we forgot the as. functions....

Release ToDos: CRAN Policy

PreRelease ToDo: check example datasets

the three already implemented datasets and their documentation need to be checked

PostRelease ToDo: further functions

quality criteria after Boquet-Appel
Net Reproduction Rate

Pack Vignettes together via bookdown

To make vignettes more visible, they should be linked in a booklet:

https://bookdown.org/yihui/bookdown/

q5_0-Corrected life table

@nevrome, @MartinHinz, @chrinne: I have written some code implementing changes to the life table according to Bocquet-Appel and Masset for accounting for missing children. Originally, I have envisioned a new function for this, but now I am doubting the usefulness of this approach. Meanwhile, I would favour an additional option in the function life.table, e. g. q5_0_correction = TRUE/FALSE, with FALSE the default.
What do you think?

lapply(life_table, lt.function) falls back on default values

Unfortunately, the strategy to cater at the same time for mortaar_life_tables as well as mortaar_life_table_lists does not seem to work properly. In the case of mortaar_life_table_lists, the code snippet lapply(x, lt.function) falls back on default values, any custom values are ignored. Therefore, the task is to pipe custom values through the function lt.function.mortaar_life_table_list.

PreRelease ToDo: Fancy Language

Improve and harmonize language in documentation and vignettes.

PreRelease ToDo: Readme.md

make people curious [or at least help them to understand what mortAAR is about]..., i.e. describe the package, its ease of use, provided exemplary figures, ...

dependency tidyverse

mortAAR should not depend on the tidyverse! We should select the really necessary packages.

PreRelease ToDo: Vignette

The vignette should tell a story and present the strengths and possibilities of the package.

Nils suggested a comparisons of different Iron Age settlements.

@ archaeologists: the non-archaeologists could participate after you give us some content/contextual information.

helpful link: http://r-pkgs.had.co.nz/vignettes.html

Adding further vignettes

The added functions for reproduction indices, maternal mortality, corrected life tables etc. need vignettes.

Prerelease: winbuilder error message

During the winbuilder check (https://win-builder.r-project.org/D7z18KlyQafp/) in "install.out" the following error message is generated:

During startup - Warning messages: 1: In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE, : there is no package called 'mixtrak.R.dev' 2: package 'mixtrak.R.dev' in options("defaultPackages") was not found

Where does 'mixtrak.R.dev' stems from? Any ideas?

All other checks (local mac, travis linux, winbuilder) are passed. This is the only one remaining before release!

Improvements in subsetting for plotting

At the moment, it is cumbersome to subset data for plotting (i. e., only a selection of the grouping variable). This is important because sometimes the plot is too crowded because of too many categories.

Proliferation of functions in the documentation

The help file starts to get inflated. It seems that the "rdname"-tag is ignored so that functions appear in the index list which are supposed to be subsumed under other entries. Any easy solutions?

PreRelease Diskussion: output of life.table

life.table currently takes either a single data.frame or a list of data.frames as input. Each of these data.frames needs the columns a and Dx. life.table then calculates stuff and returns a mortaar_life_table_list of mortaar_life_tables (more or less a list of data.frames) with all the input variables and the newly calculated variables.

This method is a bit inconvenient and error-prone. Inconvenient first of all because the users always get's a mortaar_life_table_list, even if he just wants to calculate one life table. I suggest to change the output of life.table to a mortaar_life_table in case of only one input data.frame.

Error-prone because a mortaar_life_table has a potentially inconsistent set of variables. Imagine for example that the input data.frame looks something like this:

 test <- data.frame(a = rep(5, 12), Dx = abs(round(rnorm(12)*50, 0)), fluff = 1:12, dx = 12:1)
 life.table(test)

Here there are not only a and Dx, but also the completely unrelated variables fluff and dx. The resulting mortaar_life_table will contain all the usual life table variables and also the variable fluff. The initial dx on the other hand gets overwritten, because it happens to have the same name as a variable that's calculated in life.table.

I see some possible solutions for this inconsistency. Either we keep all the additional input variables and give our new variables more rare names (maybe by adding a "." in front of the names -> ".dx"/".ex"/... - this solution is applied for example by broom::augment) or we scrape away the additional input variables beyond a and Dx.

The first approach allows more convenient workflows: The user can keep everything in one table. The second approach may be a bit more stable, because a mortaar_life_table always looks exactly the same.

Another idea: We keep all the additional input variables and warn the user in case of a name collision. I like this idea.

isaakiel / mortaar Goto Github PK

mortaar's People

Stargazers

Watchers

mortaar's Issues

Recommend Projects

Recommend Topics

Recommend Org