Giter Site home page Giter Site logo

mortaar's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

mortaar's Issues

Grouping variable with NAs

Wenn bei der Gruppierungsvariable NA vorkommen, gibt es eine nicht wirklich hilfreiche Fehlermeldung. Da table() die NA's ignoriert, habe ich sehr lange gebraucht.

Das Problem entsteht bei input_functions.R ab Zeile 140

   # Create a dataframe (restab) filled with zeros,
   # with the column count of the grouping columns (+2)
   # and the row count of the maximum age (+1).
remat=matrix(data=0,ncol=length(unique(asd$Group))+2,nrow=(max(asd$ende,na.rm=T)+1))
   restab=as.data.frame(remat)
   # Set the dataframes column names to age, the groups names and "All".
   names(restab)=c("Age",as.character(unique(asd$Group)),"All")
   # Set the age values from 0 to the maximum age +1.

Zum verdeutlichen habe ich mal folgenden Code:

test<-c(rep(letters[1:3],5),NA)
table(test)
length(unique(test))
as.character(unique(test))

Das einfachste wir sein, wir werfen einen ERROR aus, z.B. in Zeile 136

if (any(is.na(asd$Group))){stop("NA in grouping variable not allowed.")}

Seems very reasonable!

Plot function - age category offset

I just realized that there is a problem with the plot function because the age categories are offset by one. This is due to the snippet cumsum(x$a). My first attempt cumsum(x$a) - x$a did not work. Any other ideas?

Example data

We need at least two more example datasets - so far just one (schleswig_ma) is properly prepared and referenced.

siedl.txt seems to be a promising candidate.

PreRelease ToDo: analytical_functions

I do not understand the new warning messages. I just have generated a life table from the Magdalenenberg data set, which only contains the age range and the number of deceased. I still got the warning "In one of your data.frames are more than the two necessary columns a, Dx. Note that these additional columns will be dropped in the output."

PreRelease: Submitting issues

I have finally submitted mortAAR to CRAN! However, I got it back with comments (see below). Actually, I am not sure what CRAN is expecting from us (especially concerning the "small executables").

Thanks, please elaborate the provided functionality in the description field.

Can you provide a reference in the 'Description' field of your DESCRIPTION file? If so, please write the reference in the form
authors (year) doi:...
authors (year) arXiv:...
authors (year) http:... (if doi/arXiv not available)
authors (year) https:... (if doi/arXiv not available)
authors (year, ISBN:...)
with no space after 'doi:', 'arXiv:', 'http:', 'https:' and angle brackets for auto-linking.

Please add more small executable examples in your Rd-files.
Something like
\examples{
examples for users:
executable in < 5 sec
for checks
\dontshow{
examples for checks:
executable in < 5 sec together with the examples above
not shown to users
}
donttest{
further examples for users (not used for checks)
}
}
would be desirable.

Please do not comment out examples.

Please fix and resubmit.

PreRelease ToDo: Lifetable function

the default of the function shall be the "modern" approach, i.e. an acv for all ages younger than 5. [attention: this might/will affect the test]

  • multvec shall be printed to the output as well; called "nax" bei Kiefitz [provide appropriate reference]
  • style the literature in documentation [and perhaps put some flesh on the bones on the roots of the method]
  • document the brand new Lx equations

File .csl not found in resource path

In the latest CRAN-tests, there surfaced an error in the processing of the references in the vignettes via pandoc. It has to be fixed by 2022/05/06:

Check: re-building of vignette outputs 
Result: ERROR 
    Error(s) in re-building vignettes:
    --- re-building 'mortAAR_vignette-1.Rmd' using rmarkdown
    File .csl not found in resource path
    Error: processing vignette 'mortAAR_vignette-1.Rmd' failed with diagnostics:
    pandoc document conversion failed with error 99
    --- failed re-building 'mortAAR_vignette-1.Rmd'
Error: Vignette re-building failed.
    Execution halted 
Flavor: [r-release-windows-x86_64]

A Google-search did not bring up obvious solutions. The error probably implies that pandoc cannot access the csl-file linked in the vignettes. The error only occurs on the test Windows system, but also on my Mac the command RCurl::url.exists('https://www.zotero.org/styles/offa') used in the vignettes returns FALSE, which of course leaves the text string "r library(RCurl); ifelse(url.exists('https://www.zotero.org/styles/offa'), 'https://www.zotero.org/styles/offa', '')" empty. The corresponding command of the package httr httr::GET('https://www.zotero.org/styles/offa') strangely returns Error in curl::curl_fetch_memory(url, handle = handle) : SSL certificate problem: certificate has expired. However, a test of the address yielded a valid certificate. Increasing the connecttimeout setting of RCurl did not help, either.
I am puzzled. One possible solution seems to bundle a copy of the offa-csl-file with the package but this would defy the whole idea of csl-styles somehow. Any other ideas?

Bug - representativity table

Hi, I probably found a bug in lt.representativity.mortaar_life_table (representativity.R file).

Line 74: There should be mortality$q15_5 instead of mortality$q10_5.
Line 77: There should be mortality$q0_5 / mortality$q15_5 instead of mortality$q0_5 / mortality$q10_5

Travis-build fails

@nevrome: Could you please test the latest master? Locally, everything tests fine, but the travis-build fails. The log says:

-- FAILURE (test_life_table.R:8:3): life.table produces the right output with sp
Snapshot of `life.table(nitra_prep, option_spline = 10)` has changed:
`old$x` is an S3 object of class <factor>
`new$x` is a character vector ('0--4', '5--9', '10--14', '15--19', '20--24', ...)

Run `snapshot_accept('life_table')` if this is a deliberate change

�� testthat results  �����������������������������������������������������������
FAILURE (test_life_table.R:8:3): life.table produces the right output with spline-function

[ FAIL 1 | WARN 0 | SKIP 0 | PASS 118 ]
Error: Test failures

I suppose it has to do with the serialization-option instead of 'json2' but otherwise the test fails, probably because of rounding issues.
Any ideas?

Install error

install_github('ISAAKiel/mortAAR') throws the following error:

Downloading GitHub repo ISAAKiel/mortAAR@master
from URL https://api.github.com/repos/ISAAKiel/mortAAR/zipball/master
Installing mortAAR
'/usr/lib/R/bin/R' --no-site-file --no-environ --no-save --no-restore --quiet CMD INSTALL  \
  '/tmp/RtmpF2gZuF/devtoolsd083f8a55e8/ISAAKiel-mortAAR-43265aa' --library='/home/dirk/R/x86_64-pc-linux-gnu-library/3.3'  \
  --install-tests 

* installing *source* package ‘mortAAR’ ...
** R
Error in parse(outFile) : 
  /tmp/RtmpF2gZuF/devtoolsd083f8a55e8/ISAAKiel-mortAAR-43265aa/R/input_functions.R:63:2: Unerwartete(s) 'else'
62: }
63: }else
     ^
ERROR: unable to collate and parse R files for package ‘mortAAR’
* removing ‘/home/dirk/R/x86_64-pc-linux-gnu-library/3.3/mortAAR’
Fehler: Command failed (1)

I think the line in question should be like:

  }
  if(length(methode)==1){
      meth=rep(methode,ceiling(max(asd$ende)/methode))
  }

As local setwd() is set within input_functions.R and Siedlungsbestattungen_ueberblick.txt is missing from the repo, I'm unable to install_local() and test the patch.

Last checks of mortAAR 1.1 Schirndorf before release

mortAAR seems to be ready to be put back on CRAN where it has been removed due to some instable dependencies. The new version is 1.1 and includes many new functions.
It would be great if some of you would bother to test and have a look on the new vignettes ("Life table corrections" and "Reproduction"). @nevrome suggested to put the vignettes together via bookdown which seems to be a reasonable idea – for the next release.

Add population size estimate

Very simple function with D, e0 (with options for correction), t (to be added by user) and option for correction factor.

PreRelease ToDo: Input function

implement the option/"method-switcher" to state whether the age intervals are inclusive or exclusive, i.e. 20-39 and 40-60 vs. 20-40 and 40-60

PreRelease ToDo: Fix group name in tabular output

I know it is distressful but at least on my system the problem with the group-name not included in the tabular output of the life table function still persists (e.g., ":female" instead of "sex:female"). Is this just me (or more specifically, my Mac)?

PreRelease ToDo: pitfalls

enhance documentation with typical pitfalls of archaeological data to avoid rubbish in: e.g. age intervals, apply acv etc.

function1() could be replaced by a call to tidyr::separate()

I think tidyr::separate() does the same thing as mortAAR::function1() -- and it's a lot more flexible and foolproof.

test <- read.table("data-raw/siedl.txt",header=T,sep="\t")

library(mortAAR)
function1(test, "Jahrefeld")

library(tidyr)
tidyr::separate(test, Jahrefeld, into = c("from", "to"), remove = FALSE)

I think we could mention this usecase in a vignette or an example, but we don't need to cover it with an own function in mortAAR.

PostRelease ToDo: Fancy Code

in general but in particular for the input (e.g. prep.life.table) function

aim: clean code, i.e. as little loops as possible; since dplyr and tidyverse things are already dependencies of the package their data-crunching tools could/should be used [R-Code 2.0 ...]

Plot option "color" does not work

For the new branch "Plot_option_line_kind" I have tried to implement an option to print the lines in the plots in color. Simply replacing line 115 in master

my_plot <- ggplot2::ggplot(my_x, ggplot2::aes_string(x="a",y=variable_name,lty="dataset"))

with

my_plot <- ggplot2::ggplot(my_x, ggplot2::aes_string(x="a",y=variable_name,color="dataset"))

does work as expected, it only throws an error: Fehler: measure variables not found in data: color.
The way I tried to parametrize it in the new branch does not work, however.

Release ToDos: CRAN Policy

  • The ownership of copyright and intellectual property rights of all components of the package must be clear and unambiguous (including from the authors specification in the DESCRIPTION file). Where code is copied (or derived) from the work of others (including from R itself), care must be taken that any copyright/license statements are preserved and authorship is not misrepresented.
    Preferably, an ‘Authors@R’ would be used with ‘ctb’ roles for the authors of such code. Alternatively, the ‘Author’ field should list these authors as contributors.
    Where copyrights are held by an entity other than the package authors, this should preferably be indicated via ‘cph’ roles in the ‘Authors@R’ field, or using a ‘Copyright’ field (if necessary referring to an inst/COPYRIGHTS file).
    Trademarks must be respected.
  • The package’s DESCRIPTION file must show both the name and email address of a single designated maintainer (a person, not a mailing list). That contact address must be kept up to date, and be usable for information mailed by the CRAN team without any form of filtering, confirmation …
    The maintainer warrants that (s)he is acting on behalf of all credited authors and has their agreement to use their material in the way it is included in the package (or if this is not possible, warrants that it is used in accordance with the license granted by the original author).
    Additional DESCRIPTION fields could be used for providing email addresses for contacting the package authors/developers (e.g., ‘Contact’), or a URL for submitting bug reports (e.g., ‘BugReports’).
    Citations in the ‘Description’ field of the DESCRIPTION file should be in author-year style, followed by a DOI or ISBN for published materials, or a URL otherwise. Preferably, the year and identifier would be enclosed, respectively, in parentheses and angle brackets.
  • Source packages may not contain any form of binary executable code.
  • Source packages under an Open Source license must provide source or something which can easily be converted back to source (e.g., .rda files) for all components of the package (including for example PDF documentation, configure files produced by autoconf). For Java .class and .jar files, the sources should be in a top-level java directory in the source package (or that directory should explain how they can be obtained).
    Such packages are not permitted to require (e.g., by specifying in ‘Depends’, ‘Imports’ or ‘LinkingTo’ fields) directly or indirectly a package or external software which restricts users or usage.
    The package’s license must give the right for CRAN to distribute the package in perpetuity. Any change to a package’s license must be highlighted when an update is submitted (for there have been instances of an undocumented license change removing even the right of CRAN to distribute the package).
    Packages with licenses not listed at https://svn.r-project.org/R/trunk/share/licenses/license.db will generally not be accepted.
  • Package authors should make all reasonable efforts to provide cross-platform portable code. Packages will not normally be accepted that do not run on at least two of the major R platforms. Cases for Windows-only packages will be considered, but CRAN may not be the most appropriate place to host them.
  • Packages should be named in a way that does not conflict (irrespective of case) with any current or past CRAN package (the Archive area can be consulted), nor any current Bioconductor package. Package maintainers give the right to use that package name to CRAN when they submit, so the CRAN team may orphan a package and allow another maintainer to take it over.
    When a new maintainer wishes to take over a package, this should be accompanied by the written agreement of the previous maintainer (unless the package has been formally orphaned).
  • Packages on which a CRAN package depends should be available from a mainstream repository: if any mentioned in ‘Suggests’ or ‘Enhances’ fields are not from such a repository, where to obtain them at a repository should be specified in an ‘Additional_repositories’ field of the DESCRIPTION file (as a comma-separated list of repository URLs) or for other means of access, described in the ‘Description’ field.
    A package listed in ‘Suggests’ or ‘Enhances’ should be used conditionally in examples or tests if it cannot straightforwardly be installed on the major R platforms.
  • CRAN versions of packages should work with the current CRAN and Bioconductor releases of dependent packages and not anticipate nor recommend development versions of such packages on other repositories.
  • Packages will not normally be removed from CRAN: however, they may be archived, including at the maintainer’s request.
    Packages for which R CMD check gives an ‘ERROR’ when a new R x.y.0 version is released will be archived (or in exceptional circumstances updated by the CRAN team) unless the maintainer has set a firm deadline for an upcoming update (and keeps to it).
    Maintainers will be asked to update packages which show any warnings or significant notes, especially at around the time of a new x.y.0 release. Packages which are not updated are liable to be archived.
  • Packages should be of the minimum necessary size. Reasonable compression should be used for data (not just .rda files) and PDF documentation: CRAN will if necessary pass the latter through qpdf.
    As a general rule, neither data nor documentation should exceed 5MB (which covers several books). A CRAN package is not an appropriate way to distribute course notes, and authors will be asked to trim their documentation to a maximum of 5MB.
    Where a large amount of data is required (even after compression), consideration should be given to a separate data-only package which can be updated only rarely (since older versions of packages are archived in perpetuity).
    Similar considerations apply to other forms of “data”, e.g., .jar files.
  • Checking the package should take as little CPU time as possible, as the CRAN check farm is a very limited resource and there are thousands of packages. Long-running tests and vignette code can be made optional for checking, but do ensure that the checks that are left do exercise all the features of the package.
    If running a package uses multiple threads/cores it must never use more than two simultaneously: the check farm is a shared resource and will typically be running many checks simultaneously.
    Examples should run for no more than a few seconds each: they are intended to exemplify to the would-be user how to use the functions in the package.
  • The code and examples provided in a package should never do anything which might be regarded as malicious or anti-social. The following are illustrative examples from past experience.
    • Compiled code should never terminate the R process within which it is running. Thus C/C++ calls to assert/abort/exit, Fortran calls to STOP and so on must be avoided. Nor may R code call q().
    • A package must not tamper with the code already loaded into R: any attempt to change code in the standard and recommended packages which ship with R is prohibited. Altering the namespace of another package should only be done with the agreement of the maintainer of that package.
    • Packages should not write in the users’ home filespace, nor anywhere else on the file system apart from the R session’s temporary directory (or during installation in the location pointed to by TMPDIR: and such usage should be cleaned up). Installing into the system’s R installation (e.g., scripts to its bin directory) is not allowed.
      Limited exceptions may be allowed in interactive sessions if the package obtains confirmation from the user.
    • Packages should not modify the global environment (user’s workspace).
    • Packages should not start external software (such as PDF viewers or browsers) during examples or tests unless that specific instance of the software is explicitly closed afterwards.
    • Packages should not send information about the R session to the maintainer’s or third-party sites without obtaining confirmation from the user.
    • Packages must not disable the stack-checking mechanism in the R process into which they are loaded.
    • CRAN packages should use only the public API. Hence they should not use entry points not declared as API in installed headers nor .Internal() nor .Call() etc calls to base packages. Also, ::: should not be used to access undocumented/internal functions in base packages. Such usages can cause packages to break at any time, even in patched versions of R.
  • Changes to CRAN packages causing significant disruption to other packages must be agreed with the CRAN maintainers well in advance of any publicity. Introduction of packages providing back-compatibility versions of already available packages is not allowed.
  • Downloads of additional software or data as part of package installation or startup should only use secure download mechanisms (e.g., ‘https’ or ‘ftps’).

q5_0-Corrected life table

@nevrome, @MartinHinz, @chrinne: I have written some code implementing changes to the life table according to Bocquet-Appel and Masset for accounting for missing children. Originally, I have envisioned a new function for this, but now I am doubting the usefulness of this approach. Meanwhile, I would favour an additional option in the function life.table, e. g. q5_0_correction = TRUE/FALSE, with FALSE the default.
What do you think?

lapply(life_table, lt.function) falls back on default values

Unfortunately, the strategy to cater at the same time for mortaar_life_tables as well as mortaar_life_table_lists does not seem to work properly. In the case of mortaar_life_table_lists, the code snippet lapply(x, lt.function) falls back on default values, any custom values are ignored. Therefore, the task is to pipe custom values through the function lt.function.mortaar_life_table_list.

PreRelease ToDo: Readme.md

make people curious [or at least help them to understand what mortAAR is about]..., i.e. describe the package, its ease of use, provided exemplary figures, ...

dependency tidyverse

mortAAR should not depend on the tidyverse! We should select the really necessary packages.

PreRelease ToDo: Vignette

The vignette should tell a story and present the strengths and possibilities of the package.

Nils suggested a comparisons of different Iron Age settlements.

@ archaeologists: the non-archaeologists could participate after you give us some content/contextual information.

helpful link: http://r-pkgs.had.co.nz/vignettes.html

Adding further vignettes

The added functions for reproduction indices, maternal mortality, corrected life tables etc. need vignettes.

Prerelease: winbuilder error message

During the winbuilder check (https://win-builder.r-project.org/D7z18KlyQafp/) in "install.out" the following error message is generated:

During startup - Warning messages: 1: In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE, : there is no package called 'mixtrak.R.dev' 2: package 'mixtrak.R.dev' in options("defaultPackages") was not found

Where does 'mixtrak.R.dev' stems from? Any ideas?

All other checks (local mac, travis linux, winbuilder) are passed. This is the only one remaining before release!

Improvements in subsetting for plotting

At the moment, it is cumbersome to subset data for plotting (i. e., only a selection of the grouping variable). This is important because sometimes the plot is too crowded because of too many categories.

Proliferation of functions in the documentation

The help file starts to get inflated. It seems that the "rdname"-tag is ignored so that functions appear in the index list which are supposed to be subsumed under other entries. Any easy solutions?

PreRelease Diskussion: output of life.table

life.table currently takes either a single data.frame or a list of data.frames as input. Each of these data.frames needs the columns a and Dx. life.table then calculates stuff and returns a mortaar_life_table_list of mortaar_life_tables (more or less a list of data.frames) with all the input variables and the newly calculated variables.

This method is a bit inconvenient and error-prone. Inconvenient first of all because the users always get's a mortaar_life_table_list, even if he just wants to calculate one life table. I suggest to change the output of life.table to a mortaar_life_table in case of only one input data.frame.

Error-prone because a mortaar_life_table has a potentially inconsistent set of variables. Imagine for example that the input data.frame looks something like this:

 test <- data.frame(a = rep(5, 12), Dx = abs(round(rnorm(12)*50, 0)), fluff = 1:12, dx = 12:1)
 life.table(test)

Here there are not only a and Dx, but also the completely unrelated variables fluff and dx. The resulting mortaar_life_table will contain all the usual life table variables and also the variable fluff. The initial dx on the other hand gets overwritten, because it happens to have the same name as a variable that's calculated in life.table.

I see some possible solutions for this inconsistency. Either we keep all the additional input variables and give our new variables more rare names (maybe by adding a "." in front of the names -> ".dx"/".ex"/... - this solution is applied for example by broom::augment) or we scrape away the additional input variables beyond a and Dx.

The first approach allows more convenient workflows: The user can keep everything in one table. The second approach may be a bit more stable, because a mortaar_life_table always looks exactly the same.

Another idea: We keep all the additional input variables and warn the user in case of a name collision. I like this idea.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.