jeffwong / imputation Goto Github PK

View Code? Open in Web Editor NEW

37.0 37.0 24.0 641 KB

R package for data imputation. Fills missing values in a numeric matrix

R 100.00%

imputation's People

Contributors

Stargazers

Watchers

imputation's Issues

Could you please add the References for lmImpute() "locally weighted least squares regression against the row number"?

This package is no longer on CRAN

Is there a version of this package that works on the latest version of R?

"Package ‘imputation’ was removed from the CRAN repository.

Formerly available versions can be obtained from the archive.

Archived on 2014-01-14 for policy violation (using all the processors on a large system)."

http://cran.r-project.org/web/packages/imputation/index.html

implement time series imputation

Error: "new columns would leave holes"

Hi Jeff,

I am a statistics PhD student at CMU. This package looks great, however I cannot get it to run on my data due to the following error.

Error in [<-.data.frame(*tmp*, remove.indices, value = NA) :
new columns would leave holes after existing columns

It seems as though the NAs are being replaced with a character string and this is creating problems?

switch cv error metric to RMSE

Error: kNNImpute() fails if only 1 row contains missing

Hi,

in the special case that only 1 row contains missing values, kNNImpute() returns:
"Error in apply(x.missing, 1, function(i) { : dim(X) must have a positive length")

As far as I can see, the sub-function impute.prelim() is responsible in line
prelim = impute.prelim(x).

If more than 2 rows contain missing, this function returns a matrix for impute.prelim()$x.missing.
If only 1 row contains missings, it returns a named vector, which later on can not be handled by the apply-function
t(apply(x.missing, 1, function(i) {

My suggestion was to test, if inserting the drop=FALSE option at the end of function impute.prelim() solves the problem:
if (byrow) x.missing = cbind(1:nrow(x), x)[missing.rows.indices, ,drop=FALSE ] else x.missing = rbind(1:ncol(x), x)[, missing.cols.indices ,drop=FALSE]

Thanks for reading :)

implement cross imputation

If a missing value is located at row i, column j, use all information in both row i and column j to impute.
In a sense use both horizontal neighbors and vertical neighbors

reimplement kNN using Rcpp

pdist can't handle NAs in data

cross validation error for rows with 0s

cross validation division by 0 error

Incorrect implementation of SVTImpute

needs to iterate SVT

Should imputation methods standardize the input matrix first?

for n large pdist creates too much overhead

when using kNNImpute, pdist is called once for each row that has missing data. Too many calls to pdist can cause lots of overhead because of the transfer to C. One larger dist call is likely to be faster than many pdist calls

implement kmeans imputation

kNN should use weighted mean, not simple mean

tsImpute should accept a vector as the metric variable

R dimension dropping leads to bugs in imputation

Hi,

here is a simple example

x = matrix(rnorm(100),10,10)
x[1,1] = NA
meanImpute(x)

error:
Error in apply(x.missing, 2, function(j) { :
dim(X) must have a positive length

traceback()
7: stop("dim(X) must have a positive length")
6: apply(x.missing, 2, function(j) {
bad.indices = which(is.na(j))
j[bad.indices] = mean(j[-1], na.rm = T)
j[-1]
})
5: meanImpute(x) at preprocess.R#40
4: eval(expr, envir, enclos)
3: eval(ei, envir)
2: withVisible(eval(ei, envir))
1: source("../comparison_study/preprocess.R")

Fix:

You are not telling R to avoid dropping dimension in impute.prelim

if (byrow)
x.missing = cbind(1:nrow(x), x)[missing.rows.indices,
]
else x.missing = rbind(1:ncol(x), x)[, missing.cols.indices]

Add a drop = FALSE.
Also I don't really see the need for adding a 1:n row or column, but i didnt bother to look up whether this is useful for you.

Bernd

jeffwong / imputation Goto Github PK

imputation's People

Contributors

Stargazers

Watchers

Forkers

imputation's Issues

Recommend Projects

Recommend Topics

Recommend Org