jrosen48 / prcr Goto Github PK

R package for person-centered analysis

Home Page: https://jrosen48.github.io/prcr/

License: Other

R 100.00%

prcr's Introduction

prcr

prcr is an R package for person-centered analysis. Person-centered analyses focus on clusters, or profiles, of observations, and their change over time or differences across factors. See Bergman and El-Khouri (1999) for a description of the analytic approach. See Corpus and Wormington (2014) for an example of person-centered analysis in psychology and education.

Installation

You can install the development version of prcr (v. 0.2.0) from Github with:

# install.packages("devtools")
devtools::install_github("jrosen48/prcr")

This version takes a "data-first" approach different from the object-oriented approach used in the version on CRAN. Because of this, Please note that there presently exists a significant gap in the user interface between the CRAN version available through install.packages("prcr") and the in-development version available through GitHub. This should be addressed shortly in the next CRAN update.

You can install prcr from CRAN (v. 0.1.5) with:

install.packages("prcr")

Example

This is a basic example using the built-in dataset pisaUSA15:

library(prcr)

df <- pisaUSA15
m3 <- create_profiles_cluster(df, broad_interest, enjoyment, instrumental_mot, self_efficacy, n_profiles = 3)
#> Prepared data: Removed 354 incomplete cases
#> Hierarchical clustering carried out on: 5358 cases
#> K-means algorithm converged: 5 iterations
#> Clustered data: Using a 3 cluster solution
#> Calculated statistics: R-squared = 0.424
plot_profiles(m3, to_center = T)
#> Warning: attributes are not identical across measure variables;
#> they will be dropped

Other functions include those for carrying out comparing r-squared values and perfomring cross-validation. These are documented in both the manual and vignette for the CRAN release and their versions in the in-development version will be documented prior to the CRAN release.

Vignettes

See examples of use of prcr in the vignettes.

Code of Conduct

Please note that this project is released with a Contributor Code of Conduct available here

Latent Profile Analyis (LPA)?

This package is being developed along with its sister project, tidyLPA, which makes it easy to carry out Latent Profile Analysis by providing an interface to the MCLUST package. More information about tidyLPA is available here.

prcr's People

Stargazers

Watchers

Forkers

kmaheshkulkarni marta7kowal

prcr's Issues

add function to compare profile solutions with varying parameters

cran check

checking Rd \usage sections ... WARNING
Undocumented arguments in documentation object 'calculate_stats'
  ‘to_standardize’ ‘plot_uncentered_data’

Undocumented arguments in documentation object 'cross_validate'
  ‘variable_names’

Undocumented arguments in documentation object 'explore_factors'
  ‘statistics’
Documented arguments not in \usage in documentation object 'explore_factors':
  ‘cluster_assignments’ ‘cases_to_keep’

Functions with \usage entries need to have the appropriate \alias
entries, and all their arguments documented.
The \usage entries must correspond to syntactically valid R code.
See chapter ‘Writing R documentation files’ in the ‘Writing R
Extensions’ manual.

checking top-level files ... NOTE
Non-standard files/directories found at top level:
  ‘gender-race.png’ ‘ip.png’ ‘test.csv’ ‘y.csv’

checking R code for possible problems ... NOTE
calculate_stats: no visible global function definition for ‘cutree’
centering_function : scale_this: no visible global function definition
  for ‘sd’
centering_function: no visible global function definition for ‘%>%’
centering_function: no visible global function definition for
  ‘group_by’
centering_function: no visible global function definition for
  ‘mutate_each’
centering_function: no visible global function definition for ‘funs’
... 152 lines ...
  complete.cases contains cutree desc dist dummy_coded_data funs
  group_by group_by_ hclust kmeans manova matches model.matrix
  mutate_each n number_of_clusters prepared_data
  proportion_of_variance_explained quantile rnorm sd select summarize
  summarize_each summary.aov ungroup
Consider adding
  importFrom("stats", "IQR", "TukeyHSD", "aov", "as.formula",
             "complete.cases", "cutree", "dist", "hclust", "kmeans",
             "manova", "model.matrix", "quantile", "rnorm", "sd",
             "summary.aov")
to your NAMESPACE file.

create tests for outliers

create_profiles_cluster "value" not returning a list

I ran the create_profiles_cluster and instead of returning "A list containing the prepared data, the output from the hierarchical and k-means cluster analysis, the r-squared value, raw clustered data, processed clustered data of cluster centroids, and a ggplot object", it returned something like a data frame with the original variables and cluster in it.
the code I used was something like
pc2step <- create_profiles_cluster(df=df, x1, x2, x3, x4, x5, x6, x7, x8, x9, n_profiles=3, to_center=T, to_scale=T, distance_metric="euclidean", linkage="complete")

use anova and manova to calculate r-squared

gather_()

gather_()was deprecated in tidyr 1.2.0. Please usegather()` instead.

the_order argument to calculate_stats() is not working regularly

write new functions for uv and mv outlier detection

currently these are the only functions from other sources (mv is from chemometrics package)

Error with "create_profiles_cluster"

Hello everyone,

when I perform the function

profiles.1 <- create_profiles_cluster(dataset, dataset$ZSCARED_TOT, dataset$ZCSI_TOT, dataset$ZCD_TOT, n_profiles = 4, to_center = F, to_scale = F, distance_metric = "squared_euclidean", linkage = "ward.D")

the following message appear:

Prepared data: Removed 0 incomplete cases
Error in stats::hclust(distance_matrix, method = linkage) :
NA/NaN/Inf in chiamata a funzione esterna (arg 10)

Could you please help me?

Thank you in advance

add Latent Profile Analysis as option for create_profiles() function

How to create a cluster variable in the database

Dear all,
I hope you are well and safe.
I would like to ask. I would like to create a variable in the database with the cluster done. I create a cluster with 2 variables and 4 profiles and to conduct multinomial regression I need the variable in the database. I'm a beginner user, thus, sorry for may ask something odd.

database = datafinal
Variables = Distracaoevitada and BETNegativas
Profiles = 4

My code was:
m2 <- create_profiles_cluster(datafinal, Distracaoevitada, BETNegativas, n_profiles = 4)
plot_profiles(m2, to_center = TRUE)
summary (m2) #until here is fine
datafinal['m2']=m2

This last line, it put something in the database, but in fact, the R is just coping the first variable (Distracaoevitada). I don't know what I'm doing wrong and how to fix it.
Thank you very much for any help.
Amalia

> cv <- cross_validate(prepared_data1, created_profiles1, variable_names = paste0("Cluster ", 1:7), k = 10)

a_assign_star  1  2  3  4  5  6
            1  3  8  2 10  0  0
            2  0  0  0  1  0  1
            3  0  6  1  1  5  4
            4  0 13  5  0  3  0
            5 21  1  0  1  0  0
            6  0  2  0 11  0 23
Success: the objective function is -72 

a_assign_star  1  2  3  4  5  6
            1  9  0  0  0 21  1
            2  0  0  0  1  0 16
            3  0 23  2  1  0  4
            4  1  0  0  2  0  2
            5  1  1  7  4  0  7
            6  8  0  0  7  0  4
Success: the objective function is -77 

a_assign_star  1  2  3  4  5  6
            1  0  0  0  0  2 19
            2  0  0  2 11  0  0
            3 21  0  1  2  1  0
            4  3  7 13  1  3  0
            5  0  0  5  2  6  3
            6  0  0  9  4  7  0
Success: the objective function is -73 

a_assign_star  1  2  3  4  5  6
            2  0  2  1 13  8 28
            3 25 12  9 20  4  0
Error: no feasible solution found