Giter Site home page Giter Site logo

mjfii / r-nameparser-lib Goto Github PK

View Code? Open in Web Editor NEW
6.0 6.0 0.0 1.93 MB

An R library allowing parsing of surname, first name, and gender based on US census data.

License: GNU Affero General Public License v3.0

R 100.00%
algorithm census-data determination gender library parse r

r-nameparser-lib's People

Contributors

mjfii avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

r-nameparser-lib's Issues

object 'census.names' not found

I have been getting the following error:

x <- 'livingston III,  Mr. MICHAEL JOHN9'
parse.names(x)
Error in checkForRemoteErrors(val) : 
  one node produced an error: object 'census.names' not found

It looks like the census.names data frame was not accessible in the clusters. The code failed during parLapply().

After reading this explanation: SimDesign Vignette

I was able to fix it by updating the parse.names() function by loading census.names in the function and exporting it to the clusters with parallel::clusterExport( cl=cl, 'census.names' ). Wanted to share in case others were having this issue.

function (x, ...) 
{
    data("census.names")
    no_cores <- parallel::detectCores() - 1
    cl <- parallel::makeCluster(no_cores)
    parallel::clusterExport( cl=cl, 'census.names' )
    input_names <- data.frame(name = x)
    input_names$parsed_name <- parallel::parLapply(cl, input_names$name, 
        parse.name)
    input_names <- data.frame(input_names$name, do.call("rbind", 
        strsplit(as.character(input_names$parsed_name), "|", 
            fixed = TRUE)), stringsAsFactors = FALSE)
    colnames(input_names) <- c("name", "salutation", "first_name", 
        "middle_name", "last_name", "suffix", "gender", "gender_confidence")
    parallel::stopCluster(cl)
    return(input_names)
}

passing custom prefixes and suffixes to parse.names

To pass custom prefixes and suffixes to parse.name() through the parse.names() function the elipses need to be added to parLapply():

input_names$parsed_name <- parallel::parLapply(cl, input_names$name, 
        parse.name, ...)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.