joshfraser / php-name-parser Goto Github PK
View Code? Open in Web Editor NEWPHP library to split names into their respective components (first, last, etc)
PHP library to split names into their respective components (first, last, etc)
array_search will return 0 if the match is the first word, should be changed to in_array
Before:
protected function is_compound($word) {
return array_search(mb_strtolower($word), $this->dict['compound']);
}
After:
protected function is_compound($word) {
return in_array(mb_strtolower($word), $this->dict['compound']);
}
Can you make a composer package with PSR-4 autoloading?
Currently, any entry in the suffix table with regex special characters will get interpreted instead of treated for exact matching.
For example, the suffix entry "R.N." will match a first name of "Ron" and mangle the name.
Hi there,
While trying to parse a name with an unicode character (ex: M. Test-ł
), I saw that the result is altered (M. Test-▒
).
After checking the code, I manage to find that the alteration come from this line.
As the strtolower
function does not manage multi-bytes characters, this explain the alteration. Replacing the strtolower
by mb_strtolower
solve the case.
So I'm wondering if there is an interest to edit the code to replace the simple string functions by the multi bytes version of them to make the parser suitable for international names ?
Thanks.
I was excited to find this open-source library of yours. Thanks for making it.
I'm trying to replace my own custom name parser with yours because I bet you've put more thought into yours.
However, these 4 test cases of mine failed when trying to use your parser:
Right?
I like your package, and I want to continue getting updates as you build. But I can't seem to get it via composer. Any suggestions?
The rules seem to be focussed on US-style names. Is there any support or possibility for optional (pluggable/configurable) non-US name support?
As an example, my native tongue (Dutch) has the untranslatable concept of "Tussenvoegsel" (https://en.wikipedia.org/wiki/Tussenvoegsel), which my own name happens to use. My full name is "Martijn van der Lee", my first name being "Martijn" van last name being "Lee" with the "van der" being the third ("Tussenvoegsel") part. Having my last name as "van der Lee" (or worse "Van Der Lee") would be wrong and would cause sorting to be incorrect when used in the Netherlands.
I understand many languages (at least Irish, French, German) have similar rules for names, and it would be nice if there were a single name parser which could be configured for multiple cultures/languages or possibly even auto-detect them.
Can you add this package to https://packagist.org so we can install it via composer
hi Josh, i see you have a nice name parser. one of your fields coming out is "lastname"; is it also possible to add to additional fields "surname prefix" and "surname", so in essence splitting intelligent the "lastname" into 2 parts. some applications require the separation, and then it is difficult how to split a lastname into those 2 parts. Do you have a solution?
For example:
Ed Hicks-van den Something results in last name den Something rather than Hicks-van den Something.
Niche, but it came up for me! I'll try to commit a fix.
Test case shows that any last name starting with a matched common prof suffix gets incorrectly classified as a whole suffix.
Expected
OLD MACDONALD
'fname' => 'Old'
'lname' => 'Macdonald'
Got
Notes
In this case the MA.. is matching an entry 'MA'. Applies to other cases as well, and it seems like the regex is not respecting the word boundary \b metacharacter.
While trying to extend The Base FullNameParse class. I realized that /** * Parse Static entry point. * * @param string $name the full name you wish to parse * @return array returns associative array of name parts */ public static function parse($name) { $parser = new static(); return $parser->parse_name($name); }
this static entry function does not uses the new self();
keyword rather than the new static();
keyword making that function useless when extending. Thanks in advance!
Great work! Thanks. You t package looks exactly what i need. Is this repo still maintained and does is work with PHP 8.0 and/or 8.1
Are there any plans to add support for parsing of middle names, for example in a string like:
"Jonathan Randolf Jefferson"?
:)
Given a name like the following:
Jonathan Smith, MD
The comma on the last name is retained and saved to "lname". This is contrary to the documented example in examples.php.
I believe the expected behavior is that the comma is removed from the input except in the case of the comma being within the suffix, as in this example:
Jonathan Smith IV, PhD
Both names are borrowed from examples.php but they don't evaluate as shown.
I'm not sure if this is an issue or not, it could be interpreted either way. Opening up a discussion.
For example: "J. Edgar Hoover" or "M. Night Shyamalan" are currently parsed as:
Array
(
[salutation] =>
[fname] => Edgar
[initials] => J.
[lname] => Hoover
[lname_base] => Hoover
[lname_compound] =>
[suffix] =>
)
If this name is re-assembled in another system it would be assumed to be "Edgar J. Hoover" which would be incorrect.
An alternative would be to make fname "J. Edgar" in this situation, with no initials.
I pulled a random sampling of 1000 people from a large database and parsed their names, this script was 96.8% accurate. If this one issue were fixed, 13 additional splits would work, upping the accuracy to 98.1%.
Any plans to implement Dutch notation for surnames
For Instance
Peter de Vries, would be listd under V, not under d
So I would very much like a Name Parser which lists the surname like:
surname: Vries
surname prefix: de
Other examples:
Stijn van der Brekel
surname: Brekel
Surname prefix: van der
Isn't LastName <comma><space> FirstName a very common write-out of a persons full name?
print_r(FullNameParser::parse('LASTNAME, FIRSTNAME'));
Array
(
[salutation] =>
[fname] => Lastname
[initials] =>
[lname] => Firstname
[lname_base] => Firstname
[lname_compound] =>
[suffix] =>
)
Try this:
$parser = new FullNameParser();
$split_name = $parser->parse_name('Fname Lname, Ph.D.');
var_dump($split_name);
/*
array(7) {
["salutation"]=>
string(0) ""
["fname"]=>
string(11) "Fname Lname"
["initials"]=>
string(0) ""
["lname"]=>
string(5) "Ph.D."
["lname_base"]=>
string(5) "Ph.D."
["lname_compound"]=>
string(0) ""
["suffix"]=>
string(0) ""
}
*/
Hi,
Great Job! Have you considered to modify the main function parse_name
to accept two parameters like $firstname
and $lastname
? I have my list of first name and last names separate but when I feed it directly the last name is added to the first name:)
May this be a future request?
Hi. May be time to build release? We want use composer with your project.
Hello,
if I use it for parsing names with education degrees it doesnt work.
For example:
"Ing. Jan Novák Csc."
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.