Gender and Names

The service analyzes the local part of an address for gender and name information.

Syntax /svc/2.0/address/gender/<address>
Example /svc/2.0/address/gender/herbert@bar.com
Parameter An ASCII email address as last part of the URL

Result

<genderStatus> 
<avgage>57.1</avgage> 
<detail>3</detail> 
<firstname>Herbert</firstname> 
<nameday>2013-03-16</nameday> 
<result>1</result> 
</genderStatus>

or

{"avgage": 57.1, "detail":3, "firstname":"Herbert", "nameday": "2013-16-03" "result":0}
  • result: the evaluation result
    • 0: ambiguous or no gender information found, see detail codes 5-9
    • 1: gender information found, see detail codes 1-4
  • detail: detailed explanation of the result
    • 1: gender = feminine
    • 2: gender = mostly feminine
    • 3: gender = masculine
    • 4: gender = mostly masculine
    • 5: gender = unisex
    • 6: name not recognized
    • 7: input error, most likely because the name was shorter than 3 characters
    • 8: invalid address
    • 9: technical error
  • firstname: if the result is 1 this field contains the recognized first name, capitalized
  • nameday: if a valid first name was found, and there is a name day defined for it, then the date for its next occurence is returned. Next occurence means: if the name day is today or still in the future then a date for the current year is returned, otherwise for the following year. If the full date is not important, then month and day of the name day can be easily extracted from the date.
  • avgage: this field contains the average age for holders of a given first name, or 0.0 if there is not sufficient data for a prediction. Many first names follow historic or fashion trends so they can be attributed to a certain age group. In the case of rare or always popular names such predictions cantt be made, then the result 0.0 is returned.

Functionality

There are three stages in the analysis process:

  1. the component tries to extract a first name from the local part of the address
  2. if a name was found it is processed by the gender analysis library
  3. if a name was found then the first name is compared with entries in the gender_name_day table to retrieve the month and day of the corresponding name day; a full date is computed
  4. if a name was found then the first name is compared to entries in the gender_avg_age table to retrieve the average age for the name

The rule for the extraction of a first name are:

  1. extract the local part, if the address is malformed return detail code 8
  2. if the length of the local part is less than 3 return detail code 7
  3. compare the local part with predefined patterns
    • pattern firstname.surname; this pattern will use all kinds of punctuation characters, not only .
    • pattern firstname99, a name followed by a number
    • if none of the previous patterns matched the whole local part is used as the first name
  4. if the length of the extracted first name is < 3, return detail code 7
  5. else call the gender analysis library with the extracted first name