Web services that are designed to help to verify and enrich address data originating from subscription or registration forms in the websites.
Service Overview
This section documents the public part of the brandREACH | email check API that is available for customers.
This section documents the public part of the .brandREACH | email check API that is available for customers. The public API provides only GET requests. Customers can only request information about resources, they cant create or modify anything.
Please note that the public API is designed to give the caller almost always an answer useful to him or her. Customers are shielded from technical problems, they should never get a 500 Server Error response. Errors and problems are logged to the server logs, which therefore should be monitored constantly.
Since the users of the public API cannot modify resources in the system, there is no user data to protect. The only user data used by validmail is the user account. This account is a very slim entity, just providing enough information for authentication, account name, password and user group, nothing else.
The only user related data items that validmail generates are business events that contain data about each call of the public API. The data generated is primarily used for billing, but can also be used for troubleshooting.
Table of Contents:
- Adress Syntax Check
- Gender and Names
- Education Domains
- Disposable addresses
- Company Domains
- Challenge-Response risks
- Botrisks
- Common Syntax Warnings
- Role Analysis
Address Syntax Check
The address syntax check works similar to the quality check, but focusses purely on the syntax of an email address. The input address can include encoded Unicode characters.
To call the syntax check use this syntax:
Syntax | /svc/2.0/address/syntax/<e-mail address> |
Example | /svc/2.0/address/syntax/foo@bar.com |
Parameter | An email address as last part of the URL |
Result
As XML:
<syntaxStatus>
<decoded>foo@xn—blmchen-o2a.de</decoded>
<extSyntax>1</extSyntax>
<result>2</result>
</syntaxStatus>
As JSON:
{
"decoded": "foo@xn—blmchen-o2a.de",
"extSyntax":1
"result":1,
}
result
Tests the syntax of the address against the e-mail addressing standards, possible result values are
- 0: invalid syntax
- 1: valid syntax
- 2: probably valid syntax, Unicode problems were solved, see decoded
The test is stricter than the standards because it requires a valid domain name in the address. Localhost addresses and other exotic cases will not be accepted, because it is unlikely that these are e-mail addresses valid for business.
If the syntax result is 0 (invalid) or 2 (probably valid) the structure contains also syntax warnings explaining the problems, see below.
decoded
If the syntax test ended with a result of 2, this field will contain the decoded ASCII address. A syntax test result of 2 means that the address contained Unicode characters (e.g., umlauts, arabic or chinese characters), which are invalid in an e-mail address. These characters were successfully converted and the resulting, valid ASCII address was stored in decoded. Further tests should always use this decoded address.
The decoding during the syntax test is done in two stages:
- the local part of the address is checked for German umlauts. If found, they are converted to their usual ASCII counterparts (ü - ue, ä - ae, ö - oe, ß - ss)
- the domain part of the address is transformed to Punycode, according to the standard for international domains
extSyntax
Many e-mail providers have their own rules for valid e-mail addresses of their domains. These typically include the minimal length of an address, which punctuation characters are allowed etc. The extended syntax check verifies addresses against these rules. Possible results are:
- 0: invalid syntax for this domain
- 1: valid syntax for this domain
If the extended syntax check fails, the result structure will include syntax warnings explaining the problem, see below.
syntaxWarnings
If one of the syntax checks fails, the result structure will include one ore more syntaxWarnings elements:
<syntaxStatus>
...
<syntaxWarnings>synm002</syntaxWarnings>
<syntaxWarnings>....
</syntaxStatus>
Each element contains a message code, which can be used to identify the problem. See the page syntax warnings for codes and explanations.
domainScores
The domainScores element consists of a list of similar sounding domain names ordered by a calculated score:
<syntaxStatus>
...
<domainScores>
<domainScore>
<domain>teleos-web.de</domain><score>1.0</score>
</domainScore>
</domainScores>
</syntaxStatus>
The higher the score the higher is the probability that this domain was intended. The system calculates the score by searching for similar domain names that are popular with e-mail marketing users.
Functionality
This section describes the execution of the address syntax check in detail. The check consist of the following steps:
-
checking the mailbox (local part) of the address for Unicode characters. Unicode is not allowed in the local part, so this routine only checks for typical, language-specific typos. Currently this includes only German umlauts. If found, they are converted to their usual ASCII counterparts: ü - ue, ä - ae, ö - oe, ß - ss. If at least one of these conversions happens, the syntax warning synm018 is added to the result, and the decoded ASCII value of the mailbox is stored in decoded.
-
checking the domain name of the address for Unicode characters. If Unicode is found, the domain is probably an IRI, and the domain name will be transformed according to the Punycode standard (RFC 3942). In this case the syntax warning synm017 is added to the result, and the decoded ASCII value of the domain name is stored in decoded.
-
checking the syntax according to the standards. As mentioned above, exotic cases, like localhost addresses or addresses with comments, will be rejected. It expects real addresses usable for e-mail transfer accross domains. If the snytax check fails, the test assumes an input error and result includes a list of similar sounding, popular domain names, taken from the domain_response table, in domainScores.
-
checking the syntax against the rulebase of extended syntax criteria.
Address Quality Check
The address quality check uses a sequence of predefined test procedures to verify that an e-mail address is formally valid and does actually exist. There are two versions: a fast version and a more precise version. The interfaces of both versions differ only marginally, so we document them together and mention the differences where applicable.
The input address can include encoded Unicode characters.
To call the fast address quality check use this syntax:
Syntax | /svc/2.0/address/quality/<e-mail address> |
Example | /svc/2.0/address/quality/foo@bar.com |
Parameter | An e-mail address as last part of the URL |
To call the enhanced address quality check use this one:
Syntax | /svc/2.0/address/quality-n/<e-mail address> |
Example | /svc/2.0/address/quality-n/foo@bar.com |
Parameter | An e-mail address as last part of the URL |
The main difference between the two versions is the way they handle temporary errors. Temporary errors are used by mailservers to signal either Greylisting anti-spam measures or real temporary problems, e.g. a high server load.
The normal fast address quality check treats temporary errors as problems that make it impossible to verify the existence of an e-mail address. It will immediately return upon encountering such a problem.
The enhanced address quality check is more thoroughly, and starts a background check for addresses with temporary errors. The background checks repeat the tests according to predefined schedules, to check if the temporary problem caused by Greylisting or other problems goes away. The results of these background checks can be queried by simply repeating the enhanced address quality check. See the documentation for field address for more.
Result
As XML:
<qualityStatus>
<address>0</address>
<bounceRisk>0</bounceRisk>
<checked>1</checked> <decoded>foo@xn—blmchen-o2a.de</decoded>
<domain>1</domain>
<extSyntax>1</extSyntax>
<mailserver>0</mailserver>
<mailserverDiagnosis>0</mailserverDiagnosis>
<probability>0</probability>
<syntax>2</syntax>
</qualityStatus>
As JSON:
{
"address":0,
"bounceRisk":0,
"checked":1,
"decoded":
"foo@xn—blmchen-o2a.de",
"domain":1,
"extSyntax":1
"mailserver":0,
"mailserverDiagnosis":1,
"probability":0,
"syntax":1,
}
The result contains the results of a sequence of different tests that depend on each other. As soon as one of these fails (result = 0), the subsequent test will not be executed and contain also a 0 result.
We will discuss the results in the order of execution of the respective tests:
syntax
Tests the syntax of the address against the e-mail addressing standards, possible result values are
- 0: invalid syntax
- 1: valid syntax
- 2: probably valid syntax, Unicode problems were solved, see decoded
The test is stricter than the standards because it requires a valid domain name in the address. Localhost addresses and other exotic cases will not be accepted, because it is unlikely that these are e-mail addresses valid for business.
If the syntax result is 0 (invalid) or 2 (probably valid) the structure contains also syntax warnings explaining the problems, see below.
decoded
If the syntax test ended with a result of 2, this field will contain the decoded ASCII address. A syntax test result of 2 means that the address contained Unicode characters (e.g., umlauts, arabic or chinese characters), which are invalid in an e-mail address. These characters were successfully converted and the resulting, valid ASCII address was stored in decoded. Further tests should always use this decoded address.
The decoding during the syntax test is done in two stages:
- the local part of the address is checked for German umlauts. If found, they are converted to their usual ASCII counterparts (ü - ue, ä - ae, ö - oe, ß - ss)
- the domain part of the address is transformed to Punycode, according to the standard for international domains
extSyntax
Many e-mail providers have their own rules for valid e-mail addresses of their domains. These typically include the minimal length of an address, which punctuation characters are allowed etc. The extended syntax check verifies addresses against these rules. Possible results are:
- 0: invalid syntax for this domain
- 1: valid syntax for this domain
If the extended syntax check fails, the result structure will include syntax warnings explaining the problem, see below.
syntaxWarnings
If one of the syntax checks fails, the result structure will include one ore more syntaxWarnings elements:
<qualityStatus>
...
<syntaxWarnings>synm002</syntaxWarnings>
<syntaxWarnings>....
</qualityStatus>
Each element contains a message code, which can be used to identify the problem. See the page syntax warnings for codes and explanations.
domain
The domain test checks the validity of the domain name in the e-mail address. The system checks this be looking at the DNS record for the domain name. Possible result values are:
- 0: domain name does not exist
- 1: domain name exists
If the domain name is invalid, the system assumes a typo and tries to find similar domain names, which are offered to the user for selection. These domain names will be returned in the domainScores element, see domainScores.
domainScores
The domainScores element consists of a list of similar sounding domain names ordered by a calculated score:
<qualityStatus>
...
<domainScores>
<domainScore>
<domain>teleos-web.de</domain><score>1.0</score>
</domainScore>
</domainScores>
</qualityStatus>
The higher the score the higher is the probability that this domain was intended. The system calculates the score by searching for similar domain names that are popular with e-mail marketing users.
mailserver
If the domain exists, the system checks also whether the domain has a mail server defined. also by checking the DNS record. Possible result values are:
- 0: no mailserver found
- 1: mailserver found
mailserverDiagnosis
Not all mailserver tell the truth about the existence of an e-mail address, mostly due to anti-spam measures. The mailserverDiagnosis element describes the response behaviour of the domains mailservers to SMTP requests:
- 0: unknown
- 1: server tells the truth
- 2: server answers always address exists
- 3: server answers always address does not exist
- 4: SMTP requests ended with errors (e.g, network errors, timeout, server errors)
bounceRisk
The aggregated risk that an e-mail to this domain/address will be rejected, bounce:
- 0: there is a high bounce risk
- 1: there is a normal bounce risk
probability
This test calculates the probabilty of the domain name. Domain name input in e-mail address forms often results in typos, and this check tries to find such errors. Possible result values are:
- 0: the domain name has a low probability
- 1: the domain name has a normal or high probability
If the test ends with a low probability result, thewn the system found more popular domains with similar names. These are returned in a domainScores element.
Unlike the previous tests, a negative result here will not cause a termination. Since the test relies on probabilities the assessment of the results is up to the user.
address
The address is verified by SMTP requests to one of the domains mailservers. All address quality check versions support the following results:
- 0: address does not exist
- 1: address does exist
- 2: address not verifiable
The enhanced address quality check provides one more result:
- -1: encountered temporary error, background check initiated
As mentioned above the enhanced quality check tries to work around temporary errors and will repeat the SMTP test periodically, to see if the error will go away. To query the result of the background checks just repeat the API call until a result other than -1 is returned for the address field.
checked
This field provides further information about the address check. Possible values are:
- 0: the result in address was taken from the SMTP cache
- 1: the result in address is the result of a real SMTP check
Functionality
This section describes the execution of the address quality check in detail. The check consist of the following steps:
-
checking the mailbox (local part) of the address for Unicode characters. Unicode is not allowed in the local part, so this routine only checks for typical, language-specific typos. Currently this includes only German umlauts. If found, they are converted to their usual ASCII counterparts: ü - ue, ä - ae, ö - oe, ß - ss. If at least one of these conversions happens, the syntax warning synm018 is added to the result, and the decoded ASCII value of the mailbox is stored in decoded.
-
checking the domain name of the address for Unicode characters. If Unicode is found, the domain is probably an IRI, and the domain name will be transformed according to the Punycode standard (RFC 3942). In this case the syntax warning synm017 is added to the result, and the decoded ASCII value of the domain name is stored in decoded.
-
if decoded contains a value this address will be used for all subsequent tests, else the original address.
-
checking the syntax according to the standards. As mentioned above, exotic cases, like localhost addresses or addresses with comments, will be rejected. It expects real addresses usable for e-mail transfer accross domains. If the snytax check fails, the test assumes an input error and result includes a list of similar sounding, popular domain names.
-
checking the syntax against the rulebase of extended syntax criteria. These are provider-specific syntax rules that can be changed over time.
-
checking the domain name. The component looks for a DNS (A) record for the domain. By default the component tries 2 times, with a timeout of 2 seconds each, before giving up. In case of failure the mentioned list of similar domain names is returned.
-
looking for a mailserver. Using the DNS information from the previous step, the component looks for MX entries for the domain. By default the component tries 2 times, with a timeout of 2 seconds each, before giving up. In case of failure the mentioned list of similar domain names is returned.
-
calculating the bounce risk from historic data contains an aggregation of previous e-mail transfers for various domains. If there were more than 80% bounces then the bounce risk is considered high, in all other cases it is normal.
-
calculating the domain name probability, if there are domains with similar names having higher address counts. If there are such domains, then an input error is assumed, and the list of similar domains is included in the result.
-
check the SMTP cache for the domains status. The SMTP cache tracks all SMTP test results and tries to find out whether a SMTP test for an address is appropriate. Some mailservers are set up to always answer positively or negatively when asked for a mail addresses. Others might have technical errors or be simply too slow. In all these cases it would be not useful to start a SMTP check. So the address check would be skipped. The SMTP cache contains also lists of exceptions for domains and mailservers that provide fixed answers.
-
check the SMTP cache for an already existing Greylisting result. This step only occurs in the enhanced quality check. Results of Greylisting background checks are temporarily stored because it could be computationally expensive to repeat the test. By default these results ares stored for 24 hours. If during the storage time one of the stored addresses is checked again, the SMTP results from the cache are used. Although these results are taken from the cache, the checked flag is set to 1 (really checked), because the cache content is the result from a recent SMTP check.
-
checking the address by contacting the mailserver. If the SMTP cache doesnt object, a SMTP conversation with one of the mailservers for the domain is initiated.
-
the response behaviour of the domains mailservers is diagnosed. If the address check was skipped due to the SMTP cache, the value from the cache will be returned, else the result of the SMTP check will be used
Address spam trap check
The check tests whether the address provided is known as a spam trap:
Syntax | /svc/2.0/address/spamtrap/<address> |
Example | /svc/2.0/address/spamtrap/foo@bar.com |
Parameter | An ASCII email address as last part of the URL |
Result
<spamtrapStatus>
<infoId>foo@bar.com</infoId>
<result>1</result>
<trapType>1</trapType>
</spamtrapStatus>
or
{"infoId": "foo@bar.com", "result": 1}
The result document contains:
- result: the evaluation result
- 0: address is not a known spam trap
- 1: address is a known spam trap
- infoId: if the result is 1, this field contains the ID for looking up more information about a spam trap resource.
- trapType: the kind of spam trap found
- 0: not a known spam trap
- 1: the email address, the mailbox is a known spam trap
- 2: the domain of the email address is a known spam trap
Functionality
The service tests the address against the contents of the configured data source. If there is a match it returns a positive result and the resource ID of the spam trap that matched.
Information about a spam trap resource
The botrisk info API call returns information about spam trap resources.
Syntax | /svc/2.0/info/spamtrap/<id> |
Example | /svc/2.0/info/spamtrap/foo@bar.com |
Parameter | An ID (string) as the last part of the URL |
Result
<spamtrapInfo>
<id>foo@bar.com</id>
<trapType>1</trapType>
<owner>Spamtrap ... GmbH</owner>
<remarks>Sending to spam traps of this kind leads to the following reactions ...</remarks>
<url>spam110trap.de/x1</url>
</spamtrapInfo>
or
{
"id":"foo@bar.com",
"trapType":1,
"owner":"Spamtrap ... GmbH",
"remarks":"Sending to spam traps of this kind leads to the following reactions ...",
"url":"spam110trap.de/x1"
}
The structure of the result document is:
- id: resource ID
- trapType: scope/kind of spam trap
- 1: single email address
- 2: a domain
- owner: the name of the maintaining entity, if known
- remarks: details about the resource
- url: a URL with more informtion about the provider, reactions
Functionality
If no data is available for an ID the server will signal this with HTTP response code 204 (No Content) and return no result (null).
Address Robinson List check
The Robinson List check tests whether the address provided is registered in Robinson Lists.
Syntax | /svc/2.0/address/robinsonlist/<address> |
Example | /svc/2.0/address/robinsonlist/foo@bar.com |
Parameter | An ASCII email address as last part of the URL |
Result
<robinsonlistStatus>
<infoIds>
<infoId>ecg-liste</infoId>
</infoIds>
<result>1</result>
</robinsonlistStatus>
or
{"infoIds": ["ecg-liste"], "result": 1}
The service looks for typical naming patterns in different parts of the address and returns the sum. The result document contains:
- result: the evaluation result
- 0: address is not part of any of the configured Robinson lists
- 1: address is part of at least one of the configured Robinson lists
- infoIds: if the result is > 0, this contains IDs for looking up more information about a Robinson resource.
Functionality
The service encrypts the email address and tests the result against the contents of the configured data sources. If there is a match it returns a positive result and the resource IDs of the lists that matched.
Each execution of the check will be documented as a business event (database table business_event) with type 104.
Information about a Robinson resource
The botrisk info API call returns information about botrisk resources. Bots can be identified by their naming patterns, which are documented here.
Syntax | /svc/2.0/info/robinsonlist/<id> |
Example | /svc/2.0/info/robinsonlist/ecg-liste |
Parameter | An ID (string) as the last part of the URL |
Result
<robinsonlistInfo>
<country>A</country>
<id>ecg-liste</id>
<name>ECG-Liste</name>
<owner>RTR GmbH</owner>
<remarks></remarks>
<url>http://www.rtr.at/de/tk/E_Commerce_Gesetz</url>
</robinsonlistInfo>
or
{
"country":"A",
"name":"ECG-Liste",
"id":"ecg-liste",
"owner":"RTR GmbH",
"remarks":"",
"url":"http://www.rtr.at/de/tk/E_Commerce_Gesetz"
}
The structure of the result document is:
- countryCode: country of list provider
- name: Robinson List name
- id: resource ID
- owner: the name of the maintaining entity, if known
- remarks: details about the resource
- url: a URL with more informtion about the provider, list purpose
Functionality
If no data is available for an ID the server will signal this with HTTP response code 204 (No Content) and return no result (null).
Public Service Domains
The service tests whether an email address belongs to a domain related to a public service organization.
Syntax | /svc/2.0/address/publicservicedomain/<address> |
Example | /svc/2.0/address/publicservicedomain/foo@bar.com |
Parameter | An ASCII email address as last part of the URL |
Result
<publicServiceDomainStatus>
<infoIds>
<infoId>5000</infoId>
</infoIds>
<result>1</result>
</publicServiceDomainStatus>
or
{"infoIds": [5000], "result": 1}
- result: the evaluation result
- 0: not a public service domain
- 50: probably a public service domain
- 100: a public service domain
- infoIds: if result > 0, this element contains IDs for looking up more information about a public service resource.
Information about a public service resource
The public service info API call returns information about public service organization resources.
Syntax | /svc/2.0/info/publicservicedomain/<id> |
Example | /svc/2.0/info/publicservicedomain/5335 |
Parameter | An ID (number) as the last part of the URL |
Result
<publicServiceDomainInfo>
<addressCity>München, Landeshauptstadt</addressCity>
<addressState>Bayern</addressState>
<addressZipCode>80331</addressZipCode>
<countryCode>de</countryCode>
<description>Kreisfreie Stadt</description>
<id>5335</id>
<orgName>München, Landeshauptstadt</orgName>
</publicServiceDomainInfo>
or
{
"addressStreet":null,
"addressCity":"München, Landeshauptstadt",
"addressZipCode":"80331",
"addressState":"Bayern",
"countryCode":"de",
"description":"Kreisfreie Stadt",
"id":5335,
"orgName":"München, Landeshauptstadt"
}
The result document address data and descriptions for the public service resources.
Functionality
If no data is available for an ID the server will signal this with HTTP response code 204 (No Content) and return no result (null).
No Advertising Household Check
The No Advertising address checks purpose is to test whether an email address belongs to the list of registered no advertising households that should not be bothered with advertising information.
Syntax | /svc/2.0/address/no-advertising/<address> |
Example | /svc/2.0/address/no-advertising/foo@bar.com |
Parameter | An ASCII email address as last part of the URL |
Result
The service answers with a XML or JSON document providing the result code
XML:
<no-advertising>
<result>1</result>
</no-advertising>
JSON:
{"result":1}
Possible result codes:
- 0 - not found
- 1 - found, the address belongs to a registered no advertising household.
Functionality
The service checks its input parameters against a list of no-advertising households that is provided by Nayoki.
Mailserver Diagnosis
The mailserver diagnosis API call returns information about the behaviour of mailservers, when asked about the existence of email addresses.
Syntax | /svc/2.0/info/mailserverdiagnostics/<domain> |
Example | /svc/2.0/info/mailserverdiagnostics/bar.com |
Parameter | An ASCII domain name as last part of the URL |
Result
<mailserverDiagnosisInfo>
<result>2</result>
</mailserverDiagnosisInfo>
or
{"result": 2}
If successful (response code 200) the service will return a document containing the diagnosis for the mailservers of the domain requested. If no data is available for a domain the server will signal this with HTTP response code 204 (No Content) and return no result (null).
The document contains:
- result: the diagnosis for the mailservers of a domain when asked about the existence of an e-mail address
- 1: the mailservers answer truthfully
- 2: the mailservers answer always with address exists (catchall)
- 3: the mailservers answer always with address does not exist (catchall)
- 4: the mailservers answer with errors or arent available
- 6: the mailservers answer truthfully, but use Greylisting
Functionality
The service looks up the domain name in the SMTP Cache and returns the data currently stored there, or null (response code 204) if the domain name is currently not included in the SMTP Cache. The result is a snapshot of the current state of the SMTP cache, which is dynamic, so subsequent requests might return different results. An exception to this rule are the domain exceptions, a configurable list of domain names with fixed diagnostics values. These domains take precedence, so the service will always return the configured diagnostics values.
If the SMTP cache is inactive the service will always return null (response code 204).
Languages
The service finds out the primary languages for an address.
Syntax | /svc/2.0/address/language/<address> |
Example | /svc/2.0/address/language/foo@bar.com |
Parameter | An ASCII email address as last part of the URL |
Result
<languageStatus>
<result>1</result>
<languages>
<language>en</language>
</languages>
</languageStatus>
or
{"result": 1, "languages": ["en"]}
- result: the evaluation result
- 0: there is no information about the language orientation of this address available
- 1: the domain of this address has a national focus
- 2: the domain of this address has an international focus
- languages: 1-3 codes (ISO 639-1/2) for primary languages
IWT Domains (ISP, Webmailer and Telecom Domains)
The IWT check tests whether an address belongs to a domain of an ISP, a Webmail or a Telecom provider.
Syntax | /svc/2.0/address/iwt/<address> |
Example | /svc/2.0/address/iwt/foo@bar.com |
Parameter | An ASCII email address as last part of the URL |
Result
<iwtStatus>
<infoId>bar.com</infoId>
<result>1</result>
</iwtStatus>
or
{"infoId": "bar.com", "result": 1}
The result details are:
- result: the evaluation result
- 0: the domain doesnt belong to known IWT provider
- 1: the domain belongs to known IWT provider
- infoId: if the result == 1, this contains the ID for looking up more information about a IWT resource
Functionality
If it matches a positive result is returned, which includes the ID for info lookup.
Information about a IWT resource
The IWT info API call returns information about various IWT resources, describing their market orientation and languages used.
Syntax | /svc/2.0/info/iwt/<ID> |
Example | /svc/2.0/info/iwt/bar.com |
Parameter | An info ID (domain name) as last part of the URL |
Result
<iwtInfo>
<id>bar.com</id>
<owner>Webmailer Corp.</owner>
<country>USA</country>
<orientation>1</orientation>
<languages>
<language>en</language>
</languages>
</iwtInfo>
or
{
"id": "bar.com",
"owner": "Webmailer Corp.",
"country“: "USA“,
"orientation“:1,
“languages“:["en"]
}
The result document contains:
- id: the ID of the resource, the domain name
- owner: the name of the owning entity, provider
- country: country name of the provider HQ
- orientation: market orientation
- 0: unknown
- 1: national
- 2: international
- languages: if the orientation is national (1) this field can contain 1-3 language codes (ISO 639-1/2) describing the languages used in that national market
Functionality
If successful (response code 200) the service will return a document with information about the requested IWT resource. If no data is available for an ID the server will signal this with HTTP response code 204 (No Content) and return no result (null).
Gender and Names
The service analyzes the local part of an address for gender and name information.
Syntax | /svc/2.0/address/gender/<address> |
Example | /svc/2.0/address/gender/herbert@bar.com |
Parameter | An ASCII email address as last part of the URL |
Result
<genderStatus>
<avgage>57.1</avgage>
<detail>3</detail>
<firstname>Herbert</firstname>
<nameday>2013-03-16</nameday>
<result>1</result>
</genderStatus>
or
{"avgage": 57.1, "detail":3, "firstname":"Herbert", "nameday": "2013-16-03" "result":0}
- result: the evaluation result
- 0: ambiguous or no gender information found, see detail codes 5-9
- 1: gender information found, see detail codes 1-4
- detail: detailed explanation of the result
- 1: gender = feminine
- 2: gender = mostly feminine
- 3: gender = masculine
- 4: gender = mostly masculine
- 5: gender = unisex
- 6: name not recognized
- 7: input error, most likely because the name was shorter than 3 characters
- 8: invalid address
- 9: technical error
- firstname: if the result is 1 this field contains the recognized first name, capitalized
- nameday: if a valid first name was found, and there is a name day defined for it, then the date for its next occurence is returned. Next occurence means: if the name day is today or still in the future then a date for the current year is returned, otherwise for the following year. If the full date is not important, then month and day of the name day can be easily extracted from the date.
- avgage: this field contains the average age for holders of a given first name, or 0.0 if there is not sufficient data for a prediction. Many first names follow historic or fashion trends so they can be attributed to a certain age group. In the case of rare or always popular names such predictions cantt be made, then the result 0.0 is returned.
Functionality
There are three stages in the analysis process:
- the component tries to extract a first name from the local part of the address
- if a name was found it is processed by the gender analysis library
- if a name was found then the first name is compared with entries in the gender_name_day table to retrieve the month and day of the corresponding name day; a full date is computed
- if a name was found then the first name is compared to entries in the gender_avg_age table to retrieve the average age for the name
The rule for the extraction of a first name are:
- extract the local part, if the address is malformed return detail code 8
- if the length of the local part is less than 3 return detail code 7
- compare the local part with predefined patterns
- pattern firstname.surname; this pattern will use all kinds of punctuation characters, not only .
- pattern firstname99, a name followed by a number
- if none of the previous patterns matched the whole local part is used as the first name
- if the length of the extracted first name is < 3, return detail code 7
- else call the gender analysis library with the extracted first name
Education Domains
Education domain check
The service tests whether an email address belongs to a domain related to an educational institution.
Syntax | /svc/2.0/address/educationdomain/<address> |
Example | /svc/2.0/address/educationdomain/foo@bar.com |
Parameter | An ASCII email address as last part of the URL |
Result
<educationDomainStatus>
<infoId>bar.com</infoId>
<result>1</result>
</educationDomainStatus>
or
{"infoId": "bar.com", "result": 1}
- result: the evaluation result
- 0: no company domain found
- 1: company domain found
- infoId: if result == 1, this contains the ID (domain name) for looking up more information about an education domain resource.
Functionality
If a match is found a positive result (1) is returned. The ID of the matching entry, the domain name, is also returned for further lookup.
Information about an education domain resource
The education info API call returns information about education resources.
Syntax | /svc/2.0/info/educationdomain/<id> |
Example | /svc/2.0/info/educationdomain/bar.com |
Parameter | An ID (domain name) as the last part of the URL |
Result
<educationDomainInfo>
<addressAreaCode>6421</addressAreaCode>
<addressCity>Marburg</addressCity>
<addressFax>967-411</addressFax>
<addressHomePage>www.bar.com</addressHomePage>
<addressPhone>967-431</addressPhone>
<addressPostbox></addressPostbox>
<addressStreet>Dürerstraße 2011</addressStreet>
<addressZipCode></addressZipCode>
<countryCode>de</countryCode>
<domains>
<domain>bar.com</domain>
</domains>
<foundationYear>1909</foundationYear>
<orgName>Evangelische Hochschule Bar</orgName>
<orgSubtype>Theologische Hochschule</orgSubtype>
<orgType>Fachhochschulen und Hochschulen ohne Promotionsrecht</orgType>
<rightDoctorate>false</rightDoctorate>
<rightHabilitation>false</rightHabilitation>
<shortName>Marburg EH Bar</shortName>
<sponsorship>privat, staatlich anerkannt</sponsorship>
<state>Hessen</state>
<students>69</students>
<educationDomainInfo>
or
{
"addressAreaCode":"6421",
"addressCity":"Marburg",
"addressFax":"967-411",
"addressHomePage":"www.bar.com",
"addressPhone":"967-431",
"addressPostbox":"",
"addressStreet":"Dürerstraße 2011",
"addressZipCode":"" "countryCode":"de",
"domains": ["bar.com"],
"foundationYear":1909,
"orgName":"Evangelische Hochschule",
"orgType":"Fachhochschulen und Hochschulen ohne Promotionsrecht",
"orgSubtype":"Theologische Hochschule",
"rightDoctorate":false,
"rightHabilitation":false,
"shortName":"Marburg EH Bar",
"sponsorship":"privat, staatlich anerkannt",
"state":"Hessen"
"students":69,
}
The result document can contain multiple domains that act as IDs for that institution. The document further provides address and other self-explanatory data about the educational organization.
Functionality
If no data is available for an ID the server will signal this with HTTP response code 204 (No Content) and return no result (null).
Disposable addresses
Disposable address check
The service tests if a mail address belongs to a domain providing temporary, disposable email addresses.
Syntax | /svc/2.0/address/disposable/<address> |
Example | /svc/2.0/address/disposable/foo@bar.com |
Parameter | An ASCII email address as last part of the URL |
Result
<disposableStatus>
<infoId>bar.com</infoId>
<result>1</result>
</disposableStatus>
or
{"infoId": "bar.com", "result": 1}
- result: the evaluation result
- 0: not a disposable email address
- 1: is a disposable email address
- infoId: if result == 1, this contains the ID (domain name) for looking up more information about a disposable resource
Functionality
If a match is found a positive result (1) is returned. The ID of the matching entry, the domain name, is also returned for further lookup.
Information about a disposable resource
The disposable info API call returns information about disposable resources.
Syntax | /svc/2.0/info/disposable/<id> |
Example | /svc/2.0/info/disposable/bar.com |
Parameter | An ID (domain name) as the last part of the URL |
Result
<disposableInfo>
<country>USA</country>
<id>bar.com</id>
<owner>Dosposabil</owner>
<remarks>No sign-in necessary</remarks>
<validity>3T</validity>
<disposableInfo>
or
{
"country":"USA",
"id":"bar.com",
"owner":"Dosposabil",
"remarks":"No sign-in necessary",
"validity":"3T"
}
The structure of the result document is:
- id: the ID of the resource, which is also the domain name representing the provider
- owner: the name of the organization providing the disposable addresses
- country: country name for the provider organization/domain
- remarks: further details about the resource
- validity:lifetime for the disposable addresses provided by this domain
Functionality
If no data is available for an ID the server will signal this with HTTP response code 204 (No Content) and return no result (null).
Company Domains
The service tests whether an email address belongs to a domain representing a company.
Syntax | /svc/2.0/address/companydomain/<address> |
Example | /svc/2.0/address/companydomain/foo@bar.com |
Parameter | An ASCII email address as last part of the URL |
Result
<companyDomainStatus>
<infoId>bar.com</infoId>
<result>1</result>
</companyDomainStatus>
or
{"infoId": "bar.com", "result": 1}
- result: the evaluation result
- 0: no company domain found
- 1: company domain found
- infoId: if result == 1, this contains the ID (domain name) for looking up more information about a company domain resource.
Functionality
If a match is found a positive result (1) is returned. The ID of the matching entry, the domain name, is also returned for further lookup.
Information about a company domain resource
The company info API call returns information about company resources.
Syntax | /svc/2.0/info/companydomain/<id> |
Example | /svc/2.0/info/companydomain/bar.com |
Parameter | An ID (domain name) as the last part of the URL |
The company domain resources contain descriptions that are language-specific, the economic classifications. To get these descriptions in a specific language, add a HTTP Accept-language header to the request. The service will then return the descriptions in the requested language, if available, otherwise in the default language defined for each schema.
Result
<companyDomainInfo>
<id>bar.com</id>
<orgName>Bar.Com GmbH</orgName>
<addressStreet>Barstrasse 11</addressStreet>
<addressCity>Neumarkt</addressCity>
<addressZipCode>92318</addressZipCode>
<addressLatitude>11.000001</addressLatitude>
<addressLongitude>49.111110</addressLongitude>
<countryCode>de</countryCode>
<classifications>
<classification>
<id>34.10</id>
<description>Herstellung von Kraftwagen und Kraftwagenteilen</description>
<lang>de</lang>
<type>wz2003</type>
</classification>
</ classifications>
<companyDomainInfo>
or
{
"id":"bar.com",
"orgName":"Bar.Com GmbH",
"addressStreet":"Barstrasse 11",
"addressCity":"Neumarkt",
"addressZipCode":"92318",
"countryCode": "de",
"addressLatitude": 11.000001,
"addressLongitude": 49.111110,
"classifications": [
{"id":"34","text":"Herstellung von Kraftwagen und Kraftwagenteilen", "type": "wz2003", "lang": "de"}
]
}
The structure of the result document is:
- id: the ID of the resource, which is also the domain name representing the company
- orgName: the name of the company
- addressStreet/City/ZipCode: postal address data for the company
- countryCode: ISO 639-1 country code for the address
- addressLatitude/Longitude: geographic coordinates for the company address
- classifications: zero or more economic classification entries describing the area of economic activity for this company
- id: ID of the classification entry accoding to the schema used
- type: the classification schema used in this entry, possible values:
- wz2003: older German classification, see Klassifikation der Wirtschaftszweige, Ausgabe 2003
- wz2008: recent German classification, see Klassifikation der Wirtschaftszweige, Ausgabe 2008
- nace2: EU classification, see NACE Rev. 2 - Statistical classification of eceonomic activities
- isic4: UN classification, see International Standard Industrial Classification of All Economic Activities, Rev.4
- lang: language code for the description; can be influenced by the Accept-language header, see above;
- description: descriptive text for the classification entry, available in various languages
Functionality
If no data is available for an ID the server will signal this with HTTP response code 204 (No Content) and return no result (null).
The descriptive texts are available in different languages. The service will try to find the best match for the language specified in the Accept-language header, or fall back to a default language if it cannot.
The available language editions are:
- wz2003: default = de, available = de
- wz2008: default = en, available = de, en
- nace2: default = en, available = bg, cs, da, de, el, en, es, et, fi, fr, hu, it, lt, lv, mt, nl, no, pl, pt, ro, ru, sk, sl, sv, tr
- isic4: default = en, available = en, es, fr
Challenge-Response risks
The service tests if a mail address belongs to a domain that uses Challenge-Response as an anit-spam mechanism.
Syntax | /svc/2.0/address/crrisk/<address> |
Example | /svc/2.0/address/crrisk/foo@bar.com |
Parameter | An ASCII email address as last part of the URL |
Result
<crRiskStatus>
<result>1</result>
</crRiskStatus>
or
{"result": 1}
- result: the evaluation result
- 0: there is no CR risk for this address/domain
- 1: there is a CR risk for this address/domain
Challenge-Response risks
The service tests if a mail address belongs to a domain that uses Challenge-Response as an anit-spam mechanism.
Syntax | /svc/2.0/address/crrisk/<address> |
Example | /svc/2.0/address/crrisk/foo@bar.com |
Parameter | An ASCII email address as last part of the URL |
Result
<crRiskStatus>
<result>1</result>
</crRiskStatus>
or
{"result": 1}
- result: the evaluation result
- 0: there is no CR risk for this address/domain
- 1: there is a CR risk for this address/domain
Address botrisk check
The botrisk check evaluates the risk that the address in question represents an automated bot, not a real person.
Syntax | /svc/2.0/address/botrisk/<address> |
Example | /svc/2.0/address/botrisk/foo@bar.com |
Parameter | An ASCII email address as last part of the URL |
Result
<botriskStatus>
<infoIds>
<infoId>2</infoId>
<infoId>3</infoId>
</infoIds>
<result>1</result>
</botriskStatus>
or
{"infoIds": [2, 3], "result": 1}
The service looks for typical naming patterns in different parts of the address and returns the sum. The result document contains:
- result: the evaluation result
- 0: normal risk, nothing found
- 10: slightly increased risk, a part of the address is suspiciuos
- 20: medium risk, some parts of the address are suspicious
- 30: high risk, the complete address is suspicious
- infoIds: if the result is > 0, this contains IDs for looking up more information about a botrisk resource.
Functionality
The service tests the requested address and parts of it against different types of entries stored in da database:
- the complete address against address entries
- the domain only against domain entries
- the local part only against local part entries
- the complete address against regex entries
If a match is found a risk result (>0) is returned. In case of multiple matches the risk increases. The IDs of the matching entries are also returned for further lookup.
Botrisks
The botrisk info API call returns information about botrisk resources. Bots can be identified by their naming patterns, which are documented here.
Syntax | /svc/2.0/info/botrisk/<id> |
Example | /svc/2.0/info/botrisk/1 |
Parameter | An ID (number) as the last part of the URL |
Result
<botRiskInfo>
<botriskType>1</botriskType>
<id>1</id>
<owner>A botrisk provider name</owner>
<remarks>Details about the bot, its maintainer</remarks>
<url>http://bot-maintainer.com</url>
</botRiskInfo>
or
{
"botRiskType": 1,
"id": 1,
"owner": "A botrisk provider name",
"remarks": "Details about the bot, its maintainer",
"url": "http://bot-maintainer.com"
}
The structure of the result document is:
- botRiskType: the part of the address or pattern that lead to the risk estimation
- 1: domain name
- 2: local part
- 3: complete mail address
- 4: regular expression
- id: resource ID
- owner: the name of the maintaining entity, if known
- remarks: details about the resource
- url: a URL with more informtion about the bot, its maintainer
Functionality
If no data is available for an ID the server will signal this with HTTP response code 204 (No Content) and return no result (null).
Address blacklist check
Sending commercial mails to certain addresses, administrative or other, might result in the sender being blacklisted. The blacklist API call evaluates this risk for an email address.
Syntax | /svc/2.0/address/blacklist/<address> |
Example | /svc/2.0/address/blacklist/foo@bar.com |
Parameter | An ASCII email address as last part of the URL |
Result
<blacklistStatus>
<infoId>bar.com</infoId>
<listType>2</listType>
<result>1</result>
</blacklistStatus>
or
{"infoId": "bar.com", "listType": 2, "result": 1}
The result details are:
- result: the evaluation result for the blacklist risk
- 0: there is no known risk
- 1: there is a risk of getting blacklisted
- 2: error, bad address
- listType: details for the evaluation, the causes for the blacklisting
- 0: no problem, default value when result == 0
- 1: the domain name belongs to a blacklist provider
- 2: the local part of the address will probably cause blacklisting; examples are administrative addresses like *abuse@...* or *spam@...*
- infoId: the guilty part of the address, which is also the ID for looking up more information about a blacklist resource
Information about a blacklist resource
The blacklist info API call returns information about various blacklist resources, according to the resource type.
Syntax | /svc/2.0/info/blacklist/<ID> |
Example | /svc/2.0/info/blacklist/abuse |
Parameter | An info ID (local part or domain name) as last part of the URL |
Result
<blacklistInfo>
<id>abuse</id>
<listType>2</listType>
<owner/>
<remarks>Administrative address for problem reports with a domain</remarks>
<url/>
</blacklistInfo>
or
{
"id": "abuse",
"listType": 2,
"owner": "",
"remarks": "Administrative address for problem reports with a domain",
"url": ""
}
The result document contains:
- id: the ID of the resource
- listType: the type of the blacklist resource
- 1: the domain name belongs to a blacklist provider
- 2: the local part of the address will probably cause blacklisting; examples are administrative addresses like *abuse@...* or *spam@...*
- owner: if the resource represents a blacklist provider this contains the name of the owning entity
- remarks: further details about the blacklist resource
- url: a URL for further information about the blacklist provider, e.g. contact information
Functionality
If successful (response code 200) the service will return a document with inofrmation about the requested blacklist resource. If no data is available for an ID the server will signal this with HTTP response code 204 (No Content) and return no result (null).
Common Syntax Warnings
Syntax warnings are used by the Address Quality and Address Syntax checks to communicate syntax problems found while checking an e-mail address. The checks then contain message codes to identify the problem.
The initial set of message codes is documented here. While the messages of the normal syntax check (synm codes) are fixed, the extended systax check (extm codes) uses an editable rulebase. Please check this rulebase for additions and changes.
Syntax warning codes for the normal syntax check:
Code
|
Explanation
|
---|---|
synm001 | No @ found |
synm002 | No local part (mailbox name) found |
synm003 | No domain name found |
synm004 | Local part contains non-ASCII characters |
synm005 | Domain name contains non-ASCII characters |
synm006 | Invalid address format |
synm007 | Invalid mailbox name |
synm008 | Invalid domain name |
synm009 | Invalid top level domain (TLD) |
synm010 | Invalid IP address format |
synm011 | More than one @ found |
synm012 | The top level domain (TLD) can only contain letters and must have a minimum length of two. |
synm013 | The local part cant be longer than 64 characters |
synm014 | The domain name cant be longer than 254 characters |
synm015 | The e-mail address cant be longer than 254 characters |
synm016 | Invalid domain name (IRI) according to RFC 3490 |
synm017 | The domain name contained Unicode characters and was decoded |
synm018 | The local part contained Unicode characters and was decoded |
Initial set of syntax warning codes for the extended syntax check:
Code
|
Explanation
|
---|---|
extm001 | The mailbox length must be between 3-32 characters |
extm002 | Only letters (a-z) and digits (0-9) are allowed |
extm003 | The first character must be a letter |
extm004 | Only letters, digits and punctuation characters dot, hyphen and underscore are allowed |
extm005 | Multiple occurences of punctuation characters dot, hyphen and underscore are not allowed |
extm006 | The punctuation characters dot, hyphen and underscore are not allowed at the beginning or end |
extm007 | Dots are not allowd at the end |
extm008 | The mailbox length must be between 3-50 characters |
extm009 | The mailbox length must be between 5-40 characters |
extm010 | The mailbox length must be between 5-30 characters |
extm011 | The mailbox length must be between 4-32 characters |
extm012 | Only letters, digits and punctuation characters dot and underscore are allowed |
extm013 | Only one dot is allowed |
extm014 | The mailbox length must be between 2-50 characters |
extm015 | The mailbox length must be between 3-40 characters |
extm016 | Mutiple occurences of dots are not allowed |
extm017 | The mailbox length must be between 2-30 characters |
extm018 | Only letters, digits and punctuation characters dot, hyphen, underscore, plus, minus, slash and ampersand are allowed |
extm019 | The mailbox length must be between 6-30 characters |
extm020 | Only letters, digits and the punctuation character dot are allowed |
extm021 | The mailbox length must be between 3-20 characters |
extm022 | The mailbox length must be between 1-64 characters |
extm023 | The mailbox length must be between 6-20 characters |
extm024 | The mailbox length must be larger than 2 |
extm025 | The mailbox length must be between 2-31 characters |
extm026 | The first character must be either a letter or a digit |
extm027 | Only letters and dits are allowed at the beginning or end |
extm028 | Multiple occurences of underscores are not allowed |
extm029 | Dots are not allowed at the beginning or end |
extm030 | The first character must be either a letter or a digit |
Role Analysis
The service tests if the email address is a functional one, belongs to a role, such as info, admin or office for example, rather than to a person.
Syntax | /svc/2.0/address/role/<address> |
Example | /svc/2.0/address/role/info@bar.com |
Parameter | An ASCII email address as last part of the URL |
Result
<roleStatus>
<result>1</result>
</roleStatus>
or
{"result": 1}
- result: the evaluation result
- 0: address does not contain a functional role
- 1: address contains a functional role
Functionality
If a match is found a positive result (1) is returned.