Fetch OCR Data

An introduction to fetching OCR data for an Incode onboarding session

What is OCR data?

Optical Character Recognition (OCR) data for an onboarding contains all information extracted from the ID document during the session. This includes both text information and available barcode or machine-readable zone (MRZ) data read and decoded from the back of the ID.

📘

Different documents, different data

Data in ID documents can vary widely. Therefore, this endpoint's response is dynamic. It will only contain the information relevant to the detected ID. If you use compiled languages (like Java or C#) to consume these endpoint responses, be aware that your JSON parsers could break if a property you expect is missing.

How can I fetch OCR data?

To fetch OCR data for a given onboarding session, you should always pass the session's unique interview ID, also called the Session ID, to the fetch ocr data API endpoint. You can also test this endpoint at the preceding link. If you do not pass the interview ID, Incode attempts to extract it from the session token.

Common use cases

To extract information from the ID document attached to session

The following sample shows some of the most commonly used OCR fields from the ID document. It is not a complete list, since the fields vary from document to document. A longer JSON response example with additional fields is at the end of this article.

{
    "name": {
        "fullName": "",
        "firstName": "",
        "paternalLastName": "",
        "maternalLastName": "", // Optional
        "givenName": "",
        "middleName": "", // Optional
        "nameSuffix": "", // Optional
        "machineReadableFullName": "", // Optional, full name from Barcode or MRZ
        "givenNameMrz": "", // Optional
        "lastNameMrz": "" // Optional
    },
    "address": "", // Optional, address as read from ID
    "addressFields": {
        "street": "", // Optional
        "colony": "", // Optional, not applicable for all countries
        "postalCode": "", // Optional
        "city": "", // Optional
        "state": "" // Optional
    },
    "typeOfId": "", // Id classification, ie: Drivers License, Voter Identification, etc.
    "issuedAt": 0, // Issue date, expressed as a epoch timestamp in milliseconds
    "expireAt": 0, // Expiration date, expressed as a epoch timestamp in milliseconds
    "issuingCountry": "", // Optional
    "documentNumber": "", // Optional
    "fullAddress": true, //Optional. True if address from id is full (has three lines)
    "cic": "", // Mexican INE only
    "ocr": "", // Mexican INE only
}

To extract information from the POA document attached to session

When you need to extract the address or information from a POA document added to a session, you can access these fields in the JSON response:

{
    "documentType": "", // Classifier of the provided POA document
    "poaName": "", // The name that appears in the provided POA document
    "addressStatementEmissionDate": "", // Expiration date, expressed as a epoch timestamp in milliseconds
    "addressFromStatement": "", // Full address read from statement
    "addressFieldsFromStatement": {
        "street": "",
        "colony": "",
        "postalCode": "",
        "city": "",
        "state": ""
    },
}

These fields should be enough to answer questions like:

  • What's the address in the POA document?
  • What name appears in the POA document?
  • When was the POA document issued?
  • What kind of document was used as POA?

Other OCR data extraction endpoints

The preceding samples come from the standard ocr data endpoint. Incode offers three alternative endpoints for OCR data extraction, although these are less likely to be needed:

  1. OCR data v2 wraps the response from the ocr-data endpoint under an ocrData field. It returns the same data as ocr data endpoint.
  2. Second Id's OCR data is required if your flow is configured to have two ID documents attached to the same session. This will provide you with OCR data from the second ID.
  3. Batch fetch OCR data can be used when you want to fetch OCR data for multiple onboarding sessions.
📘

Scores and OCR Data

OCR data shows in the Incode Dashboard along with some scoring. This scoring relates solely to the level of confidence on the data extracted from the captured image of the ID. Level of confidence for OCR data does not directly affect or alter the score of a session.

Sample JSON File