Define your custom output format for supported documents using a schema YAML file. The document parser will generate output that exactly matches the structure specified in this schema file. Below are the supported data types and their appropriate usage scenarios:
Below is a table of the supported data types, their descriptions, and use cases in a more concise format:
Datatype | Description | Use Case Example |
---|---|---|
Object | Single occurrence fields. Outputs in JSON key-value pairs. | Unique invoice number in an invoice. |
Array | Multiple occurrence fields. Outputs as an array of JSON key-value pairs. | List of products in an invoice. |
String | Textual information at the field level. | Seller's or buyer's name in an invoice. |
Float | Numeric information with decimal values at the field level. | Unit price of a product in an invoice. |
Integer | Numeric information with whole numbers at the field level. | Quantity of a product bought in an invoice. |
Date | Date information at the field level. Supports various formats for year, month, and day. | Invoice date or payment date in an invoice. |
Listed below are sample schema used by default by our parsers. Please refer to these examples to create your own schema file.
Nested objects/arrays are not supported in the output schema
Invoice
document_type: invoice
schema:
invoice_items:
type: array
items:
type: object
properties:
- name: item_name
type: string
- name: item_description
type: string
- name: item_quantity
type: float
- name: unit_price
type: float
- name: total_price
type: float
invoice_information:
type: object
properties:
- name: invoice_number
type: string
- name: supplier_name
type: string
- name: supplier_address
type: string
- name: receiver_name
type: string
- name: receiver_address
type: string
- name: supplier_email
type: string
- name: nett_value
type: float
- name: tax_amount
type: float
- name: gross_value
type: float
- name: balance_due
type: float
- name: delivery_charge_amount
type: float
- name: other_charge_amount
type: float
- name: invoice_currency
type: string
- name: invoice_delivery_date
type: date
format: "%Y-%m-%d"
- name: payment_due_date
type: date
format: "%Y-%m-%d"
- name: invoice_date
type: date
format: "%Y-%m-%d"
- name: purchase_order_number
type: string
Bank Statement
document_type: bank_statement
schema:
accounts:
type: object
properties:
- name: account_holder_name
type: string
- name: account_number
type: string
- name: bank_address
type: string
- name: account_holder_address
type: string
- name: statement_start_period
type: date
format: "%B %Y"
- name: statement_end_period
type: date
format: "%B %Y"
transactions:
type: array
items:
type: object
properties:
- name: date
type: date
format: "%Y-%m-%d"
- name: description
type: string
- name: debit amount
type: float
- name: credit amount
type: float
- name: type (debit/credit)
type: string
- name: balance
type: float
Passport
document_type: passport
schema:
passport:
type: object
properties:
- name: passport_number
type: string
- name: passport_holder_name
type: string
- name: passport_holder_address
type: string
- name: passport_holder_gender(male/female)
type: string
- name: passport_holder_date_of_birth
type: date
format: '%Y-%m-%d'
- name: passport_issuing_country
type: string
- name: passport_issuance_date
type: date
format: '%Y-%m-%d'
- name: passport_expiration_date
type: date
format: '%Y-%m-%d'
Payslip
document_type: payslip
schema:
payslip:
type: object
properties:
- name: employer_name
type: string
- name: employee_name
type: string
- name: gross_salary
type: float
- name: net_salary
type: float
- name: allowances
type: float
- name: deductions
type: float
- name: deduction_description
type: string
- name: year_to_date_salary
type: float
- name: payslip_month
type: date
format: "%B %Y"
- name: credit_bank_name
type: string
- name: credit_bank_account_number
type: string
- name: hr_email_information
type: string
- name: country_of_employment
type: string
- name: currency_of_payslip
type: string
- name: language_of_payslip
type: string
Certificate of Employment
document_type: employment_certificate
schema:
employment_information:
type: object
properties:
- name: employer_name
type: string
- name: employer_address
type: string
- name: employee_name
type: string
- name: employee_id
type: string
- name: employment_start_date
type: string
- name: employment_start_date_formatted
type: date
format: "%Y-%m-%d"
- name: employment_end_date
type: string
- name: employee_designation
type: string
- name: hr_name
type: string
- name: hr_contact
type: string
- name: compensation
type: float
- name: compensation_frequency (weekly, biweekly, monthly etc)
type: string
- name: total_allowances_amount
type: float
- name: allowances_description
type: string
- name: total_bonuses_amount
type: float
- name: work_responsibilities
type: string
- name: certificate_requested_by
type: string
- name: certificate_issued_for
type: string
- name: country_of_employment
type: string
- name: currency_of_compensation
type: string
- name: document_issue_date_original
type: string
- name: document_issue_date_formatted
type: "%Y-%m-%d"
ITR Information
document_type: itr
schema:
itr:
type: object
properties:
- name: tax_identification_number(TIN)
type: string
- name: name
type: string
- name: employer_name
type: string
- name: employer_address
type: string
- name: signatory
type: string
- name: signatory_designation
type: string
- name: gross_compensation
type: float
- name: total_taxable_income
type: float
Utility Bills
document_type: utility_bills
schema:
utility_bill:
type: object
properties:
- name: statement_date
type: date
format: "%Y-%m-%d"
- name: name
type: string
- name: coverage
type: string
- name: due_date
type: date
format: "%Y-%m-%d"
- name: previous_balance
type: float
- name: previous_payment
type: float
- name: current_balance
type: float
- name: please_pay_identifier
type: string
Credit Card Statements
document_type: credit_card_statement
schema:
accounts:
type: object
properties:
- name: account_holder_name
type: string
- name: account_number
type: string
- name: bank_address
type: string
- name: account_holder_address
type: string
- name: statement_start_period
type: date
format: "%B %Y"
- name: statement_end_period
type: date
format: "%B %Y"
transactions:
type: array
items:
type: object
properties:
- name: date
type: date
format: "%Y-%m-%d"
- name: description
type: string
- name: debit amount
type: float
- name: credit amount
type: float
- name: type (debit/credit)
type: string
- name: balance
type: float
Loan Statement
document_type: loan_statement
schema:
accounts:
type: object
properties:
- name: account_holder_name
type: string
- name: account_number
type: string
- name: bank_address
type: string
- name: account_holder_address
type: string
- name: bank_name
type: string
- name: loan_amount
type: float
- name: outstanding_loan_amount
type: float
- name: monthly_installments
type: float
- name: tenure_paid(number of months)
type: integer
- name: tenure_pending(number of months)
type: integer
- name: Please pay by date
type: date
format: '%Y-%m-%d'
transactions:
type: array
items:
type: object
properties:
- name: date
type: date
format: "%Y-%m-%d"
- name: description
type: string
- name: amount
type: float
- name: type (debit/credit)
type: string
- name: balance
type: float