Text Extraction - Case Study

The Client

A Canadian process outsourcing and data services company.

Industry

Data Services

The Challenge

Xilligence was asked to build a text recognition product that could pull out both text and data from physical document records.

What made the requirement additionally challenging was that there was a lot of data that were in forms and tables and the expectation was that this data would then be able to be organized post digitization which meant that the ability to insert and extract data into a database would be necessary and to subsequently be used in applications. Key elements and contextual relationships would have to be maintained

The Solution

Xilligence built a process to scan and digitize the data. From there we created a process that used image recognition and machine learning (ML) technologies to extract the data and then digitize and populate into databases. On top of that we used Natural Language Processing (NLP) to ensure the contextual and data relationship information remained intact. Subsequent manual checks ensured that the data was digitized correctly.

Key Features
  • Table and Form Data Maintained
  • Contextual Data Extraction
  • Manual Review
Technologies Used
  • AWS Cloud Services
  • AI/ML
    • Amazon Comprehend
    • Amazon Textract
  • Database
    • MySQL
Xilligence Services Used

Mobile + Apps

Cloud + Web

QA + Support

Design + UI / UX

Our Case Studies