Uploading resumes to Konfuzio#

Upload a picture, a PDF file, or any other supported file of a resume to our API or web interface. Usually, this is done from a mobile app, email, FTP, or web application.

The resume will be converted to an archivable PDF/A, even it’s an Excel, Word, Image, or any other non PDF file. The PDF/A of the resume is now searchable.

Image to text using OCR#

As soon as a resume is uploaded, each page is converted to a text using OCR. In this step all the text from the resume is extracted, but it is not yet structured.

Getting JSON output from the API#

Konfuzio takes the text gained from the OCR in step 2 and converts it into structured JSON using machine learning. The JSON is then returned as output from the API. All important data fields of the resume like experience- or skill entries, addresses, and more have been extracted.

From here the resume can easily be processed into your database or temporarily stored as a searchable PDF.

Extract resume details#

Below you find our suggestion on how to set up your model to extract data fields. These can be tailormade to your needs.

  • Create a new category named “Resume”

  • To extract the Education and Work Experience create one Label Set named “Position” and enable the “Multiple Annotations sets” tick box.

  • Create Institution, Title, Start, End, Timespan, and Description as Labels and add those Labels to the Label Set

Annotate the resumes you uploaded. To support you with the training of the AI, have a look at the following picture.

Per position, you could use the following setup:

  • Institution

  • Title

  • Start

  • End

  • Timespan

  • Description

Smart Data Capturing of a CV to extract data with AI

As you prefer, additional fields can be extracted.

A Extraction AI will allow you to:

  • extract the Country of origin

  • detect the language in the resume

  • extract the Addresses of the candidate

  • location per position of the candidate

  • extract Email addresses

  • detect Phone numbers in resumes

  • separate the Work experience and Education

  • identify Skills

  • find references to websites, social media, or GitHub

  • you can also detect duplicates

On top of that, a searchable PDF/A is available for download.

If you have multiple Categories in your Project make sure to train a Categorization AI to automatically classify resumes.