Categorization AI#

Per project users can train categorization AIs. One document is assigned to one category of the project.

Users can activate one categorization AI per project to assign a document to a category automatically.

Categorization AI Detail#

Project#

The project used to train the categorization AI.

Status#

Statuses of AI training.

  • “Queuing for training…”: The categorization AI is waiting in the queue for its training to be started.

  • “Data loading in progress…”: The training process has started and the Konfuzio server loads the training data into memory.

  • “AI training in progress…”: The training data is loaded into memory and the actual training takes place.

  • “AI evaluation in progress…”: The categorization AI is trained and the evaluation of the trained categorization AI is being conducted.”

  • “Training finished.”: The categorization AI is evaluated and can be used.

In case the categorization AI could not be trained it will have the status “Contact support”.

Description#

The description to document the reason for training.

Version#

Incremented version per training

AI Parameter#

Saved status of when training started.

Created At#

Date and time when training was started.

Loading time (in seconds):#

Displays the average, minimum and maximum loading time across all runs of this AI.

Runtime (in seconds):#

Displays the average, minimum and maximum runtime across all runs of this AI. (If you add the loading and the runtime you get the overall time an AI run on a document has consumed.)

Training Log#

The log file of the training task. This is useful for debugging purpose.

Evaluation Log#

The log file of the evaluation run. This is useful for debugging purpose.

Train categorization AI#

The training process is 100 % automated, so the only setup users need to add a short description. The short description will help to relate the intention behind any change in the project to the quality of the categorization AI.

In order to train a Categorization AI, documents from at least two different categories need to be in the training set.

To improve the quality of a categorization AI make sure to use only documents which relate to one category. If you have several categories in one file split those files before you upload them.

Retrain categorization AI#

If you have new documents uploaded to your project you can train a new version of your categorization AI.

  1. Add those to the Status: Training documents

  2. Train categorization AI, see above

  3. As you use the same documents with Status: Test documents but increased the number of documents with Status: Training documents the AI quality should improve.

  4. Read more about how to Improve Extraction AI to improve your categorization AI even further.

Splitting a pdf containing a batch of scanned documents#

We provide the functionality to split stacked scans on request.

Categorization AI actions#

Evaluate categorization AIs#

If you change the documents assigned to the status test dataset you can evaluate old categorization AIs. This is helpful to evaluate different categorization AIs on the current test dataset.

img.png