Extraction AI#

Per category users can train extraction AIs.

Users can activate one extraction AI per category which will then be used to create Automated Annotations.

Extraction AI details#

Label set#

Label set in a project used for training.

Status#

Statuses of an AI training.

  • “Queuing for training…”: The extraction AI is waiting in the queue for its training to be started.

  • “Data loading in progress…”: The training process has started and the Konfuzio server loads the training data into memory.

  • “AI training in progress…”: The training data is loaded into memory and the actual training takes place.

  • “AI evaluation in progress…”: “The extraction AI is trained and the evaluation of the trained extraction AI is conducted.”

  • “Training finished.”: The extraction AI is evaluated and can be used.

In case the extraction AI could not be trained it will have the status “Contact support”.

Description#

The description to document the reason for training.

Version#

Incremented version per training

AI Parameter#

Saved status of Extraction AI Parameters when training started.

Created At#

Date and time when training was started.

Loading time (in seconds):#

Displays the average, minimum and maximum loading time across all runs of this AI.

Runtime (in seconds):#

Displays the average, minimum and maximum runtime across all runs of this AI. (If you add the loading and the runtime you get the overall time an AI run on a document has consumed.)

Training Log#

The log file of the training task. This is useful for debugging purpose.

Evaluation Log#

The log file of the evaluation run. This is useful for debugging purpose.

Evaluation#

AI quality evaluation on Category, Label Set and Label level. The evaluation is divided into separate sections for Training and Test dataset documents.

evaluation_table.png

Our evaluation process compares the extraction results of a predicted Document with the Annotations of the corresponding ground truth Document. We use the following criteria for counting True Positives (TP), False Positives (FP), and False Negatives (FN).

Non-Strict Evaluation#

By default, with the Non-Strict Evaluation, we count a TP when there is any overlap between the predicted Annotation and the ground truth Annotation. In this case, partial matches are considered as TPs.

Example: The predicted Annotation “12.2027” for the (6) Datum zu K Label partially overlaps with the ground truth Annotation “08.12.2027”, therefore it counts as a TP in the Default Evaluation.

Ground Truth

Prediction

gt_weak.png

predicted_weak.png

Additionally, for Labels whose Multiple option is set to False, we count only the first TP for each AnnotationSet. If no TPs are present, we count the first FP; otherwise, we count the first FN only.

Example: Assume that the Multiple option for a Label Bezeichnung is set to False.

  1. The predicted Annotation “Überstd.grundverg.” for the Bezeichnung Label partially overlaps with the ground truth Annotation “Überstd.grundverg.+ FLA (25%)”, therefore it counts as a TP in the Default Evaluation.

  2. The other partially matching Annotation “(25%)” would count as an additional TP for this Label, but this will be not counted as the Multiple option is set to False, and because there is already a TP.

  3. The extraction “St” would count as a FP, but this will be not counted as the Multiple option is set to False, and because there is already a TP, which means that the AI was able to locate the information in the Document.

  4. The missing predicted annotation “2,70” for the Label Menge counts as a FN, because there is no corresponding TP for this Label.

Ground Truth

Prediction

gt_multiple.png

predicted_multiple.png

Strict Evaluation#

To enable Strict Evaluation, refer to the documentation at Extraction AI Parameters - Strict Evaluation.

When Strict Evaluation is activated, we count a TP only if the predicted Annotation is exactly the same as the ground truth Annotation. Partial matches are not considered as TPs.

Example: The predicted Annotation “12.2027” for the (6) Datum zu K Label does not exactly match the ground truth Annotation “08.12.2027”. This counts as both a FP and a FN in the Strict Evaluation. It counts as a FP because it predicted an Annotation that didn’t match anything in the ground truth. And it counts as a FN because one of the Annotations in the ground truth has no match in the predicted Document.

Ground Truth

Prediction

gt_weak.png

predicted_weak.png

Example: The predicted Annotation “3.462,82” for the Auszahlungsbetrag Label matches exactly with the ground truth Annotation, therefore it counts as a TP in the Strict Evaluation.

Ground Truth

Prediction

gt_strict.png

predicted_strict.png

Train extraction AI#

The training process is 100 % automated, so the only setup users need to do is to select the eategory for which an extraction AI should be trained and add a short description. The short description will help to relate the intention behind any change in the project to the quality of the extraction AI.

Visit the tutorial Improve Extraction AI to improve the quality of a extraction AI.

Retrain extraction AI#

If you have new documents uploaded to your project you can train a new version of your extraction AI.

  1. Add those to the Status: Training documents

  2. Train extraction AI, see above.

  3. As you use the same documents with Status: Test documents but increased the number of documents with Status: Training documents the AI quality should improve.

  4. Read more about how to Improve Extraction AI to improve your extraction AI even further.

Extraction AI actions#

extraction_ai_actions.png

Evaluate extraction AIs#

If you change the documents assigned to the status test dataset you can evaluate old extraction AI models. This is helpful to evaluate different extraction AIs on the current test dataset.

Get evaluation as CSV file#

Download the most granular evaluation file. Have a look at Improve Extraction AI to see how to use this CSV.

Activate extraction AI for available categories#

A handy option to update the extraction AI for all related categories, as multiple categories can use one extraction AI even across projects.

Quality Assurance#

Quality Assurance is a Extraction AI setting which can be enabled while training an Extraction AI. By default this setting is disabled, but when enabled, the Documents contained within the Training and Test dataset will be evaluated for possible issues which may negatively impact the Extraction AI’s trained model. Quality Assurance imposes high restrictions on the data handled and will attempt to fix any issue automatically. This setting is in place to ensure data consistency.

However, not all datasets will be repaired automatically. If training with Quality Assurance fails, please open a support a ticket and our team will come back with an estimate to improve the data quality manually.

quality_assurance_checkbox.png