AI Feasibility Check#
Use this page to find out if your use cases requires IT resources or can be implemented without coding.
No IT resources required#
Any used cases listed here can be implemented by using the web interface, without any IT knowledge.
Classification of documents is a well-known and easy to train use case. One category refers to one Extraction AI per Language.
Text referencing the information you want to extract#
In the image, you see how to extract the Fax number as an example, where the “Fax: “ acts as a referencing text. In contrast to many competitors, you don’t need to add any rules. Konfuzio will learn it automatically.
Keyword-based search by example#
One simple use case is to train Konfuzio to detect keywords. Most probable you only need one document for training. You should make sure to annotate all keywords in a document.
Data defined by position as often seen in forms#
In the image, you see how to extract prefix of in an air waybill as an example
Many elements to extract but with referencing texts#
In the image, you see how to extract all elements of a consignee in an air waybill as an example. The structure and the sequence of Labels and those text strings which will not be annotated provide enough context to the Extraction AI.
Table Extraction where the values per row provide the context#
In many cases, table-like structures provide the context by the values in a row. As humans understand the context without column headers, so the AI does. In the image, you see how to extract the line items in a receipt as an example.
To do so Konfuzio generalizes the content of the Annotation and the context before and after. In the following image, you see an example of the typical real estate notation of a flat of one building in Germany. A size of a flat is described proportionally by using “x/y”, where x is the units of the flat and y the units of the total building. Konfuzio does not learn those number by heart by generalizes the representation of the value as “x/y”.
Table Extraction where the first cell in the row can act as a keyword#
Some Tables provide the context by phrases per row. This provides the context to extract the columns. In the image, you see how to extract values from a balance sheet in an annual report as an example
Paragraph Detection via Keyword#
Konfuzio can support you to convert paragraphs to a CSV or even API. This can even be done without coding if the paragraphs provide some keywords.
At the moment, the following use cases require customization. However, we are working hard to add the functionality to the web interface.
Those examples refer to use cases focused on page segmentation. Page segmentation refers to the functionality to detect tables, titles, paragraphs, images and lists on a document page without any training. Many users favor this use case as it feels easy to use. However, you will need to put in additional effort to analyze the data. In contrast to cases which require no coding, those use cases will require connecting the data to its business context.
Let’s have a look at the first use case: Sentiment detection of paragraphs or sentences. In the image you see paragraphs annotated which contain a positive sentiment. Dependent on your company structure it will take some agreement on the business side to define what a positive sentiment is.
Keyword and NER detection#
An easier to implement use case is to come up with a list of phrases or keywords. The image shows how to detect paragraphs which contain keywords about climate change. The list of keywords can be expanded by using a Named Entity Recognition model to detect synonyms or similar concepts automatically.
Summarization or paraphrasing of documents#
Paraphrasing or summarizing paragraphs provides a huge advantage to share the information. You can read more about our approach here or read the transcript of Infineon which we automatically summarized here.
Visually detect and extract tables#
Using a pre-trained Computer Vision Model Konfuzio allows parsing tables which can then further be processed. Have a look at our Developer Guide
Extract images and charts from documents#
Using a pre-trained Computer Vision Model Konfuzio allows parsing images, figures or charts which can then further be processed. Have a look at our Developer Guide. If you want to convert charts to structured data we highly recommend the open-source tool [WebPlotDigitizer] (https://github.com/ankitrohatgi/WebPlotDigitizer), which is also available as a hosted version.