A comprehensive resource to classify, extract and validate documents without coding.
Welcome to Konfuzio!#
Welcome to the comprehensive tutorial guide for getting started with Konfuzio. Learn how to capture data from any document, in any format and in any language in the world. Automate repetitive tasks, even reading data from the most complex documents, and make your team more productive!
Whether you are opening a car rental service, sharing benefits with your loyal customers, opening more restaurants, checking data for the employees you hired, or managing your accounting, Konfuzio will help you easily capture data from any document you can think of. You can focus on what is important to you. You decide how you work and build the perfect setup for your needs.
Unlike a traditional data extraction system, Konfuzio gives you the power of an AI-enabled, template agnostic solution. It is super flexible—connecting with your information, capturing only what you need, and structuring it as you need it—all in one place.
Who is this documentation for?#
Technical and non-technical people, brand new Konfuzio users and anyone who just wants to refresh their Konfuzio knowledge.
First sign up on app.konfuzio.com to access the Konfuzio platform.
All you need to do is:
Enter your email address
Choose a password
Verify the link you receive in your email
Once you first open up Konfuzio, you’ll see a navigation panel on the left-hand side (More on these later).
The left-hand side contains all your key account information. Please use the left navigation and expand it if you don’t see it.
Annotations: Entity Annotation teaches AI models how to identify parts of the text, named entities and keyphrases within a text. Annotators read the document thoroughly, locate the target entities, highlight them in the SmartView and choose from a predetermined list of Labels and Label Sets.
Categories: One type of Document. One Category can have one specialized Extraction AI.
Categorization AIs: AI to categorize different types of Documents.
Documents: All Documents processed in the selected Project.
Extraction AIs: Models to extract information from a Document.
Labels: A field you want to extract from a Document.
Label Sets: A group of fields.
Members: Team members that have access to your models in the Project.
Projects: Review the Projects you have access to or create new Projects. One Project separates users and data.
Try your first Extraction model#
When you first open up Konfuzio, you’ll be asked to create your first Project. You can get access to different AI models that you can use right away. Otherwise, you can train your own models.
Use a pre-made Konfuzio Model: You can quickly use a pre-made Konfuzio model for these document types. Simply contact our support to get access. Each model is already trained on hundreds of pages and works well out of the box! What’s more, we can quickly enable these models for any additional language.
Build your own Extraction AI model: Please watch our free tutorial on how to do it
We require a minimum of 10 images or Documents to train a custom model.
If you want to implement a new use case, you might want to check our feasibility check before you start working on it, see an article on AI feasibility check.
Refer to an in-depth article here to train the best performance model for you.
How much training data is enough?#
We recommend starting with 50 and adding files depending on the accuracy you see. The AI model gets better significantly at reading Documents as you show more data. For a complicated Document type, you might need 500 or more files.
Improving AI model accuracy#
Not satisfied with the model results or accuracy? You can improve the model accuracy quickly and simply by showing the model more diversified data.
The model performance significantly improves if you add 50 < 100 < 500 more files. Follow Steps 1-3 under Build your own OCR Model to improve the model performance. This process is called retraining.
If you are still unsatisfied with the model’s performance, reach out to our support so that our product specialist can help you.
Testing your first model using the UI#
Your model is trained, and you can see its accuracy on the “Model Metrics” section. You can also see AI machinery behind your model - the different Experiments, aka, AI Architectures contributing to the best model accuracy for your Document.
Quick steps to see some AI magic on some unseen documents:
Make sure your Extraction AI is active.
Upload a new Document via the web interface or API (more on this later).
Open the SmartView to review the results.
We summarize the procedure in the following video:
You can see the extracted information as Bounding Boxes on the original image on the left-hand side. On the right-hand side, you can see the extracted information as text under List View, and the response as returned by the API in the JSON format.
Integrate Konfuzio APIs#
It is easy to integrate the code into your codebase to consume the API, in no more than 15 minutes. We will keep this documentation short, as this guide is written for users without an IT background. You will find further information on dev.konfuzio.com.
Review the API Guide.
Choose your preferred Authentication method.
Post new documents to the project endpoint.
Either process the results synchronously, pull them or wait for a webhook.
Find out more about our API: https://app.konfuzio.com/v3/swagger/
Exporting data & Supported integrations#
We support several pre-built integrations ready right out of the box.
Export as CSV
SAP: Besides custom implementation, we advise clients to have a look at the solution provided by Ersasoft.
Incoming invoice processing in Datev and Microsoft Dynamics: We cooperate with the workflow tool named iflow, where Konfuzio is integrated out of the box.
E-Mail: After registration, you have the option to use secure email forwarding as documented here.
RPA Systems such as UiPath or Power Automate
Didn’t find what you were looking for? We also provide custom integrations. Please create a support Ticket, and we’ll get back to you .
Our AI Models support 70+ languages worldwide, including English, Spanish, German, French, Icelandic, Eastern European Languages such as Lithuanian, Hungarian, Romanian, and South Asian Languages such as Korean, Japanese, Malaysian, and Indonesian. For more detailed information, address the Supported OCR languages list.
We can give you access to the customer portal where you can change or cancel your contract at any time. In addition, you can download invoices or update your billing information and view the number of Pages processed cost incurred.
Visit our Developer Resources if you want to expand Konfuzio using code.