Superuser-Projects#

This describes additional Project settings which are only available to Superusers.

Project:#

Name:#

The name of the project.

Users:#

The Users of this Project.

Decimal Separator:#

See here

Document Processing:#

Processing type:#

Decide how text embeddings in PDFs are used when doing OCR.

OCR Method:#

Select the OCR method which is used for new Documents.

Auto Delete Documents after Days:#

See here.

Automatically rotate documents:#

See here.

Priority OCR and extraction:#

Decide if Documents of this Project are processed with priority.

Include email body in document#

Decide if the email body should be included in a Document (in case a Document is created via the E-Mail automation.)

Create labels and label sets#

If this is set labels and label sets which are predicted by extraction AIs are created if they dont exist yet. In addition, label set settings might be changed in order to store the extraction AI results (i.e. labels are added to label sets and “multiple ann. sets” will become “True”.

Save HOCR:#

Decide if HOCR results are saved.

Enable option to bulk edit documents:#

If this is set there is an option on the Document List which allows to edit multiple documents at once.

Enable Labeling:#

Decide if SmartView and TextView links should be shown on the document overview.

Enable translated strings:#

Enables translated string functionality. Incurs a penality performance when creating annotations, so it should left disabled if possible.

Notify Assignee:#

See here.

Default Assignee:#

See here.

Page limit:#

The maximum number of pages per document.

Domain whitelist:#

See here

AI:#

Categorization AI:#

The AI used for document categorization.

Categorization AI parameters:#

See here.

Categorization ai version:#

The version of the last trained categorization AI.

Retraining webhook:#

A webhook URL to trigger the training of an AI. If this is empty, the default training pipeline is used for model training.

Airtable:#

Airtable API URL:#

The URL of a shared board view of an airtable. This URL starts with “https://airtable.com/embed/”.

Airtable Board URL:#

The URL of a shared board view of an airtable. This URL starts with “https://airtable.com/embed/”.

Airtable URL:#

The URL of an airtable, as it is displayed in the web browser. This URL starts with “https://airtable.com/”.

Airtable Token:#

The airtable API token, which can be generated at https://airtable.com/account

Microsoft Graph:#

Planner ID:#

The Planner ID (for the Graph API integration.)

Microsoft Tenant settings#

The microsoft tenant (for the Graph API integration).

Other:#

Additional csv export options:#

Coming soon.

Additional page split options:#

This allows to automatically split new uploaded documents and to create a template for filenames for the documents that are the results of the Split.

Lets assume we have a Project with a Category called “ID-Cards” which has the ID=11. In addition we have the label Issue_Date and Country as part of the LabelSet “ID-Cards” (which is also the Category). With the following config, a stack scan of 10 pages would be splitted into 10 documents, every resulting document is named based on the extracted values for Country and Issue_Date. Setting "per_page": "None" tells the Konfuzio Server to split after every page.

{"per_page": "None", "file_names": {"11": "{{Country}}_{{Issue_Date}}.pdf",}}

Note: The documents that are created by the split do not get their own document ID. They are only accessible via API using the file_urls field.

Storage Name:#

This allows to set up a dedicated Blob Storage Account for a specific Project. Enter the name of the Storage Account here and add the credentials in the environment variable AZURE_CUSTOMER_STORAGE (for Azure Blob Storage) or CUSTOMER_STORAGE (for S3 Blob Storage).

Statistic:#

Coming soon.