Cropilot: AI-powered scan processing

Limited digitization capacity slows down the preservation of cultural heritage

When digitizing books and other documents, scan cropping is one of the most repetitive and time-consuming steps. In many cultural heritage institutions, cropping is still performed manually or with outdated tools.

At the same time, the cropping process itself usually does not require expert decision-making — yet it takes up a significant part of a librarian’s working day. The result is limited digitization capacity, inconsistent output quality, and an unnecessary burden on qualified staff who could be focusing on more specialized work.

Qualified librarians should not have to do work a machine can handle

Automated cropping saves time and reduces frustration. Manual, repetitive steps are taken over by AI — a helpful and, above all, fast assistant that never gets tired. Our solution makes it possible to:

Adjust cropping parameters to the needs of a specific collection or institution
Minimize the need for manual inspection through full automation, including page rotation and left/right page detection
Increase digitization capacity without increasing staff or costs
Help ensure stable and consistent output quality
Continuously improve the results based on feedback

Key features

Operating modes

Automatic, semi-automatic, and training mode.

Compatibility with digitization workflows

Direct integration into the NDK workflow and connection to the ProArc system. Cropilot can also be used as a standalone solution.

Automatic crop area detection

Detection of the page crop area, automatic page rotation, and recognition of left and right pages.

Cropping types

Full support for both inner and outer document cropping.

AI vision models

The system uses fine-tuned vision models to fully automate scan cropping.

Administration and management

Easy management of access rights and documents.

Automation

Elimination of repetitive tasks by letting artificial intelligence take over the technical routine.

Training mode: custom cropping for specific collections

Each institution works with different types of documents. For standard materials such as books and newspapers, our model can handle cropping reliably in automatic mode, with no additional training required. However, not every digitized item is that straightforward. The process can be complicated by many factors, from the poor physical condition of the source documents to atypical formats. In such cases, we can fine-tune the AI model so that it is tailored to the specific digitization workflow and needs of a particular institution.

The learning process can be compared to training a new professional.

A newcomer first processes an initial set of pages independently. An experienced colleague then checks their work, corrects inaccuracies, and explains where mistakes occurred. Thanks to this feedback, the newcomer improves quickly and is able to process the next batches independently, without repeating the same mistakes. Our model works on the same principle.

This cycle can be repeated until the required cropping quality and parameters are achieved. The model continuously improves based on feedback and, once fine-tuned, can process even the most complex document types in fully automatic mode.

In simple terms: the first hundreds of pages are reviewed by an expert; the following batches can then be processed by the model independently, with minimal intervention and a low error rate. This allows experienced staff to focus on more specialized work.

AI as both a helper and a tool

A librarian’s work is never finished. To preserve as many documents as possible for future generations, the digitization process must not be held back by technical routine. Leave cropping to the machines and use your professional expertise where it is truly irreplaceable: on the content itself and its preservation.

‍