
Cropilot is a system for fully automated scan cropping powered by advanced AI vision models. It removes one of the biggest bottlenecks in digitization workflows: repetitive manual page cropping.
When digitizing books and other documents, scan cropping is one of the most repetitive and time-consuming steps. In many cultural heritage institutions, cropping is still performed manually or with outdated tools.

At the same time, the cropping process itself usually does not require expert decision-making — yet it takes up a significant part of a librarian’s working day. The result is limited digitization capacity, inconsistent output quality, and an unnecessary burden on qualified staff who could be focusing on more specialized work.
Automated cropping saves time and reduces frustration. Manual, repetitive steps are taken over by AI — a helpful and, above all, fast assistant that never gets tired. Our solution makes it possible to:

Automatic, semi-automatic, and training mode.
Direct integration into the NDK workflow and connection to the ProArc system. Cropilot can also be used as a standalone solution.
Detection of the page crop area, automatic page rotation, and recognition of left and right pages.
Full support for both inner and outer document cropping.
The system uses fine-tuned vision models to fully automate scan cropping.
Easy management of access rights and documents.
Elimination of repetitive tasks by letting artificial intelligence take over the technical routine.
Each institution works with different types of documents. For standard materials such as books and newspapers, our model can handle cropping reliably in automatic mode, with no additional training required. However, not every digitized item is that straightforward. The process can be complicated by many factors, from the poor physical condition of the source documents to atypical formats. In such cases, we can fine-tune the AI model so that it is tailored to the specific digitization workflow and needs of a particular institution.

The learning process can be compared to training a new professional.
A newcomer first processes an initial set of pages independently. An experienced colleague then checks their work, corrects inaccuracies, and explains where mistakes occurred. Thanks to this feedback, the newcomer improves quickly and is able to process the next batches independently, without repeating the same mistakes. Our model works on the same principle.
.png)
This cycle can be repeated until the required cropping quality and parameters are achieved. The model continuously improves based on feedback and, once fine-tuned, can process even the most complex document types in fully automatic mode.
In simple terms: the first hundreds of pages are reviewed by an expert; the following batches can then be processed by the model independently, with minimal intervention and a low error rate. This allows experienced staff to focus on more specialized work.
A librarian’s work is never finished. To preserve as many documents as possible for future generations, the digitization process must not be held back by technical routine. Leave cropping to the machines and use your professional expertise where it is truly irreplaceable: on the content itself and its preservation.



