Skip to main content
Back

AI-Powered OCR Data Quality Validation System

The Data Validation Tool is an AI system currently in development by Fisheries and Oceans Canada that improves the accuracy of digitized documents. When handwritten or non-machine-readable documents are converted to digital text using optical character recognition (OCR), errors often occur. This system uses advanced language models to detect these errors and propose corrections, ensuring that extracted data is accurate and consistent.

The tool is designed to help GC employees who work with Area Stream Inspection Logs from Pacific Regions. By automatically validating and standardizing data, it reduces the time staff spend manually checking for transcription errors and helps ensure data quality before information enters databases. This means more reliable information for analysis and better integration across government systems.

This system does not involve personal information and is being developed using open source technologies. The validation occurs automatically after OCR processing, making the data review process more efficient and consistent across different documents and data entry operations.

Government of Canada – AI Register