Making StatCan Data More Discoverable in AI
Statistics Canada is developing a system that makes its statistical data and publications more discoverable within Large Language Models (LLMs) through a process called vectorization. This experimental system converts StatCan's published datasets and web content into a format that AI systems can better understand and retrieve, improving how citizens and researchers can find relevant government statistics through AI-powered search and analysis tools.
This system is designed to serve both Statistics Canada employees and the general public by enhancing access to official statistical information through modern AI interfaces. The system does not involve processing personal information about individuals—it focuses entirely on making published government statistics more accessible through emerging AI technologies.
The project is currently in development and includes experimentation with Model Context Protocol (MCP) technology to further improve how statistical data can be integrated with AI systems. Statistics Canada is being transparent about this development work and how it may affect how citizens discover and use official statistics.