Document Smart Search
The Department of Fisheries and Oceans (DFO) collaborated with the UBC Cloud Innovation Centre (CIC) to develop an AI-powered prototype to make the process of accessing and exploring official documents more efficient. DFO provides a wide range of scientific and policy resources, but navigating this content to find relevant insights can be a challenge. Users can now search efficiently through conversations with an AI-assistant and filter options; supporting them in finding and utilizing the relevant information more effectively.
Government agencies manage vast document libraries containing critical research and policy information. Finding the right document among thousands can take hours of manual searching. The Department of Fisheries and Oceans (DFO) faced exactly this challenge with their extensive collection of scientific publications and policy documents. DFO identified an opportunity to improve this experience, and approached the UBC Cloud Innovation Centre to co-create a prototype that can help users discover and engage with, making it easier to search, filter, and utilize information.
Approach
The UBC CIC built a web application prototype to help users explore and navigate large collections of official documents, such as DFO’s extensive library of policy and research publications. The prototype organizes content by topic and mandate, generates summaries, and assigns relevance scores in order to help users find relevant materials.
Powered by AWS, the prototype can analyze and rank documents by how closely they align with a users’ search or query. The system assigns relevance scores that reflect the semantic similarity to the user’s search.
Users can interact with these materials in two ways:
- By browsing and filtering documents using topics, mandates, and publication years.
- Through a conversational AI assistant that highlights key documents related to the user’s search, helping them identify the most relevant DFO materials.
For example, if used with DFO materials, a user interested in the impact of climate change on Pacific salmon can either search directly through the Document Search feature or ask a question using the AI-Assistant. In both cases, the prototype provides relevance scores and AI-generated summaries to help the user identify which documents most closely align with their query. Whether browsing through filtered search results or receiving recommended sources from the assistant, these features help users focus on the most useful materials.
With this solution, users can locate the most relevant documents on a topic in just one search; reducing what once took hours or days of manual searching to only seconds. This solution can scale to any organization looking to make extensive document libraries easier to discover and explore.
Click here to go directly to the project GitHub repository.
Screenshots of UI
The following screenshots highlight key parts of the user interface, showcasing both the public and administrator views.
Public View
The public-facing interface is designed to help users search for documents, explore topics, and engage with the AI-assistant.
AI-Assistant
Document Search
Analytics
The Analytics Function allows users to explore trends across DFO’s document library by visualizing document counts over time, which helps identify emerging topics and shifting priorities.
Administrator View
The administrator interface provides tools to monitor user engagement, review feedback, and adjust how the assistant responds to different types of users, such as providing detailed, more technical answers for researchers and easier to digest summaries for the general public.
Architecture Diagram
This diagram shows the Document Smart Search architecture. The solution ingests documents through Amazon S3 and AWS Glue, combines structured data Amazon RDS with vector embeddings Amazon Bedrock and Amazon OpenSearch for hybrid search, and delivers AI responses through AWS Lambda-orchestrated RAG workflows with built-in AWS security.

https://github.com/UBC-CIC/DFO-Smart-Search/blob/main/docs/architectureDeepDive.md
Technical Details
The UBC CIC implemented a serverless architecture on AWS to handle unpredictable query volumes while keeping operational overhead minimal. Amazon Bedrock provides the foundation model capabilities, while Amazon OpenSearch enables hybrid search combining keyword matching with semantic similarity.
User authentication is managed by AWS Cognito, which controls sign-up, sign-in, and role-based access. To kick-start the data ingestion process administrators can upload documents to Amazon S3 and trigger an AWS Glue pipeline that scrapes, cleans, and organizes HTML content by topic and mandate.
Structured metadata is stored in Amazon RDS, while text embeddings are generated using Amazon Bedrock and indexed in Amazon OpenSearch. This enables powerful hybrid search, combining traditional keyword and semantic similarity.
The system is secured with AWS Shield for DDoS protection and AWS WAF for application-layer threats. All interactions flow through Amazon API Gateway, which routes requests to AWS Lambda functions. For admin tasks (like managing users or prompts), Lambda accesses DynamoDB for chat memory and RDS for structured data.
When a user asks a question, a Lambda function launches a Retrieval-Augmented Generation (RAG) process: pulling conversation history from DynamoDB, facts from RDS, and relevant documents from OpenSearch. It then sends this context to a Bedrock-hosted LLM, which generates a tailored, source-grounded response.
Link to solution on GitHub: https://github.com/UBC-CIC/DFO-Smart-Search
Video
Acknowledgements
Student team: Developers: Daniel Long, Tien Nguyen, Nikhil Sinclair, and Zayan Sheikh. Project Assistance by Amy Cao and Harleen Chahal.
About the University of British Columbia Cloud Innovation Centre (UBC CIC)
The UBC CIC is a public-private collaboration between UBC and Amazon Web Services (AWS). A CIC identifies digital transformation challenges, the problems or opportunities that matter to the community, and provides subject matter expertise and CIC leadership.
Using Amazon’s innovation methodology, dedicated UBC and AWS CIC staff work with students, staff and faculty, as well as community, government or not-for-profit organizations to define challenges, to engage with subject matter experts, to identify a solution, and to build a Proof of Concept (PoC). Through co-op and work-integrated learning, students also have an opportunity to learn new skills which they will later be able to apply in the workforce.
