Using the Cloud to Accelerate Machine Learning Models

Project phases

Published: November 1, 2023

Last Updated: 6 months ago.

View All Projects

A researcher from the department of Electrical and Computer Engineering contacted the UBC Cloud Innovation Center (CIC) to support a group of students in developing a prototype solution for storing, accessing, and processing echocardiogram (Echo) videos in the AWS cloud, taking advantage of scalability. This approach provides scalability, security, and ease of access. The scalability of AWS allows the infrastructure to expand. The solution references the COVID application, demonstrating reusability between projects at the CIC. The group of students learned about AWS resources in their implementation process through the mentorship of the CIC solution architects.

Approach

A team of researchers at the University of British Columbia (UBC) are working with a dataset of 30 years’ worth of 500,000 Echo ultrasound records, collected using cart-based ultrasound devices at the Vancouver General Hospital (VGH) Echo Core Lab. An ML model is being developed to analyze the data stream for predictions and outcomes. Researchers aim to reference this system and implement a comparable prototype architecture.

The CIC collaborated with the group of Engineering students on ingesting the Echo data in the cloud to leverage scalable GPUs. The architecture allows for a machine learning (ML) model to be deployed in containers on AWS. This cost-effective solution can be autoscaled to process large amounts of data.

Supporting Artifacts

Architecture Diagram

The backend diagram of the Echo Data Flow project, with all the steps numbered.

Technical Details

Store in the Cloud

  1. Echo video data are inputted into the system.
  2. Model outputs are all stored in the S3 bucket.
  3. AWS Lambda verifies the file types of the stored items.
  4. Verified files are pushed to a queue using SQS to keep track of the new data.
  5. Instances are automatically scaled, meaning that the bandwidth required to finish the queue is computed using GPU instances. In reference to the Image Builder, the process is cost-effective as the instance image containing the ML model is already created.
  6. Results are stored back into the S3 bucket, specifically, data processed from the model is obtained as an input.

Image Builder

The classifier docker image, stored in the Amazon ECR, manages docker images to ensure they are secure and reliable. Models are containerized and stored in ECR private repositories. A custom AMI is generated containing the deployed model to improve efficiency.

Link to solution on GitHub: https://github.com/UBC-CIC/echo-data-flow

Acknowledgements

Photo by Chaikom on Shutterstock

About the University of British Columbia Cloud Innovation Centre (UBC CIC)

The UBC CIC is a public-private collaboration between UBC and Amazon. A CIC identifies digital transformation challenges, the problems or opportunities that matter to the community, and provides subject matter expertise and CIC leadership.

Using Amazon’s innovation methodology, dedicated UBC and Amazon CIC staff work with students, staff and faculty, as well as community, government or not-for-profit organizations to define challenges, to engage with subject matter experts, to identify a solution, and to build a Proof of Concept (PoC). Through co-op and work-integrated learning, students also have an opportunity to learn new skills which they will later be able to apply in the workforce.