Milestone 2 (the wheel): MLOps Infrastructure & Advanced Training Workflows - Building Atomic Containers, Versioned Data Pipelines, and Scalable Computing Solutions

In this milestone, the focus is on developing a robust and scalable MLOps infrastructure. Teams will build atomic containers for various project components, design data pipelines with version control using tools like DVC, Label studio, Dask and integrate distributed computing solutions alongside cloud storage

Key dates:

  • Due date: Sep 26th

Template Repository

Submission Instructions:

  • Please see Ed

Objectives:

  • Build Atomic Containers for Components: Create containerized solutions for various components using standalone containers that can run independently. This will include the development of containers for individual applications and services involved in the project.
  • Construct Data Pipelines with Versioning: Design and implement a robust data pipeline that leverages Extract, Transform, Versioning tools like DVC. This will enable efficient data handling and version control within the project.

Deliverables:

  • Containerized Components: Fully-functional atomic containers for all individual components, aligned and ready for integration within the project architecture.
  • Virtual Environment Setup: Documented and implemented virtual machines and environments tailored to support the containerized components.