Milestone 3 (internal combustion engine): Midterm Presentation: Optimization, Monitoring, and Deployment of Scalable Data Solutions

Note: This milestone serves as the midterm presentation for AC215.

This milestone focuses on the optimization, monitoring, and deployment aspects of a complex data-driven project. It emphasizes efficiency and scalability, utilizing advanced techniques and tools like TensorFlow Lite for model optimization and Kubeflow for machine learning workflows. The integration of a comprehensive performance monitoring system ensures ongoing reliability, and the creation of a “Mega Pipeline” application represents the culminating hands-on experience that brings all elements together in a cohesive, manageable structure.

The fourth milestone builds on the foundational work of the first three, driving the project towards completion and ensuring readiness for real-world application. This will provide students with exposure to industry-standard best practices and hands-on experience with cutting-edge tools and methodologies.

Finally, the milestone serves as an opportunity to practice presenting information to a technical audience in an engaging and concise manner.

Key dates:

  • Due date: Oct 31st

Objectives:

  • Distillation/Quantization/Compression, TF lite: Implement methods for model optimization such as distillation, quantization, and compression, using TensorFlow Lite. This will enable deployment in resource-constrained environments.

  • Model Performance Monitoring, Data Drift Awareness: Develop a comprehensive monitoring solution to continuously track model performance, detect data drift, and ensure awareness of other post-release factors that may impact the system.

  • Kubeflow and Cloud Functions Integration: Utilize Kubeflow for machine learning workflows and integrate cloud functions to automate and scale various processes within the project, aligning with cloud-native practices.

  • Build a Hands-on Mega Pipeline App: Design and create a scalable and comprehensive pipeline application that encapsulates all the elements of the project. This “Mega Pipeline” will serve as the unified interface for managing and executing various project components.

  • Presenting a Technical Project: Create a presentation that concisely covers what has been accomplished up to this point, and what the plan is for next steps. Here are some useful questions to think about when creating the presentation: Who is the audience (technical or non-technical), and what information can you expect them to know (and not know) going into the presentation? What’s the story that you are trying to tell? How are you planning to tell that story? What do you want the audience to take away from the presentation?

Deliverables:

  • Optimized Models: Models that have been distilled, quantized, or compressed using TensorFlow Lite, complete with performance benchmarks and analysis.

  • Performance Monitoring Solution: A well-documented monitoring system that provides insights into model performance, data drift, and other relevant factors. Includes alerting and reporting mechanisms.

  • Kubeflow & Cloud Functions Implementation: Documentation and code showcasing the successful integration of Kubeflow for machine learning orchestration and cloud functions for process automation.

  • Mega Pipeline Application: A fully-functional CLI based mega pipeline application that serves as the control center for the project, enabling the seamless execution of various components from data ingestion to model deployment. Should include detailed documentation and user guides.

  • Presentation: A 4-5 minute presentation that includes a brief overview of the project to help orient the audience, a walk-through of the work that has been completed, and a brief outline of next steps. The walk-through does not need to be a live demo (it can be a set of well-made, visually-pleasing slides), but it should make sure to showcase and highlight the various components that have been built. Be prepared for 3 additional minutes of questions at the end.