Advertisements

AI System Design & MLOps: From Raw Data to AWS Kubernetes

Advertisements
Enterprise Healthcare ML Project — SQL Analytics, XGBoost, FastAPI, MLflow, DVC, Docker, EKS & Governance
4.8
4.8/5
(53) Ratings
281 students
Created by Rahul Sahay
Advertisements

What you'll learn

  • Build an end-to-end AI system from raw data to cloud deployment using real-world architecture
  • Design ML pipelines with SQL, feature engineering, and leakage-safe model training
  • Use MLflow and DVC for experiment tracking, data versioning, and reproducible pipelines
  • Develop production-ready APIs using FastAPI with validation, logging, and model loading
  • Implement drift detection using PSI and trigger automated retraining pipelines
  • Containerize applications using Docker and deploy scalable services on AWS ECR and EKS
  • Connect data, ML, MLOps, APIs, monitoring, and cloud into one cohesive system
  • Think like an architect and design production-first AI systems, not just models
This course includes:
14 total hours on-demand video
0 articles
14 downloadable resources
122 lessons
Full lifetime access
Access on mobile and TV
Certificate of completion
Advertisements

Course content

Requirements

  • Basic understanding of Python programming
  • Familiarity with machine learning concepts (classification, features, evaluation metrics)
  • Basic knowledge of SQL is helpful but not mandatory
  • No prior MLOps or cloud experience required (covered step by step)
  • A system capable of running Python, Docker, and basic data processing workloads

Description

AI System Design & MLOps: From Raw Data to AWS Kubernetes (End-to-End Project)

Stop Learning Machine Learning in Isolation

Most machine learning courses focus on building models in isolation. You train a model, evaluate accuracy, and consider the job done.

But in real-world systems, that is only a small part of the problem.

Organizations do not need models. They need systems that can:

  • ingest and process real-world data

  • generate reliable predictions

  • serve those predictions through APIs

  • monitor performance over time

  • adapt when data changes

This course is designed to bridge that gap.

The Story Behind This Capstone

Imagine a large hospital network handling thousands of patients every day.

Patients arrive with different conditions. Some cases are routine, while others escalate into high-risk situations requiring immediate attention. At the same time, every visit generates billing records, which are later submitted to insurance providers. Some claims are approved quickly, while others are delayed or rejected, leading to revenue loss and operational inefficiencies.

Now consider the questions hospital leadership is asking:

  • Can we identify high-risk patient visits early so that resources can be allocated proactively?

  • Can we predict which claims are likely to be rejected before they are submitted?

  • Can we continuously monitor the system and adapt when patient patterns or insurance behaviors change?

These are not just modeling questions. They require a complete, well-designed system.

In this course, you will build that system from the ground up.

What You Will Build

You will design and implement a complete healthcare AI platform that includes:

1. Data Layer

You will start with raw datasets such as patients, visits, and billing records. Instead of working directly on CSV files, you will create a structured analytics layer using SQL, ensuring that data can be queried, validated, and joined properly.

You will then perform exploratory data analysis and build meaningful features such as visit frequency, average length of stay, and provider rejection rates.

2. Machine Learning Layer

You will build two real-world models:

  • A visit risk classifier that predicts whether a patient visit is low, medium, or high risk

  • A claim outcome predictor that determines whether a claim will be paid, pending, or rejected

You will implement multiple algorithms, including Logistic Regression, Random Forest, and XGBoost, and evaluate them using proper metrics such as precision, recall, and F1 score.

More importantly, you will understand how data quality impacts model performance and how fixing labels can dramatically improve outcomes.

3. MLOps Layer

This is where the system becomes production-ready.

You will integrate:

  • MLflow for experiment tracking and model versioning

  • DVC for data versioning and reproducible pipelines

You will define clear artifacts such as trained models, feature schemas, and prediction logs, ensuring that every step in the pipeline is traceable and repeatable.

4. Serving Layer

You will expose your models through a FastAPI-based service with well-defined endpoints for prediction.

You will enforce input validation using Pydantic and build a browser-based interface using Gradio for demonstration purposes.

You will also implement monitoring mechanisms such as PSI-based drift detection to identify when the system starts behaving differently due to changes in incoming data.

5. Cloud Deployment Layer

You will containerize your application using Docker and push images to AWS Elastic Container Registry.

You will then deploy the system on AWS EKS using Kubernetes, enabling scalability, high availability, and zero-downtime updates.

A complete CI/CD pipeline using GitHub Actions will automate build, test, and deployment steps.

6. Continuous Retraining Loop

The system does not stop after deployment.

You will implement a feedback loop where:

  • predictions are logged

  • drift is detected

  • retraining is triggered using DVC pipelines

This ensures that the system continuously improves as new data flows in.

How This Course Connects the Dots

One of the biggest challenges in learning AI and machine learning is fragmentation. You learn SQL in one place, modeling in another, APIs somewhere else, and cloud deployment separately.

This course connects all of these pieces into a single, coherent system.

You will see how:

  • raw data flows into structured analytics

  • features feed into models

  • models are tracked and versioned

  • predictions are served via APIs

  • systems are deployed to the cloud

  • monitoring drives retraining

By the end, you will not just understand individual tools. You will understand how they work together.

Who This Course Is For

This course is ideal for:

  • software engineers who want to transition into AI/ML systems

  • machine learning practitioners who want to learn production deployment

  • backend developers interested in building AI-powered APIs

  • architects who want to understand end-to-end AI system design

What You Will Walk Away With

By the end of this course, you will have:

  • built a complete end-to-end AI system

  • deployed it on AWS using modern cloud practices

  • implemented monitoring and retraining mechanisms

  • developed a strong understanding of production-first architecture

More importantly, you will develop the ability to think beyond models and design systems that deliver real business value.

Final Note

This is not a course about isolated concepts. It is about building something that resembles real-world systems.

If your goal is to move from learning machine learning to applying it in production, this course is designed for you.

Production-first architecture is not an advanced topic. It is the standard.

Who this course is for:

  • Software engineers who want to transition into AI/ML systems and production architecture
  • Machine learning practitioners who want to learn deployment, MLOps, and real-world pipelines
  • Backend developers interested in building AI-powered APIs and scalable services
  • Architects and senior developers who want to understand end-to-end AI system design
  • Anyone tired of isolated tutorials and wants to see how everything connects in production
Advertisements
B5EB233E402D0ED91CB4
Advertisements
Advertisements
Free Online Courses with Certificates
Logo
Register New Account