3.1 Project Overview

Overview of Project ☁️

Scenario:

Cloudhour, a fast-growing SaaS startup, is preparing to launch a new subscription-based product.

The marketing team wants to automatically determine if a new customer is likely to subscribe to the premium plan, based on their demographic and past interaction data, such as age, job, account balance, and contact history.

They need a real-time prediction API that their website can call instantly whenever a user signs up.

Our solution:

We’ll build, train, and deploy a Machine Learning model that predicts whether a user will subscribe to a service.

Using Amazon SageMaker, we’ll handle the entire ML workflow from importing data and training a model to deploying it as a live endpoint that can respond to API calls in real time.

About the Project

In this hands-on lab, you’ll learn to:

Import and preprocess data using Amazon S3 and SageMaker Notebooks
Train a binary classification model using XGBoost or Scikit-learn inside SageMaker
Deploy the trained model to a Real-Time Endpoint for live inference
Test the endpoint using Python to simulate API calls

By the end of this project, you’ll have a fully functional ML-powered API that predicts customer subscriptions on demand.

This project helps you understand how ML models move from experimentation to production, forming the foundation for future MLOps automation.

Steps To Be Performed 👩‍💻

We’ll complete the following steps in sequence:

Set up the SageMaker environment and permissions
Import and upload the dataset to Amazon S3
Explore and prepare the data for model training
Train the classification model using XGBoost
Deploy the trained model to a Real-Time Endpoint
Test live predictions using JSON inputs
Clean up resources and review best practices

Each of these steps will be explained in detail in the following pages.

Services Used 🛠

Amazon SageMaker → Used to build, train, and host the machine learning model
Amazon S3 → Stores datasets and model artifacts for training and deployment
AWS IAM → Provides secure role-based permissions for SageMaker and S3 access
Amazon CloudWatch (optional) → Monitors logs, latency, and performance metrics for endpoints

Estimated Time & Cost ⚙️

Estimated Time: 1.5 – 2.5 hours
Estimated Cost: ~$1 – $2 (if using ml.t2.medium or ml.m5.large instances briefly)
Note: Always delete endpoints after testing to avoid additional hourly costs.

➡️ Architectural Diagram

This is the architecture you’ll build in this project:

➡️ Final Result

The final chart visualizes the predicted subscription probability for each customer in the test set.

Each point shows how confident the model is, and the red 0.5 threshold separates “Subscribe” vs “Not Subscribe.”

This helps you understand how the model makes decisions and how confident it is across different samples.

Complete and Continue