3.1 Project Overview
Overview of Project ☁️
This project involves building and deploying a Cybersecurity Threat Detection System using Amazon SageMaker. The system identifies anomalous network activity that may indicate cyberattacks, such as DDoS attacks, unauthorized access, or phishing attempts. The machine learning pipeline automates data ingestion, preprocessing, model training, deployment, and inference.
Key components include:
- Data Ingestion & Preprocessing: Raw network traffic logs are collected, transformed, and feature-engineered to create a structured dataset.
- Model Training & Evaluation: An XGBoost model is trained to classify network activity as normal or malicious.
- Deployment & Inference: The trained model is deployed as an endpoint to detect real-time security threats.
- Pipeline Automation: An end-to-end SageMaker Pipeline automates data transformation, model training, and deployment.
Steps to be performed 👩💻
We'll go through the following steps in the next few lessons.
1. Preprocess Data and Feature Engineering
2. Training and Testing a Model using XGBoost
3. Deploy and Serve the Model
4. Automating with SageMaker Pipelines
Services Used 🛠
- Amazon SageMaker: Trains, deploys, and serves the machine learning model. [Machine Learning]
- Amazon S3: Stores raw network traffic logs, preprocessed data, and model artifacts. [Storage]
- AWS Lambda: Automates data preprocessing tasks and feature extraction. [Compute]
- Amazon CloudWatch: Monitors model performance and logs security threats. [Monitoring]
- AWS IAM: Manages permissions and security policies for accessing AWS services. [Security]
Estimated Time & Cost ⚙️
- This project is estimated to take about 2-3 hours
- Cost: ~$1 to $2
➡️ Diagram
This is the architectural diagram for the project:
➡️ Final Result
This project implements a near real-time data analytics pipeline rather than a fully real-time system. The stock data is streamed, processed by AWS Lambda, and stored in DynamoDB with a 30-second delay.
0 comments