Module 6 : TP Final - Infrastructure Production-Ready
Objectifs du Module
Ce TP final vous permet de mettre en pratique l'ensemble des compétences acquises durant la formation en déployant une infrastructure complète et production-ready sur AWS.
Durée estimée : 4 heures
Contexte
Vous êtes Cloud Engineer chez TechStartup Inc., une entreprise qui lance une nouvelle application web. Votre mission est de concevoir et déployer l'infrastructure AWS complète en respectant les best practices de sécurité, haute disponibilité et optimisation des coûts.
Architecture Cible
graph TB
subgraph "Internet"
USERS((👥 Users))
CF["☁️ CloudFront"]
end
subgraph "AWS Region: eu-west-1"
subgraph "Public Subnets"
ALB["⚖️ Application<br/>Load Balancer"]
NAT_A["📡 NAT GW A"]
NAT_B["📡 NAT GW B"]
end
subgraph "Private Subnets - EKS"
EKS["☸️ EKS Cluster"]
subgraph "Pods"
WEB["🌐 Web (x3)"]
API["⚙️ API (x3)"]
WORKER["👷 Worker (x2)"]
end
end
subgraph "Database Subnets"
RDS["🗄️ Aurora PostgreSQL<br/>Multi-AZ"]
REDIS["📮 ElastiCache<br/>Redis Cluster"]
end
subgraph "Storage"
S3_ASSETS["📦 S3 Assets"]
S3_LOGS["📦 S3 Logs"]
end
subgraph "Security & Monitoring"
SM["🔐 Secrets Manager"]
CW["📊 CloudWatch"]
WAF["🛡️ WAF"]
end
end
USERS --> CF
CF --> WAF
WAF --> ALB
ALB --> WEB
WEB --> API
API --> RDS
API --> REDIS
API --> S3_ASSETS
WORKER --> RDS
EKS --> CW
API --> SM
style EKS fill:#326ce5,color:#fff
style RDS fill:#3b48cc,color:#fff
style ALB fill:#8c4fff,color:#fff
Spécifications Techniques
Infrastructure Réseau
| Composant | Spécification |
|---|---|
| VPC CIDR | 10.0.0.0/16 |
| Public Subnets | 10.0.1.0/24, 10.0.2.0/24, 10.0.3.0/24 |
| Private Subnets | 10.0.10.0/24, 10.0.11.0/24, 10.0.12.0/24 |
| Database Subnets | 10.0.20.0/24, 10.0.21.0/24, 10.0.22.0/24 |
| Availability Zones | eu-west-1a, eu-west-1b, eu-west-1c |
Compute (EKS)
| Composant | Spécification |
|---|---|
| Version Kubernetes | 1.28 |
| Node Type | t3.large (production), t3.medium (staging) |
| Min/Max Nodes | 3/10 |
| Node Groups | 2 (on-demand + spot) |
Base de Données
| Composant | Spécification |
|---|---|
| Engine | Aurora PostgreSQL 15.4 |
| Instance Class | db.r6g.large |
| Réplication | Multi-AZ (1 writer, 2 readers) |
| Backup | 7 jours, PITR activé |
Cache
| Composant | Spécification |
|---|---|
| Engine | Redis 7.0 |
| Node Type | cache.r6g.large |
| Réplication | 2 replicas |
| Cluster Mode | Enabled |
Tâches à Réaliser
Partie 1 : Infrastructure Réseau (45 min)
Tâche 1.1 : VPC et Subnets
Créer le VPC complet avec :
- VPC avec DNS hostnames activé
- 9 subnets (3 publics, 3 privés, 3 database)
- Internet Gateway
- 2 NAT Gateways (HA)
- Route Tables appropriées
- VPC Flow Logs vers CloudWatch
Solution 1.1
#!/bin/bash
# create-vpc.sh
REGION="eu-west-1"
VPC_CIDR="10.0.0.0/16"
# VPC
VPC_ID=$(aws ec2 create-vpc \
--cidr-block $VPC_CIDR \
--tag-specifications 'ResourceType=vpc,Tags=[{Key=Name,Value=techstartup-vpc}]' \
--query 'Vpc.VpcId' --output text)
aws ec2 modify-vpc-attribute --vpc-id $VPC_ID --enable-dns-hostnames '{"Value":true}'
# Internet Gateway
IGW_ID=$(aws ec2 create-internet-gateway \
--tag-specifications 'ResourceType=internet-gateway,Tags=[{Key=Name,Value=techstartup-igw}]' \
--query 'InternetGateway.InternetGatewayId' --output text)
aws ec2 attach-internet-gateway --internet-gateway-id $IGW_ID --vpc-id $VPC_ID
# Subnets
declare -A SUBNETS
AZS=("a" "b" "c")
# Public Subnets
for i in 0 1 2; do
SUBNETS["pub_${AZS[$i]}"]=$(aws ec2 create-subnet \
--vpc-id $VPC_ID \
--cidr-block "10.0.$((i+1)).0/24" \
--availability-zone "${REGION}${AZS[$i]}" \
--tag-specifications "ResourceType=subnet,Tags=[{Key=Name,Value=pub-${AZS[$i]}},{Key=kubernetes.io/role/elb,Value=1}]" \
--query 'Subnet.SubnetId' --output text)
aws ec2 modify-subnet-attribute --subnet-id ${SUBNETS["pub_${AZS[$i]}"]} --map-public-ip-on-launch
done
# Private Subnets
for i in 0 1 2; do
SUBNETS["priv_${AZS[$i]}"]=$(aws ec2 create-subnet \
--vpc-id $VPC_ID \
--cidr-block "10.0.$((i+10)).0/24" \
--availability-zone "${REGION}${AZS[$i]}" \
--tag-specifications "ResourceType=subnet,Tags=[{Key=Name,Value=priv-${AZS[$i]}},{Key=kubernetes.io/role/internal-elb,Value=1}]" \
--query 'Subnet.SubnetId' --output text)
done
# Database Subnets
for i in 0 1 2; do
SUBNETS["db_${AZS[$i]}"]=$(aws ec2 create-subnet \
--vpc-id $VPC_ID \
--cidr-block "10.0.$((i+20)).0/24" \
--availability-zone "${REGION}${AZS[$i]}" \
--tag-specifications "ResourceType=subnet,Tags=[{Key=Name,Value=db-${AZS[$i]}}]" \
--query 'Subnet.SubnetId' --output text)
done
# NAT Gateways (2 pour HA)
for az in a b; do
EIP_ID=$(aws ec2 allocate-address --domain vpc --query 'AllocationId' --output text)
NAT_GW=$(aws ec2 create-nat-gateway \
--subnet-id ${SUBNETS["pub_$az"]} \
--allocation-id $EIP_ID \
--tag-specifications "ResourceType=natgateway,Tags=[{Key=Name,Value=nat-$az}]" \
--query 'NatGateway.NatGatewayId' --output text)
eval "NAT_$az=$NAT_GW"
done
echo "Waiting for NAT Gateways..."
aws ec2 wait nat-gateway-available --nat-gateway-ids $NAT_a $NAT_b
# Route Tables
# Public RT
PUB_RT=$(aws ec2 create-route-table --vpc-id $VPC_ID \
--tag-specifications 'ResourceType=route-table,Tags=[{Key=Name,Value=pub-rt}]' \
--query 'RouteTable.RouteTableId' --output text)
aws ec2 create-route --route-table-id $PUB_RT --destination-cidr-block 0.0.0.0/0 --gateway-id $IGW_ID
for az in a b c; do
aws ec2 associate-route-table --subnet-id ${SUBNETS["pub_$az"]} --route-table-id $PUB_RT
done
# Private RTs (one per AZ for NAT Gateway)
for az in a b; do
PRIV_RT=$(aws ec2 create-route-table --vpc-id $VPC_ID \
--tag-specifications "ResourceType=route-table,Tags=[{Key=Name,Value=priv-rt-$az}]" \
--query 'RouteTable.RouteTableId' --output text)
eval "NAT_GW=\$NAT_$az"
aws ec2 create-route --route-table-id $PRIV_RT --destination-cidr-block 0.0.0.0/0 --nat-gateway-id $NAT_GW
aws ec2 associate-route-table --subnet-id ${SUBNETS["priv_$az"]} --route-table-id $PRIV_RT
aws ec2 associate-route-table --subnet-id ${SUBNETS["db_$az"]} --route-table-id $PRIV_RT
done
# AZ-C uses NAT from AZ-B
aws ec2 associate-route-table --subnet-id ${SUBNETS["priv_c"]} --route-table-id $PRIV_RT
aws ec2 associate-route-table --subnet-id ${SUBNETS["db_c"]} --route-table-id $PRIV_RT
# VPC Flow Logs
aws logs create-log-group --log-group-name /vpc/techstartup-flow-logs
aws ec2 create-flow-logs \
--resource-type VPC \
--resource-ids $VPC_ID \
--traffic-type ALL \
--log-destination-type cloud-watch-logs \
--log-group-name /vpc/techstartup-flow-logs \
--deliver-logs-permission-arn arn:aws:iam::$(aws sts get-caller-identity --query Account --output text):role/VPCFlowLogsRole
echo "=== VPC Created ==="
echo "VPC: $VPC_ID"
Partie 2 : Base de Données (30 min)
Tâche 2.1 : Aurora PostgreSQL
Déployer un cluster Aurora PostgreSQL avec :
- 1 writer + 2 readers
- Multi-AZ
- Encryption at rest (KMS)
- Backup 7 jours
- Performance Insights activé
- Accès via Private IP uniquement
Solution 2.1
# Subnet Group
aws rds create-db-subnet-group \
--db-subnet-group-name techstartup-db-subnets \
--db-subnet-group-description "Database subnets for TechStartup" \
--subnet-ids ${SUBNETS[db_a]} ${SUBNETS[db_b]} ${SUBNETS[db_c]}
# Security Group
DB_SG=$(aws ec2 create-security-group \
--group-name aurora-sg \
--description "Aurora PostgreSQL" \
--vpc-id $VPC_ID \
--query 'GroupId' --output text)
aws ec2 authorize-security-group-ingress \
--group-id $DB_SG \
--protocol tcp \
--port 5432 \
--source-group $EKS_SG # From EKS nodes
# Aurora Cluster
aws rds create-db-cluster \
--db-cluster-identifier techstartup-aurora \
--engine aurora-postgresql \
--engine-version 15.4 \
--master-username dbadmin \
--master-user-password "$(aws secretsmanager get-random-password --password-length 32 --query 'RandomPassword' --output text)" \
--db-subnet-group-name techstartup-db-subnets \
--vpc-security-group-ids $DB_SG \
--storage-encrypted \
--kms-key-id alias/aurora-key \
--backup-retention-period 7 \
--enable-cloudwatch-logs-exports '["postgresql"]' \
--deletion-protection
# Writer Instance
aws rds create-db-instance \
--db-instance-identifier techstartup-aurora-writer \
--db-cluster-identifier techstartup-aurora \
--db-instance-class db.r6g.large \
--engine aurora-postgresql \
--enable-performance-insights \
--performance-insights-retention-period 7
# Reader Instances
for i in 1 2; do
aws rds create-db-instance \
--db-instance-identifier techstartup-aurora-reader-$i \
--db-cluster-identifier techstartup-aurora \
--db-instance-class db.r6g.large \
--engine aurora-postgresql \
--enable-performance-insights
done
# Store credentials in Secrets Manager
aws secretsmanager create-secret \
--name techstartup/aurora/credentials \
--secret-string '{"username":"dbadmin","password":"...",...}'
Partie 3 : EKS Cluster (45 min)
Tâche 3.1 : Cluster EKS
Créer un cluster EKS avec :
- 2 node groups (on-demand + spot)
- IRSA configuré
- AWS Load Balancer Controller
- EBS CSI Driver
- Cluster Autoscaler
Solution 3.1
# Créer le cluster avec eksctl
cat > cluster.yaml << 'EOF'
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
name: techstartup-cluster
region: eu-west-1
version: "1.28"
vpc:
id: "vpc-xxx"
subnets:
private:
eu-west-1a: { id: subnet-priv-a }
eu-west-1b: { id: subnet-priv-b }
eu-west-1c: { id: subnet-priv-c }
iam:
withOIDC: true
managedNodeGroups:
- name: on-demand
instanceType: t3.large
minSize: 2
maxSize: 6
desiredCapacity: 3
volumeSize: 50
privateNetworking: true
labels:
workload: on-demand
iam:
withAddonPolicies:
albIngress: true
ebs: true
cloudWatch: true
- name: spot
instanceTypes: ["t3.large", "t3.xlarge", "m5.large"]
minSize: 0
maxSize: 10
desiredCapacity: 2
spot: true
privateNetworking: true
labels:
workload: spot
taints:
- key: spot
value: "true"
effect: PreferNoSchedule
addons:
- name: vpc-cni
- name: coredns
- name: kube-proxy
- name: aws-ebs-csi-driver
cloudWatch:
clusterLogging:
enableTypes: ["api", "audit", "authenticator", "controllerManager", "scheduler"]
EOF
eksctl create cluster -f cluster.yaml
# Install AWS Load Balancer Controller
helm repo add eks https://aws.github.io/eks-charts
helm install aws-load-balancer-controller eks/aws-load-balancer-controller \
-n kube-system \
--set clusterName=techstartup-cluster
# Install Cluster Autoscaler
kubectl apply -f https://raw.githubusercontent.com/kubernetes/autoscaler/master/cluster-autoscaler/cloudprovider/aws/examples/cluster-autoscaler-autodiscover.yaml
Partie 4 : Application Deployment (45 min)
Tâche 4.1 : Déployer l'Application
Déployer les composants :
- Web frontend (3 replicas)
- API backend (3 replicas)
- Worker (2 replicas)
- Ingress avec ALB
- HPA sur web et API
Solution 4.1
# Complete deployment file
---
apiVersion: v1
kind: Namespace
metadata:
name: techstartup
---
# ConfigMap
apiVersion: v1
kind: ConfigMap
metadata:
name: app-config
namespace: techstartup
data:
DATABASE_HOST: "techstartup-aurora.cluster-xxx.eu-west-1.rds.amazonaws.com"
REDIS_HOST: "techstartup-redis.xxx.cache.amazonaws.com"
---
# External Secrets (from Secrets Manager)
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: db-credentials
namespace: techstartup
spec:
refreshInterval: 1h
secretStoreRef:
name: aws-secrets-manager
kind: ClusterSecretStore
target:
name: db-credentials
data:
- secretKey: password
remoteRef:
key: techstartup/aurora/credentials
property: password
---
# Web Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: web
namespace: techstartup
spec:
replicas: 3
selector:
matchLabels:
app: web
template:
metadata:
labels:
app: web
spec:
serviceAccountName: techstartup-sa
containers:
- name: web
image: 123456789012.dkr.ecr.eu-west-1.amazonaws.com/techstartup-web:latest
ports:
- containerPort: 80
envFrom:
- configMapRef:
name: app-config
resources:
requests:
cpu: 100m
memory: 256Mi
limits:
cpu: 500m
memory: 512Mi
livenessProbe:
httpGet:
path: /health
port: 80
initialDelaySeconds: 10
readinessProbe:
httpGet:
path: /ready
port: 80
---
# API Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: api
namespace: techstartup
spec:
replicas: 3
selector:
matchLabels:
app: api
template:
metadata:
labels:
app: api
spec:
serviceAccountName: techstartup-sa
containers:
- name: api
image: 123456789012.dkr.ecr.eu-west-1.amazonaws.com/techstartup-api:latest
ports:
- containerPort: 8080
envFrom:
- configMapRef:
name: app-config
env:
- name: DB_PASSWORD
valueFrom:
secretKeyRef:
name: db-credentials
key: password
resources:
requests:
cpu: 200m
memory: 512Mi
limits:
cpu: 1000m
memory: 1Gi
---
# Worker Deployment (on spot nodes)
apiVersion: apps/v1
kind: Deployment
metadata:
name: worker
namespace: techstartup
spec:
replicas: 2
selector:
matchLabels:
app: worker
template:
metadata:
labels:
app: worker
spec:
tolerations:
- key: spot
operator: Equal
value: "true"
effect: PreferNoSchedule
nodeSelector:
workload: spot
containers:
- name: worker
image: 123456789012.dkr.ecr.eu-west-1.amazonaws.com/techstartup-worker:latest
---
# Services
apiVersion: v1
kind: Service
metadata:
name: web
namespace: techstartup
spec:
selector:
app: web
ports:
- port: 80
---
apiVersion: v1
kind: Service
metadata:
name: api
namespace: techstartup
spec:
selector:
app: api
ports:
- port: 8080
---
# Ingress
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: techstartup-ingress
namespace: techstartup
annotations:
kubernetes.io/ingress.class: alb
alb.ingress.kubernetes.io/scheme: internet-facing
alb.ingress.kubernetes.io/target-type: ip
alb.ingress.kubernetes.io/listen-ports: '[{"HTTPS":443}]'
alb.ingress.kubernetes.io/certificate-arn: arn:aws:acm:eu-west-1:xxx:certificate/xxx
alb.ingress.kubernetes.io/wafv2-acl-arn: arn:aws:wafv2:eu-west-1:xxx:regional/webacl/xxx
spec:
rules:
- host: app.techstartup.com
http:
paths:
- path: /api
pathType: Prefix
backend:
service:
name: api
port:
number: 8080
- path: /
pathType: Prefix
backend:
service:
name: web
port:
number: 80
---
# HPA
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: web-hpa
namespace: techstartup
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: web
minReplicas: 3
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
Partie 5 : Monitoring & Alerting (30 min)
Tâche 5.1 : CloudWatch Dashboard et Alertes
Configurer :
- Dashboard CloudWatch pour l'application
- Alertes sur CPU, mémoire, erreurs 5xx
- SNS topic pour notifications
Solution 5.1
# SNS Topic
TOPIC_ARN=$(aws sns create-topic --name techstartup-alerts --query 'TopicArn' --output text)
aws sns subscribe --topic-arn $TOPIC_ARN --protocol email --notification-endpoint ops@techstartup.com
# CloudWatch Alarms
# High CPU on EKS
aws cloudwatch put-metric-alarm \
--alarm-name "EKS-High-CPU" \
--metric-name CPUUtilization \
--namespace AWS/EKS \
--statistic Average \
--period 300 \
--threshold 80 \
--comparison-operator GreaterThanThreshold \
--evaluation-periods 2 \
--alarm-actions $TOPIC_ARN
# Aurora connections
aws cloudwatch put-metric-alarm \
--alarm-name "Aurora-High-Connections" \
--metric-name DatabaseConnections \
--namespace AWS/RDS \
--dimensions Name=DBClusterIdentifier,Value=techstartup-aurora \
--statistic Average \
--period 300 \
--threshold 100 \
--comparison-operator GreaterThanThreshold \
--evaluation-periods 2 \
--alarm-actions $TOPIC_ARN
# ALB 5xx errors
aws cloudwatch put-metric-alarm \
--alarm-name "ALB-5xx-Errors" \
--metric-name HTTPCode_Target_5XX_Count \
--namespace AWS/ApplicationELB \
--dimensions Name=LoadBalancer,Value=app/techstartup-alb/xxx \
--statistic Sum \
--period 60 \
--threshold 10 \
--comparison-operator GreaterThanThreshold \
--evaluation-periods 1 \
--alarm-actions $TOPIC_ARN
Livrables Attendus
- Scripts Infrastructure as Code (Bash/CloudFormation/Terraform)
- Manifests Kubernetes (YAML)
- Diagramme d'architecture (Mermaid ou draw.io)
- Documentation des choix techniques (README.md)
- Estimation des coûts mensuels
Estimation des Coûts
| Service | Configuration | Coût estimé/mois |
|---|---|---|
| EKS Control Plane | 1 cluster | $73 |
| EC2 (EKS Nodes) | 3x t3.large on-demand | $180 |
| EC2 (Spot) | 2x t3.large avg | $40 |
| Aurora PostgreSQL | db.r6g.large (writer + 2 readers) | $450 |
| ElastiCache Redis | cache.r6g.large (3 nodes) | $270 |
| ALB | 1 ALB | $25 |
| NAT Gateway | 2 NAT GW | $90 |
| S3 | 100GB Standard | $2 |
| CloudWatch | Logs + Metrics | $30 |
| TOTAL | ~$1,160/mois |
Optimisation
- Utiliser Savings Plans pour EC2/Fargate (-30%)
- Reserved Instances pour Aurora (-40%)
- Revoir le sizing après 1 mois de production
Critères d'Évaluation
| Critère | Points |
|---|---|
| Infrastructure réseau complète et sécurisée | 20 |
| Base de données HA avec backups | 15 |
| Cluster EKS fonctionnel | 20 |
| Application déployée et accessible | 20 |
| Monitoring et alerting | 15 |
| Documentation et qualité du code | 10 |
| TOTAL | 100 |
Navigation
| Précédent | Suivant |
|---|---|
| ← Module 5 : EKS | Module 7 : CI/CD → |
Navigation
| ← Module 5 : EKS & Containers | Module 7 : CI/CD avec CodePipeline & ... → |