583 lines
14 KiB
Markdown
583 lines
14 KiB
Markdown
|
|
# Cloud Cost Optimization Reference
|
||
|
|
|
||
|
|
Comprehensive guide for cloud cost optimization including reserved instances, spot/preemptible, right-sizing, and FinOps practices.
|
||
|
|
|
||
|
|
## FinOps Framework
|
||
|
|
|
||
|
|
### FinOps Principles
|
||
|
|
|
||
|
|
1. **Teams need to collaborate** - Finance, engineering, and business work together
|
||
|
|
2. **Everyone takes ownership** - Decentralized cost responsibility
|
||
|
|
3. **A centralized team drives FinOps** - Center of excellence for best practices
|
||
|
|
4. **Reports should be accessible and timely** - Real-time visibility
|
||
|
|
5. **Decisions are driven by business value** - Cost per business outcome
|
||
|
|
6. **Take advantage of variable cost model** - Scale up and down as needed
|
||
|
|
|
||
|
|
### FinOps Lifecycle
|
||
|
|
|
||
|
|
```
|
||
|
|
Inform
|
||
|
|
|
|
||
|
|
+---------+
|
||
|
|
| |
|
||
|
|
v v
|
||
|
|
Optimize --> Operate
|
||
|
|
^ |
|
||
|
|
| |
|
||
|
|
+---------+
|
||
|
|
```
|
||
|
|
|
||
|
|
**Inform Phase**
|
||
|
|
- Visibility into cloud spend
|
||
|
|
- Allocation and showback
|
||
|
|
- Benchmarking and forecasting
|
||
|
|
|
||
|
|
**Optimize Phase**
|
||
|
|
- Rate optimization (RIs, savings plans)
|
||
|
|
- Usage optimization (right-sizing)
|
||
|
|
- Architectural optimization
|
||
|
|
|
||
|
|
**Operate Phase**
|
||
|
|
- Continuous improvement
|
||
|
|
- Automation and governance
|
||
|
|
- Anomaly detection
|
||
|
|
|
||
|
|
## Compute Cost Optimization
|
||
|
|
|
||
|
|
### Reserved Instances / Savings Plans
|
||
|
|
|
||
|
|
**AWS Savings Plans**
|
||
|
|
|
||
|
|
| Type | Flexibility | Savings |
|
||
|
|
|------|-------------|---------|
|
||
|
|
| Compute Savings Plans | Any EC2, Fargate, Lambda | Up to 66% |
|
||
|
|
| EC2 Instance Savings Plans | Specific instance family, region | Up to 72% |
|
||
|
|
| Reserved Instances | Specific instance type, AZ | Up to 72% |
|
||
|
|
|
||
|
|
**Commitment Strategy**
|
||
|
|
```
|
||
|
|
Baseline (always-on): 1-year or 3-year Savings Plans
|
||
|
|
Variable (predictable): Scheduled Reserved Instances
|
||
|
|
Spiky (unpredictable): On-Demand + Spot
|
||
|
|
```
|
||
|
|
|
||
|
|
**Azure Reservations**
|
||
|
|
```
|
||
|
|
# Azure CLI - Purchase reservation
|
||
|
|
az reservations reservation-order purchase \
|
||
|
|
--sku Standard_D2s_v3 \
|
||
|
|
--term P1Y \
|
||
|
|
--billing-scope /subscriptions/{subscription-id} \
|
||
|
|
--quantity 10 \
|
||
|
|
--applied-scope-type Shared
|
||
|
|
```
|
||
|
|
|
||
|
|
**GCP Committed Use Discounts**
|
||
|
|
- Resource-based: Specific vCPUs and memory
|
||
|
|
- Spend-based: Dollar commitment for flexibility
|
||
|
|
- 1-year (37% discount) or 3-year (55% discount)
|
||
|
|
|
||
|
|
### Spot/Preemptible Instances
|
||
|
|
|
||
|
|
**When to Use Spot**
|
||
|
|
- Batch processing and analytics
|
||
|
|
- CI/CD build agents
|
||
|
|
- Stateless web servers (with auto-scaling)
|
||
|
|
- Machine learning training
|
||
|
|
- Development and testing environments
|
||
|
|
|
||
|
|
**AWS Spot Best Practices**
|
||
|
|
```yaml
|
||
|
|
# EC2 Auto Scaling with Spot
|
||
|
|
MixedInstancesPolicy:
|
||
|
|
InstancesDistribution:
|
||
|
|
OnDemandBaseCapacity: 2
|
||
|
|
OnDemandPercentageAboveBaseCapacity: 20
|
||
|
|
SpotAllocationStrategy: capacity-optimized
|
||
|
|
LaunchTemplate:
|
||
|
|
Overrides:
|
||
|
|
- InstanceType: m5.large
|
||
|
|
- InstanceType: m5a.large
|
||
|
|
- InstanceType: m4.large
|
||
|
|
- InstanceType: r5.large
|
||
|
|
```
|
||
|
|
|
||
|
|
**Spot Interruption Handling**
|
||
|
|
```python
|
||
|
|
# Check for spot termination notice (AWS)
|
||
|
|
import requests
|
||
|
|
|
||
|
|
def check_spot_termination():
|
||
|
|
try:
|
||
|
|
response = requests.get(
|
||
|
|
"http://169.254.169.254/latest/meta-data/spot/termination-time",
|
||
|
|
timeout=2
|
||
|
|
)
|
||
|
|
if response.status_code == 200:
|
||
|
|
# 2-minute warning - gracefully shutdown
|
||
|
|
graceful_shutdown()
|
||
|
|
except requests.exceptions.RequestException:
|
||
|
|
pass # Not being terminated
|
||
|
|
```
|
||
|
|
|
||
|
|
**GCP Preemptible/Spot VMs**
|
||
|
|
```hcl
|
||
|
|
# Terraform - GCP Spot VM
|
||
|
|
resource "google_compute_instance" "spot" {
|
||
|
|
name = "spot-instance"
|
||
|
|
machine_type = "n2-standard-4"
|
||
|
|
|
||
|
|
scheduling {
|
||
|
|
preemptible = true
|
||
|
|
automatic_restart = false
|
||
|
|
provisioning_model = "SPOT"
|
||
|
|
instance_termination_action = "STOP"
|
||
|
|
}
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
### Right-Sizing
|
||
|
|
|
||
|
|
**Analysis Process**
|
||
|
|
1. Collect metrics (CPU, memory, network, disk I/O)
|
||
|
|
2. Identify idle or underutilized resources
|
||
|
|
3. Recommend appropriate instance size
|
||
|
|
4. Implement changes during maintenance windows
|
||
|
|
5. Monitor and iterate
|
||
|
|
|
||
|
|
**AWS Compute Optimizer**
|
||
|
|
```bash
|
||
|
|
# Enable Compute Optimizer
|
||
|
|
aws compute-optimizer update-enrollment-status \
|
||
|
|
--status Active \
|
||
|
|
--include-member-accounts
|
||
|
|
|
||
|
|
# Get recommendations
|
||
|
|
aws compute-optimizer get-ec2-instance-recommendations \
|
||
|
|
--filters name=Finding,values=OVER_PROVISIONED
|
||
|
|
```
|
||
|
|
|
||
|
|
**Right-Sizing Thresholds**
|
||
|
|
| Metric | Underutilized | Optimal | Overutilized |
|
||
|
|
|--------|---------------|---------|--------------|
|
||
|
|
| CPU | <20% avg | 40-60% avg | >80% avg |
|
||
|
|
| Memory | <30% avg | 50-70% avg | >85% avg |
|
||
|
|
| Network | <10% capacity | Variable | >80% capacity |
|
||
|
|
|
||
|
|
**Azure Advisor Recommendations**
|
||
|
|
```bash
|
||
|
|
# Get cost recommendations
|
||
|
|
az advisor recommendation list \
|
||
|
|
--category Cost \
|
||
|
|
--query "[?impact=='High']"
|
||
|
|
```
|
||
|
|
|
||
|
|
## Storage Cost Optimization
|
||
|
|
|
||
|
|
### Object Storage Tiering
|
||
|
|
|
||
|
|
**AWS S3 Storage Classes**
|
||
|
|
```
|
||
|
|
S3 Standard
|
||
|
|
|
|
||
|
|
| (30 days)
|
||
|
|
v
|
||
|
|
S3 Standard-IA
|
||
|
|
|
|
||
|
|
| (90 days)
|
||
|
|
v
|
||
|
|
S3 Glacier Instant Retrieval
|
||
|
|
|
|
||
|
|
| (180 days)
|
||
|
|
v
|
||
|
|
S3 Glacier Deep Archive
|
||
|
|
```
|
||
|
|
|
||
|
|
**Lifecycle Policy Example**
|
||
|
|
```json
|
||
|
|
{
|
||
|
|
"Rules": [
|
||
|
|
{
|
||
|
|
"ID": "OptimizeCosts",
|
||
|
|
"Status": "Enabled",
|
||
|
|
"Filter": { "Prefix": "logs/" },
|
||
|
|
"Transitions": [
|
||
|
|
{ "Days": 30, "StorageClass": "STANDARD_IA" },
|
||
|
|
{ "Days": 90, "StorageClass": "GLACIER" },
|
||
|
|
{ "Days": 365, "StorageClass": "DEEP_ARCHIVE" }
|
||
|
|
],
|
||
|
|
"Expiration": { "Days": 730 }
|
||
|
|
}
|
||
|
|
]
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
**S3 Intelligent-Tiering**
|
||
|
|
- Automatic tiering based on access patterns
|
||
|
|
- No retrieval fees
|
||
|
|
- Small monitoring fee per object
|
||
|
|
- Best for unpredictable access patterns
|
||
|
|
|
||
|
|
### Block Storage Optimization
|
||
|
|
|
||
|
|
**EBS Volume Selection**
|
||
|
|
| Type | Use Case | $/GB/month |
|
||
|
|
|------|----------|------------|
|
||
|
|
| gp3 | General purpose | $0.08 |
|
||
|
|
| gp2 | Legacy (migrate to gp3) | $0.10 |
|
||
|
|
| io2 | High IOPS databases | $0.125+ |
|
||
|
|
| st1 | Throughput (big data) | $0.045 |
|
||
|
|
| sc1 | Cold archives | $0.015 |
|
||
|
|
|
||
|
|
**gp3 Migration (20% savings)**
|
||
|
|
```bash
|
||
|
|
# Modify EBS volume from gp2 to gp3
|
||
|
|
aws ec2 modify-volume \
|
||
|
|
--volume-id vol-12345678 \
|
||
|
|
--volume-type gp3 \
|
||
|
|
--iops 3000 \
|
||
|
|
--throughput 125
|
||
|
|
```
|
||
|
|
|
||
|
|
### Database Storage
|
||
|
|
|
||
|
|
**Aurora Storage Optimization**
|
||
|
|
- Pay only for storage used (auto-scaling)
|
||
|
|
- No pre-provisioning required
|
||
|
|
- 10GB increments up to 128TB
|
||
|
|
|
||
|
|
**DynamoDB Capacity Modes**
|
||
|
|
| Mode | Best For | Pricing |
|
||
|
|
|------|----------|---------|
|
||
|
|
| On-Demand | Unpredictable traffic | Pay per request |
|
||
|
|
| Provisioned | Steady traffic | Pay per capacity unit |
|
||
|
|
| Provisioned + Auto Scaling | Variable but predictable | Lower cost than on-demand |
|
||
|
|
|
||
|
|
## Network Cost Optimization
|
||
|
|
|
||
|
|
### Data Transfer Costs
|
||
|
|
|
||
|
|
**AWS Data Transfer Pricing**
|
||
|
|
```
|
||
|
|
Inbound: Free
|
||
|
|
Same AZ: Free
|
||
|
|
Cross-AZ: $0.01/GB each direction
|
||
|
|
Same Region (via public IP): $0.01/GB
|
||
|
|
Cross-Region: $0.02/GB
|
||
|
|
Internet Egress: $0.09/GB (first 10TB)
|
||
|
|
```
|
||
|
|
|
||
|
|
**Optimization Strategies**
|
||
|
|
1. Keep traffic within same AZ when possible
|
||
|
|
2. Use VPC endpoints for AWS services
|
||
|
|
3. Use CloudFront for cacheable content
|
||
|
|
4. Compress data before transfer
|
||
|
|
5. Use regional rather than global services
|
||
|
|
|
||
|
|
**VPC Endpoints (Avoid NAT Gateway)**
|
||
|
|
```hcl
|
||
|
|
# Gateway endpoint (free for S3, DynamoDB)
|
||
|
|
resource "aws_vpc_endpoint" "s3" {
|
||
|
|
vpc_id = aws_vpc.main.id
|
||
|
|
service_name = "com.amazonaws.us-east-1.s3"
|
||
|
|
}
|
||
|
|
|
||
|
|
# Interface endpoint (cheaper than NAT for specific services)
|
||
|
|
resource "aws_vpc_endpoint" "ecr" {
|
||
|
|
vpc_id = aws_vpc.main.id
|
||
|
|
service_name = "com.amazonaws.us-east-1.ecr.api"
|
||
|
|
vpc_endpoint_type = "Interface"
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
### CDN Optimization
|
||
|
|
|
||
|
|
**CloudFront Cost Savings**
|
||
|
|
- Lower data transfer rates than direct from origin
|
||
|
|
- Cache hit ratio optimization (target >90%)
|
||
|
|
- Use Origin Shield to reduce origin load
|
||
|
|
- Compress objects (Gzip/Brotli)
|
||
|
|
|
||
|
|
```yaml
|
||
|
|
# CloudFront cache optimization
|
||
|
|
CacheBehaviors:
|
||
|
|
- PathPattern: "/static/*"
|
||
|
|
CachePolicyId: 658327ea-f89d-4fab-a63d-7e88639e58f6 # CachingOptimized
|
||
|
|
Compress: true
|
||
|
|
TTL:
|
||
|
|
DefaultTTL: 86400
|
||
|
|
MaxTTL: 31536000
|
||
|
|
```
|
||
|
|
|
||
|
|
## Serverless Cost Optimization
|
||
|
|
|
||
|
|
### Lambda Optimization
|
||
|
|
|
||
|
|
**Memory/CPU Tuning**
|
||
|
|
```python
|
||
|
|
# Use AWS Lambda Power Tuning
|
||
|
|
# Finds optimal memory for cost vs performance
|
||
|
|
|
||
|
|
# Results example:
|
||
|
|
# 128MB: $0.000021 per invocation, 3200ms duration
|
||
|
|
# 256MB: $0.000025 per invocation, 1600ms duration
|
||
|
|
# 512MB: $0.000031 per invocation, 800ms duration
|
||
|
|
# 1024MB: $0.000042 per invocation, 450ms duration
|
||
|
|
# Optimal: 512MB (best cost-performance balance)
|
||
|
|
```
|
||
|
|
|
||
|
|
**Cost Reduction Strategies**
|
||
|
|
1. Right-size memory allocation
|
||
|
|
2. Minimize cold starts (provisioned concurrency for critical paths)
|
||
|
|
3. Use ARM64 (Graviton2) - 20% cheaper
|
||
|
|
4. Optimize package size for faster cold starts
|
||
|
|
5. Use Lambda Layers for shared dependencies
|
||
|
|
|
||
|
|
**Graviton2 Migration**
|
||
|
|
```yaml
|
||
|
|
# SAM template with ARM64
|
||
|
|
Resources:
|
||
|
|
MyFunction:
|
||
|
|
Type: AWS::Serverless::Function
|
||
|
|
Properties:
|
||
|
|
Runtime: python3.11
|
||
|
|
Architectures:
|
||
|
|
- arm64 # 20% cost savings
|
||
|
|
```
|
||
|
|
|
||
|
|
### Container Optimization
|
||
|
|
|
||
|
|
**Fargate Pricing Optimization**
|
||
|
|
```
|
||
|
|
# Fargate Spot: Up to 70% discount
|
||
|
|
# Use for fault-tolerant workloads
|
||
|
|
|
||
|
|
ECS Service:
|
||
|
|
CapacityProviderStrategy:
|
||
|
|
- CapacityProvider: FARGATE_SPOT
|
||
|
|
Weight: 4
|
||
|
|
- CapacityProvider: FARGATE
|
||
|
|
Weight: 1
|
||
|
|
Base: 2 # Minimum on-demand tasks
|
||
|
|
```
|
||
|
|
|
||
|
|
**Right-Size Container Resources**
|
||
|
|
```yaml
|
||
|
|
# Analyze actual usage with Container Insights
|
||
|
|
resources:
|
||
|
|
requests:
|
||
|
|
memory: "256Mi" # Based on p95 usage + 20% buffer
|
||
|
|
cpu: "100m" # Based on p95 usage + 20% buffer
|
||
|
|
limits:
|
||
|
|
memory: "512Mi" # 2x requests for burst
|
||
|
|
cpu: "500m"
|
||
|
|
```
|
||
|
|
|
||
|
|
## Cost Allocation and Tagging
|
||
|
|
|
||
|
|
### Tagging Strategy
|
||
|
|
|
||
|
|
**Required Tags**
|
||
|
|
```yaml
|
||
|
|
# Terraform - enforce tags
|
||
|
|
variable "required_tags" {
|
||
|
|
default = {
|
||
|
|
environment = "prod"
|
||
|
|
cost-center = "engineering"
|
||
|
|
owner = "platform-team"
|
||
|
|
project = "api-gateway"
|
||
|
|
managed-by = "terraform"
|
||
|
|
}
|
||
|
|
}
|
||
|
|
|
||
|
|
resource "aws_instance" "example" {
|
||
|
|
ami = data.aws_ami.latest.id
|
||
|
|
instance_type = "t3.medium"
|
||
|
|
tags = var.required_tags
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
**Tag Enforcement**
|
||
|
|
```json
|
||
|
|
// AWS SCP - Deny untagged resources
|
||
|
|
{
|
||
|
|
"Version": "2012-10-17",
|
||
|
|
"Statement": [
|
||
|
|
{
|
||
|
|
"Sid": "DenyUntaggedEC2",
|
||
|
|
"Effect": "Deny",
|
||
|
|
"Action": "ec2:RunInstances",
|
||
|
|
"Resource": "arn:aws:ec2:*:*:instance/*",
|
||
|
|
"Condition": {
|
||
|
|
"Null": {
|
||
|
|
"aws:RequestTag/cost-center": "true"
|
||
|
|
}
|
||
|
|
}
|
||
|
|
}
|
||
|
|
]
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
### Cost Allocation Reports
|
||
|
|
|
||
|
|
**AWS Cost and Usage Report**
|
||
|
|
```bash
|
||
|
|
# Enable detailed billing reports
|
||
|
|
aws cur put-report-definition \
|
||
|
|
--report-definition '{
|
||
|
|
"ReportName": "detailed-cost-report",
|
||
|
|
"TimeUnit": "HOURLY",
|
||
|
|
"Format": "Parquet",
|
||
|
|
"Compression": "Parquet",
|
||
|
|
"S3Bucket": "my-billing-bucket",
|
||
|
|
"S3Region": "us-east-1",
|
||
|
|
"AdditionalArtifacts": ["ATHENA"]
|
||
|
|
}'
|
||
|
|
```
|
||
|
|
|
||
|
|
**Athena Queries for Analysis**
|
||
|
|
```sql
|
||
|
|
-- Cost by service and tag
|
||
|
|
SELECT
|
||
|
|
line_item_product_code as service,
|
||
|
|
resource_tags_user_cost_center as cost_center,
|
||
|
|
SUM(line_item_unblended_cost) as cost
|
||
|
|
FROM cost_report
|
||
|
|
WHERE month = '2024-01'
|
||
|
|
GROUP BY 1, 2
|
||
|
|
ORDER BY 3 DESC;
|
||
|
|
|
||
|
|
-- Unused Reserved Instances
|
||
|
|
SELECT
|
||
|
|
reservation_reservation_a_r_n,
|
||
|
|
reservation_unused_quantity,
|
||
|
|
reservation_unused_normalized_unit_quantity
|
||
|
|
FROM cost_report
|
||
|
|
WHERE reservation_unused_quantity > 0;
|
||
|
|
```
|
||
|
|
|
||
|
|
## Automation and Governance
|
||
|
|
|
||
|
|
### Automated Cost Controls
|
||
|
|
|
||
|
|
**AWS Budgets with Actions**
|
||
|
|
```yaml
|
||
|
|
# CloudFormation - Budget with auto-stop
|
||
|
|
Resources:
|
||
|
|
MonthlyCostBudget:
|
||
|
|
Type: AWS::Budgets::Budget
|
||
|
|
Properties:
|
||
|
|
Budget:
|
||
|
|
BudgetName: monthly-cost-limit
|
||
|
|
BudgetLimit:
|
||
|
|
Amount: 10000
|
||
|
|
Unit: USD
|
||
|
|
TimeUnit: MONTHLY
|
||
|
|
BudgetType: COST
|
||
|
|
NotificationsWithSubscribers:
|
||
|
|
- Notification:
|
||
|
|
NotificationType: ACTUAL
|
||
|
|
ComparisonOperator: GREATER_THAN
|
||
|
|
Threshold: 80
|
||
|
|
Subscribers:
|
||
|
|
- SubscriptionType: EMAIL
|
||
|
|
Address: finance@company.com
|
||
|
|
```
|
||
|
|
|
||
|
|
**Scheduled Scaling (Dev/Test)**
|
||
|
|
```yaml
|
||
|
|
# Stop non-prod resources nights/weekends
|
||
|
|
Resources:
|
||
|
|
ScaleDownSchedule:
|
||
|
|
Type: AWS::AutoScaling::ScheduledAction
|
||
|
|
Properties:
|
||
|
|
AutoScalingGroupName: !Ref DevASG
|
||
|
|
DesiredCapacity: 0
|
||
|
|
Recurrence: "0 20 * * MON-FRI" # 8 PM weekdays
|
||
|
|
|
||
|
|
ScaleUpSchedule:
|
||
|
|
Type: AWS::AutoScaling::ScheduledAction
|
||
|
|
Properties:
|
||
|
|
AutoScalingGroupName: !Ref DevASG
|
||
|
|
DesiredCapacity: 3
|
||
|
|
Recurrence: "0 8 * * MON-FRI" # 8 AM weekdays
|
||
|
|
```
|
||
|
|
|
||
|
|
### Cost Anomaly Detection
|
||
|
|
|
||
|
|
**AWS Cost Anomaly Detection**
|
||
|
|
```bash
|
||
|
|
# Create anomaly monitor
|
||
|
|
aws ce create-anomaly-monitor \
|
||
|
|
--anomaly-monitor '{
|
||
|
|
"MonitorName": "ServiceMonitor",
|
||
|
|
"MonitorType": "DIMENSIONAL",
|
||
|
|
"MonitorDimension": "SERVICE"
|
||
|
|
}'
|
||
|
|
|
||
|
|
# Create anomaly subscription
|
||
|
|
aws ce create-anomaly-subscription \
|
||
|
|
--anomaly-subscription '{
|
||
|
|
"SubscriptionName": "CostAlerts",
|
||
|
|
"MonitorArnList": ["arn:aws:ce::123456789:anomalymonitor/abc123"],
|
||
|
|
"Subscribers": [{"Type": "EMAIL", "Address": "alerts@company.com"}],
|
||
|
|
"Threshold": 100
|
||
|
|
}'
|
||
|
|
```
|
||
|
|
|
||
|
|
## Cost Metrics and KPIs
|
||
|
|
|
||
|
|
### Key Metrics
|
||
|
|
|
||
|
|
| Metric | Formula | Target |
|
||
|
|
|--------|---------|--------|
|
||
|
|
| Unit Cost | Total Cost / Business Metric | Decreasing |
|
||
|
|
| Coverage | Reserved Hours / Total Hours | >70% |
|
||
|
|
| Utilization | Used Reserved Hours / Purchased | >80% |
|
||
|
|
| Waste | Idle Resource Cost / Total Cost | <10% |
|
||
|
|
| Forecast Accuracy | Actual / Forecasted | 90-110% |
|
||
|
|
|
||
|
|
### Dashboard Example
|
||
|
|
|
||
|
|
```sql
|
||
|
|
-- Cost efficiency dashboard metrics
|
||
|
|
WITH metrics AS (
|
||
|
|
SELECT
|
||
|
|
date_trunc('month', usage_date) as month,
|
||
|
|
SUM(cost) as total_cost,
|
||
|
|
SUM(CASE WHEN reservation_arn IS NOT NULL THEN cost END) as reserved_cost,
|
||
|
|
COUNT(DISTINCT user_id) as active_users
|
||
|
|
FROM cloud_costs
|
||
|
|
GROUP BY 1
|
||
|
|
)
|
||
|
|
SELECT
|
||
|
|
month,
|
||
|
|
total_cost,
|
||
|
|
reserved_cost / total_cost as reservation_coverage,
|
||
|
|
total_cost / active_users as cost_per_user
|
||
|
|
FROM metrics;
|
||
|
|
```
|
||
|
|
|
||
|
|
## Quick Wins Checklist
|
||
|
|
|
||
|
|
**Immediate Savings (This Week)**
|
||
|
|
- [ ] Delete unused EBS volumes and snapshots
|
||
|
|
- [ ] Terminate stopped EC2 instances not needed
|
||
|
|
- [ ] Remove unused Elastic IPs
|
||
|
|
- [ ] Delete unused load balancers
|
||
|
|
- [ ] Review and delete old AMIs
|
||
|
|
|
||
|
|
**Short-Term (This Month)**
|
||
|
|
- [ ] Right-size underutilized instances
|
||
|
|
- [ ] Migrate gp2 volumes to gp3
|
||
|
|
- [ ] Implement S3 lifecycle policies
|
||
|
|
- [ ] Enable S3 Intelligent-Tiering
|
||
|
|
- [ ] Schedule dev/test environments
|
||
|
|
|
||
|
|
**Medium-Term (This Quarter)**
|
||
|
|
- [ ] Purchase Savings Plans for baseline
|
||
|
|
- [ ] Implement Spot for fault-tolerant workloads
|
||
|
|
- [ ] Set up cost allocation tags
|
||
|
|
- [ ] Enable Cost Anomaly Detection
|
||
|
|
- [ ] Establish FinOps practices
|