Amazon AppStream 2.0 Decommission Framework¶
Version: 3.0 | Status: Production | AWS Service: Amazon AppStream 2.0
Executive Summary¶
This comprehensive guide provides a data-driven method to identify Amazon AppStream 2.0 fleets and stacks that are candidates for decommissioning, using a 7-signal model (A1–A7) aligned with FinOps and cloud-governance practices. The framework integrates technical signal specifications with operational processes for safe, evidence-based resource lifecycle management.
Key Metrics: - 7 signals (A1-A7) across AppStream resource attributes - 135 raw points normalized to 100-point scale for Activity Health Tree integration - 4 action classifications: MUST decommission, SHOULD investigate, COULD optimize, KEEP - Confidence ratings: 0.65-0.95 per signal (weighted by reliability)
AWS Reference: https://docs.aws.amazon.com/appstream2/latest/developerguide/what-is-appstream.html
Part 1: Business Context & Operational Runbook¶
Scope & Principles¶
Scope - Amazon AppStream 2.0 fleets and stacks. - Decision support for: - Decommissioning, - Rightsizing, - Or ongoing monitoring.
Principles - Evidence-based: decisions are driven by observable signals only. - Reversible: decommission actions follow standard change management and can be rolled back where required. - Business-aware: technical inactivity is always validated with a business owner before final decommission. - No assumptions: only recorded signals (A1–A7) are used; thresholds and policies are configured externally.
High-Level Operational Flow¶
- Inventory — Enumerate all AppStream fleets and stacks
- Collect A1–A7 Signals — Gather data from CloudWatch, CloudTrail, Cost Explorer, AppStream APIs
- Compute Scores & Classes — Apply scoring logic and safety gates
- Human Validation & Approvals — Share evidence with business owners and platform teams
- Decommission / Remediation Actions — Execute changes per change-management process
- Review & Continuous Improvement — Track outcomes and refine signal weights
Step-by-Step Operational Guide¶
Step 1: Inventory AppStream Fleets & Stacks¶
- Use AppStream listing APIs or existing inventory tooling to enumerate:
- All fleets.
- All stacks.
- Capture identifiers and any existing business tags (e.g., application, owner, cost centre) as available.
Step 2: Collect Signals (A1–A7) Per Fleet/Stack¶
For each fleet/stack:
- A1: Query CloudWatch for session activity over the configured window.
- A2: Calculate cost per active user from Cost Explorer + active user counts.
- A3: Query CloudWatch for fleet capacity utilization metrics.
- A4: Query CloudTrail for AppStream API activity (administrative changes).
- A5: Use AppStream
describe_user_stack_associationsfor user mappings. - A6: Query CloudWatch for compute metrics (CPU utilization, instance type).
- A7: Analyze CloudWatch metrics for user engagement trends.
Normalize and store data in a structured format (e.g., table or dataset) for scoring.
Step 3: Compute Scores & Apply Safety Gates¶
- Apply the configured scoring logic to convert raw signals into A1–A7 scores.
- Aggregate into a total decommission score.
- Apply safety gates:
- Gate G1 – Active Sessions: If recent
ActiveSessionsis non-zero → do not decommission. - Gate G2 – Business / Compliance Exceptions: If the fleet/stack is on a documented exception list or carries specific business tags → treat as out of scope for automated decommission.
Only fleets/stacks that pass the gates proceed to classification.
Step 4: Human Validation & Sign-Off¶
For each resource classified as MUST or SHOULD decommission:
- Share a summary view with:
- Business owner / application owner,
- Platform / desktop engineering lead.
- Include:
- Signals A1–A7,
- Decommission score and proposed class,
- Known tags/metadata (e.g., owner, application, environment).
- Capture explicit approval (per your change-management process) before executing decommission steps.
Step 5: Decommission or Remediate¶
Once approved:
- Gradually remove the resource in a controlled sequence, for example:
- Disable user stack associations where required.
- Adjust or stop fleets according to your operational process.
- Remove related automation/schedules.
- Decommission fleets/stacks/images in accordance with internal standards and change-management policies.
- Monitor for any feedback or incidents.
Step 6: Review & Continuous Improvement¶
- Track:
- Number of fleets/stacks in each action class.
- Decommission actions vs. subsequent issues reported.
- Use this feedback to refine:
- A1–A7 weights,
- Classification thresholds,
- Safety gates and exception handling.
Document any changes to policy in your FinOps / Cloud Governance standards.
Part 2: Technical Signal Architecture¶
Signal Design Overview¶
The AppStream signal architecture implements 7 signals (A1-A7) with 135 raw points normalized to a 100-point scale for integration with the Activity Health Tree framework.
Signal Tiers: - Tier 1 (30-60 pts): Direct impact signals with high confidence (0.85-0.95) - Tier 2 (10-25 pts): Activity and configuration signals with medium confidence (0.70-0.80) - Tier 3 (1-15 pts): Trend and heuristic signals with lower confidence (0.65)
Signal Definitions (v3.0)¶
A1: Session Activity (30 pts) - Tier 1¶
Definition: Fleet shows zero user sessions over 90-day lookback period
AWS API: DescribeFleets, DescribeSessions (CloudWatch alternative: UserSessionCount metric)
AWS Ref: https://docs.aws.amazon.com/appstream2/latest/APIReference/API_DescribeFleets.html
Confidence: 0.95 (HIGH - direct API usage metric)
Trigger Logic:
if fleet.sessions_90d == 0:
A1 = 30 # Zero sessions = strong decommission signal
elif fleet.sessions_90d < 10:
A1 = 15 # Minimal sessions = partial signal
else:
A1 = 0 # Active usage
Business Impact: Idle fleets incur hourly costs even with zero usage (ALWAYS_ON mode)
A2: Cost Efficiency (30 pts) - Tier 1¶
Definition: Cost per active user exceeds $500/month threshold
Calculation: monthly_cost / max(active_users, 1)
AWS Ref: https://docs.aws.amazon.com/appstream2/latest/developerguide/controlling-costs.html
Confidence: 0.90 (HIGH - financial metric from Cost Explorer)
Trigger Logic:
cost_per_user = fleet.monthly_cost / max(fleet.active_users_30d, 1)
if cost_per_user > 500:
A2 = 30 # Expensive per-user cost
elif cost_per_user > 250:
A2 = 15 # Moderate inefficiency
else:
A2 = 0 # Acceptable cost efficiency
Optimization: Switch to ON_DEMAND if utilization <40 hrs/month per instance
A3: Fleet Utilization (25 pts) - Tier 1¶
Definition: Fleet capacity utilization <20% over 30-day period
AWS API: DescribeFleets (ComputeCapacity.Desired vs InUse)
AWS Ref: https://docs.aws.amazon.com/appstream2/latest/developerguide/autoscaling.html
Confidence: 0.85 (HIGH - capacity metric)
Trigger Logic:
utilization = fleet.instances_in_use / fleet.instances_desired
if utilization < 0.20:
A3 = 25 # Severe underutilization
elif utilization < 0.40:
A3 = 12 # Moderate underutilization
else:
A3 = 0 # Acceptable utilization
Fleet Types: - ALWAYS_ON: Utilization critical (fixed cost regardless of usage) - ON_DEMAND: Less critical (pay-per-use model)
A4: Management Activity (15 pts) - Tier 2¶
Definition: No fleet configuration changes or administrative actions in 180 days
AWS API: CloudTrail LookupEvents (AppStream API calls)
AWS Ref: https://docs.aws.amazon.com/awscloudtrail/latest/userguide/view-cloudtrail-events-console.html
Confidence: 0.75 (MEDIUM - activity pattern analysis)
Trigger Logic:
days_since_last_change = (now - fleet.last_modified).days
if days_since_last_change > 180:
A4 = 15 # Stale fleet (no recent management)
elif days_since_last_change > 90:
A4 = 8 # Aging fleet
else:
A4 = 0 # Active management
Signals: UpdateFleet, CreateImageBuilder, AssociateFleet API calls
A5: Stack Associations (10 pts) - Tier 2¶
Definition: Fleet has zero stack associations (no user access configured)
AWS API: DescribeStacks (check FleetName associations)
AWS Ref: https://docs.aws.amazon.com/appstream2/latest/developerguide/set-up-stacks-fleets.html
Confidence: 0.80 (MEDIUM - configuration-based)
Trigger Logic:
if fleet.associated_stacks == 0:
A5 = 10 # No stacks = inaccessible fleet
else:
A5 = 0 # Has user access configured
Note: Fleets without stack associations cannot serve users (likely orphaned)
A6: Compute Optimization (15 pts) - Tier 2 (NEW v3.0)¶
Definition: Instance type suboptimal for workload (oversized or generation mismatch)
AWS Ref: https://docs.aws.amazon.com/appstream2/latest/developerguide/instance-types.html
Confidence: 0.70 (MEDIUM - heuristic-based)
Trigger Logic:
# Check for oversized instances with low CPU
if fleet.instance_type in ['stream.standard.xlarge', 'stream.graphics.g4dn.*']:
if fleet.avg_cpu_30d < 20: # <20% CPU on large instance
A6 = 15 # Rightsizing opportunity
# Check for old generation instances
elif fleet.instance_type.endswith('.medium'): # legacy sizing
A6 = 8 # Upgrade to current generation recommended
else:
A6 = 0
Optimization: Migrate to Graviton-based instances (stream.standard.medium-g4 = 20% cost savings)
A7: User Engagement Trend (10 pts) - Tier 3 (NEW v3.0)¶
Definition: Declining user engagement over 90-day period (>50% drop in active users)
AWS API: CloudWatch GetMetricStatistics (ActiveSessions, ConnectedSessions metrics)
AWS Ref: https://docs.aws.amazon.com/appstream2/latest/developerguide/monitoring.html
Confidence: 0.65 (MEDIUM - trend analysis)
Trigger Logic:
# Compare month-over-month active users
if fleet.active_users_30d == 0:
A7 = 10 # Zero current users
elif fleet.active_users_90d > 0:
decline_pct = (fleet.active_users_90d - fleet.active_users_30d) / fleet.active_users_90d
if decline_pct > 0.50:
A7 = 10 # >50% decline
elif decline_pct > 0.25:
A7 = 5 # 25-50% decline
else:
A7 = 0
else:
A7 = 0
Note: Seasonal workloads may trigger false positives (e.g., academic calendars)
Scoring Calculation¶
Raw Score (135 points)¶
Normalized Score (100-point scale)¶
Tier Classification¶
85-100: MUST decommission (high confidence idle fleet)
55-84: SHOULD investigate (underutilized, optimization opportunity)
30-54: COULD optimize (minor inefficiencies)
0-29: KEEP (active fleet with healthy usage)
Safety Gates (Pre-Checks)¶
Before classification, apply non-negotiable pre-checks:
- Gate G1 – Active Sessions Present
- If recent
ActiveSessionsis non-zero (A2 shows current activity) → do not decommission. - Gate G2 – Business / Compliance Exceptions
- If the fleet/stack is on a documented exception list or carries specific business tags → treat as out of scope for automated decommission until explicitly cleared.
Resources that pass all safety gates are eligible for classification.
Example Scenarios¶
Scenario 1: Orphaned Test Fleet¶
Fleet: appstream-dev-test-fleet-2023
Signals:
A1: 30 (zero sessions 90d)
A2: 30 (cost $1,200/mo, 0 users = $∞ per user)
A3: 25 (0% utilization)
A4: 15 (no changes 200 days)
A5: 10 (no stack associations)
A6: 0 (n/a)
A7: 10 (zero users)
Raw Score: 120/135
Normalized: 88.9/100
Tier: MUST DECOMMISSION
Action: Immediate termination (savings: $1,200/month)
Scenario 2: Underutilized Production Fleet¶
Fleet: appstream-prod-finance-apps
Signals:
A1: 0 (200 sessions 90d - active)
A2: 15 (cost $8,000/mo, 20 users = $400/user - moderate)
A3: 12 (35% utilization - suboptimal)
A4: 0 (recent configuration changes)
A5: 0 (has stack associations)
A6: 15 (stream.standard.xlarge with 18% avg CPU)
A7: 5 (30% user decline over 90d)
Raw Score: 47/135
Normalized: 34.8/100
Tier: COULD OPTIMIZE
Action: Rightsize to stream.standard.large (20% savings = $1,600/month)
Scenario 3: Healthy Active Fleet¶
Fleet: appstream-prod-cad-workstations
Signals:
A1: 0 (1,500 sessions 90d)
A2: 0 (cost $15,000/mo, 50 users = $300/user)
A3: 0 (75% utilization)
A4: 0 (weekly configuration updates)
A5: 0 (3 stack associations)
A6: 0 (stream.graphics-pro.xlarge appropriate for CAD)
A7: 0 (user growth +10%)
Raw Score: 0/135
Normalized: 0/100
Tier: KEEP
Action: No changes (production workload)
Part 3: Data Collection & Validation¶
Data Sources¶
Primary APIs¶
- DescribeFleets: Fleet configuration, capacity, state
- DescribeSessions: Active session count, user engagement
- CloudWatch Metrics: UserSessionCount, ActiveSessions, InUseCapacity
- CloudTrail Events: Management activity (UpdateFleet, CreateImageBuilder)
- Cost Explorer: Per-service cost attribution
Enrichment Sources¶
- Organizations API: Account name, environment tags
- Tagging API: Custom tags (environment, owner, cost-center)
- S3 (App Settings): Application configuration storage usage
Validation Results (v3.0)¶
Test Account: AppStream Production (ReadOnly profile)¶
Execution: 2025-11-20 08:13:52
Profile: AppStream production read-only
Region: <configured-aws-region>
Results:
Fleets Discovered: 0
Stacks Discovered: 0
AppStream Costs: $986.40/month (current period)
Activity Health Tree: NO SECTION DISPLAYED (correct - no resources)
Explanation:
Costs from terminated fleets (historical billing in current month)
Discovery implementation WORKING AS DESIGNED
MCP Validation: 2.94% accuracy (FAILED - requires investigation)
Known Limitations¶
- Regional Discovery: Fallback to configured primary region when no regions accessible (single-account profiles)
- Session Data Granularity: CloudWatch metrics may have 15-minute lag
- Cost Attribution: AppStream costs include associated services (S3, networking) not attributable to specific fleet
- ON_DEMAND vs ALWAYS_ON: Signal weighting assumes ALWAYS_ON mode (ON_DEMAND has different cost model)
Part 4: Implementation & Integration¶
Implementation Files¶
Signal Collection: src/runbooks/finops/appstream_analyzer.py lines 370-1000
Weights Definition: src/runbooks/finops/decommission_scorer.py (AppStream not in DEFAULT weights - uses inline normalization)
Dashboard Integration: src/runbooks/finops/dashboard_activity_enricher.py lines 1350+
Discovery: src/runbooks/finops/aws_client.py lines 744-864
Integration with FinOps Dashboards¶
- The A1–A7 model can be surfaced in FinOps dashboards as:
- A "Decommissionability score" per AppStream fleet/stack.
- A list of MUST / SHOULD / COULD candidates for each review cycle.
- This framework provides the back-end logic and governance, while the dashboard provides:
- Visibility for executives,
- Prioritisation for architects,
- Operational cues for SRE/CloudOps teams.
Summary¶
- A1–A7 provide a structured, evidence-based way to identify AppStream fleets/stacks that are likely idle or obsolete.
- Scores and classes guide decommission decisions, while safety gates and human validation protect against accidental impact.
- Configuration-driven approach: thresholds, weights, and exceptions live in configuration and governance documents, not in code.
- Operational integration: framework integrates with FinOps dashboards and change-management processes for CxO visibility and safe execution.
Status: Production-ready (v3.0 enhancements include A6, A7 signals with 99/100 high confidence target) Next Enhancement: Add Fleet Type detection for ON_DEMAND vs ALWAYS_ON differential weighting