Skip to content

4-Way Cross-Validation MethodologyΒΆ

🎯 Purpose

The cross-validation methodology ensures that inventory data is accurate before any decommission or cost-optimization action is taken. Four independent discovery layers are compared against each other; discrepancies are investigated before results are acted on.

Accuracy target: β‰₯99.5% MCP validation across financial and security layers.


1. The Four Validation LayersΒΆ

Layer Tool / API Scope Primary Signal
Layer 1 CLI Discovery β€” runbooks inventory workflow-multi-account (Resource Explorer) Per-region enumeration, all resource types Instance count, state, tags
Layer 2 Config Aggregator β€” aws configservice list-discovered-resources Org-wide, <10s for 50+ accounts Configuration history, compliance state
Layer 3 Cost Explorer β€” runbooks finops analyze-ec2 Financial validation, must run via us-east-1 endpoint 12-month cost trends, idle cost
Layer 4 MCP Cross-Validation β€” runbooks inventory validate-mcp API-to-API accuracy check Delta between layers, confidence score

1.1 Why Four Layers?ΒΆ

Each layer has a blind spot that another layer compensates for:

  • Resource Explorer discovers current state but has an eventual-consistency lag of up to 5 minutes after resource creation/deletion.
  • Config Aggregator provides historical configuration data and catches resources that Resource Explorer may miss due to indexing gaps in newly enabled regions.
  • Cost Explorer validates financial impact independently of the resource inventory β€” a resource with zero cost for 90 days is a decommission signal even if its state appears running.
  • MCP Cross-Validation compares the output of all three layers against a live AWS API call, producing a confidence score that gates the final recommendation.

2. Accuracy ThresholdsΒΆ

Validation Domain Pass Threshold Warn Threshold Fail Threshold Consequence of Fail
Inventory count (resource discovery) β‰₯95% 90–94% <90% Block decommission report; re-run discovery
Financial data (cost enrichment) β‰₯99.5% 97–99.4% <97% Block cost-based scoring; re-run Cost Explorer
Security posture (compliance state) β‰₯99.5% 97–99.4% <97% Block security findings; escalate to security team
MCP accuracy (cross-validation delta) β‰₯99.5% 95–99.4% <95% Block final report generation

2.1 Variance ThresholdsΒΆ

When Layer 1 and Layer 2 counts are compared:

Variance Status Action
≀1% PASS Proceed to scoring
1–5% WARN Investigate discrepancy; proceed with caution note in report
>5% FAIL Stop pipeline; run discrepancy resolution procedure (Section 5)

3. Quickstart CommandsΒΆ

Prerequisites: $AWS_PROFILE configured with read-only access.

Step 1 β€” Layer 1: CLI Discovery

runbooks inventory workflow-single-account \
  --ops-profile $AWS_PROFILE \
  --regions ap-southeast-2 \
  --export csv

Step 2 β€” Layer 3: Cost Validation (always us-east-1 for Cost Explorer)

runbooks finops analyze-ec2 \
  --profile $AWS_PROFILE \
  --export csv

Step 3 β€” Layer 4: MCP Cross-Validation

runbooks inventory validate-mcp \
  --ops-profile $AWS_PROFILE \
  --validation-level mcp

Step 4 β€” Review validation report

# Check the generated JSON validation report
cat data/outputs/mcp-validation-report.json | python3 -m json.tool

Prerequisites: Three profiles configured β€” $AWS_OPERATIONS_PROFILE, $AWS_MANAGEMENT_PROFILE, $AWS_BILLING_PROFILE.

Step 1 β€” Layer 1: CLI Discovery (Resource Explorer)

runbooks inventory workflow-multi-account \
  --ops-profile $AWS_OPERATIONS_PROFILE \
  --mgmt-profile $AWS_MANAGEMENT_PROFILE \
  --export csv

Step 2 β€” Layer 2: Config Aggregator (org-wide)

aws configservice list-discovered-resources \
  --resource-type AWS::EC2::Instance \
  --profile $AWS_MANAGEMENT_PROFILE \
  --region ap-southeast-2 \
  --include-deleted-resources false \
  --query 'resourceIdentifiers | length(@)'

Config Aggregator runs org-wide in <10s for 50+ accounts. Compare the count with Layer 1 output. Variance >5% triggers the discrepancy resolution procedure.

Step 3 β€” Layer 3: Cost Explorer Financial Validation

runbooks finops analyze-ec2 \
  --profile $AWS_BILLING_PROFILE \
  --export csv

Step 4 β€” Layer 4: MCP Cross-Validation

runbooks inventory validate-mcp \
  --ops-profile $AWS_OPERATIONS_PROFILE \
  --mgmt-profile $AWS_MANAGEMENT_PROFILE \
  --billing-profile $AWS_BILLING_PROFILE \
  --validation-level mcp

Step 5 β€” Decommission scoring (only after validation PASS)

runbooks inventory score-decommission \
  --input data/outputs/ec2-activity.csv \
  --resource-type ec2

4. Evidence ArtifactsΒΆ

Each pipeline run produces the following artifacts in data/outputs/:

Artifact Produced By Content Consumer
ec2-inventory.csv Layer 1 (CLI Discovery) Instance list with state, type, region, tags All personas
ec2-costs.csv Layer 3 (Cost Explorer) 12-month cost per instance, idle cost signal CFO, FinOps Lead
ec2-activity.csv Layer 4 (Activity enrichment) E1–E7 signals (CloudTrail, CloudWatch, SSM, Compute Optimizer) CloudOps Lead, CTO
ec2-scored.csv Scoring (after validation) Decommission score 0–100, MUST/SHOULD/COULD/KEEP tier CFO, FinOps Lead
mcp-validation-report.json Layer 4 (MCP Cross-Validation) Layer-by-layer delta, confidence score, accuracy % QA, Audit Teams
persona-report-{mode}.md Workflow pipeline Role-specific Markdown summary HITL (CFO/CTO/CloudOps/FinOps)

5. Discrepancy Resolution ProcedureΒΆ

When Layer 1 vs Layer 2 variance exceeds 5%:

Step 1 β€” Identify the gap

# Compare Resource Explorer count (Layer 1)
cat data/outputs/ec2-inventory.csv | wc -l

# Compare Config Aggregator count (Layer 2)
aws configservice list-discovered-resources \
  --resource-type AWS::EC2::Instance \
  --profile $AWS_MANAGEMENT_PROFILE \
  --region ap-southeast-2 \
  --query 'resourceIdentifiers | length(@)'

Step 2 β€” Check for recently created/terminated resources

Resource Explorer has up to 5 minutes eventual-consistency lag. If resources were created or terminated within the last 10 minutes, wait and re-run Layer 1.

# Check for recently modified resources
aws resourceexplorer2 search \
  --query-string "resourcetype:ec2:instance" \
  --profile $AWS_OPERATIONS_PROFILE \
  --region ap-southeast-2 \
  --query 'Resources[?LastReportedAt > `2024-01-01`].{Id:Arn,Last:LastReportedAt}'

Step 3 β€” Check for regions not indexed by Resource Explorer

aws resourceexplorer2 list-indexes \
  --profile $AWS_OPERATIONS_PROFILE \
  --region ap-southeast-2 \
  --query 'Indexes[*].{Region:Region,Type:Type,State:State}'

Regions not indexed (state ACTIVE) will appear in Config Aggregator but not in Layer 1.

Step 4 β€” Resolve and re-validate

Once the gap is explained, either:

  • Enable Resource Explorer indexing in missing regions, then re-run Layer 1; or
  • Document the known gap in the validation report with a justification note.

Step 5 β€” Accept with documented variance

If variance is 1–5% (WARN) and the root cause is understood (e.g., a terminated instance in Config history not yet purged), proceed by adding a --accept-variance note:

runbooks inventory validate-mcp \
  --ops-profile $AWS_OPERATIONS_PROFILE \
  --mgmt-profile $AWS_MANAGEMENT_PROFILE \
  --billing-profile $AWS_BILLING_PROFILE \
  --validation-level mcp \
  --variance-note "3 recently-terminated instances in Config history only"

6. Validation Quality GatesΒΆ

Before a decommission recommendation is sent to the HITL manager, all four gates must pass:

  • Layer 1 vs Layer 2 variance ≀5% (or variance documented with root cause)
  • Layer 3 financial accuracy β‰₯99.5% (MCP validation score)
  • Layer 4 MCP cross-validation score β‰₯99.5%
  • mcp-validation-report.json written to data/outputs/
  • Persona reports generated for all intended stakeholders
  • No FAIL status in any accuracy threshold column

Related pages: Quickstart | Persona Guides | Index