4-Way Cross-Validation MethodologyΒΆ
π― Purpose
The cross-validation methodology ensures that inventory data is accurate before any decommission or cost-optimization action is taken. Four independent discovery layers are compared against each other; discrepancies are investigated before results are acted on.
Accuracy target: β₯99.5% MCP validation across financial and security layers.
1. The Four Validation LayersΒΆ
| Layer | Tool / API | Scope | Primary Signal |
|---|---|---|---|
| Layer 1 | CLI Discovery β runbooks inventory workflow-multi-account (Resource Explorer) |
Per-region enumeration, all resource types | Instance count, state, tags |
| Layer 2 | Config Aggregator β aws configservice list-discovered-resources |
Org-wide, <10s for 50+ accounts | Configuration history, compliance state |
| Layer 3 | Cost Explorer β runbooks finops analyze-ec2 |
Financial validation, must run via us-east-1 endpoint |
12-month cost trends, idle cost |
| Layer 4 | MCP Cross-Validation β runbooks inventory validate-mcp |
API-to-API accuracy check | Delta between layers, confidence score |
1.1 Why Four Layers?ΒΆ
Each layer has a blind spot that another layer compensates for:
- Resource Explorer discovers current state but has an eventual-consistency lag of up to 5 minutes after resource creation/deletion.
- Config Aggregator provides historical configuration data and catches resources that Resource Explorer may miss due to indexing gaps in newly enabled regions.
- Cost Explorer validates financial impact independently of the resource inventory β a resource with zero cost for 90 days is a decommission signal even if its state appears
running. - MCP Cross-Validation compares the output of all three layers against a live AWS API call, producing a confidence score that gates the final recommendation.
2. Accuracy ThresholdsΒΆ
| Validation Domain | Pass Threshold | Warn Threshold | Fail Threshold | Consequence of Fail |
|---|---|---|---|---|
| Inventory count (resource discovery) | β₯95% | 90β94% | <90% | Block decommission report; re-run discovery |
| Financial data (cost enrichment) | β₯99.5% | 97β99.4% | <97% | Block cost-based scoring; re-run Cost Explorer |
| Security posture (compliance state) | β₯99.5% | 97β99.4% | <97% | Block security findings; escalate to security team |
| MCP accuracy (cross-validation delta) | β₯99.5% | 95β99.4% | <95% | Block final report generation |
2.1 Variance ThresholdsΒΆ
When Layer 1 and Layer 2 counts are compared:
| Variance | Status | Action |
|---|---|---|
| β€1% | PASS | Proceed to scoring |
| 1β5% | WARN | Investigate discrepancy; proceed with caution note in report |
| >5% | FAIL | Stop pipeline; run discrepancy resolution procedure (Section 5) |
3. Quickstart CommandsΒΆ
Prerequisites: $AWS_PROFILE configured with read-only access.
Step 1 β Layer 1: CLI Discovery
runbooks inventory workflow-single-account \
--ops-profile $AWS_PROFILE \
--regions ap-southeast-2 \
--export csv
Step 2 β Layer 3: Cost Validation (always us-east-1 for Cost Explorer)
Step 3 β Layer 4: MCP Cross-Validation
Step 4 β Review validation report
Prerequisites: Three profiles configured β $AWS_OPERATIONS_PROFILE, $AWS_MANAGEMENT_PROFILE, $AWS_BILLING_PROFILE.
Step 1 β Layer 1: CLI Discovery (Resource Explorer)
runbooks inventory workflow-multi-account \
--ops-profile $AWS_OPERATIONS_PROFILE \
--mgmt-profile $AWS_MANAGEMENT_PROFILE \
--export csv
Step 2 β Layer 2: Config Aggregator (org-wide)
aws configservice list-discovered-resources \
--resource-type AWS::EC2::Instance \
--profile $AWS_MANAGEMENT_PROFILE \
--region ap-southeast-2 \
--include-deleted-resources false \
--query 'resourceIdentifiers | length(@)'
Config Aggregator runs org-wide in <10s for 50+ accounts. Compare the count with Layer 1 output. Variance >5% triggers the discrepancy resolution procedure.
Step 3 β Layer 3: Cost Explorer Financial Validation
Step 4 β Layer 4: MCP Cross-Validation
runbooks inventory validate-mcp \
--ops-profile $AWS_OPERATIONS_PROFILE \
--mgmt-profile $AWS_MANAGEMENT_PROFILE \
--billing-profile $AWS_BILLING_PROFILE \
--validation-level mcp
Step 5 β Decommission scoring (only after validation PASS)
4. Evidence ArtifactsΒΆ
Each pipeline run produces the following artifacts in data/outputs/:
| Artifact | Produced By | Content | Consumer |
|---|---|---|---|
ec2-inventory.csv |
Layer 1 (CLI Discovery) | Instance list with state, type, region, tags | All personas |
ec2-costs.csv |
Layer 3 (Cost Explorer) | 12-month cost per instance, idle cost signal | CFO, FinOps Lead |
ec2-activity.csv |
Layer 4 (Activity enrichment) | E1βE7 signals (CloudTrail, CloudWatch, SSM, Compute Optimizer) | CloudOps Lead, CTO |
ec2-scored.csv |
Scoring (after validation) | Decommission score 0β100, MUST/SHOULD/COULD/KEEP tier | CFO, FinOps Lead |
mcp-validation-report.json |
Layer 4 (MCP Cross-Validation) | Layer-by-layer delta, confidence score, accuracy % | QA, Audit Teams |
persona-report-{mode}.md |
Workflow pipeline | Role-specific Markdown summary | HITL (CFO/CTO/CloudOps/FinOps) |
5. Discrepancy Resolution ProcedureΒΆ
When Layer 1 vs Layer 2 variance exceeds 5%:
Step 1 β Identify the gap
# Compare Resource Explorer count (Layer 1)
cat data/outputs/ec2-inventory.csv | wc -l
# Compare Config Aggregator count (Layer 2)
aws configservice list-discovered-resources \
--resource-type AWS::EC2::Instance \
--profile $AWS_MANAGEMENT_PROFILE \
--region ap-southeast-2 \
--query 'resourceIdentifiers | length(@)'
Step 2 β Check for recently created/terminated resources
Resource Explorer has up to 5 minutes eventual-consistency lag. If resources were created or terminated within the last 10 minutes, wait and re-run Layer 1.
# Check for recently modified resources
aws resourceexplorer2 search \
--query-string "resourcetype:ec2:instance" \
--profile $AWS_OPERATIONS_PROFILE \
--region ap-southeast-2 \
--query 'Resources[?LastReportedAt > `2024-01-01`].{Id:Arn,Last:LastReportedAt}'
Step 3 β Check for regions not indexed by Resource Explorer
aws resourceexplorer2 list-indexes \
--profile $AWS_OPERATIONS_PROFILE \
--region ap-southeast-2 \
--query 'Indexes[*].{Region:Region,Type:Type,State:State}'
Regions not indexed (state ACTIVE) will appear in Config Aggregator but not in Layer 1.
Step 4 β Resolve and re-validate
Once the gap is explained, either:
- Enable Resource Explorer indexing in missing regions, then re-run Layer 1; or
- Document the known gap in the validation report with a justification note.
Step 5 β Accept with documented variance
If variance is 1β5% (WARN) and the root cause is understood (e.g., a terminated instance in Config history not yet purged), proceed by adding a --accept-variance note:
runbooks inventory validate-mcp \
--ops-profile $AWS_OPERATIONS_PROFILE \
--mgmt-profile $AWS_MANAGEMENT_PROFILE \
--billing-profile $AWS_BILLING_PROFILE \
--validation-level mcp \
--variance-note "3 recently-terminated instances in Config history only"
6. Validation Quality GatesΒΆ
Before a decommission recommendation is sent to the HITL manager, all four gates must pass:
- Layer 1 vs Layer 2 variance β€5% (or variance documented with root cause)
- Layer 3 financial accuracy β₯99.5% (MCP validation score)
- Layer 4 MCP cross-validation score β₯99.5%
-
mcp-validation-report.jsonwritten todata/outputs/ - Persona reports generated for all intended stakeholders
- No
FAILstatus in any accuracy threshold column
Related pages: Quickstart | Persona Guides | Index