Cloud Foundations Functional Area Runbooks CatalogΒΆ
For CxO and engineering leads: Copy one of the two tabs below for each functional area to either execute a runbooks command directly OR invoke an autonomous
/adlcagent to coordinate the gap-close action.
1. Business Continuity β Org-Wide Backup CoverageΒΆ
The Challenge: The majority of accounts lack backup coverage. AWS Backup policy is enabled at the root OU but not propagated. This is the #1 board risk.
The Value: One org backup policy write β 100% account coverage. A single policy, not a multi-account project.
User Story & 5W1H: - Why: Unrecoverable accounts represent existential risk. A ransomware event or region outage on an unprotected account can be catastrophic. - What if missing: Next incident, the CSO cannot confirm recoverable backups exist. Board escalation. Regulatory finding (SOCI Act / AESCSF / NZISM). - Business value: One-time policy write enables automatic vault creation in all new and migrated accounts forever. - Purpose: Move from ad-hoc (a minority of accounts) to managed (every account). - Critical thinking: The policy is already enabled at root; the gap is propagation to child OUs. Cloud Architect must verify OU structure, then SCP + Backup policy at the correct scope.
2. Operations β Patch Manager BaselineΒΆ
The Challenge: Compliance is critically below target for patch-compliant account-rule instances. This is the highest operational-risk amplifier.
The Value: Patch Manager baseline in Account Factory β 80%+ compliance gate in 12 weeks.
User Story & 5W1H: - Why: Critical compliance gaps mean unpatched instances are the fastest attack vector. A zero-day on one unpatched instance can pivot across the organization. - What if missing: Next security incident, the CTO cannot prove a patch baseline existed. Breach attribution becomes harder. Regulatory finding. - Business value: Continuous patch signals reduce incident-response friction and lower MTTR when a real CVE lands. - Purpose: Move from manual (minimal rule-instance compliance) to managed (baseline automated, exceptions by exception). - Critical thinking: Patch Manager is not backup; it requires Systems Manager agent and a maintenance window. Cloud Architect must align with change-control windows; SRE must own the baseline definition.
3. Finance β Tag-Enforcement for Cost AttributionΒΆ
The Challenge: The majority of monthly spend is unattributed. The CFO cannot chargeback to business units.
The Value: Tag-enforcement SCP + account-vending tag policy β permanent showback clarity + 5β8% optimization opportunity.
# Check tag coverage across the organization
runbooks inventory tag-coverage --profile $AWS_OPERATIONS_PROFILE
# Show cost breakdown by service (with tag context)
runbooks finops dashboard --all --profile $AWS_BILLING_PROFILE
# Validate cost-allocation tags for completeness
runbooks finops check-config-compliance --profile $AWS_MANAGEMENT_PROFILE
User Story & 5W1H: - Why: Unattributed spend prevents chargebacks to teams. Business units see the bill but not their own consumption. FinOps and business alignment fail. - What if missing: Finance cannot answer "which team spent $40K this month?" Every quarter the CFO absorbs the cost without business owner visibility. - Business value: Tag-enforced chargeback prevents waste (teams control their spend when they see it), enables FinOps showback, and unblocks the 5β8% EC2-Other and RDS optimization that was hiding. - Purpose: Move from optional (25 tags in use, uneven) to mandatory (tag-enforcement SCP, 100% coverage). - Critical thinking: Tag policy is not a one-time write; it requires naming standard + SCP + remediation for untagged resources. Rollout must be phased (warn β block new β remediate old) to avoid breaking deployments.
4. Governance β OU-Level SCP InheritanceΒΆ
The Challenge: Config shows partial compliance (not enforced). The majority of SCPs attach per-account (managing drift on every change), only a minority at OU level.
The Value: OU-level inheritance + conformance packs β continuous-assured compliance.
User Story & 5W1H: - Why: Per-account SCP attachment means every account-create or team-onboard must re-apply SCPs. This is manual toil and a drift vector. Compliance becomes an audit finding ("this account lacks the region-lock SCP") instead of a design fact. - What if missing: Config rule-compliance remains critically low, drift accumulates, and every audit cycle requires remediation. Regulatory finding on continuous controls. - Business value: Move governance from reactive (audit β remediate) to continuous (policy β drift-detected β auto-remediate OR alert). - Purpose: Move from partial (per-account SCPs, critically low compliance) to managed (OU-level + conformance packs, fully compliant). - Critical thinking: Control Tower provides this as a starting point; LZA or native OU SCPs achieve it. Rollout order: (1) audit current per-account SCPs, (2) author OU-level equivalents, (3) test in dev OU, (4) deploy to prod OUs with a safe rollback window.
5. Security β GuardDuty 100% EnrollmentΒΆ
The Challenge: Nearly all accounts have threat detection enabled. A minority of accounts lack a GuardDuty detector. SecurityHub delegated-admin is misconfigured.
The Value: Control Tower enrollment closes the coverage gap to 100% automatically. Proof the foundation is sound.
# List all GuardDuty detectors and identify gaps
runbooks inventory list-guardduty-detectors --profile $AWS_MANAGEMENT_PROFILE
# Assess security posture across all frameworks
runbooks security assess --profile $AWS_MANAGEMENT_PROFILE
# Deploy GuardDuty to gap accounts
runbooks security deploy-guardduty --profile $AWS_MANAGEMENT_PROFILE
User Story & 5W1H: - Why: Coverage gaps mean blind spots on those accounts. A threat detected in most accounts but silent on a minority is a coverage hole the CSO cannot explain. - What if missing: Next audit, the CSO cannot confirm blanket GuardDuty coverage. Regulatory finding on monitoring. - Business value: 100% GuardDuty coverage + centralized SecurityHub means threat signals are comprehensive and actionable from a single pane. - Purpose: Close the coverage gap to achieve 100% enrollment. Prove the security foundation is managed and continuous, not ad-hoc. - Critical thinking: The coverage gap may be intentional (dev/sandbox without GuardDuty), or it may be oversight. CSO must audit the reason first, then deploy. SecurityHub delegated-admin misconfiguration must be fixed concurrently (ops account should be admin, not member).
6. Infrastructure β LZA Network BaselinesΒΆ
The Challenge: A large hand-rolled network estate exists with no drift detection or auto-inheritance for new accounts.
The Value: LZA network-config templates VPC/TGW as versioned drift-detected baselines.
# Analyze network topology and resource relationships
runbooks vpc analyze --profile $AWS_OPERATIONS_PROFILE
# Discover VPC and Transit Gateway configuration
runbooks vpc topology --profile $AWS_OPERATIONS_PROFILE
# Check for CloudFormation drift in network stacks
runbooks inventory find-cfn-drift --profile $AWS_OPERATIONS_PROFILE
# Validate overall Landing Zone configuration
runbooks inventory check-landingzone --profile $AWS_MANAGEMENT_PROFILE
User Story & 5W1H: - Why: A large hand-rolled network estate means many different ways to misconfigure subnets, routing, NAT, etc. A new account onboard requires manual VPC design (toil). Drift is invisible until a misconfigured subnet causes an incident. - What if missing: New accounts get hand-configured VPCs (slow, error-prone). Drift accumulates (a NAT removed on one account, security group rule changed on another). When audited, remediation is manual. - Business value: LZA or Terraform modules allow VPC/TGW to be versioned infrastructure. New account = template instantiation (seconds, no mistakes). Drift detection runs continuously. - Purpose: Move from hand-rolled (a large estate with no drift detection) to templated (LZA-managed baseline, drift-detected). - Critical thinking: This is the longest engagement of the six. VP-Infra must (1) decide LZA vs. Terraform, (2) design the template, (3) test on dev account, (4) migrate existing VPCs (or run in parallel), (5) sunset hand-rolled accounts. Expect 8β12 weeks.
Next StepsΒΆ
- Identify the functional area your role owns (BC=CSO/CEO, Ops=CTO, Finance=CFO, Governance=CDO, Security=CSO/CTO, Infrastructure=VP-Infra).
- Choose your execution path:
- Runbooks CLI β direct commands to assess and remediate today (ops teams).
- ADLC Prompt β paste into
/adlcto coordinate autonomous execution via agents (distributed teams, HITL review gates). - Engage your cloud architect to review the gap-close approach, estimate timeline, and manage dependencies with other areas.
For detailed command options, see Runbooks CLI Catalog.