Skip to content

Cloud Foundations Functional Area Runbooks CatalogΒΆ

For CxO and engineering leads: Copy one of the two tabs below for each functional area to either execute a runbooks command directly OR invoke an autonomous /adlc agent to coordinate the gap-close action.


1. Business Continuity β€” Org-Wide Backup CoverageΒΆ

The Challenge: The majority of accounts lack backup coverage. AWS Backup policy is enabled at the root OU but not propagated. This is the #1 board risk.

The Value: One org backup policy write β†’ 100% account coverage. A single policy, not a multi-account project.

# Assess current backup posture across the organization
runbooks inventory collect --all --profile $AWS_OPERATIONS_PROFILE

# Check backup vault status by account  
runbooks cfat assess --profile $AWS_MANAGEMENT_PROFILE
/adlc bc-org-backup-policy with Business Continuity owner to enforce org-wide AWS Backup policy at OU level: as_a "CSO" i_want "every account to have at least one backup vault" so_that "RTO/RPO targets are measurable and cloud-resilient"

User Story & 5W1H: - Why: Unrecoverable accounts represent existential risk. A ransomware event or region outage on an unprotected account can be catastrophic. - What if missing: Next incident, the CSO cannot confirm recoverable backups exist. Board escalation. Regulatory finding (SOCI Act / AESCSF / NZISM). - Business value: One-time policy write enables automatic vault creation in all new and migrated accounts forever. - Purpose: Move from ad-hoc (a minority of accounts) to managed (every account). - Critical thinking: The policy is already enabled at root; the gap is propagation to child OUs. Cloud Architect must verify OU structure, then SCP + Backup policy at the correct scope.


2. Operations β€” Patch Manager BaselineΒΆ

The Challenge: Compliance is critically below target for patch-compliant account-rule instances. This is the highest operational-risk amplifier.

The Value: Patch Manager baseline in Account Factory β†’ 80%+ compliance gate in 12 weeks.

# Check patch compliance across all accounts
runbooks inventory enrich-ec2 --profile $AWS_OPERATIONS_PROFILE

# Get SSM agent health per instance
runbooks inventory ssm-status --instance-id <instance-id> --profile $AWS_OPERATIONS_PROFILE
/adlc ops-patch-baseline with CTO to deploy Patch Manager baseline at OU scope: as_a "CTO" i_want "all EC2 instances to be evaluated for patches monthly" so_that "a zero-day on any account does not spread uncontained"

User Story & 5W1H: - Why: Critical compliance gaps mean unpatched instances are the fastest attack vector. A zero-day on one unpatched instance can pivot across the organization. - What if missing: Next security incident, the CTO cannot prove a patch baseline existed. Breach attribution becomes harder. Regulatory finding. - Business value: Continuous patch signals reduce incident-response friction and lower MTTR when a real CVE lands. - Purpose: Move from manual (minimal rule-instance compliance) to managed (baseline automated, exceptions by exception). - Critical thinking: Patch Manager is not backup; it requires Systems Manager agent and a maintenance window. Cloud Architect must align with change-control windows; SRE must own the baseline definition.


3. Finance β€” Tag-Enforcement for Cost AttributionΒΆ

The Challenge: The majority of monthly spend is unattributed. The CFO cannot chargeback to business units.

The Value: Tag-enforcement SCP + account-vending tag policy β†’ permanent showback clarity + 5–8% optimization opportunity.

# Check tag coverage across the organization
runbooks inventory tag-coverage --profile $AWS_OPERATIONS_PROFILE

# Show cost breakdown by service (with tag context)
runbooks finops dashboard --all --profile $AWS_BILLING_PROFILE

# Validate cost-allocation tags for completeness
runbooks finops check-config-compliance --profile $AWS_MANAGEMENT_PROFILE
/adlc finance-tag-enforcement with CFO to author org-wide cost-allocation tag policy: as_a "CFO" i_want "every resource to carry project/cost-center/environment tags" so_that "I can see which business unit owns which dollar"

User Story & 5W1H: - Why: Unattributed spend prevents chargebacks to teams. Business units see the bill but not their own consumption. FinOps and business alignment fail. - What if missing: Finance cannot answer "which team spent $40K this month?" Every quarter the CFO absorbs the cost without business owner visibility. - Business value: Tag-enforced chargeback prevents waste (teams control their spend when they see it), enables FinOps showback, and unblocks the 5–8% EC2-Other and RDS optimization that was hiding. - Purpose: Move from optional (25 tags in use, uneven) to mandatory (tag-enforcement SCP, 100% coverage). - Critical thinking: Tag policy is not a one-time write; it requires naming standard + SCP + remediation for untagged resources. Rollout must be phased (warn β†’ block new β†’ remediate old) to avoid breaking deployments.


4. Governance β€” OU-Level SCP InheritanceΒΆ

The Challenge: Config shows partial compliance (not enforced). The majority of SCPs attach per-account (managing drift on every change), only a minority at OU level.

The Value: OU-level inheritance + conformance packs β†’ continuous-assured compliance.

# Assess current Config compliance and SCP effectiveness
runbooks cfat assess --profile $AWS_MANAGEMENT_PROFILE

# Check SCP coverage and OU attachment strategy
runbooks inventory collect --all --profile $AWS_MANAGEMENT_PROFILE
/adlc governance-ou-scp-inheritance with CDO to consolidate SCPs at OU boundary: as_a "CDO" i_want "all guardrails to flow from OUs (not per-account)" so_that "drift on every change is eliminated and governance becomes continuous"

User Story & 5W1H: - Why: Per-account SCP attachment means every account-create or team-onboard must re-apply SCPs. This is manual toil and a drift vector. Compliance becomes an audit finding ("this account lacks the region-lock SCP") instead of a design fact. - What if missing: Config rule-compliance remains critically low, drift accumulates, and every audit cycle requires remediation. Regulatory finding on continuous controls. - Business value: Move governance from reactive (audit β†’ remediate) to continuous (policy β†’ drift-detected β†’ auto-remediate OR alert). - Purpose: Move from partial (per-account SCPs, critically low compliance) to managed (OU-level + conformance packs, fully compliant). - Critical thinking: Control Tower provides this as a starting point; LZA or native OU SCPs achieve it. Rollout order: (1) audit current per-account SCPs, (2) author OU-level equivalents, (3) test in dev OU, (4) deploy to prod OUs with a safe rollback window.


5. Security β€” GuardDuty 100% EnrollmentΒΆ

The Challenge: Nearly all accounts have threat detection enabled. A minority of accounts lack a GuardDuty detector. SecurityHub delegated-admin is misconfigured.

The Value: Control Tower enrollment closes the coverage gap to 100% automatically. Proof the foundation is sound.

# List all GuardDuty detectors and identify gaps
runbooks inventory list-guardduty-detectors --profile $AWS_MANAGEMENT_PROFILE

# Assess security posture across all frameworks
runbooks security assess --profile $AWS_MANAGEMENT_PROFILE

# Deploy GuardDuty to gap accounts
runbooks security deploy-guardduty --profile $AWS_MANAGEMENT_PROFILE
/adlc security-guardduty-100 with CSO to close the GuardDuty coverage gap and formalise SecurityHub delegated-admin: as_a "CSO" i_want "every account to have an active GuardDuty detector AND SecurityHub to report to a single delegated admin" so_that "threat signals are comprehensive and not siloed"

User Story & 5W1H: - Why: Coverage gaps mean blind spots on those accounts. A threat detected in most accounts but silent on a minority is a coverage hole the CSO cannot explain. - What if missing: Next audit, the CSO cannot confirm blanket GuardDuty coverage. Regulatory finding on monitoring. - Business value: 100% GuardDuty coverage + centralized SecurityHub means threat signals are comprehensive and actionable from a single pane. - Purpose: Close the coverage gap to achieve 100% enrollment. Prove the security foundation is managed and continuous, not ad-hoc. - Critical thinking: The coverage gap may be intentional (dev/sandbox without GuardDuty), or it may be oversight. CSO must audit the reason first, then deploy. SecurityHub delegated-admin misconfiguration must be fixed concurrently (ops account should be admin, not member).


6. Infrastructure β€” LZA Network BaselinesΒΆ

The Challenge: A large hand-rolled network estate exists with no drift detection or auto-inheritance for new accounts.

The Value: LZA network-config templates VPC/TGW as versioned drift-detected baselines.

# Analyze network topology and resource relationships
runbooks vpc analyze --profile $AWS_OPERATIONS_PROFILE

# Discover VPC and Transit Gateway configuration
runbooks vpc topology --profile $AWS_OPERATIONS_PROFILE

# Check for CloudFormation drift in network stacks
runbooks inventory find-cfn-drift --profile $AWS_OPERATIONS_PROFILE

# Validate overall Landing Zone configuration
runbooks inventory check-landingzone --profile $AWS_MANAGEMENT_PROFILE
/adlc infra-lza-network-baseline with VP-Infra to adopt LZA for VPC/TGW templating and drift detection: as_a "VP-Infra" i_want "every new account to inherit a versioned VPC/TGW template (not re-define it)" so_that "network design is consistent and drift is detected in minutes, not quarters"

User Story & 5W1H: - Why: A large hand-rolled network estate means many different ways to misconfigure subnets, routing, NAT, etc. A new account onboard requires manual VPC design (toil). Drift is invisible until a misconfigured subnet causes an incident. - What if missing: New accounts get hand-configured VPCs (slow, error-prone). Drift accumulates (a NAT removed on one account, security group rule changed on another). When audited, remediation is manual. - Business value: LZA or Terraform modules allow VPC/TGW to be versioned infrastructure. New account = template instantiation (seconds, no mistakes). Drift detection runs continuously. - Purpose: Move from hand-rolled (a large estate with no drift detection) to templated (LZA-managed baseline, drift-detected). - Critical thinking: This is the longest engagement of the six. VP-Infra must (1) decide LZA vs. Terraform, (2) design the template, (3) test on dev account, (4) migrate existing VPCs (or run in parallel), (5) sunset hand-rolled accounts. Expect 8–12 weeks.


Next StepsΒΆ

  1. Identify the functional area your role owns (BC=CSO/CEO, Ops=CTO, Finance=CFO, Governance=CDO, Security=CSO/CTO, Infrastructure=VP-Infra).
  2. Choose your execution path: - Runbooks CLI β†’ direct commands to assess and remediate today (ops teams). - ADLC Prompt β†’ paste into /adlc to coordinate autonomous execution via agents (distributed teams, HITL review gates).
  3. Engage your cloud architect to review the gap-close approach, estimate timeline, and manage dependencies with other areas.

For detailed command options, see Runbooks CLI Catalog.