Risk Radar

The Risk Radar continuously analyses your infrastructure topology to detect architectural risks that are invisible when looking at individual resources in isolation. Unlike security checks that evaluate resource configurations, the Risk Radar examines how resources relate to each other, how they are distributed, and whether your architecture follows resilience and efficiency best practices.

After every scan, the Risk Radar evaluates your dependency graph and produces a prioritised list of risks, each with a severity level, affected resources, and recommended fix actions.

The Nine Risk Types

Guardian Pro's Risk Radar detects nine distinct categories of architectural risk. Each represents a different class of problem that can affect your infrastructure's reliability, efficiency, or operational maturity.

1. Single Point of Failure

Severity: Critical or High

A single point of failure (SPOF) exists when a resource has multiple dependents but lacks redundancy. If this resource fails, all dependent services are affected with no failover path.

What the Risk Radar looks for:

Resources with a high number of dependent services and no redundant counterpart
Critical infrastructure components (databases, load balancers, NAT gateways) deployed without high-availability configuration
Services that form a bottleneck in your dependency chain

Example: A single NAT Gateway serving multiple private subnets across an application tier. If the NAT Gateway fails, all outbound internet traffic from those subnets stops.

Recommended actions:

Deploy redundant instances across multiple availability zones
Enable multi-AZ configurations for databases and critical services
Add load balancing and auto-scaling to compute tiers

warning

Single points of failure are the most common cause of service outages in cloud environments. Addressing SPOF risks should be a top priority, especially for production workloads.

2. Availability Zone Concentration

Severity: High

AZ concentration occurs when a disproportionate share of your resources are deployed in a single availability zone. This creates a geographic single point of failure -- if that AZ experiences an outage, the majority of your infrastructure is affected.

What the Risk Radar looks for:

Imbalanced resource distribution across availability zones
Critical workloads running entirely within a single AZ
Subnets with resources concentrated in one AZ

Recommended actions:

Distribute resources more evenly across availability zones
Use auto-scaling groups with multi-AZ configurations
Deploy databases with multi-AZ standby replicas

3. High Blast Radius

Severity: Critical or High

A high blast radius risk is flagged when a single resource's failure would cascade to affect a large percentage of your total infrastructure. This is different from a simple SPOF -- it specifically measures the breadth of impact.

What the Risk Radar looks for:

Resources whose failure would cascade through the dependency graph to affect a significant portion of your infrastructure
Shared services (VPCs, security groups, IAM roles) that many resources depend on
Tightly coupled architectures where failures propagate widely

Recommended actions:

Introduce isolation boundaries (separate VPCs, independent service tiers)
Implement circuit breakers and graceful degradation patterns
Reduce coupling between service tiers

tip

Use the Failure Simulator to see the exact cascade path for any high-blast-radius resource. This helps you understand the specific failure chain and design targeted mitigations.

4. Missing Redundancy

Severity: High or Medium

Missing redundancy is detected when resources that should have high-availability configurations are deployed without them. This differs from SPOF in that it focuses on individual resource configuration rather than dependency relationships.

What the Risk Radar looks for:

Compute instances running without auto-scaling or load balancing
Databases deployed in single-AZ mode without read replicas
Services without health checks or failover configurations
Workloads without backup or disaster recovery configurations

Recommended actions:

Enable multi-AZ deployment for databases
Place compute behind load balancers with auto-scaling
Configure health checks and automated failover
Implement backup and recovery policies

5. Consolidation Opportunity

Severity: Low or Medium

Consolidation opportunities arise when your infrastructure contains redundant, orphaned, or duplicate resources that could be consolidated to reduce complexity and cost.

What the Risk Radar looks for:

Orphaned resources not attached to any active workload (unused Elastic IPs, detached EBS volumes, idle load balancers)
Duplicate security groups with overlapping rules
Resources that serve the same purpose and could be merged
Unused or underutilised infrastructure components

Recommended actions:

Remove orphaned and unused resources
Consolidate duplicate security groups
Merge redundant infrastructure where possible

info

Consolidation opportunities often align with cost optimisation recommendations in the Cost Intelligence module. Addressing them improves both your architecture cleanliness and your AWS bill.

6. Network Topology Gap

Severity: High or Medium

Network topology gaps are detected when your network architecture is missing essential components or contains overly permissive configurations that create security or reliability risks.

What the Risk Radar looks for:

Private subnets without NAT gateway access where outbound connectivity is expected
Security groups with overly broad ingress rules
Missing network segmentation between application tiers
VPCs without flow logging enabled

Recommended actions:

Deploy NAT gateways for private subnets that need outbound access
Tighten security group rules to follow least-privilege principles
Implement proper network segmentation between tiers
Enable VPC flow logs for visibility and compliance

7. Containerisation Candidate

Severity: Low

This risk type identifies workloads that may benefit from modernisation to container-based architectures. It is an informational finding designed to highlight modernisation opportunities.

What the Risk Radar looks for:

Multiple EC2 instances running similar workloads without container orchestration
Workloads that would benefit from the density, portability, and scaling advantages of containers
Environments that have not adopted modern compute patterns

Recommended actions:

Evaluate workloads for container compatibility
Consider migrating to ECS or EKS for improved density and scaling
Use the Infrastructure Wizard to generate container-ready infrastructure templates

8. Underutilised Resource

Severity: Medium or Low

Underutilised resources are flagged when the Risk Radar identifies resources that are over-provisioned relative to their actual usage, enriched with additional context from the dependency graph.

What the Risk Radar looks for:

Compute instances with consistently low CPU and memory utilisation
Databases with minimal connection counts and low query volume
Storage volumes with low I/O activity
Resources that are provisioned for peak load but consistently idle

Recommended actions:

Rightsize instances to match actual workload requirements
Consider switching to burstable instance types for variable workloads
Migrate to newer-generation storage types for better cost efficiency
Review whether the resource is still needed

9. Template Update Needed

Severity: Medium

This risk is specific to infrastructure managed by CloudFormation. It is flagged when a resource's live configuration has drifted from its CloudFormation template -- typically because a manual change was applied directly (either through the console, CLI, or through Guardian Pro's automated remediation).

What the Risk Radar looks for:

Resources that belong to a CloudFormation stack but whose current configuration no longer matches the template
Manual remediations applied to CloudFormation-managed resources
Template drift that could cause unexpected behaviour during the next stack update

Recommended actions:

Update the CloudFormation template to reflect the current desired state
Use the IaC Governance page to view corrected templates
Redeploy the updated template to bring the stack back into sync

warning

If a CloudFormation template is not updated after a manual change, the next stack update could revert the change and reintroduce the original issue. Always update your templates to reflect the intended state.

Risk Severity Levels

Each detected risk is assigned a severity level based on its potential impact:

Severity	Meaning
Critical	Could cause widespread service disruption. Address immediately.
High	Significant resilience or security gap. Address in the near term.
Medium	A best-practice deviation. Plan for remediation.
Low	An optimisation opportunity or informational finding.

Fix Actions

Every risk includes a recommended fix action. Guardian Pro supports three types of fix actions:

Action Type	Description
Auto-remediate	Guardian Pro can apply the fix automatically. Review the preview, confirm, and the change is applied.
Guided steps	Step-by-step instructions with the specific AWS actions needed to resolve the risk.
View guide	A detailed explanation and guidance for addressing the risk, typically for complex changes that require planning.

Risk Lifecycle

Risks are detected during each scan and follow a lifecycle:

Detected -- The risk is identified during analysis and appears in the Risk Radar.
Active -- The risk remains present across subsequent scans.
Resolved -- The underlying condition is no longer detected. The risk is marked as resolved and moves to history.

Risks are automatically re-evaluated on every scan. If you remediate the underlying issue, the risk will be marked as resolved in the next scan cycle.

Viewing and Filtering Risks

The Risk Radar page allows you to filter and sort risks by:

Severity -- Focus on critical and high risks first
Risk type -- View all risks of a specific category
Service -- Filter by AWS service
Status -- View active risks, resolved risks, or both

Clicking on any risk opens a detail view showing the affected resources, the root cause analysis, and the recommended fix action.

Next Steps

Failure Simulator -- Simulate the impact of a resource failure to understand blast radius.
Architecture Map -- Visualise the topology that the Risk Radar analyses.
Action Centre -- View architectural risks alongside security findings and cost recommendations.
IaC Governance -- Address Template Update Needed risks with corrected templates.

The Nine Risk Types​

1. Single Point of Failure​

2. Availability Zone Concentration​

3. High Blast Radius​

4. Missing Redundancy​

5. Consolidation Opportunity​

6. Network Topology Gap​

7. Containerisation Candidate​

8. Underutilised Resource​

9. Template Update Needed​

Risk Severity Levels​

Fix Actions​

Risk Lifecycle​

Viewing and Filtering Risks​

Next Steps​

The Nine Risk Types

1. Single Point of Failure

2. Availability Zone Concentration

3. High Blast Radius

4. Missing Redundancy

5. Consolidation Opportunity

6. Network Topology Gap

7. Containerisation Candidate

8. Underutilised Resource

9. Template Update Needed

Risk Severity Levels

Fix Actions

Risk Lifecycle

Viewing and Filtering Risks

Next Steps