The Annual Pen Test is Dying. Continuous Offensive Security is now required for the cloud.

For decades, annual penetration tests have been the cornerstone of enterprise security assurance. Schedule the engagement in Q3, receive the report in Q4, remediate over the winter, and repeat. This is probably very familiar. It is also dangerously out of step with the way modern cloud environments actually operate.

Gartner® March 2026 research report, The Future of Pen Testing Is Continuous Offensive Security Testing, states:

“Modern environments now change faster than traditional penetration testing (pen testing) can validate. Rapid cloud deployment, identity changes, API evolution, and AI‑enabled threats are producing exposure windows that periodic, point‑in‑time tests can no longer cover. As a result, cybersecurity leaders must shift to continuous offensive security testing (COST), a trigger‑driven, intelligence-led model that activates validation when material risk changes, not when the calendar dictates.”

The problem is structural. Cloud environments change continuously. CloudDevOps is in constant motion: new workloads are deployed, IAM permissions are modified, APIs are exposed, and architectural configurations drift from their intended state. An annual penetration test captures a single moment in time. Between that moment and the next scheduled test, the environment changes hundreds or thousands of times. Each change is a potential new exposure. Each unvalidated exposure is a window an adversary can exploit or worse, weaponize.

The traditional model was designed for a world where infrastructure changed slowly and attackers operated on human timescales. Neither of those conditions holds today. AI-driven adversaries move faster than any quarterly review cycle, and cloud environments evolve faster than any scoping document can track. The result is a growing gap between the security posture organizations believe they have and the one they actually have.

Continuous Offensive Security Testing

Gartner states:

“COST is an operating model for offensive security that validates an organization’s security defenses through trigger-driven, adversary-based testing as environments and threats change.”

In our view, the key phrase is trigger-driven. COST does not wait for the calendar. It activates when something material changes, a new deployment, a configuration modification, a privilege escalation, a zero-day announcement, a threat intelligence spike. Validation is initiated in response to risk, not in response to a schedule.

COST unifies what have historically been separate, siloed activities like penetration testing, red teaming, bug bounty, and control validation, into a single, continuously operating capability. It blends automation, AI, and human adversarial reasoning to ensure that validation is fast, relevant, and aligned to the actual conditions attackers would exploit.

Gartner predicts: By 2028, over 60% of enterprise pen test programs will operate as continuous validation executed within DevSecOps pipelines and governed by CTEM, replacing annual assessments as the primary proof of resilience.

Why Cloud Needs Continuous Testing

The drivers behind COST are not abstract. They are the defining characteristics of modern cloud infrastructure.

The attack surface of a typical enterprise cloud environment is not static; it is a living system. New application deployments introduce new entry points. Identity and privilege configurations change with every onboarding, offboarding, and role modification. Multi-cloud architectural complexity creates blind spots that no single tool can fully illuminate. And AI-enabled attacker methodologies evolve on a weekly or even hourly basis, constantly probing for the gaps that emerge between validation cycles.

In Skyhawk’s opinion, in the cloud the conventional approach which consists of periodic testing, reliance on unverified assumptions about control efficacy, delayed confirmation of exploitable pathways, does not just underperform. It actively misleads. Security teams make resource allocation decisions based on findings that may be months out of date. Executives receive assurance reports that reflect a posture that no longer exists. This is exactly what adversaries identify as their opportunities, the very exploitable and weaponizable gap between the last test and the current state of the environment.

COST addresses this by treating validation as an operational function rather than a compliance exercise. Findings are not delivered in a report at the end of an engagement, they are fed directly into remediation workflows, CTEM mobilization processes, and SecOps response pipelines in real time. Gartner states:

“Cybersecurity leaders must shift from asking “When was this last tested?” to prioritizing “What has changed, and has it been validated?” This requires rethinking offensive security not as a series of discrete engagements, but as a continuous operating capability tightly aligned to risk signals, threat intelligence, and business change.”

Skyhawk’s View: How Skyhawk Security’s Platform Delivers Continuous Offensive Testing for the Cloud

Skyhawk Security was built to answer exactly the question COST demands: given the current state of your cloud environment, what is actually exploitable and/or weaponizable right now?

The platform delivers continuous offensive security testing through two foundational capabilities that work in concert. The first is the Digital Twin which is an AI-constructed, continuously updated simulation of the customer’s cloud environment that captures the full logical structure: IAM permission hierarchies, network topology, workload configurations, security control posture, and inter-service relationships. The Digital Twin is built from read-only API connections to AWS, Azure, and Google Cloud, requiring no agents and creating no production impact. It is not a one-for-one copy of your cloud, this would be prohibitively expensive. It is a digital representation, enabling Skyhawk gain accurate results, as if the testing were run in production! Critically, it updates in real time as the environment changes, every new deployment, every permission modification, every configuration drift is reflected immediately.

The second capability is the AI Red Team. It is a set of purpose-built deep learning agents, eight years in development and trained on thousands of adversarial TTPs, that continuously execute offensive intelligent simulations against the Digital Twin. These agents do not pattern-match against known vulnerability signatures. They reason adversarially: given the current state of this environment, what attack chains are viable? Which misconfigurations can be combined with which overprivileged identities to reach which crown jewel assets? It looks at what can be dynamically manipulated to gain access to your valuable cloud assets.

In our opinion, this is the COST model in practice. Every material change to the environment, whether it be a new workload, a modified IAM policy, a newly disclosed CVE, triggers a new simulation cycle. The AI Red Team re-evaluates the attack surface, identifies new or changed attack paths, and delivers findings directly into the customer’s security workflow. The exposure window between a change and its validation is measured in minutes, not months.

Gartner states:

“The future of pen testing is continuous, business-risk-driven, and guided by threat intelligence. Embedding continuous validation into operational workflows ensures that organizations proactively prioritize and address the real-world attacks that target the most critical and impactful assets.”

In our opinion, Skyhawk operationalizes this vision at cloud scale, without the scheduling overhead, scoping delays, or production risk of traditional offensive security engagements.

The output of every Skyhawk simulation cycle is Adversarial Exposure Validation: not a list of vulnerabilities ranked by CVSS score, but a confirmed, evidence-based answer to the question that matters: which of these findings can an adversary actually weaponize against your environment today to cause a breach? This is the finding that drives remediation priority. This is the signal that cuts through alert noise. And this is the assurance that annual pen testing, by its very nature, cannot provide.

This is also exactly why Skyhawk is needed in the age of Mythos. Check out this blog and webinar to learn more!

See it for yourself! Sign up for our free 30-day trial.

Want to go deeper? Read our differentiators whitepaper for the full picture, then book a meeting with our team to discuss your specific environment

Gartner subscribers can read the full report at https://www.gartner.com/document-reader/document/7558445

Gartner, The Future of Pen Testing is Continuous Offensive Security Testing by Dhivya Polle, Carlos De Sola Caraballo, Mitchell Schneider, published March 6, 2026.

GARTNER is a trademark of Gartner, Inc. and its affiliates.