Attack surface management: Six steps for success in OT/ICS

Courtesy: Brett Sayles

Attack surface management insights

  • Attack surface management is the act of identifying, prioritizing and minimizing threat vectors in a business environment.
  • The elements that ASM have are discovery, assessment, context, prioritization, remediation and maintenance.

Over the past two to three years, enterprises have realized the critical importance of attack surface management (ASM) to identify, prioritize and minimize the potential threat vectors in their environment. Besides the general growth in attacker activity, the largest driver of this need is because organizations’ attack surfaces have expanded so much in the past five years or so. And those “surfaces” are often unmapped and unknown — sort of like the “unknown” parts of the world prior to the Western explorers’ “discoveries.”

Cloud and software-as-a-service (SaaS) were the initial obvious causes of attack surface expansion — and were what drove the initial push to manage these unknown dominions.

However, as operational technology (OT) systems — industrial control systems (ICS), building controls, transportation controls, etc. — become more connected in the drive for greater efficiency and effectiveness of production, OT and internet of things (IoT) now become the “new frontier” of the attack surface management challenge. In fact, according to a 2021 survey of CISOs and senior cybersecurity leaders, the No. 1 challenge of current ASM initiatives is the identification and management of OT/IoT and other unknown systems.

Most attack surface management tools and approaches do not understand the technical complexities and operational requirements of these OT systems. But there is a way to effectively and efficiently conduct ASM in OT.

What is attack surface management?

Attack surface management is the continuous discovery, collection, assessment, classification, prioritization, remediation and monitoring of IT/OT/IoT assets. This may sound like traditional asset inventory or vulnerability management. However, ASM takes an “attacker” view of the challenge. This approach adds significant value to more traditional vulnerability management approaches because it helps to prioritize those risks that are most likely to create a threat from attackers. When done correctly, it allows an organization to prioritize their most critical exposures — not just based on a CVE score, but based on the true potential for a critical event from an attacker.

OT attack surface management includes 6 key elements:

  1. Discovery: The ability to see all “corners of the world” of your attack surface. This includes the discovery of unknown assets, unknown connectivity (both actual flows and potential flows due to misconfigured network devices), software, configurations, users, etc.
  2. Assessment: Identifying the risk of an asset based on a 360-degree risk assessment that includes all elements of the discovery — users and account access, network access, software and hardware vulnerabilities, missing patches, insecure configurations, etc.
  3. Context: This adds an overlay of criticality, usage, owners, etc. to create a risk profile of the asset as it relates to an attacker’s perspective.
  4. Prioritization: This is where ASM truly differs from vulnerability or inventory management. ASM prioritizes risks based on the attacker’s perspective using the above information. The eventual result is a risk score that takes into account the various elements to prioritize actions.
  5. Remediation: The consistent hardening of security directed by the prioritization in the prior step. This includes comprehensive actions such as network protection, patching, hardening, etc.
  6. Maintenance: Perhaps the hardest part of the entire process is the ongoing updating and regular reviewing of threat vectors to identify new risks and continually update current risks based on the remediation actions taken and new vulnerabilities identified.

OT attack surface management challenge

This six-step process sounds easy, but many organizations find difficulties at various steps along the journey. This is particularly the case for industrial organizations that have significant OT footprints. Many organizations have discovered that traditional vulnerability management or threat detection — especially in the OT world — creates resource burdens that are just not feasible.

In OT, the first challenge is just getting an accurate “map” of the attack surface. The traditional approach of manual or network-span port inventories just does not provide an accurate “map of the world,” so to speak. It misses assets, incorrectly identifies vulnerabilities and leaves an organization with no ability to immediately take remediating actions. Further, much of the OT threat detection creates huge volumes of alerts with little specific attack surface insight to prioritize those alerts.

Remediation is challenging because of the age of many systems and the inability to update those systems, therefore requiring a more holistic approach to the remediation actions. Finally, most organizations do not have a true “enterprise” view of their OT surface — the information is often stuck at the plant level, which makes resourcing and prioritization very challenging.

Recent attacks focused on IT that crossed over into OT systems are an example of this lack of true visibility. Ransomware is now the No. 1 concern of OT security practitioners, according to the 2021 SANS survey — it wasn’t in the top five two years ago.

Courtesy: Verve Industrial
Courtesy: Verve Industrial

That threat vector — coming through the information technology (IT) side and bridging into OT — is part of a company’s attack surface that is often not seen completely. Once through that connection, the attack surface within the OT environment usually has many “dark spots” where light doesn’t shine.

Mandiant’s research shows that 99% of all attacks start with and leverage the IT-type infrastructure that sits between IT and OT. These connections are often not well understood. In our own research, Verve finds that misconfigured firewalls, dual-NICs bridging networks, individual programmable logic controllers (PLCs) and other devices connected directly to the corporate network are present in almost every plant we assess. There is a lack of view of that surface.

Succeeding in OT ASM

Attack surface management is possible in OT, but it requires a fundamentally different approach than most organizations or ASM providers take today. There are six key steps to getting this right:

1) Successful discovery: Capturing endpoint and potential network connections accurately and comprehensively.

On the IT side of the house, if the team came to the chief information security officer (CISO) and said the only way to discover the attack surface is to gather manual inventories or observe network traffic, the team wouldn’t last long. IT uses scanning, agents, discovery tools, as well as manual and network approaches to capture a full picture of its attack surface. But in OT, because of the sensitivity of these devices, security leaders have been left with less than effective options.

There is an OT-safe and more effective alternative, however. Control systems engineers interact with their systems every day to program, backup and tune. These same techniques can be used to discover the full view of the attack surface. Using an endpoint-focused approach to asset inventory rather than a network approach allows for a much broader view into all of the corners of the map, but it also allows for a deeper view of each potential asset. This endpoint approach discovers not only that a device exists, but also all of its users, accounts, software, patch status, firmware versions, configuration status, possible (not just actual) network paths, anti-malware status, etc.

2) 360-degree assessment: A platform that allows for a comprehensive risk view.

An effective assessment must include a comprehensive view of the risk to each asset. That should take into account all of those findings from the complete discovery described above. A 360-degree view allows the organization to make appropriate trade-offs in risk priority.

Courtesy: Verve Industrial
Courtesy: Verve Industrial

3) Adding important context: Asset criticality and use is key to future prioritization.

As the above chart shows, a full 360-degree view needs to include important context about an asset, described as “asset criticality/impact” in the chart. To conduct the next step of prioritization effectively, the attack surface needs to include robust context from the criticality of the asset, its use, its network connections to other devices, etc. In some cases, organizations will have some of this data available from other efforts, such as disaster recovery analysis. But in others, this context needs to be created from the data provided based on the connections, software installed, etc.

4) Effective risk prioritization: Don’t get overwhelmed by all of the potential risks you see.

One of the biggest challenges in OT security is the number of risks found in many of these environments. In most cases, OT systems aren’t patched regularly, older devices run out-of-date firmware, the anti-malware status may not be regularly updated, etc. In most assessments, our platform identifies thousands of critical vulnerabilities. There’s no way an organization can get to all of this immediately.

Key to threat surface management in OT is to prioritize the risks to remediate first. These risks should begin with those most likely to be used by an attacker and to have significant impact on the environment. This should balance the exploitability of a vulnerability with the extent that the attack could spread to critical assets across the environment. The only way to do this effectively is to bring all of the data into a single database across original equipment manufacturer (OEM) systems, endpoint and network risks, operational impact, etc. An effective OT ASM platform needs to enable all of that.

We also would argue that in OT, a key to this prioritization is to integrate the database with human analysts that can help the organization bring insights from other entities and threat data to help prioritize. This “man and machine” approach offers the greatest source of prioritization.

5) Rapid and safe remediation management includes MANAGEMENT!

The discovery of a threat is irrelevant if you can’t respond rapidly and safely for OT. In IT, organizations will focus on weekly updating of patches, automated resetting of configurations, network access control to refuse connections from unknown assets, etc. In OT, however, many of these solutions can cause an operational impact on the processes you are trying to protect. As a result, in OT, many of the solutions to date have focused on detection only. They will focus on anomalous patterns of network behavior that you may want to address.

An effective OT ASM platform needs to enable both the prioritization of risks and the ability to immediately pivot to remediation in a way that’s efficient and safe for the processes. This requires an OT-safe MANAGEMENT platform that allows you to patch, harden configurations, remove unapproved software, remove or limit access for certain accounts or users, create network segmentation, etc. The most efficient way to do this is to integrate it into the ASM platform rather than rely on separate tools or manual efforts to conduct each of these different remediation actions.

The best way to think about this is what we call “Think Global: Act Local.” This architecture enables centralized analysis and prioritization of remediation actions but also ensures that when actions are actually executed, they are controlled by those closest to the process such as DCS engineers. This balances the need for efficiency and OT safety.

Courtesy: Verve Industrial
Courtesy: Verve Industrial

6) Efficient maintenance: Reduce labor requirements by 70-plus%.

The No. 1 challenge of OT security is resource constraints. Industrial operations personnel are already overwhelmed — even before the great resignation of the past couple of years. This stress on resources is amplified when you overlay security knowledge on top of this. This is seen in survey results such as the KPMG-CSAI survey of control systems security personnel shown below.

Courtesy: Verve Industrial
Courtesy: Verve Industrial

The “Think Global: Act Local” approach above also allows an organization to radically reduce the costs of maintaining the attack surface. We find that many organizations are relying on local site personnel to manage their OT security. This is just not a feasible approach — both for consistency as well as labor efficiency.




Keep your finger on the pulse of top industry news