To effectively combat pervasive cyberthreats like ransomware, organizations need to centralize their detection and response efforts. Oftentimes, they will turn to an endpoint detection and response (EDR) tool or managed detection and response (MDR) to achieve this. EDR/MDR solutions combine real-time continuous monitoring and collection of endpoint data with automated response capabilities, such as quarantining an infected device or blocking specific IP addresses. So what does this look like in practice?
A user is tricked into doing something that compromises their machine. This may have been through a malicious link that was shared, a weaponized email attachment masquerading as an invoice that needs to be paid or plugging in that handy new USB storage device given out by that nice fellow at the trade show who told the user that they could use it to encrypt the user’s crypto wallet.
Now the user is compromised, and the EDR/MDR has detected that low-level changes have been made to the device, or the device is now communicating with a well-known DNS that was published in the user’s latest threat intel report. EDR/MDR now takes action by isolating the compromised machine from the network and revokes the user’s access from the domain or VPN. This is great. The user has now stopped the spread of ransomware or stopped an attacker from gaining access to their network at the cost of impacting an employee or group of employees from getting that end of quarter report done. This happens every day in information technology (IT) security, and it’s a trade-off that is well accepted.
Why don’t traditional EDR/MDR tools work in OT systems?
However, for critical infrastructure organizations relying heavily on operational technology (OT) to power their business, EDR and MDR are generally not a good fit because the technology is too intrusive for sensitive industrial control system endpoints. The level of control that an EDR tool has over an endpoint can potentially impact processes that are too critical to shut off in an OT system, such as a safety mechanism or critical power generation device.
Let’s say another scenario plays out, but this time in an OT environment. A company has good controls in place: things like checking email or use of removable media on OT systems is well-controlled and managed. It’s now Friday afternoon, and a user has personnel from their original equipment manufacturer (OEM) onsite to update a bug – a memory leak causing the safety system to restart occasionally.
The updates are going well, but the vendor drops the approved removable media, and it falls through the drainage grate onto the manufacturing floor never to be seen again. To continue working, the vendor grabs a backup removable media that he used at the last customer site who has the same system because he knows the software update is on there. He plugs it in, uses it and accidently infects the safety system with command-and-control software. The EDR/MDR solution isolates the safety system from the network, which means it is no longer able to monitor for unsafe operating conditions, creating a safety hazard for the organization. This is a prime example of an IT-based response that is unfit for OT.
Alternatives to OT
A promising alternative to EDR/MDR solutions for OT environments is using an OT-specific asset management tool to monitor for suspicious activity on an endpoint and integrating that data into a centralized machine learning engine within a security information and event management (SIEM), such as Splunk. Security teams can then use common data models for behavioral detection efforts, while also being able to customize response workflows for OT systems in their SOAR tool as an alternative to more intrusive EDR/MDR solutions.
So, let’s take the same OT scenario again. This time, instead of using an MDR/EDR solution, the company leverages data from the OT asset management system and combines it with the power of a SIEM that also has SOAR technology. The OT vendor uses the tainted removable media and infects the safety system with command-and-control software.
The OT asset management system detects the change and shares the alert along with deep context about that endpoint, such as criticality and function, with the SIEM, which then triggers an OT response workflow with the SOAR platform, alerting a security operations center (SOC) analyst of the compromise, as well as the asset owner in OT. The SOC analyst looks at the network data and can see the C&C server attempting to phone home, but that firewall is blocking the traffic. The analyst then digs into the asset information to ensure no other unauthorized changes have occurred. In parallel, the OT asset owner safely transitions to a back-up safety system and works with the vendor to restore the primary system to a known good state.