Using Machine Learning to Ward Off Zero-Day Attacks on the Internet of Things

How can we possibly defend against the wave of unprecedented new cyberattacks that are almost certainly going to be targeted against the Internet of Things (IoT)?

Zero-day vulnerabilities will continue to intensify across the IoT. A zero-day attack exploits vulnerabilities of which the providers of impacted infrastructure may not be aware, or at the very least, for which they and cybersecurity vendors may have no prebuilt defenses. These vulnerabilities, which almost certainly are abound in the IoT, make users sitting ducks for whatever attacks that can exploit them until patches and other defenses become available.

Why is the IoT a potential rat’s nest of zero-day vulnerabilities? As a fog-computing infrastructure in which endpoints continue to expand, the IoT presents a phenomenally huge attack surface. This refers to the sum total of exploitable entry points for malware, intrusions, and advanced persistent threats. These vulnerabilities derive from the inherently complex, dynamic, distributed, heterogeneous, and innovative environment that the IoT represents. In addition, the IoT increasingly pins its intelligence on a deepening stack of cognitive-computing, deep learning, and machine learning algorithms, all of which, as I discussed in this recent post, present a large attack surface in their own right.

As the IoT becomes a mass consumer phenomenon, zero-day attacks might potentially go well beyond hacks into your computers, online accounts, and private data. They might even endanger your physical security. As you add IoT sensor/actuator “smarts” to your home, car, office, and physical person, your entire existence may become acutely vulnerable to tracking, stalking, and attack in ways that you cannot anticipate.

To mitigate the threat from zero-day and other cyberattacks, the IoT needs multi-layered security safeguards in each of the following domains:

  • Endpoint security: Safeguards must be built into the things themselves. Key enablers for those would be standardized IoT interfaces that incorporate robust security and attack-protections safeguards in the development of IoT products; leverage widely vetted open security standards; and embed modular security-aware hardware and software designs. Ideally, the endpoints should have enough embedded analytic intelligence to ward off attacks unassisted by gateways and cloud-based protection mechanisms. This latter capability is especially important in scenarios where endpoints have limited or nonexistent connectivity to infrastructure-resident safeguards. Facilitating all of this should be a fog-based, big-data repository of identity of things, profiles, configurations, privileges, histories, and other metadata that will prove essential for endpoint security.
  • Interaction security: Safeguards must be built into the middleware and protocols that govern how IoT endpoints and other nodes interact with users, local and remote applications, cloud and other infrastructures, and each other. These safeguards must at the very least leverage identity, authentication, access control, message encryption, transport-level security, key exchange, digital signatures, de-identification, privacy, intrusion detection, alerting, auditing, monitoring, malware prevention, DDoS prevention, and other infrastructure services in IoT environments. They should rely on analytic and trend-analysis tools to identify threats across the entire IoT cloud under your purview. It should also facilitate monitoring and logging of IoT security-relevant events. Furthermore, they should embed and continuously update analytics algorithms that detect various security issues, predict and preempt attacks, and automatically alert, escalate, and log all priority issues. In addition, they should escalate exceptional, unprecedented, and undiagnosed IoT issues to human security analysts for further investigation. Supporting all of this should be a big data security incident & event management repository, which will be the foundation for logging IoT events for predictive, real-time, & historical analysis.
  • IoT ecosystem security: Security vulnerabilities may be introduced anywhere in the constellation of solution providers, service businesses, certification authorities, and others who build, deploy, test, manage, and vouch for the endpoints and infrastructures. Safeguards must at the very least assemble the compliance, legal, contractual, trust, reputation, governance, operational, and risk management frameworks to handle the interlocking responsibilities of all these parties to ensure end-to-end IoT security. They should enable us to inspect, certify, vet, monitor, and audit the suppliers of IoT components and life-cycle services for conformance to generally accepted IoT security practices. Furthermore, they should implement strong authentication, permission management, content encryption, tamper proofing, and other technical safeguards to prevent unauthorized parties in the value chain from gaining access to sensitive data. Supporting all this should be a big-data ecosystem registry for tracking all value-chain parties who are in contact to security-sensitive things throughout their life cycles.

Implicit in this security architecture is the need for cognitive-computing algorithms to sift through massive, never-ending streams of IoT data in order to detect, predict, and prevent zero-day and other attacks before they can do damage. In that regard, I came across a recent Network World article that discusses how machine learning algorithms can mitigate these risk throughout pervasive deployment across the distributed cloud infrastructures. Specifically, it focuses on zero-day vulnerabilities from a new generation of advanced persistent threats, which it refers to as “the most sophisticated mutations of viruses and malware.”

What I found most interesting is the article’s discussion of the advantages of machine learning—vis-à-vis established approaches such as anti-malware signatures and heuristics-as a tool for identifying dynamic, unprecedented cyberattack patterns exhibited in zero-day events.

At heart, the approach uses supervised learning in which the behavior of any file type—known or unknown—is classified algorithmically on the fly as malicious or benign, based on historical patterns extracted from training data. These patterns, encoded in predictive machine-learning algorithms, describe the features, variables, and other parameters typically associated with cyberattacks. The predictive patterns, which would be used to sniff and neutralize unprecedented new attacks, may not correspond to any specific prior-attack signature, hence their utility in identifying zero-day threats. They also may not correspond to specific heuristic rules that were distilled from past attacks, hence their value in new attacks that exhibit dynamic, evolving features.

If these predictive machine learning algorithms can’t be embedded into IoT endpoints, they wouldn’t be much use in intermittently connected scenarios where the edges need to protect themselves autonomously in real-time from the zero-day attacks. That’s why I took great interest in the discussion of how the algorithms might be compressed so that they can be embedded in increasingly resource-constrained endpoints. That dovetails with my recent discussions of the need for standard frameworks for embedded deployment of cognitive analytics and other containerized microservices to IoT endpoints.

As IoT data, application, and infrastructure functionality gets distributed out to and embedded in the edges, security protections of this sort needs to be integrated as well. Without endpoint-embedded cognitive security, the IoT attack surface will grow astronomically, as will the corresponding vulnerabilities.

This article was originally published on James Kobielus’ LinkedIn . To learn more about James, connect with him on LinkedIn, follow him on Twitter, or see his posts on the IBM Big Data & Analytics Hub blog

Share this Post

Leave a Reply

Your email address will not be published. Required fields are marked *