Machines Can Learn the Inherent Complexity of Data

The following is from ai-one’s white paper on Machine Learning for Cyber Security at Network Speed & Scale.  Click here to download a copy from SlideShare (registration required).

Machines can learn like humans by understanding the inherent complexity of patterns and associations in data.

The goal of this paper is to inspire new ideas and invite collaboration to innovate new ways to protect large-scale cyber assets. Our central questions are:

  1. How will real-time, deep pattern recognition change cyber warfare?
  2. How will machine learning of byte-patterns impact the evolution of cyber attacks?
  3. How can machine learning systems protect large-scale networks?
  4. Can machine learning reduce the human capital and expenditures required to defend large scale networks?

Cyber security (defenses) of the US military, government and critical civilian infrastructure are inadequate. The US Department of Homeland’s “Cyberstorm III” drill in September 2010 demonstrated that private industry and government resources are unable to protect critical infrastructure from destruction from a well-orchestrated cyber attack.[1]American cyber defense has fallen far behind the technological capabilities of our adversaries [such]…that the number of cyber attacks is now so large and their sophistication so great that many organizations are having trouble determining which new threats and vulnerabilities pose the greatest risk.”[2]

This paper outlines a framework to improve US cyber defenses in a matter of months at very minimal cost with virtually no technological risk.

“America’s prosperity in the 21st century will depend on cyber security.”  
President Barak Obama, May 29, 2009

A new form of machine learning discovered by ai-one inc. has the potential to transform cyber warfare. This technology was made commercially available in June 2011. It is in use by Swiss law enforcement, a major European mobile network and under evaluation by more than 40 organizations worldwide.[3]

Large scale government and corporate networks are irresistible targets for cyber attacks – from hackers, hostile government agencies and malicious NGOs. These networks are fantastically complex. Each user, application, data source, sensor and control mechanism add value. Yet each of these components increases the threat surface for cyber attacks. Defending a network by simplifying network complexity is not an option. Taking functionality away from a network would be self-defeating. Moreover, the best networks use a blend of custom, commercial and open-source technologies – each presenting a new opportunity for attack. Thus, cyber security depends on understanding complexity – not simplifying it.

“All war presupposes human weakness and seeks to exploit it.”
Carl von Clausewitz in Vom Kriege (1832)

Current technologies using Computer programming – such as anti-malware software, firewalls and network appliances (such as IDPS) – are unable to detect the most catastrophic forms of zero-day attacks: incremental delivery of viruses, application hijacking, impersonation, insider conspiracies and cloaked DDoS.[4]

Representation of Heterarchy

Why? Computer programming is reductionist and prone to cognitive biases. First, programmers and analysts simplify threat profiles by categorizing them so they can be processed mathematically and logically using structured data. For example, they look for viruses and potential variations using fuzzy matching techniques. Simplifying the complexity of suspicious byte-patterns into mathematical models provides ample opportunities for attackers to “hide in the noise.” Secondly, programmers and analysts are human. They make mistakes. Moreover, they tend to repeat mistakes – so if you find one security hole, you can search for patterns that will lead you to others.

Cyber attackers know these weaknesses and exploit them by hiding within the noise of network complexity and discovering patterns of weaknesses. Deception and exploitation of predictable defensive patterns are the pillars of successful offensive cyber attacks.

Thus, current defenses are destined to fail against the next generation of zero-day cyber attacks (such as incremental viral insertion, MHOTCO and genetic algorithm intrusions).[5]

“All warfare is based on deception.”
The Art of War by Sun Tzu, 600 BC

New artificial intelligence technology that learns through detecting data heterarchies enables unprecedented levels of cyber security and countermeasures. Knowing the structure of data is the key to understanding its meaning. Machine learning using heterarchical pattern recognition reveals the relationships and associations between all bytes across an entire system (or network) – including overlaps, multiplicities, mixed ascendancies, and divergent-but-coexistent patterns. This approach is similar to how humans learn: We associate stimuli with patterns. For example, a child learns that the sound “dog” refers to the 65-pound, four-legged creature with soft fuzzy white hair. A computer would need to be programmed with a series of commands to know that dog refers to a specific creature – and is thus unable to recognize similarities that are not part of the predetermined definition of “dog” – such as a black 5-pound miniature poodle.

In June 2011, ai-one released a new machine learning application programming interface (API) that is a radical departure from traditional forms of artificial intelligence. The technology is a neural network that detects heterarchical byte-patterns and creates a dynamic descriptive associative network – called a lightweight ontology. This technology determines the meaning of data by evaluating the relationships between each byte, cluster of bytes, words, documents, and so on. Unlike other forms of artificial intelligence, ai-one’s approach:

  • Detects how each byte relates to another – including multiple paths, asynchronous relationships and multiple high-order co-occurrences.
  • Automatically generates an associative network (lightweight ontology) revealing all patterns and relationships – detecting anomalies within any portion of the data set.
  • Enables machine learning without human intervention.
  • Unbiased. Does not rely upon external ontologies or standards.
  • Learns associations upon data ingestion – so it is much faster than techniques that require recalculations, such as COStf-idf (a vector space model approach). [6], [7]
  • Non-redundant. Each byte pattern is stored only once. This has the effect of compressing data while increasing pattern recognition speed.
  • Spawning cells. The underlying cell structure in the neural network is autonomic; generating cells as they are needed as they are stimulated by sensors (during data input).
  • Neural cells can be theoretically shared across other instances of the network.[8]
“Understanding ai-one requires an open mind – one that ignores what has been and embraces what is possible.”
Allan Terry, PhD, Former DARPA AI Scientist (Prime Contractor)

This technology has the potential to enable cyber security systems to detect, evaluate and counter threats by assessing anomalies within packets, byte-patterns, data traffic and user behaviors across the entire network. When placed into a matrix chipset, this technology can theoretically evaluate every byte across the entire network in real time with exabytes (1018) of capacity using a combination of sliding windows, high performance computing (HPC) and hardware accelerators.

As such, we will present how this technology has the potential to revolutionize cyber security by supporting each of the “Five Pillars” framework defined by the US Military for cyberwarfare:[9], [10]

Cyberwarfare Pillar Potential Roles for Machine Learning

Cyber domain is similar to other elements in battlespace.

  • Transparency to command & control of emerging threats
  • Unbiased detection & analysis of threats by detecting anomalies
  • Empower human analysts with actionable intelligence

Proactive defenses

  • Constant real-time monitoring of every packet across network
  • Near instant recognition of anomalies within packet payload or communication frames

Protection of critical infrastructure

  • Enhance intrusion detection and protection systems (IDPS) with real-time libraries & heuristic approximations of potential threats

Collective defense

  • Early detection & instant response across entire network
  • Enable counter-counter-measures, trapping, etc.

Maintain advantage of technological change

  • Early adoption of technology with accelerating rate of returns (1st mover advantage).

 

The next generation of cyber security attacks will be deadly in their subtly: They can remain undetected until it is too late to prevent catastrophic loss of data, connectivity and/or malicious manipulation of sensitive information. Such attacks can collapse key infrastructure systems such as power grids, communications networks, financial systems and national security assets.

The advantages of machine learning as a first line of defense against zero-day attacks include:

  • Force multiplication – enabling fewer human analysts to indentify, thwart and counter far greater numbers of attacks than programmatic approaches.
  • Evolutionary advantage – enabling cyber defenses to preempt threat adaptations by detecting any change within byte patterns.
  • Battlespace awareness – providing network security analysts with situational awareness by identifying and classifying byte pattern mutations.
  • Proactive defenses – Constant monitoring of the entire threat surface to detect any patterns of vulnerability before they can be exploited by the enemy.
Rueters Cyberattack Snapshot


[1] US GAO report, “CYBERSECURITY: Continued Attention Needed to Protect Our Nation’s Critical Infrastructure.” Statement  of  Gregory C.  Wilshusen, Director, Information Security Issues, July 26, 2011.

[2] The Lipman Report, “Threats to the Information Highway: CyberWarfare, Cyber Terrorism and Cyber Crime.” October 15, 2010, p.1.

[3] Bundeskriminalamt (German equivalent to the US FBI) built a shoe print recognition system that is in use at three major Swiss CSI labs. ai-one is restricted from advertising or using the name of customers as part of licensing and non-disclosure agreements.  

[4] Zero-day attacks refer to threats to networks that exploit vulnerabilities that are unknown to administrators and/or cyber security applications and appliances. Zero-day exploits include detection of security holes that are used or shared by attackers before the network detects the vulnerability.

[5] See Appendix for “Worst Case Scenario” that describes possible MHOTCO attack.

[6] COStf-idf is an approach to determine the relevance of a term in any given corpus.

[7] For a more extensive comparison see: Reimer, U., Maier, E., Streit, S., Diggelmann, T., Hoffleisch, M., Learning a Lightweight Ontology for Semantic Retrieval in Patient-Centered Information Systems. In International Journal of Knowledge Management, 7(3), 11-26, (July-September 2011)

[8] ai-one internal research project scheduled for mid-2012.

[10] For purposes of this paper, the requirements of large multi-national corporations (such as Goldman-Sachs, Google, Exxon, etc.) are substantially similar to those of government agencies (such as DoD, DHS, NSA, etc.).

Tags: , , , , ,

Comments are closed.