Posts Tagged ‘cyber security’

The Current State of Cyber Security is Fundamentally Flawed

Tuesday, October 18th, 2011

The following is from ai-one’s white paper on Machine Learning for Cyber Security at Network Speed & Scale.  Click here to download a copy from SlideShare (registration required).

A Call to Action

Our research indicates that cyber security is far worse than is commonly reported in news outlets. We estimate there is an extreme shortage of human capital with the skills necessary to thwart attacks from rapidly evolving, highly adaptive adversaries.[1], [2] Research for this paper includes publically available sources of information found on the Internet, interviews with network and software security experts and experts in artificial intelligence. In particular, we speculate on how machine learning might impact the security of large-scale (enterprise) networks from both offensive and defensive perspectives. In particular, we seek to find ways that machine learning might create and thwart zero-day attacks in networks deploying the most current security technologies, such as neural network enabled intrusion detection and protection system (IDPS), heuristic and fuzzy matching anti-malware software systems, distributed firewalls, and packet encryption technologies. Furthermore, we evaluate ways that adaptive adversaries might bypass application level security measures such as:

  • address space layout randomization (ASLR)
  • heap hardening
  • data execution prevention (DEP)

We conclude that machine learning provides first-mover advantages to both attackers and defenders. However, we find that the nature of machine learning’s ability to understand complexity provides the

Twitter DDOS

greater advantage to network defenses when deployed as part of a multi-layer defensive framework.

As networks grow in value they become exponentially more at risk to cyber attacks. Metcalfe’s Law states that the value of any network is proportional to the number of users.[3] From a practical standpoint, usability is proportional to functionality. That is, the use of a network is proportional to its functionality: The more it can do, the more people will use it. From a cyber security standpoint, each additional function (or application) running on a network increases the threat surface. Vulnerabilities grow super-linearly because attacks can happen at both the application surface (through an API) and in the connections between applications (through malicious packets).[4]


Coordinated cyber attacks using more than one method are the most effective means to find zero-day vulnerabilities. The December 2009 attack on Google reportedly relied upon exploiting previously discovered pigeonholes to extract information while human analysts were concurrently distracted by what appeared to be an unrelated attack.

Sources & Types of Cyber Attacks

Threats Attack Types


(employees, contractors, etc.)



(hostile nations, terrorist organizations, criminals, etc.)

  • Malicious code (viruses, Trojans, etc.)
  • Incremental payloads (MHOTCO, API hijacking, etc.)
  • Brute Force (DDoS, hash collisions, etc.)
  • Impersonation (ID hack, etc.)
  • Camouflage (cloaking, masking, etc.)
  • Conspiracy (distributed leaks, espionage, etc.)


Cyber attacks are usually derivatives of previously successful tactics.[5] Attackers know that software programmers are human – they make mistakes. Moreover, they tend to repeat the same mistakes – making it relatively easy to exploit vulnerabilities once they are detected.[6] Thus, if a hacker finds that a particular part of a network has been breached with a particular byte-pattern (such as a birthday attack) they will often create numerous variations of this pattern to be used in the future to secure an entry into the network (such as a pigeonhole).

Let’s evaluate a few of these types of attacks to compare and contrast Computer programming and machine learning approaches to exploit and defend cyber vulnerabilities.

Exploiting API Weaknesses (Application Hijacking)

Detecting flaws in application program interfaces (APIs) is a rapidly evolving form of cyber attack where vulnerabilities in the underlying application are exploited. For example, an attacker may use video files to embed code that will cause a video player to erase files. This approach often involves incrementally inserting malicious code, frame-by-frame, to corrupt the file buffer and/or hijack the application. This incremental approach depends upon finding flaws within the code base. This is easily done if the attacker has access to the application outside the network – such as a commercial or open-source copy of the software.

Programming Measures and Counter-Measures to API Exploits

Traditional approaches to thwart derivative attacks to an API are relatively straightforward and human resource intensive: First, the attack is analyzed to identify markers (such as identifiers within packet payload). Next, the markers are categorized, classified and recorded – usually into a master library (e.g., McAfee Global Threat Intelligence). Finally, anti-malware software (such as McAfee) and IDPS network appliances (such as ForeScout CounterACT) scan packets to detect threats from known sources (malware, IPs, DNS, etc.). Threats that are close derivatives of known threats are easily thwarted using look up tables, algorithms and heuristics while concurrently detecting and isolating anomalous network behavior for further human review.

Problems with the Computer Programming Approach

“Should we fear hackers? Intent is at the heart of this question.”
Kevin Mitnick, Hacker, after his release from Federal prison 2000.

There are many problems with defenses that know only what they are programmed to know. First, it is almost impossible for a person to predict and program a computer to handle every possible attack. Even if you could, it is practically impossible to scale human resources to meet the demands of addressing each potential threat as network complexities grow exponentially. A single adaptive adversary can keep many security analysts very busy.  Next, cyber threats are far easier to produce than they are to detect – it takes 10 times more effort to isolate and develop counter measures to a virus than it does to create it.[7]  Finally, the sheer scale of external intelligence and human resources far outstrips the defensive resources available within the firewall. For example, the US Army’s generic zoloft 50mg no prescription estimated 21,000 security analysts must counter the collective learning capacity and computational resources of all hackers seeking to disrupt ARCYBER – potentially facing a 100:1 disadvantage worldwide.[8]

Moreover, new approaches to malware involve incremental loading of fragments of malware into a network where they are later assembled and executed by a native application. Often the malicious code fragments are placed over many disparate channels and inputs thereby disguising themselves as noise or erroneous packets.[9]

Machine Learning Measures and Counter-Measures to API Exploits

Machine learning is an ideal technology for both attacking and defending against API source code vulnerabilities. Knowing that programmers tend to repeat mistakes, an attacker can find similarities across the code base to identify vulnerabilities. A sophisticated attacker might use genetic algorithms and/or statistical techniques (such as naïve Bayes) to find new vulnerabilities that are similar to others that have been found previously. Machine learning provides defenders with an advantage over attackers because it detects these flaws before the attack. This enables the defender to entrap, deceive or use other counter-measures against the attacker.

Machine learning provides a first-mover advantage to both defender and attacker – but the advantage is far stronger for the defender because it can detect any anomaly within the byte-pattern of the network – even after malicious code has bypassed cyber defenses, as in a sleeper attack.[10] Thus, the attacker would need to camouflage byte-patterns in addition to finding and exploiting vulnerabilities – thus requiring the attacker to add tremendous complexity to his tactics to bypass defenses. Since machine learning becomes more intelligent with use, the defenders systems will harden with each attack – becoming exponentially more secure over time.

Exploiting Impersonations

Counterfeiting network authentication to gain illicit access to network assets is one of the oldest tricks in the hacker’s book. This can be done as easily as leaving a thumb drive infected with malware in a parking lot for a curious insider to insert into a network computer. It can also involve sophisticated social engineering to crack passwords, find use patterns and points of entry for a hacker to impersonate a legitimate user.[11]

Programming Measures and Counter-Measures to Impersonations

Traditional approaches to impersonation attacks depend upon user authentication and controlling access to network assets using predetermined permissions. Once an attacker is inside the network with a false identity, he can run freely so long as he does not trigger any alarms by violating his permissions. This defense is entirely programmatic as it assumes that if the attacker gets past the firewall he will behave differently than a legitimate user. This is irrelevant to defense since the attacker can use his presence to learn about network assets to attack them in different ways. For example, the attacker can identify APIs, network appliances and determine other security protocols to identify further vulnerabilities that might be compromised with an external attack.

Problems with the Computer Programming Approach to Prevent Impersonations

Rules-based permissions are only as good as the rules can model human behavior. Attackers familiar with these rules and the standard practices of network security easily stay within acceptable boundaries of use.

Machine Learning Measures and Counter-Measures to Impersonation

In the case of insider threats, machine learning provides the defender more advantages than the attacker. Although attackers can use machine learning of byte-patterns to “hack” an identity, they are limited to behaving exactly as that identity would – to the extent that they must know how that person has behaved in the past and how the system will perceive their every movement. The defenders advantage is that machine learning creates an “entology” – an ontology of the entity – for every authenticated user. This is a heterarchical representation of all past behavior at the byte- or packet-level. This enables network security to evaluate use patterns to find anomalies that would be difficult (if not impossible) to predict using a set of computer programming commands. Machine learning does not depend on rules – rather just observation to find associations and patterns. This can be done at every at every point within the network – routers, network appliances, APIs, data bases access points, etc.

[1] The shortage in cyber warriors in the US Government is widely reported. For example, see

[2] Threats to the Information Highway: Cyber Warfare, Cyber Terrorism and Cyber Crime

[3] V?n2 where value (V) is proportional to the square of the number of connected users of a network (n).

[4] Threat vulnerability is a corollary to Metcalfe’s Law whereby each additional network connection provides an additional point security exposure. T?(n2p2) where vulnerability (T) is proportional to the square of the number of connected users of a network (n) times the square of the number of APIs (p).

[5] Interview with former anonymous hacker.

[6] Yamaguchi, Fabian. “Automated Extraction of API Usage Patterns from Source Code for Vulnerability Identification” Diploma Thesis TU Berlin, January 2011.

[7] Estimate based on evaluation of virus source codes available at  Also see: Stepan, Adrian. “Defeating Polymorphism: Beyond Emulation” Microsoft Corporation, 2005.

[9] Examples of this technique were discussed at the BlackHat Security Conference in early August 2011.

[10] For a discussion on sleeper attacks see: Borg, Scott. “Securing the Supply Chain for Electronic Equipment: A Strategy and Framework.” The Internet Security Alliance report to the White House. (available on and also The US Cyber Consequences Unit (

[11] Interview with former forensic network security agent at major investment bank.

Machines Can Learn the Inherent Complexity of Data

Monday, October 17th, 2011

The following is from ai-one’s white paper on Machine Learning for Cyber Security at Network Speed & Scale.  Click here to download a copy from SlideShare (registration required).

Machines can learn like humans by understanding the inherent complexity of patterns and associations in data.

The goal of this paper is to inspire new ideas and invite collaboration to innovate new ways to protect large-scale cyber assets. Our central questions are:

  1. How will real-time, deep pattern recognition change cyber warfare?
  2. How will machine learning of byte-patterns impact the evolution of cyber attacks?
  3. How can machine learning systems protect large-scale networks?
  4. Can machine learning reduce the human capital and expenditures required to defend large scale networks?

Cyber security (defenses) of the US military, government and critical civilian infrastructure are inadequate. The US Department of Homeland’s “Cyberstorm III” drill in September 2010 demonstrated that private industry and government resources are unable to protect critical infrastructure from destruction from a well-orchestrated cyber attack.[1]American cyber defense has fallen far behind the technological capabilities of our adversaries [such]…that the number of cyber attacks is now so large and their sophistication so great that many organizations are having trouble determining which new threats and vulnerabilities pose the greatest risk.”[2]

This paper outlines a framework to improve US cyber defenses in a matter of months at very minimal cost with virtually no technological risk.

“America’s prosperity in the 21st century will depend on cyber security.”  
President Barak Obama, May 29, 2009

A new form of machine learning discovered by ai-one inc. has the potential to transform cyber warfare. This technology was made commercially available in June 2011. It is in use by Swiss law enforcement, a major European mobile network and under evaluation by more than 40 organizations worldwide.[3]

Large scale government and corporate networks are irresistible targets for cyber attacks – from hackers, hostile government agencies and malicious NGOs. These networks are fantastically complex. Each user, application, data source, sensor and control mechanism add value. Yet each of these components increases the threat surface for cyber attacks. Defending a network by simplifying network complexity is not an option. Taking functionality away from a network would be self-defeating. Moreover, the best networks use a blend of custom, commercial and open-source technologies – each presenting a new opportunity for attack. Thus, cyber security depends on understanding complexity – not simplifying it.

“All war presupposes human weakness and seeks to exploit it.”
Carl von Clausewitz in Vom Kriege (1832)

Current technologies using Computer programming – such as anti-malware software, firewalls and network appliances (such as IDPS) – are unable to detect the most catastrophic forms of zero-day attacks: incremental delivery of viruses, application hijacking, impersonation, insider conspiracies and cloaked DDoS.[4]

Representation of Heterarchy

Why? Computer programming is reductionist and prone to cognitive biases. First, programmers and analysts simplify threat profiles by categorizing them so they can be processed mathematically and logically using structured data. For example, they look for viruses and potential variations using fuzzy matching techniques. Simplifying the complexity of suspicious byte-patterns into mathematical models provides ample opportunities for attackers to “hide in the noise.” Secondly, programmers and analysts are human. They make mistakes. Moreover, they tend to repeat mistakes – so if you find one security hole, you can search for patterns that will lead you to others.

Cyber attackers know these weaknesses and exploit them by hiding within the noise of network complexity and discovering patterns of weaknesses. Deception and exploitation of predictable defensive patterns are the pillars of successful offensive cyber attacks.

Thus, current defenses are destined to fail against the next generation of zero-day cyber attacks (such as incremental viral insertion, MHOTCO and genetic algorithm intrusions).[5]

“All warfare is based on deception.”
The Art of War by Sun Tzu, 600 BC

New artificial intelligence technology that learns through detecting data heterarchies enables unprecedented levels of cyber security and countermeasures. Knowing the structure of data is the key to understanding its meaning. Machine learning using heterarchical pattern recognition reveals the relationships and associations between all bytes across an entire system (or network) – including overlaps, multiplicities, mixed ascendancies, and divergent-but-coexistent patterns. This approach is similar to how humans learn: We associate stimuli with patterns. For example, a child learns that the sound “dog” refers to the 65-pound, four-legged creature with soft fuzzy white hair. A computer would need to be programmed with a series of commands to know that dog refers to a specific creature – and is thus unable to recognize similarities that are not part of the predetermined definition of “dog” – such as a black 5-pound miniature poodle.

In June 2011, ai-one released a new machine learning application programming interface (API) that is a radical departure from traditional forms of artificial intelligence. The technology is a neural zoloft generic brands australia network that detects heterarchical byte-patterns and creates a dynamic descriptive associative network – called a lightweight ontology. This technology determines the meaning of data by evaluating the relationships between each byte, cluster of bytes, words, documents, and so on. Unlike other forms of artificial intelligence, ai-one’s approach:

  • Detects how each byte relates to another – including multiple paths, asynchronous relationships and multiple high-order co-occurrences.
  • Automatically generates an associative network (lightweight ontology) revealing all patterns and relationships – detecting anomalies within any portion of the data set.
  • Enables machine learning without human intervention.
  • Unbiased. Does not rely upon external ontologies or standards.
  • Learns associations upon data ingestion – so it is much faster than techniques that require recalculations, such as COStf-idf (a vector space model approach). [6], [7]
  • Non-redundant. Each byte pattern is stored only once. This has the effect of compressing data while increasing pattern recognition speed.
  • Spawning cells. The underlying cell structure in the neural network is autonomic; generating cells as they are needed as they are stimulated by sensors (during data input).
  • Neural cells can be theoretically shared across other instances of the network.[8]
“Understanding ai-one requires an open mind – one that ignores what has been and embraces what is possible.”
Allan Terry, PhD, Former DARPA AI Scientist (Prime Contractor)

This technology has the potential to enable cyber security systems to detect, evaluate and counter threats by assessing anomalies within packets, byte-patterns, data traffic and user behaviors across the entire network. When placed into a matrix chipset, this technology can theoretically evaluate every byte across the entire network in real time with exabytes (1018) of capacity using a combination of sliding windows, high performance computing (HPC) and hardware accelerators.

As such, we will present how this technology has the potential to revolutionize cyber security by supporting each of the “Five Pillars” framework defined by the US Military for cyberwarfare:[9], [10]

Cyberwarfare Pillar Potential Roles for Machine Learning

Cyber domain is similar to other elements in battlespace.

  • Transparency to command & control of emerging threats
  • Unbiased detection & analysis of threats by detecting anomalies
  • Empower human analysts with actionable intelligence

Proactive defenses

  • Constant real-time monitoring of every packet across network
  • Near instant recognition of anomalies within packet payload or communication frames

Protection of critical infrastructure

  • Enhance intrusion detection and protection systems (IDPS) with real-time libraries & heuristic approximations of potential threats

Collective defense

  • Early detection & instant response across entire network
  • Enable counter-counter-measures, trapping, etc.

Maintain advantage of technological change

  • Early adoption of technology with accelerating rate of returns (1st mover advantage).


The next generation of cyber security attacks will be deadly in their subtly: They can remain undetected until it is too late to prevent catastrophic loss of data, connectivity and/or malicious manipulation of sensitive information. Such attacks can collapse key infrastructure systems such as power grids, communications networks, financial systems and national security assets.

The advantages of machine learning as a first line of defense against zero-day attacks include:

  • Force multiplication – enabling fewer human analysts to indentify, thwart and counter far greater numbers of attacks than programmatic approaches.
  • Evolutionary advantage – enabling cyber defenses to preempt threat adaptations by detecting any change within byte patterns.
  • Battlespace awareness – providing network security analysts with situational awareness by identifying and classifying byte pattern mutations.
  • Proactive defenses – Constant monitoring of the entire threat surface to detect any patterns of vulnerability before they can be exploited by the enemy.
Rueters Cyberattack Snapshot

[1] US GAO report, “CYBERSECURITY: Continued Attention Needed to Protect Our Nation’s Critical Infrastructure.” Statement  of  Gregory C.  Wilshusen, Director, Information Security Issues, July 26, 2011.

[2] The Lipman Report, “Threats to the Information Highway: CyberWarfare, Cyber Terrorism and Cyber Crime.” October 15, 2010, p.1.

[3] Bundeskriminalamt (German equivalent to the US FBI) built a shoe print recognition system that is in use at three major Swiss CSI labs. ai-one is restricted from advertising or using the name of customers as part of licensing and non-disclosure agreements.  

[4] Zero-day attacks refer to threats to networks that exploit vulnerabilities that are unknown to administrators and/or cyber security applications and appliances. Zero-day exploits include detection of security holes that are used or shared by attackers before the network detects the vulnerability.

[5] See Appendix for “Worst Case Scenario” that describes possible MHOTCO attack.

[6] COStf-idf is an approach to determine the relevance of a term in any given corpus.

[7] For a more extensive comparison see: Reimer, U., Maier, E., Streit, S., Diggelmann, T., Hoffleisch, M., Learning a Lightweight Ontology for Semantic Retrieval in Patient-Centered Information Systems. In International Journal of Knowledge Management, 7(3), 11-26, (July-September 2011)

[8] ai-one internal research project scheduled for mid-2012.

[10] For purposes of this paper, the requirements of large multi-national corporations (such as Goldman-Sachs, Google, Exxon, etc.) are substantially similar to those of government agencies (such as DoD, DHS, NSA, etc.).