Archive for the ‘machine learning sdk’ Category

ai-one’s Biologically Inspired Neural Network

Sunday, February 1st, 2015

ai-one’s Learning Algorithm: Biologically Inspired Neural Network
– Introduction to HSDS vs ANN in Text Applications

Unlike any of the traditional neural nets, the neural network based on ai-one, the HoloSemantic Data Space neural network (invented by Manfred Hoffleisch) or in short “HSDS”, are massively connected, asymmetrical graphs which are stimulated by binary spikes. HSDS do not have any neural structures pre-defined by the user. Their building blocks resemble biological neural networks: a neuron has dendrites, on which the synapses from other neurons are placed, and an axon which ends in synapses at other neurons.

The connections between the neurons emerge in an unsupervised manner while the learning input is translated into the neural graph structure. The resulting graph can be queried by means of specific stimulations of neurons. In traditional neural systems it is necessary to set up the appropriate network structure at the beginning according to what is to be learned. Moreover, the supervised learning employed by neural nets such as the perceptron requires that a teacher be present who answers specific questions. Even neural nets that employ unsupervised learning (like those of Hopfield and Kohonen) require a neighborhood function adapted to the learning issue. In contrast, HSDS require neither a teacher nor a predefined structure or neighborhood function (note that although a teacher is not required, in most applications programmatic teaching is used to insure the HSDS has learned the content needed to meet performance requirements). In the following we characterize HSDS according to their most prominent features.

Exploitation of context

In ai-one applications like BrainDocs, HSDS is used for the learning of associative networks and feature extraction. The learning input consists of documents from the application domains, which are broken down into segments rather than entered whole: all sentences may be submitted as is or segmented into sub-sentences according to grammatical markers. By way of experimenting, we have discovered that a segment should ideally consist of 7 to 8 words. This is in line with findings from cognitive psychology. Breaking down text documents into sub-sentences is the closest possible approximation to the ideal segment size. The contexts given by the sub-sentence segments help the system learn. The transitivity of term co-occurrences from the various input contexts (i.e. segments) are a crucial contribution to creating appropriate associations. This can be compared with the higher-order co-occurrences explored in the context of latent semantic indexing.

Continuously evolving structure
The neural structure of a HSDS is dynamic and changes constantly in line with neural operations. In the neural context, change means that new neurons are produced or destroyed and connections reinforced or inhibited. Connections that are not used in the processing of input into the net for some time will get gradually weaker. This effect can also be applied to querying, which then results in the weakening of connections that are rarely traversed for answering a query.

Asymmetric connections
The connections between the neurons need not be equally strong on both sides and it is not necessary that a connection should exist between all the neurons (cp. Hopfield’s correlation matrix).

Spiking neurons
The HSDS is stimulated by spikes, i.e. binary signals which either fire or do not. Thresholds do not play a role in HSDS. The stimulus directed at a neuron is coded by the sequence of spikes that arrive at the dendrite.

Massive connectivity
Whenever a new input document is processed, new (groups of) neurons are created which in turn stimulate the network by sending out a spike. Some of the neurons reached by the stimulus react and develop new connections, whereas others, which are less strongly connected, do not. The latter nevertheless contribute to the overall connectivity because they make it possible to reach neurons which could not otherwise be reached. Given the high degree of connectivity, a spike can pass through a neuron several times since it can be reached via several paths. The frequency and the chronological sequence in which this happens determine the information that is read from the net

General purpose
There is no need to define a topology before starting the learning process because the neural structure of the HSDS develops on its own. This is why it is possible to retrieve a wide range of information by means of different stimulation patterns. For example, direct associations or association chains between words can be found, the words most strongly associated with a particular word can be identified, etc.

AI, AGI, ASI, Deep Learning, Intelligent Machines.. Should you worry?

Saturday, January 17th, 2015

If the real life Tony Stark and technology golden boy, Elon Musk, is worried that AI is an existential threat to humanity, are we doomed? Can mere mortals do anything about this when the issue is cloaked in dozens of buzzwords and the primary voices on the subject are evangelists with 180 IQs from Singularity University? Fortunately, you can get smart and challenge them without a degree in AI from MIT.

There are good books on the subject. I like James Barrat’s Our Final Invention and while alarmist, it is thorough and provides a guide to a number of resources from both sides of the argument. One of those was the Machine Intelligence Research Institute (MIRI) founded by Eliezer Yudkowsky. This book was recommended on the MIRI website and is a good primer on the subject.

Smarter Than Us by Stuart ArmstrongSmarter Than Us – The Rise of Machine Intelligence by Stuart Armstrong can also be downloaded at iTunes.

“It will sharpen your focus to see AI from a different view. The book does not provide a manual for Friendly AI, but its shows the problems and it points to the 3 critical things needed. We are evaluating the best way for ai-one to participate in the years ahead.” Walt Diggelmann, CEO ai-one.

In Chapter 11 Armstrong recommends we take an active role in the future development and deployment of AI, AGI and ASI. The developments are coming; the challenge is to make sure AI plays a positive role for everyone. A short summary:

“That’s Where You Come In . . .

There are three things needed—three little things that will make an AI future bright and full of meaning and joy, rather than dark, dismal, and empty. They are research, funds, and awareness.

Research is the most obvious.
A tremendous amount of good research has been accomplished by a very small number of people over the course of the last few years—but so much more remains to be done. And every step we take toward safe AI highlights just how long the road will be and how much more we need to know, to analyze, to test, and to implement.

Moreover, it’s a race. Plans for safe AI must be developed before the first dangerous AI is created.
The software industry is worth many billions of dollars, and much effort (and government/defense money) is being devoted to new AI technologies. Plans to slow down this rate of development seem unrealistic. So we have to race toward the distant destination of safe AI and get there fast, outrunning the progress of the computer industry.

Funds are the magical ingredient that will make all of this needed research.
In applied philosophy, ethics, AI itself, and implementing all these results—a reality. Consider donating to the Machine Intelligence Research Institute (MIRI), the Future of Humanity Institute (FHI), or the Center for the Study of Existential Risk (CSER). These organizations are focused on the right research problems. Additional researchers are ready for hire. Projects are sitting on the drawing board. All they lack is the necessary funding. How long can we afford to postpone these research efforts before time runs out? “

About Stuart: “After a misspent youth doing mathematical and medical research, Stuart Armstrong was blown away by the idea that people would actually pay him to work on the most important problems facing humanity. He hasn’t looked back since, and has been focusing mainly on existential risk, anthropic probability, AI, decision theory, moral uncertainty, and long-term space exploration. He also walks the dog a lot, and was recently involved in the coproduction of the strange intelligent agent that is a human baby.”

Since ai-one is a part of this industry and one of the many companies moving the field forward, there will be many more posts on the different issues confronting AI. We will try to keep you updated and hope you’ll join the conversation on Google+, Facebook, Twitter or LinkedIn. AI is already pervasive and developments toward AGI can be a force for tremendous good. Do we think you should worry? Yes, we think it’s better to lose some sleep now so we don’t lose more than that later.

Tom

(originally posted on www.analyst-toolbox.com)

ai-one and the Machine Intelligence Landscape

Monday, January 12th, 2015

In the sensationally titled Forbes post, Tech 2015: Deep Learning And Machine Intelligence Will Eat The World, author Anthony Wing Kosner surveys the impact of deep learning technology in 2015. This is nothing new for those in the field of AI. His post reflects the recent increase in coverage artificial intelligence (AI) technologies and companies are getting in business and mainstream media. As a core technology vendor in AI for over ten years, it’s a welcome change in perspective and attitude.

We are pleased to see ai-one correctly positioned as a core technology vendor in the Machine Intelligence Landscape chart featured in the article. The chart, created by Shivon Zilis, investor at BloombergBETA, is well done and should be incorporated into the research of anyone seriously tracking this space.

Especially significant is Zilis’ focus on “companies that will change the world of work” since these are companies applying AI technologies to innovation and productivity challenges across the public and private sectors. The resulting solutions will provide real value through the combination of domain expertise (experts and data) and innovative application development.

This investment thesis is supported by the work of Erik Brynjolfsson and Andrew McAfee in their book “The Second Machine Age”, a thorough discussion of value creation (and disruption) by the forces of innovation that is digital, exponential and combinatorial. The impact of these technologies will change the economics of every industry over years if not decades to come. Progress and returns will be uneven in their impact on industry, regional and demographic sectors. While deep learning is early in Gartner’s Hype Cycle, it is clear that the market value of machine learning companies and data science talent are climbing fast.

This need for data scientists is growing but the business impact of AI may be limited in the near future by the lack of traditional developers who can apply them. Jeff Hawkins of Numenta has spoken out on this issue and we agree. It is a fundamentally different way to create an application for “ordinary humans” and until the “killer app” Hawkin’s speaks about is created, it will be hard to attract enough developers to invest time learning new AI tools. As the chart shows, there are many technologies competing for their time. Developers can’t build applications with buzzwords and one size fits all APIs or collections of open source algorithms. Technology vendors have a lot of work to do in this respect.

Returning to Kosner’s post, what exactly is deep learning and how is it different from machine learning/artificial intelligence? According to Wikipedia,

Deep learning is a class of machine learning training algorithms that use many layers of nonlinear processing units for feature extraction and transformation. The algorithms may be supervised or unsupervised and applications include pattern recognition and statistical classification.

  • are based on the (unsupervised) learning of multiple levels of features or representations of the data. Higher level features are derived from lower level features to form a hierarchical representation.
  • are part of the broader machine learning field of learning representations of data.
  • learn multiple levels of representations that correspond to different levels of abstraction; the levels form a hierarchy of concepts.
  • form a new field with the goal of moving toward artificial intelligence. The different levels of representation help make sense of data such as images, sounds and texts.

These definitions have in common (1) multiple layers of nonlinear processing units and (2) the supervised or unsupervised learning of feature representations in each layer, with the layers forming a hierarchy from low-level to high-level features.

While in the 4th bullet this is termed a new field moving toward artificial intelligence, it is generally considered to be part of the larger field of AI already. Deep learning and machine intelligence is not the same as human intelligence. Artificial intelligence in this definition above and in the popular press usually refers to Artificial General Intelligence (AGI). AGI and the next evolution, Artificial Super Intelligence (ASI) are the forms of AI that Stephen Hawking and Elon Musk are worried about.

This is powerful stuff no question, but as an investor, user or application developer in 2015 look for the right combination of technology, data, domain expertise, and application talent applied to a compelling (valuable) problem in order to create a disruptive innovation (value). This is where the money is over the new five years and this is our focus at ai-one.

Tom

Big Data Solutions: Intelligent Agents Find Meaning of Text

Friday, January 18th, 2013

 

ai-BrainDocs AgentWhat if your computer could find ideas in documents? Building on the idea of fingerprinting documents, ai-one helped develop ai-BrainDocs – a tool to mine large sets of documents to find ideas using intelligent agents. This solves a big problem for knowledge workers: How to find ideas in documents that are missed by traditional keyword search tools (such as Google, Lucine, Solr, FAST, etc.).

Customers Struggle with Unstructured Text

Almost every organization struggles to find value in “big data” – especially ideas buried within unstructured text. Often a very limited set of vocabulary can be used to express very different ideas. Lawyers are particularly talented at this: They can use 100 unique words to express thousands of ideas by simply changing the ordering and frequencies of the words.

Lawyers are not the only ones that need to find ideas inside documents. Other use cases include finding and classifying complaints, identifying concepts within social media feeds such as Twitter or Facebook and mining PubMed find related research articles. Recently, we have had several healthcare companies contact us to mine electronic health records (EHR) data to find information that is buried within doctors notes so they can predict adverse reactions, find co-morbidity risks and detect fraud.

The common denominator for all these uses cases is simple: How to find “what matters most” in documents? They need a way to find these ideas fast enough to keep pace with the growth in documents. Given that information is growing at almost 20% per year – this means that a very big problem now will be enormous next year.

Problems with Current Approaches

We’ve heard numerous stories from customers who were frustrated at the cost, complexity and expertise required to implement solutions to enable machines to read and understand the meaning of free-form text. Often these solutions use latent semantic indexing (LSI) and latent Dirichlet allocation (LDA). In one case, a customer spent more than two years trying to combine LSI with a Microsoft FAST Enterprise search appliance running on SharePoint. It failed because they were searching a high-volume of legal documents with very low variability. They were searching legal contracts to find paragraphs that included a very specific legal concept that could be expressed with many different combinations of words. Keyword search failed because the legal concept used commonly used words. LSI and LDA failed because the systems required a very large training set — often involving hundreds of documents. Even after reducing the specificity requirements, LSI and LDA still failed because they could not find the legal ideas at the paragraph level.

Inspiration

We found inspiration in the complaints we heard from customers: What if we could build an “intelligent agent” that could read documents like a person? We thought of the agent as an entry-level staff person who could be taught with a few examples then highlight paragraphs that were similar to (but not exactly like) the teaching examples.

Solution: Building Intelligent Agents

For several months, we have been developing prototypes of intelligent agents to mine unstructured text to find meaning. We built a Java application that combine ai-one’s machine learning API with natural language processing (OpenNLP) and NoSQL databases (MongoDB). Our approach generates an “ai-Fingerprint” that is a representational model of a document using keywords and association words. The “ai-Fingerprint” is similar to a graph G[V,E] where G is the knowledge representation, V (vertices) are keywords, and E (edges) are associations. This can also be thought of as a topic model.

ai-FingerprintThe ai-Fingerprint can be generated for almost any size text – from sentences to entire libraries of documents. As you might expect, the “intelligence” (or richness) of the ai-Fingerprint is proportional to the size of text it represents. Very sparse text (such as a tweet) has very little meaning. Large texts, such as legal documents, are very rich. This approach to topic modelling is precise — even without training or using external ontologies.

[NOTE: We are experimenting with using ontologies (such as OWL and RDF) as a way to enrich ai-Fingerprints with more intelligence. We are eager to find customers who want to build prototypes using this approach.]

The Secret Sauce

The magic is that ai-one’s API automatically detects keywords and associations – so it learns faster, with fewer documents and provides a more precise solution than mainstream machine learning methods using latent semantic analysis. Moreover, using ai-one’s approach makes it relatively easy for almost any developer to build intelligent agents.

How to Build Intelligent Agents?

To build an intelligent agent, we first had to consider how a human reads and understands a document.

The Human Perspective

Human are very good at detecting ideas – regardless of the words used to express them. As mentioned above, lawyers can express dozens of completely different legal concepts with a vocabulary of just a few hundred words. Humans can recognize the subtle differences of two paragraphs by how a lawyer uses words – both in meaning (semantics) and structure (syntax). Part of the cleverness of a lawyer is finding ways to combine as few words as possible to express a very precise idea to accomplish a specific legal or business objective. In legal documents, each new idea is almost always expressed in a paragraph. So two paragraphs might have the exact same words but express completely different ideas.

To find these ideas, a person (or computer) must detect the patterns of word use – similar to the finding a pattern in a signal. For example, as a child I knew I was in trouble when my mother called me by my first and last name – the combination of these words created a “signal” that was different than when she just used my first name. Similarly, a legal concept has a different meaning if two words occur together, such as “written consent” than if it only uses the word “consent.”

The (Conventional) Machine Learning Perspective

It’s almost impossible to program a computer to find such “faint signals” within a large number of documents. To do so would require a computer to be programmed to find all possible combinations of words for a given idea to search and match.

Machine learning technologies enable computers to identify features within the data to detect patterns. The computer “learns” by recognizing the combinations of features as patterns.

[There are many forms of machine learning – so I will keep focused only on those related to our text analytics problem.]

Natural Language Processing

One of the most important forms of machine learning for text analytics is natural language processing (NLP). NLP tools are very good at codifying the rules of language for computers to detect linguistic features – such as parts of speech, named entities, etc.

However (at the time of this writing), most NLP systems can’t detect patterns unless they are explicitly programmed or trained to do so. Linguistic patterns are very domain specific. The language used in medicine is different than what is used in law, etc. Thus, NLP is not easily generalized. NLP only works in specific situations where there is predictable syntax, semantics and context. IBM Watson can play Jeopardy! but has had tremendous problems finding commercial applications in marketing or medical records processing. Very few organizations have the budget or expertise to train NLP systems. They are left to either buy an off-the-shelf solution (such as StoredIQ ) or hire a team of PhDs to modify one of the open-source NLP tools. Good luck.

Latent Analysis Techniques

Tools such as latent semantic analysis (LSA), latent semantic indexing (LSI) and latent Dirichlet allocation (LDA) are all capable of detecting patterns within language. However, they require tremendous expertise to implement and often require large numbers of training documents. LSA and LSI are computationally expensive because they must recalculate the relationships between features each time they are given something new to learn. Thus, learning the meaning of the 1,001th document requires a calculation across the 1,000 previously learned documents. LSA uses a statistical approach called single variable decomposition to isolate keywords. Unlike LSA, ai-one’s technology also detects the association words that give a keyword context.

Similar to our ai-Fingerprint approach, LDA uses a graphical model for topic discovery. However, it takes tremendous skill to develop applications using LDA. Even when implemented, it requires the user to make informed guesses about the nature of the text. Unlike LDA, ai-one’s technology can be learned in a few hours. It requires no supervision or human interaction. It simply detects the inherent semantic value of text – regardless of language.

Our First Intelligent Agent Prototype: ai-BrainDocs

It took our team about a month to build the initial version of ai-BrainDocs. Our team used ai-one’s keyword and association commands to generate a graph for each document. This graph goes into MongoDB as a JSON object that represents the knowledge (content) of each document.
Next we created an easy way to build intelligent agents. We simply provide the API with examples of concepts we want to find. This training set can be very short. For one type of legal contracts, it only took 4 examples of text for the intelligent agent to achieve 90% accuracy in finding similar concepts.

Unlike solutions that use LSI, LDA and other technologies, the intelligent agents in ai-BrainDocs finds ideas at the paragraph level. This is a huge advantage when looking at large documents – such as medical research or SEC filings.

Next we built an interface that allows the end-user to control the intelligent agents by setting thresholds for sensitivity and determining how many paragraphs to scan at a time.

Our first customers are now testing ai-BrainDocs – and so far they love it. We expect to learn a lot as more people use the tool for different purposes. We are looking forward to developing ways for intelligent agents to interact – just like people – by comparing what they find within documents. We are finding that it is best for each agent to specialize in a specific subject. So finding ways for agents to compare their results using Boolean operators enables them to find similarities and differences between documents.

One thing is clear: Intelligent agents are ideal for mining unstructured text to find small ideas hidden in big data.

We look forward to reporting more on our work with ai-BrainDocs soon.

Posted by: Olin Hyde

Building Intelligent Agents: Google Now versus Apple SIRI?

Friday, December 14th, 2012

It has been a long time since our last blog post. Why? We’ve been busy learning how to build better intelligent agents.

Today, Kurt and I were discussing ways to improve feature detection algorithms for use in a prototype application called ai-BrainDocs. This is a system that detects concepts within legal documents. This is a hard problem because legal concepts (or ideas) use the same words. That is, there are no distinguishing features in the text.

ai-one’s technology is able to solve this problem by understanding how the same word (keyword) can mean different things by its context (as defined by association words). Together, keywords and associations create an array that we call an ai-Fingerprint. This can be thought of as a graph that can be represented as G[V,E]. ai-Fingerprints are easy to build using our Topic-Mapper API.

We pondered how the intelligent agents for Android developed by Google (called Google Now) and Apple iOS (called SIRI) might perform on a simple test. We picked a use case where the words were sparse but unique — looking for the status for a departing flight on American Airlines. Both Google Now and Apple SIRI have a tremendous advantages over ai-one because they: 1) have a lot more money to spend on R&D, 2) use expensive voice recognition technologies, and 3) they store all queries made by every user so they can apply statistical  machine learning to refine results from natural language processing (NLP).

Unlike Apple and Google, ai-one’s approach is not statistical. We use a new form of artificial neural network (ANN) that detects features and relationships without any training or human intervention.  This enables us to do something that Google and Apple can’t: Autonomic learning. This is a huge advantage for situations where you need to develop machine learning applications to find information where you can’t define what you are seeking. This is common in so-called “Big Data” problems. It is also much cheaper, faster and accurate than using the statistical machine learning tools that Apple and Google are pushing.

 

Posted by: Olin Hyde

Gartner Names ai-one Cool Vendor 2012 for Content Analytics

Tuesday, May 15th, 2012

Gartner Cool Vendor in Content Analytics, 2012

 

*GARTNER named ai-one in Cool Vendors in Content Analytics, 2012. The report reviews five vendors from around the world that offer potentially disruptive innovations for analyzing data to find actionable insights. Unlike traditional business intelligence solutions, these vendors provide technologies that can understand multiple types of information — including both structured and unstructured data.

The core value of ai-one’s technology is to make it easy for programmers to build intelligence into any application. Our APIs provide a way to mimic the way people detect patterns. “This is why we call it biologically inspired intelligence,” says founder and CEO

Answering the Most Important Questions, Mr. Walter Diggelmann, “because it works just like the human brain.”

 These companies have received tremendous publicity. Both are funded by traditional Silicon Valley venture capital firms. No surprise that they strive to provide comprehensive machine learning solutions rather than a tool for the general programming public.

“We do something completely different! We provide a general purpose tool that you can combine with other technologies to solve a specific problem. We do not try to do everything. Rather we just do one thing: We find the answer to the question you didn’t know to ask.” says Diggelmann

The advantage of ai-one’s approach to developers is that using the API is easy. The tool finds the inherent meaning of any data by detecting patterns. For example, feed it text and it will find every keyword and determine the association words that give each keyword context. Together, keywords and associations provide a complete and accurate summary of a document. The API gives precise results almost instantly and does not require any specialized training to use. Moreover, it is autonomic — as it works without any human intervention.

ai-one follows a technology licensing model — much like Qualcomm. The company makes money when licensees embed the API into commercial applications. ai-one works closely with its OEM partners to ensure that their products are successful.

ai-one’s technology enables programmers to build hybrid analytics solutions that integrate content from almost any digital source, in any language, regardless of its structure (or lack of structure). This capability has the potential to transform the way we think about business intelligence. “90% of the world’s data is unstructured,” says Diggelmann, “but 100% of the major business intelligence systems can’t read or understand it.  We provide a tool to bridge the gap.”

*Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings.  Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact.  Garner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.

 

Big Data Just Got Smaller: New Approach to Find Information

Tuesday, November 15th, 2011

Press Release

For Immediate Release

ai-Fingerprint

ai-Fingerprint shows a graphical representation of the knowledge within a news article

San Diego, CA – Artificial intelligence vendor ai-one will unveil a new approach to graphically represent knowledge at the SuperData conference in San Diego on Wednesday November 16, 2011. The discovery, named ai-Fingerprint, is a significant breakthrough because it allows computers to understand the meaning of language much like a person. Unlike other technologies, ai-Fingerprints compresses knowledge in way that can work on any kind of device, in any language and shows how clusters of information relate to each other. This enables almost any developer to use off-the-shelf and open-source tools to build systems like Apple’s SIRI and IBM Watson.

Ondrej Florian, ai-one’s VP of Core Technology invented ai-Fingerprints as a way to find information by comparing the differences, similarities and intersections of information on multiple websites. The approach is dynamic so that the ai-Fingerprint transforms as the source information changes. For example, the shape for a Twitter feed adapts with the conversation. This enables someone to see new information evolve and immediately understand its significance.

“The big idea is that we use artificial intelligence to identify clusters and show how each cluster relates to another,” said Florian. “Our approach enables computers to compare ai-Fingerprints across many documents to find hidden patterns and interesting relationships.”

The ai-Fingerprint is the collection of all the keywords and their associations identified by ai-one’s Topic-Mapper tool. Each keyword and its associations is a coordinate – much like what you would find on a map. The combination of these keywords and associations forms a graph that encapsulates the entire meaning of the document.

The real-world applications are impressive. “It solves a lot of so-called Big Data problems because the system learns by itself,” said Olin Hyde who worked with Florian on the project. “ai-Fingerprints work with existing computer languages and standards. So it only took us about a week to create a generic tool, called BrainBrowser, to find relationships in complex texts – such as summarizing news articles, searching for a job, or identifying new uses for a drug.”

To build BrainBrowser, the team fed ai-Fingerprint results from Topic-Mapper into a natural language processing tool, OpenNLP, so that the computer could understand the rules of grammar then tag parts of speech, chunk phrases and classify words into categories (also called named-entity recognition). The ai-Fingerprint is continuously updated by Topic-Mapper so that the computer can understand how information changes over time – as it does in a human conversation.

Next, the team built a little tool in Java that converted the output into a continuous data feed using an open-standard format called XGMML. This format shares the knowledge of a document as a network of words, sentences and relationships.

Finally, they visualized the result with an open-source bioinformatics tool, called Cytoscape, to show the differences, similarities and identify anomalous information among documents. The result is a graphic representation of knowledge that can show clusters, extract summaries and compare many documents at the same time.

The approach is easy for others to replicate with other technologies. “We used Topic-Mapper with Java, OpenNLP and Cytoscape,” said Florian, “But you could easily do this with Python, MATLAB and NLTK. Heck, you could throw a voice recognition tool on it, like Dragon or Nuance, and you can build an intelligent agent just like SIRI.”

ai-Fingerprint works in any language because Topic-Mapper looks only at byte-patterns. “The approach can give false positives if you don’t teach it the rules of language” warned Florian, “but it is very accurate once it learns the grammar from an outside source of information – such as a natural language processing system or an external database.”

ai-one’s engineering team sees ai-Fingerprints as a way to make it easier, faster and less expensive for their partners to develop intelligent systems. The team is now testing it for applications in advertising, financial analysis, medical research and search engine optimization (SEO).

“Our mission is to make powerful AI available to all developers. This is a big step in that direction,” said ai-one’s chief operating officer Tom Marsh. “We are eager to find academic and consulting partners who can build upon what we started.”

“BrainBrowser is just a minimally viable product (MVP) to prove the concept,” added Hyde. “The sky is the limit for those that want to build commercial applications. Just take the MVP code and customize to your needs.”

A demo of the system can be seen on www.ai-one.com and the semsys YouTube channel.  ai-one intends to provide the source code for ai-Fingerprint as part of its Topic-Mapper software development kit.

How to Build a Killer Application: AI and the Lean Startup

Tuesday, October 18th, 2011

 

How to Build a Killer Application: Artificial Intelligence and the Lean Startup

Quick pitch to the San Diego Tech Founders: Lean Startup Group