We are pleased to publish ISC’s submission under the DIUx program. The new “Defense Innovation Unit Experimental (DIUx) serves as a bridge between those in the U.S. military executing on some of our nation’s toughest security challenges and companies operating at the cutting edge of technology.” Powered by ai-one’s Nathan ICE artificial intelligence core for language, ISC’s Pytheas AI will provide ISC with the technology to assist researchers and help our governments keep us safe. Some proprietary sections have been deleted in the version below.
ISC White Paper for DIUx Technology Area of Interest: Knowledge Management
By Jeremy Toor, ISC Consulting Group
ISC/ai-one develops a prototype using Pytheas artificial intelligence (Pytheas AI) which will provide automated intelligent information management, or knowledge management (KM) of multiple data sources. Pytheas AI fingerprints the flow of data from almost any source including chat, email, message traffic, and other data. This AI core then supports the user in a publish/subscribe architecture, with building knowledge from the fingerprinted data through queries and intuitive alerts that understand the difference and importance of contextual situations.
The abundant quantity of data that is available to users, analysts, and commanders today can make it challenging to build a concise and accurate picture from which dynamic assessments can be made. Both Command and Control (C2) and intelligence systems are largely data-centric. Users that are required to make strategic and tactical decisions will benefit from a task-centric user experience that is able to manage information as it is created and presented, and distil many sources of data into a manageable data flow. This user experience, facilitated by Pytheas AI will deliver an KM Engine that can accelerate the decision making process.
Through Pytheas AI the user will be presented with data that has gone through automated processes to be categorized, tagged, and ranked according to its value in the current context of operations. Pytheas AI will give the user flexibility to tailor their focus area and pull information from a wide breadth of sources as they build situational awareness and confidence to take action.
ISC/ai-one proposes a three phase project. Phase 1 will include one-week for initial installation, configuration and user training. Phase 2 will include a five-month period to support data ingestion, intelligent agent training and dashboard customization. Phase 3 will include a three-week evaluation and close out.
Pytheas AI is built with an artificial intelligence core to collect, organize and analyze language to uncover key links and patterns within large volumes of unstructured text. The application empowers analysts to find the relationships necessary to discover, manage, process and exploit data. Key features and attributes of Pytheas include:
- Discovery of Concepts through the use of Intelligent Agents
- Agent collections can be built from existing plans, roadmaps and strategy documents
- DoD analysts can use common KM collections or build and share concept agents
- Agents provide classification for query and tagging of documents
- Application core is language independent
- Fast and lightweight running on PC class machines or VMs
Pytheas AI is built upon ai-one’s BrainDocs software application (with NathanICE API core) which is a commercially ready and viable technology that has been applied to several use-cases similar to the requirements in the technology area of interest, knowledge management, that DIUx is seeking. Our prototype for KM is ready for demonstration using sample data.
Pytheas uses the ability of ai-one’s proprietary NathanICE API to discern patterns in the words and associations that are central to the meaning of all or a portion of a text document (in the same way as the brain). Nathan extracts these keywords and associations, filtering out the noise to create a proprietary fingerprint array of the concept that can be used in many ways.
Pytheas uses the fingerprint of a trained concept to find (rank) similar concepts within a corpus of information (documents, websites, databases) and returns paragraph-level results sorted by “similarity”. These results support a variety of workflows in enterprise compliance, classification, search and knowledge management. Agent similarity scores are exported to Excel or your database to support analytics and BI tools. This can be done by the analyst for small ad hoc studies. Agents can also be used to code years of legacy data without additional training.
Users employ agents in Pytheas AI to organize text based on contextual ideas and metadata dimensions, improving accuracy, consistency and saving substantial amounts of time in this tedious process.
The Basic Elements of Pytheas
Documents – Pytheas is capable of analyzing any form of unstructured text. In fact, our technology works best with semantically-rich content written in your business vernacular without external taxonomies or ontologies. Working at the paragraph level it has been used on everything from text messages to database fields to long documents always with full traceability to source.
Conceptual Fingerprints – This is the “secret sauce” of our discovery capabilities. Pytheas uses the Nathan API keywords and associations to create semantic “fingerprints” of concepts. Because one concept can be written in multiple ways, our algorithm does not rely on word counts, natural language processing (NLP) or latent semantic analysis (LSA) when identifying and fingerprinting concepts.
Intelligent Agents – Pytheas agents examine and compare the conceptual fingerprints to find traces of concepts buried within your data. Our premise is that analyst is the expert and needs to be able to train their own army of software agents to “read” documents and deliver the relevant paragraph. Used as a collection, the scores from a collection of agents set the context for a user’s query.
Paragraph Level Concept Discovery – Pytheas provides the ability to categorize and display concept results at the paragraph-level. Users do not need to hunt through documents trying to find a concept that a search engine claims to be present. Our system will return the paragraph(s) that closely match a concept, sort and group the concepts by similarity to one another. Paragraphs can be evaluated and traced back to their source document for reporting and distribution.
Figure 1. Topic Mapper Entity and Sentiment in SEC Filings
Ease of Integration – Pytheas application can be used with conventional desktop tools for ad hoc projects. For workflow automation a Restful API provides developers an easy method to process documents and export results to SQL or other DBs for reporting and visualizations.
Optional Entity Extraction and Sentiment (Figure 1 above) – Complementing paragraph level concept detection is the ability to extract entities and/or score for sentiment so this information can be added to visualizations and follow on workflows. Clients can use their own technology for this purpose or add custom analytics to further refine the insight for social network analysis, tagging existing file headers or streamlining the flow of information into the analyst.
The immediate benefit to DoD is increased productivity, consistent analysis and more effective information management. The long-term benefit is an ability to perform quicker, more informed decisions.
Operational users of this prototype include any person that has to search through data. This includes anyone using SharePoint and other common organizational databases. Analysts who must sift through massive amounts of data in order to discover relevant information will save countless hours through the employment of our prototype. Through the employment of a similar use case at NASA, our customer was able to complete a typical six-week project in one-week!
Company and Relevant Use Case
Lead by ISC, personnel from ISC Consulting and ai-one inc. will execute the project.
ISC Consulting Group is a Service Disabled Veteran-Owned Small Business (SDVOSB). We are headquartered in Sierra Vista, Arizona, with operational offices at Ft. Huachuca, AZ; Orlando, FL; Ft. Gordon, GA; and Northern Virginia. ISC provides a full-spectrum of services, products & solutions supporting the DOD Intelligence Community and key commercial clients with advanced capabilities in Instructional Solutions, Cyber Security, Command and Control planning and operations, Intelligence operations, Information Technology, and Data Analytics through Artificial Intelligence products and services.
ai-one inc. is the developer of a proprietary core technology that emulates the complex pattern recognition functions of the human brain that can detect the key features and contextual meaning of text, time-series and visual data. This technology will enable DIUx to score and analyze any piece of textual content and discover information by concept, bringing the dimension of AI understanding to knowledge management. This technology automatically generates a lightweight ontology that easily detects all relationships among data elements; solving the immediate problems facing the DIUx knowledge management based process and schedule.
ISC has served several clients with Pytheas technology, including NASA Marshall Space Flight Center (MSFC). Currently, Pytheas is being used by MSFC’s Advanced Concepts Office (ACO) under a Cooperative Agreement to assist in technology roadmap development and separately by the Office of Strategic Analysis and Communication (OSAC) to manage and report on their portfolio of project investments (similar to SBIR grants). For example, the roadmap project is described below:
Overview of the NASA Advanced Concepts TAPP Pilot Project
The Advance Concepts Office (ACO) at MSFC, NASA is developing and refining methods and processes for performing Information Based Decisions for Strategic Technology Investments. This system is currently referred to as TAPP, Technology Alignment & Prioritization Process. This process supports the evaluation of the technologies for investment by NASA and MSFC to insure alignment with NASA mission plans, technology area priorities and strategic knowledge gaps.
TAPP creates an interactive system for exploring the almost mind boggling complexity of planning for multiple missions using over 400 technologies (many still in basic research) and hundreds of interrelated elements/sub-elements over 30-year planning horizons.
Pytheas provides NASA the capability to have data mining agents parse and score unstructured content against the nearly 400 technologies identified in the 15 Technology Roadmaps. This ability to score proposals with agents allows ACO to perform statistical analysis within the Information Based Decision framework for Strategic Investments.
The immediate benefit to ACO is increased productivity and consistent analysis. The long-term benefit is an ability to perform quicker, more informed technology assessments, feasibility analysis, and concept studies that align with NASA evolving strategic goals and multiple mission objectives.
Given a six-month prototype build period, ISC/ai-one will demonstrate to DIUx that ISC/ai-one’s Pytheas AI application will enable the organization to save critical time and human capital in the implementation and operation of knowledge management systems. Pytheas will empower the IC to rapidly and effectively sort through vast volumes of text data in order to gain knowledge and position decision makers with the right information to achieve stated organizational analytical research outcomes.