Report on data mining at DHS
Also from the DHS report:
The DHS inspector general released a report last week entitled “Survey of DHS Data Mining Activities” which identifies and describes 12 systems and capabilities at DHS, some of which are operational, others of which are under development. Many of the 12 have received little to no public scrutiny until this report, most notably the Intelligence and Information Fusion (I2F) program under the DHS Office of Intelligence Analysis, which was previously alluded in the IT spending analysis for the FY 2007 budget request and at a conference in May, but has not been really discussed until this report. The report notes:
The purpose of the I2F is to make operational an integrated intelligence and information capability for DHS. This capability will enable intelligence analysts to understand relationships that would otherwise not be readily apparent. I2F is in early development and is primarily dependent on the analyst manually processing, compiling, and analyzing data. The next version of the system will be a set of tools and technologies integrated to support the intelligence analyst.
I2F provides intelligence analysts with tools that aid in the discovery and tracking of terrorism threats to the United States population and infrastructure. I2F is principally made up of commercial off-the-shelf software, but also integrates government off-the-shelf programs. These programs are used for entity extraction, search capabilities, and link analysis.
The report also discusses the ADVISE program, which has been the subject of occasional worried speculation over the past year and a half. The report adds additional details to previous official accounts of ADVISE, describing how it uses semantic graph techniques to “connect information extracted from text and images, databases, and simulation and modeling tools to provide a watch-and-warning system for analysts.”
Overall, an informative report. For more info, see these stories last week from Washington Technology and the Washington Times.
Also from the DHS report:
Science and Technology (S&T) is developing an advanced analytics capability called Analysis, Dissemination, Visualization, Insight and Semantic Enhancement (ADVISE), as described in Table 6. ADVISE is an advanced information technology that can integrate information and facts from many different types of data. Since ADVISE is a “technology framework,” it can be tailored and deployed for specific purposes and areas of interest. For example, it is being developed to incorporate chemical, biological, radiological, nuclear, and explosive threat and effects data. It is intended to ingest data from a variety of sources, ranging from highly structured content, such as database records, to unstructured content, such as message traffic. Still in development, ADVISE will connect information extracted from text and images, databases, and simulation and modeling tools to provide a watch-and-warning system for analysts.
ADVISE employs semantic graphs to determine relationships and patterns among data and multiple visualization techniques to display the resulting information. The Department seeks to predict threat and vulnerabilities, such as through the detection of relationships between seemingly disjointed entities. Semantic graphs organize data entities regarding threats and vulnerabilities and link their relationships. Thus, hidden relationships in the data are uncovered by examining the structure and properties of the semantic graph. For example, a simple semantic graph can link people, workplaces, and towns as well as indicate a relationship with various friends. Studying the links can assist in understanding the relationships between entities, and help identify threats and vulnerabilities. S&T expects ADVISE’s ability to apply the capabilities of semantic data fusion, link analysis, and unstructured text analysis will be a powerful capability that will allow analysts to find the expected and discover the unexpected.