|
Is
Outsourcing
the Answer?
Acquiring
and exploiting open source information is a challenge
partially solved through partnerships and perseverance
By Edward F. Dandar Jr.
Commercial
vendors, universities, and military reservists have the
background and experience to accomplish several actions.
They can perform continuous data monitoring and
acquisition tasks in a responsive manner which supports
intelligence community trans-national, military
operations other-than-war, and major regional conflict
issues. Industry and academic centers have information
specialists with expertise on various regions of the
world and subject areas. These non-government analysts
can acquire and pre-process open source information which
will help satisfy many civil, political, law enforcement,
economic and military community information requirements.
Using these analysts
through an outsourcing program may be a partial solution
to acquiring and exploiting open source information. It
is one of several avenues being studied this past year by
an intelligence community working group charged with
producing an information technology assessment. The
author studied various commercial businesses, drawing
information from several sources. Additionally, several
information gathering discussions were held with defense
contractors. A review was completed of some small
business open source contract efforts by several small
businesses. The intelligence community continues to study
ways to take advantage of these non-government
open-source acquisition and exploitation assets.
U.S. responsiveness to
natural and man-made disasters relies heavily on a
variety of open sources. Humanitarian relief
organizations provide valuable information. Open source
information from previous or existing intelligence
community external research contracts also help to obtain
a realistic "picture" of the crisis and taking
appropriate action. This "picture" includes
information on a countrys or regions national
cultures (religions, customs), personalities, and basic
infrastructures (food, water, medical, communications,
transportation, critical supplies, power generation, and
distribution systems).
Two objectives for meeting
the needs of policy makers and commanders will be
satisfied when incorporating open source information
vendors into the flow of intelligence. First, in the near
term, they can assist in fulfilling short-suspense
contingency requirements through accessing, filtering,
and maintaining on-call source data. Secondly, they can
provide strategic-level open source research to alert the
intelligence community about activities that indicate an
abnormal or potentially alarming situation.
To meet open source
information research requirements, one must first
understand its availability and utility. Many commercial
vendors, academics, and reservists maintain their current
knowledge of the global information environment by
holding membership in professional organizations
dedicated to information research. They network and
attend professional, international symposiums,
conventions and trade shows. These activities help them
stay abreast of new avenues of open source information
and to pursue commercial business interests. In addition,
these information specialists have created both domestic
and international networks (academic and professional)
which can be leveraged.
Industry and university
centers maintain contemporary technical libraries with
reference books, specialized publications and journals
from around the world. They rigorously evaluate
information sources to minimize bias and unsubstantiated
facts which may have been reported or published. The open
source information providers also acquire information not
readily available through data services. They also
solicit information from non-electronic sources such as
embassies, trade missions, foreign libraries and
organizations.
Because focused data
acquisition is a fundamental part of their business, the
open source information providers are experienced at
acquiring gray literature (publicly available information
which is not distributed through normal publishing
channels). Examples of gray literature are academic
writings, conference proceedings and trade show
literature, video and still imagery reports, marketing
research studies, international tender documents, and
industry-sponsored research. Knowing what information is
available and obtaining it requires a staff experienced
in nontraditional research methods with a broad base of
commercial contacts.
Foreign
Language Hurdles
Foreign
language open source documents can be translated by the
open source information exploitation vendors
language support centers where available. Translators who
cover several languages and dialects normally staff the
centers. Contractors, academics, and reservists represent
a large pool of both subject matter expertise and foreign
language capabilities which can be tapped quickly to meet
current intelligence community needs.
The need to process text
from multiple languages is increasingly important to
intelligence analysis. Historically, foreign language
processing needed human translators and was constrained
by languages and domains with high mission priority.
Increased access to foreign language sources, especially
on-line open source literature, has created requirements
for a range of tools to handle multiple languages. The
overall goal is to provide a multilingual text analysis
capability for foreign language information.
Tools must be developed to
facilitate analysts handling of foreign language
text in multilingual environments, especially when
analysts may not be language experts. These tools may
range from automatic language classification capabilities
to identify the source materials language, to
tailorable information extraction and summarization tools
for abstracting foreign language documents. The range can
extend to presentation tools for handling specialized
character sets as well. Machine translation capabilities
are key to supporting a broad user population with
wide-ranging language skills and domain expertise.
A number of components in
the intelligence community and DoD are working on machine
translation research. DoD personnel are completing the
majority of the basic work while various intelligence
community organizations are performing additional work.
The hosting and maintenance
of machine translation software, which automatically
translates text into language pairs (e.g.,
Chinese-English), is available on the intelligence
communitys Open Source Information System and the
Intelink-TS network. It will be available shortly on the
Intelink-S network. This machine translation capability
has been a major success story for the U.S. Air
Forces National Air Intelligence Center at
Wright-Patterson Air Force Base, Ohio. Other DoD agencies
are developing machine translators for
"low-density" languages.
Officials at the National
Air Intelligence Center have been engaged in machine
translation for over 40 years. They began with the
world-famous Systran Russian to English machine
translation system which was developed during the Cold
War and continues to support todays intelligence
community translation needs. There are 11 Systran machine
translation systems in use throughout the U.S.
Government. They are: Russian to English, French to
English, German to English, Spanish to English, Italian
to English, Portuguese to English, Japanese to English,
Serbo-Croatian to English, Chinese to English, Korean to
English and English to Korean. The last three systems are
in very early development stages. Officials will begin
developing Ukrainian and Cantonese this year and host
operational prototypes within two years.
The Systran machine
translation systems no longer require main frame
computers. The software is available for UNIX and
DOS/Windows. The National Air Intelligence Center owns
unlimited rights for free use by U.S. Government
agencies. Soon, U.S. Government organizations with
appropriate computer systems can download certain windows
versions of Systran from the Open Source Information
System and Intelink networks. Systems which can be
downloaded include Russian, French, German Spanish,
Italian and Portuguese.
Shrink-wrapped versions of
Systran software are available from Dale Bostad at the
National Air Intelligence Center. Direct questions
regarding machine translation capabilities and software
access to NAIC/DXLT, Dale Bostad
([email protected])or call (513) 257-6538 or DSN
787-6538 or FAX: 656-1669. The request for software is
sent to Systran Software Inc., logged on a government
database and immediately sent to the requester. All
languages noted above are available.
Intelligence community
analysts can exploit foreign electronic open sources in
the above available languages. Analysts can exploit the
source by pasting an Internet hypertext markup language
page into the Open Source Information System machine
translation system or other machine translation equipped
U.S. system. The entire web page will be translated into
English and returned for quick evaluation of its content.
Contact Bostad at the address above for the latest
procedures for accomplishing electronic machine
translation on the fly.
The National Air
Intelligence Center, along with the Federal Intelligent
Document Understanding Laboratory and other intelligence
community members, are working together to develop
optical character reader technology which integrates
Systran and other machine translation systems. A
government-sponsored Eastern Computers Incorporated
Chinese optical character reader package is now
integrated with Systran Chinese. The Cuneiform optical
character reader package which includes seven Germanic
and Romance languages, Russian Cyrillic, Serbian Cyrillic
and Croatian roman languages has been integrated with
Systran. E-Typist, a commercial off-the-shelf Japanese
optical character reader package also has been integrated
with Systran Japanese. Direct inquiries on the status of
machine translation or optical character reader
capabilities to Bostad at the address given previously.
Another intelligence
community organization is developing Arabic and Farsi
optical character readers. These machine translation and
optical character reader capabilities are major steps
forward in dealing with the global information
environment which is expected to become more regionally
and linguistically focused.
Editors Note: Part
III of this series of articles will examine a possible
open source information strategy for dealing with the
flood of open source material.
Ed
Dandar is a civilian employee of the Army Intelligence
and Security Command and is assigned to the Deputy
Assistant Secretary of Defense for Intelligence and
Security, Intelligence Systems Support Office. Comments
can be provided to him at: [email protected]
|