A fast and reliable decision is based on the availability of a precise data basis and expert knowledge. In order to address these aspects for resource planning, we provide an intuitive situation analysis in mixed reality with a seamless transition between conventional location planning and virtual inspection of the operational environment, which in addition to a better understanding of the terrain allows for location-independent cooperation of several users. The basic idea of the presented concept comprises the intuitive application preparation in mixed reality in a realistic 3D environment combined with an efficient 2D situation overview using e.g. a large display, tablet or smartphone. The aim is to enable transparent collaboration between different system environments regardless of the location of the respective users. The aim of the idea is an improved understanding of the terrain and increased situational awareness for fast and demand-oriented location planning. The solution is based on the three building blocks. First, a user-oriented display of high-resolution 3D geodata for a better understanding of the terrain on the basis of data standards, including necessary performance optimizations for use in mixed reality environments. Second, the combination of two-dimensional and three-dimensional display of operational pictures, which allows a choice of means through the synchronization of the information between the different platforms and thus a demand-oriented deployment planning. And third, the support of content-related cooperation of remote users for fast decision making via wired or mobile networks.
In recent years, professionally used workstations got increasingly complex and multi-monitor systems are more and more common. Novel interaction techniques like gesture recognition were developed but used mostly for entertainment and gaming purposes. These human computer interfaces are not yet widely used in professional environments where they could greatly improve the user experience. To approach this problem, we combined existing tools in our imageinterpretation-workstation of the future, a multi-monitor workplace comprised of four screens. Each screen is dedicated to a special task in the image interpreting process: a geo-information system to geo-reference the images and provide a spatial reference for the user, an interactive recognition support tool, an annotation tool and a reporting tool. To further support the complex task of image interpreting, self-developed interaction systems for head-pose estimation and hand tracking were used in addition to more common technologies like touchscreens, face identification and speech recognition. A set of experiments were conducted to evaluate the usability of the different interaction systems. Two typical extensive tasks of image interpreting were devised and approved by military personal. They were then tested with a current setup of an image interpreting workstation using only keyboard and mouse against our image-interpretationworkstation of the future. To get a more detailed look at the usefulness of the interaction techniques in a multi-monitorsetup, the hand tracking, head pose estimation and the face recognition were further evaluated using tests inspired by everyday tasks. The results of the evaluation and the discussion are presented in this paper.
Real-time motion video analysis is a challenging and exhausting task for the human observer, particularly in safety and
security critical domains. Hence, customized video analysis systems providing functions for the analysis of subtasks like
motion detection or target tracking are welcome. While such automated algorithms relieve the human operators from
performing basic subtasks, they impose additional interaction duties on them. Prior work shows that, e.g., for interaction
with target tracking algorithms, a gaze-enhanced user interface is beneficial.
In this contribution, we present an investigation on interaction with an independent motion detection (IDM) algorithm.
Besides identifying an appropriate interaction technique for the user interface – again, we compare gaze-based and
traditional mouse-based interaction – we focus on the benefit an IDM algorithm might provide for an UAS video analyst.
In a pilot study, we exposed ten subjects to the task of moving target detection in UAS video data twice, once performing
with automatic support, once performing without it. We compare the two conditions considering performance in terms of
effectiveness (correct target selections). Additionally, we report perceived workload (measured using the NASA-TLX
questionnaire) and user satisfaction (measured using the ISO 9241-411 questionnaire).
The results show that a combination of gaze input and automated IDM algorithm provides valuable support for the
human observer, increasing the number of correct target selections up to 62% and reducing workload at the same time.
KEYWORDS: 3D imaging standards, Data modeling, Geographic information systems, 3D acquisition, 3D modeling, Data acquisition, Systems modeling, Standards development, Analytical research, 3D image processing, Visualization
In the area of working with spatial data, in addition to the classic, two-dimensional geometrical data (maps, aerial
images, etc.), the needs for three-dimensional spatial data (city models, digital elevation models, etc.) is increasing.
Due to this increased demand the acquiring, storing and provision of 3D enabled spatial data in Geographic Information
Systems (GIS) is more and more important. Existing proprietary solutions quickly reaches their limits during data
exchange and data delivery to other systems. They generate a large workload, which will be very costly. However, it is
noticeable that these expenses and costs can generally be significantly reduced using standards. The aim of this research
is therefore to develop a concept in the field of three-dimensional spatial data that runs on existing standards whenever
possible. In this research, the military image analysts are the preferred user group of the system.
To achieve the objective of the widest possible use of standards in spatial 3D data, existing standards, proprietary
interfaces and standards under discussion have been analyzed. Since the here used GIS of the Fraunhofer IOSB is
already using and supporting OGC (Open Geospatial Consortium) and NATO-STANAG (NATO-Standardization
Agreement) standards for the most part of it, a special attention for possible use was laid on their standards.
The most promising standard is the OGC standard 3DPS (3D Portrayal Service) with its occurrences W3DS (Web 3D
Service) and WVS (Web View Service). A demo system was created, using a standardized workflow from the data
acquiring, storing and provision and showing the benefit of our approach.
Motion video analysis is a challenging task, particularly if real-time analysis is required. It is therefore an important issue how to provide suitable assistance for the human operator. Given that the use of customized video analysis systems is more and more established, one supporting measure is to provide system functions which perform subtasks of the analysis. Recent progress in the development of automated image exploitation algorithms allow, e.g., real-time moving target tracking. Another supporting measure is to provide a user interface which strives to reduce the perceptual, cognitive and motor load of the human operator for example by incorporating the operator’s visual focus of attention. A gaze-enhanced user interface is able to help here. This work extends prior work on automated target recognition, segmentation, and tracking algorithms as well as about the benefits of a gaze-enhanced user interface for interaction with moving targets. We also propose a prototypical system design aiming to combine both the qualities of the human observer’s perception and the automated algorithms in order to improve the overall performance of a real-time video analysis system. In this contribution, we address two novel issues analyzing gaze-based interaction with target tracking algorithms. The first issue extends the gaze-based triggering of a target tracking process, e.g., investigating how to best relaunch in the case of track loss. The second issue addresses the initialization of tracking algorithms without motion segmentation where the operator has to provide the system with the object’s image region in order to start the tracking algorithm.
In recent years, many new interaction technologies have been developed that enhance the usability of computer systems and allow for novel types of interaction. The areas of application for these technologies have mostly been in gaming and entertainment. However, in professional environments, there are especially demanding tasks that would greatly benefit from improved human machine interfaces as well as an overall improved user experience. We, therefore, envisioned and built an image-interpretation-workstation of the future, a multi-monitor workplace comprised of four screens. Each screen is dedicated to a complex software product such as a geo-information system to provide geographic context, an image annotation tool, software to generate standardized reports and a tool to aid in the identification of objects. Using self-developed systems for hand tracking, pointing gestures and head pose estimation in addition to touchscreens, face identification, and speech recognition systems we created a novel approach to this complex task. For example, head pose information is used to save the position of the mouse cursor on the currently focused screen and to restore it as soon as the same screen is focused again while hand gestures allow for intuitive manipulation of 3d objects in mid-air. While the primary focus is on the task of image interpretation, all of the technologies involved provide generic ways of efficiently interacting with a multi-screen setup and could be utilized in other fields as well. In preliminary experiments, we received promising feedback from users in the military and started to tailor the functionality to their needs
When a spatiotemporal events happens, multi-source intelligence data is gathered to understand the problem, and strategies for solving the problem are investigated. The difficulties arising from handling spatial and temporal intelligence data represent the main problem. The map might be the bridge to visualize the data and to get the most understand model for all stakeholders. For the analysis of geodata based intelligence data, a software was developed as a working environment that combines geodata with optimized ergonomics. The interaction with the common operational picture (COP) is so essentially facilitated. The composition of the COP is based on geodata services, which are normalized by international standards of the Open Geospatial Consortium (OGC). The basic geodata are combined with intelligence data from images (IMINT) and humans (HUMINT), stored in a NATO Coalition Shared Data Server (CSD). These intelligence data can be combined with further information sources, i.e., live sensors. As a result a COP is generated and an interaction suitable for the specific workspace is added. This allows the users to work interactively with the COP, i.e., searching with an on board CSD client for suitable intelligence data and integrate them into the COP. Furthermore, users can enrich the scenario with findings out of the data of interactive live sensors and add data from other sources. This allows intelligence services to contribute effectively to the process by what military and disaster management are organized.
This paper introduces an interactive recognition assistance system for imaging reconnaissance. This system supports
aerial image analysts on missions during two main tasks: Object recognition and infrastructure analysis. Object
recognition concentrates on the classification of one single object. Infrastructure analysis deals with the description of
the components of an infrastructure and the recognition of the infrastructure type (e.g. military airfield). Based on satellite
or aerial images, aerial image analysts are able to extract single object features and thereby recognize different object
types. It is one of the most challenging tasks in the imaging reconnaissance. Currently, there are no high potential ATR
(automatic target recognition) applications available, as consequence the human observer cannot be replaced entirely.
State-of-the-art ATR applications cannot assume in equal measure human perception and interpretation. Why is this still
such a critical issue? First, cluttered and noisy images make it difficult to automatically extract, classify and identify
object types. Second, due to the changed warfare and the rise of asymmetric threats it is nearly impossible to create an
underlying data set containing all features, objects or infrastructure types. Many other reasons like environmental
parameters or aspect angles compound the application of ATR supplementary. Due to the lack of suitable ATR
procedures, the human factor is still important and so far irreplaceable. In order to use the potential benefits of the human
perception and computational methods in a synergistic way, both are unified in an interactive assistance system.
RecceMan® (Reconnaissance Manual) offers two different modes for aerial image analysts on missions: the object
recognition mode and the infrastructure analysis mode. The aim of the object recognition mode is to recognize a certain
object type based on the object features that originated from the image signatures. The infrastructure analysis mode
pursues the goal to analyze the function of the infrastructure. The image analyst extracts visually certain target object
signatures, assigns them to corresponding object features and is finally able to recognize the object type. The system
offers him the possibility to assign the image signatures to features given by sample images. The underlying data set
contains a wide range of objects features and object types for different domains like ships or land vehicles. Each domain
has its own feature tree developed by aerial image analyst experts. By selecting the corresponding features, the possible
solution set of objects is automatically reduced and matches only the objects that contain the selected features.
Moreover, we give an outlook of current research in the field of ground target analysis in which we deal with partly
automated methods to extract image signatures and assign them to the corresponding features. This research includes
methods for automatically determining the orientation of an object and geometric features like width and length of the
object. This step enables to reduce automatically the possible object types offered to the image analyst by the interactive
recognition assistance system.
Motion video analysis is a challenging task, especially in real-time applications. In most safety and security critical applications, a human observer is an obligatory part of the overall analysis system. Over the last years, substantial progress has been made in the development of automated image exploitation algorithms. Hence, we investigate how the benefits of automated video analysis can be integrated suitably into the current video exploitation systems. In this paper, a system design is introduced which strives to combine both the qualities of the human observer’s perception and the automated algorithms, thus aiming to improve the overall performance of a real-time video analysis system. The system design builds on prior work where we showed the benefits for the human observer by means of a user interface which utilizes the human visual focus of attention revealed by the eye gaze direction for interaction with the image exploitation system; eye tracker-based interaction allows much faster, more convenient, and equally precise moving target acquisition in video images than traditional computer mouse selection. The system design also builds on prior work we did on automated target detection, segmentation, and tracking algorithms. Beside the system design, a first pilot study is presented, where we investigated how the participants (all non-experts in video analysis) performed in initializing an object tracking subsystem by selecting a target for tracking. Preliminary results show that the gaze + key press technique is an effective, efficient, and easy to use interaction technique when performing selection operations on moving targets in videos in order to initialize an object tracking function.
“Although we know that it is not a familiar object, after a while we can say what it resembles”. The core task of an aerial
image analyst is to recognize different object types based on certain clearly classified characteristics from aerial or satellite
images. An interactive recognition assistance system compares selected features with a fixed set of reference objects (core
data set). Therefore it is mainly designed to evaluate durable single objects like a specific type of ship or vehicle. Aerial
image analysts on missions realized a changed warfare over the time. The task was not anymore to classify and thereby
recognize a single durable object. The problem was that they had to classify strong variable objects and the reference set
did not match anymore. In order to approach this new scope we introduce a concept to a further development of the
interactive assistance system to be able to handle also short-lived, not clearly classifiable and strong variable objects like
for example dhows. Dhows are the type of ships that are often used during pirate attacks at the coast of West Africa. Often
these ships were build or extended by the pirates themselves. They follow no particular pattern as the standard construction
of a merchant ship. In this work we differ between short-lived and durable objects. The interactive adaptable assistance
system is supposed to assist image analysts with the classification of objects, which are new and not listed in the reference
set of objects yet. The human interaction and perception is an important factor in order to realize this task and achieve the
goal of recognition. Therefore we had to model the possibility to classify short-lived objects with appropriate procedures
taking into consideration all aspects of short-lived objects. In this paper we will outline suitable measures and the
possibilities to categorize short-lived objects via simple basic shapes as well as a temporary data storage concept for shortlived
objects. The interactive adaptable approach offers the possibility to insert the data (objects) into the system directly
and on-site. To mitigate the manipulation risk the entry of data (objects) into the main reference (core data) set is granted
to a central authorized unit.
A frequently occurring interaction task in UAS video exploitation is the marking or selection of objects of interest in the
video. If an object of interest is visually detected by the image analyst, its selection/marking for further exploitation,
documentation and communication with the team is a necessary task. Today object selection is usually performed by
mouse interaction. As due to sensor motion all objects in the video move, object selection can be rather challenging,
especially if strong and fast and ego-motions are present, e.g., with small airborne sensor platforms. In addition to that,
objects of interest are sometimes too shortly visible to be selected by the analyst using mouse interaction. To address this
issue we propose an eye tracker as input device for object selection. As the eye tracker continuously provides the gaze
position of the analyst on the monitor, it is intuitive to use the gaze position for pointing at an object. The selection is
then actuated by pressing a button. We integrated this gaze-based “gaze + key press” object selection into Fraunhofer
IOSB's exploitation station ABUL using a Tobii X60 eye tracker and a standard keyboard for the button press.
Representing the object selections in a spatial relational database, ABUL enables the image analyst to efficiently query
the video data in a post processing step for selected objects of interest with respect to their geographical and other
properties. An experimental evaluation is presented, comparing gaze-based interaction with mouse interaction in the
context of object selection in UAS videos.
KEYWORDS: Visualization, Databases, Raster graphics, Geographic information systems, Sensors, Data storage, Neodymium, Computer security, Data storage servers, Video
Modern crisis management requires that users with different roles and computer environments have to deal with a high
volume of various data from different sources. For this purpose, Fraunhofer IOSB has developed a geographic
information system (GIS) which supports the user depending on available data and the task he has to solve.
The system provides merging and visualization of spatial data from various civilian and military sources. It supports the
most common spatial data standards (OGC, STANAG) as well as some proprietary interfaces, regardless if these are filebased
or database-based.
To set the visualization rules generic Styled Layer Descriptors (SLDs) are used, which are an Open Geospatial
Consortium (OGC) standard. SLDs allow specifying which data are shown, when and how. The defined SLDs consider
the users' roles and task requirements. In addition it is possible to use different displays and the visualization also adapts
to the individual resolution of the display. Too high or low information density is avoided.
Also, our system enables users with different roles to work together simultaneously using the same data base. Every user
is provided with the appropriate and coherent spatial data depending on his current task. These so refined spatial data are
served via the OGC services Web Map Service (WMS: server-side rendered raster maps), or the Web Map Tile
Service - (WMTS: pre-rendered and cached raster maps).
In this contribution, we propose the use of eye tracking technology to support video analysts. To reduce workload, we
implemented two new interaction techniques as a substitute for mouse pointing: gaze-based selection of a video of
interest from a set of video streams, and gaze-based selection of moving targets in videos. First results show that the
multi-modal interaction technique gaze + key press allows the selection of fast moving objects in a more effective way.
Moreover, we discuss further application possibilities like gaze behavior analysis to measure the analyst's fatigue, or
analysis of the gaze behavior of expert analysts to instruct novices.
The officer-in-charge deploying the security personnel to protect a large infrastructure meets a complex decision
problem if multiple threats have to be handled simultaneously: Limited surveillance resources have to be optimally
allocated to the many affected sectors in order to provide the safest threat state for the infrastructure as a whole over
time. This contribution presents an interactive resource management system providing decision support for optimally
deploying surveillance resources. For this purpose, the user interface displays a risk map of the infrastructure's current
threat situation together with a recommendation of the currently optimal resource allocation. Thereby, the resource
allocation recommendation is obtained by solving a CMDP model of an infrastructure's global threat situation. An
evaluation of the CMDP-based decision support shows that displaying both resource allocation recommendation and risk
map enables the participants to handle threat scenarios more cost-saving, and additionally causes less workload and
higher acceptance among the participants.
System concepts for network enabled image-based ISR (intelligence, surveillance, reconnaissance) is the major mission
of Fraunhofer IITB's applied research in the area of defence and security solutions. For the TechDemo08 as part of the
NATO CNAD POW Defence against terrorism Fraunhofer IITB advanced a new multi display concept to handle the
shear amount and high complexity of ISR data acquired by networked, distributed surveillance systems with the
objective to support the generation of a common situation picture. Amount and Complexity of ISR data demands an
innovative man-machine interface concept for humans to deal with it. The IITB's concept is the Digital Map & Situation
Surface. This concept offers to the user a coherent multi display environment combining a horizontal surface for the
situation overview from the bird's eye view, an attached vertical display for collateral information and so-called foveatablets
as personalized magic lenses in order to obtain high resolved and role-specific information about a focused areaof-
interest and to interact with it. In the context of TechDemo08 the Digital Map & Situation Surface served as
workspace for team-based situation visualization and analysis. Multiple sea- and landside surveillance components were
connected to the system.
In many application domains the analysis of aerial or satellite images plays an important role. The use of stereoscopic
display technologies can enhance the image analyst's ability to detect or to identify certain objects of interest, which
results in a higher performance. Changing image acquisition from analog to digital techniques entailed the change of
stereoscopic visualisation techniques. Recently different kinds of digital stereoscopic display techniques with affordable
prices have appeared on the market. At Fraunhofer IITB usability tests were carried out to find out (1) with which kind
of these commercially available stereoscopic display techniques image analysts achieve the best performance and (2)
which of these techniques achieve a high acceptance. First, image analysts were interviewed to define typical image
analysis tasks which were expected to be solved with a higher performance using stereoscopic display techniques. Next,
observer experiments were carried out whereby image analysts had to solve defined tasks with different visualization
techniques. Based on the experimental results (performance parameters and qualitative subjective evaluations of the used
display techniques) two of the examined stereoscopic display technologies were found to be very good and appropriate.
The analysis of complex infrastructure from aerial imagery, for instance a detailed analysis of an airfield, requires the
interpreter, besides to be familiar with the sensor's imaging characteristics, to have a detailed understanding of the
infrastructure domain. The required domain knowledge includes knowledge about the processes and functions involved
in the operation of the infrastructure, the potential objects used to provide those functions and their spatial and functional
interrelations. Since it is not possible yet to provide reliable automatic object recognition (AOR) for the analysis of such
complex scenes, we developed systems to support a human interpreter with either interactive approaches, able to assist
the interpreter with previously acquired expert knowledge about the domain in question, or AOR methods, capable of
detecting, recognizing or analyzing certain classes of objects for certain sensors. We believe, to achieve an optimal result
at the end of an interpretation process in terms of efficiency and effectivity, it is essential to integrate both interactive and
automatic approaches to image interpretation. In this paper we present an approach inspired by the advancing semantic
web technology to represent domain knowledge, the capabilities of available AOR modules and the image parameters in
an explicit way. This enables us to seamlessly extend an interactive image interpretation environment with AOR
modules in a way that we can automatically select suitable AOR methods for the current subtask, focus them on an
appropriate area of interest and reintegrate their results into the environment.
A digital situation table which allows a team of experts to cooperatively analyze a situation has been developed. It is
based on a horizontal work table providing a general overview of the situation. Tablet PCs referenced precisely to the
scene image using a digital image processing algorithm display a detailed view of a local area of the image. In this way a
see-through effect providing high local resolution at the position of the tablet PC is established. Additional information
not fitting the bird's eye view of the work table can be displayed on a vertical screen. All output devices can be
controlled using tablet PCs where each team member has his own tablet PC. An interaction paradigm has been developed
allowing each team member to interact with a high degree of freedom and ensuring cooperative teamwork.
Designing SAR sensors is an extremely complex process. Thereby it is very important to keep in mind the goal for which the SAR sensor has to be built. For military purpose the detection and recognition of vehicles is essential. To give recommendations for design and use of SAR sensors we carried out interpreter experiments. To assess the interpreter performance we measured performance parameters like detection rate, false alarm rate etc. The following topics were of interest: How do the SAR sensor parameters bandwidth and incidence angle influence the interpreter performance? Could the length, width and orientation of vehicles be measured in SAR-images? Which information (size, signature...) Will be used by the interpreters for vehicle recognition? Using our SaLVe evaluation testbed we prepared lots of images from the experimental SAR-system DOSAR (EADS Dornier) and defined several military interpretation tasks for the trials. In a 4 weeks experiment 30 German military photo interpreters had to detect and classify tanks and trucks in X-Band images with different resolutions. To accustom the interpreters to SAR image interpretation they carried out a computer based SAR tutorial. To complete the investigations also subjective assessment of image quality was done by the interpreters.
In the field of remote sensing, efficient image analysis necessitates defined image requirements for the tasks which have to be solved. Image interpretability scales, such as the National Imagery Interpretability Rating Scale (NIIRS), define different levels of image quality or interpretability by the types of tasks an analyst can perform with imagery of a given rating level. These scales result from subjective observer interrogations and exist for the sensor types visible, infrared, SAR (synthetic aperture radar) and multispectral. Another approach is persecuted at the IITB. Within objective observer experiments image interpreter have to carry out defined tasks on images with defined sensor parameters. Experiment results are performance parameters which give information about the solvability of the tasks on the examined image parameters. Two commercial satellite sensors (optical and SAR) with a resolution of 25 meters were examined. Goal of the investigation was to find out which tasks can be solved with the optical and/or the SAR image. The inspected tasks were deduced from tasks used in the NIIRS. Also it was examined if the accuracy of targeting decisions can be enhanced by providing the image interpreters both pictures.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.