PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
This PDF file contains the front matter associated with SPIE Proceedings Volume 8297, including the Title Page, Copyright information, Table of Contents, and the Conference Committee listing.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Biomedical journal articles contain a variety of image types that can be broadly classified into two categories: regular
images, and graphical images. Graphical images can be further classified into four classes: diagrams, statistical figures,
flow charts, and tables. Automatic figure type identification is an important step toward improved multimodal (text +
image) information retrieval and clinical decision support applications. This paper describes a feature-based learning
approach to automatically identify these four graphical figure types. We apply Evolutionary Algorithm (EA), Binary
Particle Swarm Optimization (BPSO) and a hybrid of EA and BPSO (EABPSO) methods to select an optimal subset of
extracted image features that are then classified using a Support Vector Machine (SVM) classifier. Evaluation performed
on 1038 figure images extracted from ten BioMedCentral® journals with the features selected by EABPSO yielded
classification accuracy as high as 87.5%.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper describes an automated system to label zones containing Investigator Names (IN) in biomedical articles, a key
item in a MEDLINE® citation. The correct identification of these zones is necessary for the subsequent extraction of IN
from these zones. A hierarchical classification model is proposed using two Support Vector Machine (SVM) classifiers.
The first classifier is used to identify an IN zone with highest confidence, and the other classifier identifies the remaining
IN zones. Eight sets of word lists are collected to train and test the classifiers, each set containing collections of words
ranging from 100 to 1,200. Experiments based on a test set of 105 journal articles show a Precision of 0.88, 0.97 Recall,
0.92 F-Measure, and 0.99 Accuracy.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents a novel, general purpose and multi-applications color segmentation system providing optimal
chromatic and achromatic layers and filtering the hue and illumination distortions, with minimal information
loss. A text extraction method based on the resulting segmentation is proposed to illustrate the usefulness of
the method. The system is validated through the evaluation of a well-known commercial OCR line segmentation
performances on the processed images.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Document layout analysis is of fundamental importance for document image understanding and information retrieval.
It requires the identification of blocks extracted from a document image via features extraction and block
classification. In this paper, we focus on the classification of the extracted blocks into five classes: text (machine
printed), handwriting, graphics, images, and noise. We propose a new set of features for efficient classifications
of these blocks. We present a comparative evaluation of three ensemble based classification algorithms (boosting,
bagging, and combined model trees) in addition to other known learning algorithms. Experimental results
are demonstrated for a set of 36503 zones extracted from 416 document images which were randomly selected
from the tobacco legacy document collection. The results obtained verify the robustness and effectiveness of
the proposed set of features in comparison to the commonly used Ocropus recognition features. When used in
conjunction with the Ocropus feature set, we further improve the performance of the block classification system
to obtain a classification accuracy of 99.21%.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Recognizing old documents is highly desirable since the demand for quickly searching millions of archived documents
has recently increased. Using Hidden Markov Models (HMMs) has been proven to be a good solution to tackle the
main problems of recognizing typewritten Arabic characters. These attempts however achieved a remarkable success for
omnifont OCR under very favorable conditions, they didn't achieve the same performance in practical conditions, i.e. noisy
documents. In this paper we present an omnifont, large-vocabulary Arabic OCR system using Pseudo Two Dimensional
Hidden Markov Model (P2DHMM), which is a generalization of the HMM. P2DHMM offers a more efficient way to
model the Arabic characters, such model offer both minimal dependency on the font size/style (omnifont), and high level
of robustness against noise. The evaluation results of this system are very promising compared to a baseline HMM system
and best OCRs available in the market (Sakhr and NovoDynamics). The recognition accuracy of the P2DHMM classifier
is measured against the classic HMM classifier, the average word accuracy rates for P2DHMM and HMM classifiers are
79% and 66% respectively. The overall system accuracy is measured against Sakhr and NovoDynamics OCR systems, the
average word accuracy rates for P2DHMM, NovoDynamics, and Sakhr are 74%, 71%, and 61% respectively.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We present in this paper an HMM-based recognizer for the recognition of unconstrained Arabic handwritten words.
The recognizer is a context-dependent HMM which considers variable topology and contextual information for a better modeling of writing units.
We propose an algorithm to adapt the topology of each HMM to the character to be modeled.
For modeling the contextual units, a state-tying process based on decision tree clustering is introduced which significantly reduces the number of parameters.
Decision trees are built according to a set of expert-based questions on how characters are written.
Questions are divided into global questions yielding larger clusters and precise questions yielding smaller ones.
We apply this modeling to the recognition of Arabic handwritten words.
Experiments conducted on the OpenHaRT2010 database show that variable length topology and contextual information significantly improves the recognition rate.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Offline Chinese handwritten character string recognition is one of the most important research fields in pattern
recognition. Due to the free writing style, large variability in character shapes and different geometric characteristics,
Chinese handwritten character string recognition is a challenging problem to deal with. However, among the current
methods over-segmentation and merging method which integrates geometric information, character recognition
information and contextual information, shows a promising result. It is found experimentally that a large part of errors
are segmentation error and mainly occur around non-Chinese characters. In a Chinese character string, there are not only
wide characters namely Chinese characters, but also narrow characters like digits and letters of the alphabet. The
segmentation error is mainly caused by uniform geometric model imposed on all segmented candidate characters. To
solve this problem, post processing is employed to improve recognition accuracy of narrow characters. On one hand,
multi-geometric models are established for wide characters and narrow characters respectively. Under multi-geometric
models narrow characters are not prone to be merged. On the other hand, top rank recognition results of candidate paths
are integrated to boost final recognition of narrow characters. The post processing method is investigated on two
datasets, in total 1405 handwritten address strings. The wide character recognition accuracy has been improved lightly
and narrow character recognition accuracy has been increased up by 10.41% and 10.03% respectively. It indicates that
the post processing method is effective to improve recognition accuracy of narrow characters.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The paper presents complexity reduction of an on-line handwritten Japanese text recognition system by selecting an
optimal off-line recognizer in combination with an on-line recognizer, geometric context evaluation and linguistic
context evaluation. The result is that a surprisingly small off-line recognizer, which alone is weak, produces nearly the
best recognition rate in combination with other evaluation factors in remarkably small space and time complexity.
Generally speaking, lower dimensions with less principle components produce a smaller set of prototypes, which reduce
memory-cost and time-cost. It degrades the recognition rate, however, so that we need to compromise them. In an
evaluation function with the above-mentioned multiple factors combined, the configuration of only 50 dimensions with
as little as 5 principle components for the off-line recognizer keeps almost the best accuracy 97.87% (the best accuracy
97.92%) for text recognition while it suppresses the total memory-cost from 99.4 MB down to 32 MB and the average
time-cost of character recognition for text recognition from 0.1621 ms to 0.1191 ms compared with the traditional offline
recognizer with 160 dimensions and 50 principle components.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Earlier work has shown how to recognize handwritten characters by representing coordinate functions or integral
invariants as truncated orthogonal series. The series basis functions are orthogonal polynomials defined by a
Legendre-Sobolev inner product. It has been shown that the free parameter in the inner product, the 'jet scale',
has an impact on recognition both using coordinate functions and integral invariants.
This paper develops methods of improving series-based recognition. For isolated classification, the first
consideration is to identify optimal values for the jet scale in different settings. For coordinate functions, we find
the optimum to be in a small interval with the precise value not strongly correlated to the geometric complexity
of the character. For integral invariants, used in orientation-independent recognition, we find the optimal value
of the jet scale for each invariant. Furthermore, we examine the optimal degree for the truncated series. For
in-context classification, we develop a rotation-invariant algorithm that takes advantage of sequences of samples
that are subject to similar distortion. The algorithm yields significant improvement over orientation-independent
isolated recognition and can be extended to shear and, more generally, affine transformations.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
It is a commonly used evaluation strategy to run competing algorithms on a test dataset and state which performs
better in average on the whole set. We call this generic evaluation. Although it is important, we believe this
type of evaluation is incomplete.
In this paper, we propose a methodology for algorithm comparison, which we call specific evaluation. This
approach attempts to identify subsets of the data where one algorithm is better than the other. This allows
not only knowing each algorithm's strengths and weaknesses better but also constitutes a simple way to develop
a combination policy that allows enjoying the best of both. We shall be applying specific evaluation to an
experiment that aims at grouping pre-obtained table cells into columns; we demonstrate how it identifies a
subset of data for which the on-average least good but faster algorithm is equivalent or better, and how it then
manages to create a policy for combining the two competing table column delimitation algorithms.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
With the tremendous popularity of PDF format, recognizing mathematical formulas in PDF documents becomes a new
and important problem in document analysis field. In this paper, we present a method of embedded mathematical
formula identification in PDF documents, based on Support Vector Machine (SVM). The method first segments text
lines into words, and then classifies each word into two classes, namely formula or ordinary text. Various features of
embedded formulas, including geometric layout, character and context content, are utilized to build a robust and
adaptable SVM classifier. Embedded formulas are then extracted through merging the words labeled as formulas.
Experimental results show good performance of the proposed method. Furthermore, the method has been successfully
incorporated into a commercial software package for large-scale e-Book production.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In chemical literature much information is given in the form of diagrams depicting molecules. In order to access this
information diagrams have to be recognised and translated into a processable format. We present an approach that models
the principal recognition steps for molecule diagrams in a strictly rule based system, providing rules to identify the main
components - atoms and bonds - as well as to resolve possible ambiguities. The result of the process is a translation into
a graph representation that can be used for further processing. We show the effectiveness of our approach by describing
its embedding into a full recognition system and present an experimental evaluation that demonstrates how our current
implementation outperforms the leading open source system currently available.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
To model a handwritten graphical language, spatial relations describe how the strokes are positioned in the 2-dimensional space. Most of existing handwriting recognition systems make use of some predefined spatial relations. However, considering a complex graphical language, it is hard to express manually all the spatial relations. Another possibility would be to use a clustering technique to discover the spatial relations. In this paper, we discuss how to create a relational graph between strokes (nodes) labeled with graphemes in a graphical language. Then we vectorize spatial relations (edges) for clustering and quantization. As the targeted application, we extract the repetitive sub-graphs (graphical symbols) composed of graphemes and learned spatial relations. On two handwriting databases, a simple mathematical expression database and a complex flowchart database, the unsupervised spatial relations outperform the predefined spatial relations. In addition, we visualize the frequent patterns on two text-lines containing Chinese characters.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Archiving official written documents such as invoices, reminders and account statements in business and private
area gets more and more important. Creating appropriate index entries for document archives like sender's name,
creation date or document number is a tedious manual work. We present a novel approach to handle automatic
indexing of documents based on generic positional extraction of index terms. For this purpose we apply the
knowledge of document templates stored in a common full text search index to find index positions that were
successfully extracted in the past.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We introduce a new system for layout-based (LATEX) indexing and retrieval of mathematical expressions using
substitution trees. Substitution trees can efficiently store and find expressions based on the similarity of their
symbols, symbol layout, sub-expressions and size. We describe our novel implementation and some of our
modifications to the substitution tree indexing and retrieval algorithms. We provide an experiment testing our
system against the TF-IDF keyword-based system of Zanibbi and Yuan and demonstrate that, in many cases, the
quality of search results returned by both systems is comparable (overall means, substitution tree vs. keywordbased:
100% vs. 89% for top 1; 48% vs. 51% for top 5; 22% vs. 28% for top 20). Overall, we present a promising
first attempt at layout-based substitution tree indexing and retrieval for mathematical expressions and believe
that this method will prove beneficial to the field of mathematical information retrieval.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We propose a novel strategy for the optimal combination of human and machine decisions in a cost-sensitive
environment. The proposed algorithm should be especially beneficial to financial institutions where off-line
signatures, each associated with a specific transaction value, require authentication. When presented with a
collection of genuine and fraudulent training signatures, produced by so-called guinea pig writers, the proficiency
of a workforce of human employees and a score-generating machine can be estimated and represented in receiver
operating characteristic (ROC) space. Using a set of Boolean fusion functions, the majority vote decision of the
human workforce is combined with each threshold-specific machine-generated decision. The performance of the
candidate ensembles is estimated and represented in ROC space, after which only the optimal ensembles and
associated decision trees are retained. When presented with a questioned signature linked to an arbitrary writer,
the system first uses the ROC-based cost gradient associated with the transaction value to select the ensemble
that minimises the expected cost, and then uses the corresponding decision tree to authenticate the signature in
question. We show that, when utilising the entire human workforce, the incorporation of a machine streamlines
the authentication process and decreases the expected cost for all operating conditions.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
During the last few years many document recognition methods have been developed to determine whether a
handwriting specimen can be attributed to a known writer. However, in practice, the work-flow of the document
examiner continues to be manual-intensive. Before a systematic or computational, approach can be developed, an
articulation of the steps involved in handwriting comparison is needed. We describe the work flow of handwritten
questioned document examination, as described in a standards manual, and the steps where existing automation
tools can be used. A well-known ransom note case is considered as an example, where one encounters testing for
multiple writers of the same document, determining whether the writing is disguised, known writing is formal
while questioned writing is informal, etc. The findings for the particular ransom note case using the tools are
given. Also observations are made for developing a more fully automated approach to handwriting examination.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Document analysis and recognition systems often fail to produce results with a sufficient quality level when processing
old and damaged documents sets, and require manual corrections to improve results. This paper presents
how, using the iterative analysis of document pages we recently proposed, we can implement a spontaneous
interaction model, suitable for mass document processing. It enables human operators to detect and correct
errors made by the automatic system, and reintegrates the corrections they made into subsequent analysis steps
of the iterative analysis process. Thus, a page analyzer can reprocess erroneous parts and those which depend
on them, avoiding the necessity to manually fix during post-processing all the consequences of errors made by
the automatic system. After presenting the global system architecture and a prototype implementation of our
proposal, we show that document model can be simply enriched to enable the spontaneous interaction model we
propose. We present how to use it in a practical example to correct under-segmentation issues during the localization
of numbers in documents from the 18th century. Evaluations we conducted on the example case show, on
50 pages containing 1637 numbers to localize, that the interaction model we propose can reduce human workload
(29.8% less elements to provide) for a given target quality level when compared to a manual post-processing.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The essential layout attributes of a visual table can be defined by the location of four critical grid cells. Although these
critical cells can often be located by automated analysis, some means of human interaction is necessary for correcting
residual errors. VeriClick is a macro-enabled spreadsheet interface that provides ground-truthing, confirmation,
correction, and verification functions for CSV tables. All user actions are logged. Experimental results of seven subjects
on one hundred tables suggest that VeriClick can provide a ten- to twenty-fold speedup over performing the same
functions with standard spreadsheet editing commands.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In spite of a hundredfold decrease in the cost of relevant technologies, the role of document image processing systems is
gradually declining due to the transition to an on-line world. Nevertheless, in some high-volume applications, document
image processing software still saves millions of dollars by accelerating workflow, and similarly large savings could be
realized by more effective automation of the multitude of low-volume personal document conversions. While potential
cost savings, based on estimates of costs and values, are a driving force for new developments, quantifying such savings
is difficult. The most important trend is that the cost of computing resources for DIA is becoming insignificant compared
to the associated labor costs. An econometric treatment of document processing complements traditional performance
evaluation, which focuses on assessing the correctness of the results produced by document conversion software.
Researchers should look beyond the error rate for advancing both production and personal document conversion.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Calligraphic style is considered, for this research, visual attributes of images of calligraphic characters sampled randomly
from a "work" created by a single artist. It is independent of page layout or textual content. An experimental design is
developed to investigate to what extent the source of a single, or of a few pairs, of character images can be assigned to
the either same work or to two different works. The experiments are conducted on the 13,571 segmented and labeled
600-dpi character images of the CADAL database. The classifier is not trained on the works tested, only on other works.
Even when only a few samples of same-class pairs are available, the difference-vector of a few simple features extracted
from each image of a pair yields over 80% classification accuracy for a same-work vs. different-work dichotomy. When
many pairs of different classes are available for each pair, the accuracy, using the same features, is almost the same.
These style-verification experiments are part of our larger goal of style identification and forgery detection.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
State-of-the-art techniques for writer identification have been centered primarily on enhancing the performance
of the system for writer identification. Machine learning algorithms have been used extensively to improve
the accuracy of such system assuming sufficient amount of data is available for training. Little attention has
been paid to the prospect of harnessing the information tapped in a large amount of un-annotated data. This
paper focuses on co-training based framework that can be used for iterative labeling of the unlabeled data
set exploiting the independence between the multiple views (features) of the data. This paradigm relaxes the
assumption of sufficiency of the data available and tries to generate labeled data from unlabeled data set along
with improving the accuracy of the system. However, performance of co-training based framework is dependent
on the effectiveness of the algorithm used for the selection of data points to be added in the labeled set. We
propose an Oracle based approach for data selection that learns the patterns in the score distribution of classes
for labeled data points and then predicts the labels (writers) of the unlabeled data point. This method for
selection statistically learns the class distribution and predicts the most probable class unlike traditional selection
algorithms which were based on heuristic approaches. We conducted experiments on publicly available IAM
dataset and illustrate the efficacy of the proposed approach.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Handwriting styles are constantly changing over time. We approach the novel problem of estimating the approximate
age of Historical Handwritten Documents using Handwriting styles. This system will have many
applications in handwritten document processing engines where specialized processing techniques can be applied
based on the estimated age of the document. We propose to learn a distribution over styles across centuries
using Topic Models and to apply a classifier over weights learned in order to estimate the approximate age of
the documents. We present a comparison of different distance metrics such as Euclidean Distance and Hellinger
Distance within this application.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Forensic individualization is the task of associating observed evidence with a specific source. The likelihood ratio
(LR) is a quantitative measure that expresses the degree of uncertainty in individualization, where the numerator
represents the likelihood that the evidence corresponds to the known and the denominator the likelihood that
it does not correspond to the known. Since the number of parameters needed to compute the LR is exponential
with the number of feature measurements, a commonly used simplification is the use of likelihoods based on
distance (or similarity) given the two alternative hypotheses. This paper proposes an intermediate method
which decomposes the LR as the product of two factors, one based on distance and the other on rarity. It was
evaluated using a data set of handwriting samples, by determining whether two writing samples were written
by the same/different writer(s). The accuracy of the distance and rarity method, as measured by error rates, is
significantly better than the distance method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents a system for the recognition of unconstrained handwritten mails. The main part of this
system is an HMM recognizer which uses trigraphs to model contextual information. This recognition system
does not require any segmentation into words or characters and directly works at line level. To take into account
linguistic information and enhance performance, a language model is introduced. This language model is based
on bigrams and built from training document transcriptions only. Different experiments with various vocabulary
sizes and language models have been conducted. Word Error Rate and Perplexity values are compared to show
the interest of specific language models, fit to handwritten mail recognition task.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents a linear-based restoration method for bleed-through degraded document images and uses a
Bayesian approach for bleed-through reduction. A variation of iterated conditional modes (ICM) optimisation
is used whereby samples are drawn for the clean image estimates, whilst the remaining variables are estimated
via the mode of their conditional probabilities. The proposed method is tested on various samples of scanned
manuscript images with different degrees of degradation, and results visually compared with a recent user-assisted
restoration method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Forensic analysis of questioned documents sometimes can be extensively data intensive. A forensic expert might
need to analyze a heap of document fragments and in such cases to ensure reliability he/she should focus only
on relevant evidences hidden in those document fragments. Relevant document retrieval needs finding of similar
document fragments. One notion of obtaining such similar documents could be by using document fragment's
physical characteristics like color, texture, etc. In this article we propose an automatic scheme to retrieve
similar document fragments based on visual appearance of document paper and texture. Multispectral color
characteristics using biologically inspired color differentiation techniques are implemented here. This is done
by projecting document color characteristics to Lab color space. Gabor filter-based texture analysis is used to
identify document texture. It is desired that document fragments from same source will have similar color and
texture. For clustering similar document fragments of our test dataset we use a Self Organizing Map (SOM)
of dimension 5×5, where the document color and texture information are used as features. We obtained an
encouraging accuracy of 97.17% from 1063 test images.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Video structuring and indexing are two crucial processes for multi-media document understanding and information
retrieval. This paper presents a novel approach in automatic structuring and indexing lecture videos for
an educational video system. By structuring and indexing video content, we can support both topic indexing
and semantic querying of multimedia documents. In this paper, our goal is to extract indices of topics and link
them with their associated video and audio segments. Two main techniques used in our proposed approach are
video image analysis and video text analysis. Using this approach, we obtain accuracy of over 90.0% on our
test collection.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We present in this paper a feature selection and weighting method for medieval handwriting images that relies on
codebooks of shapes of small strokes of characters (graphemes that are issued from the decomposition of manuscripts).
These codebooks are important to simplify the automation of the analysis, the manuscripts transcription and the
recognition of styles or writers. Our approach provides a precise features weighting by genetic algorithms and a highperformance
methodology for the categorization of the shapes of graphemes by using graph coloring into codebooks
which are applied in turn on CBIR (Content Based Image Retrieval) in a mixed handwriting database containing
different pages from different writers, periods of the history and quality. We show how the coupling of these two
mechanisms 'features weighting - graphemes classification' can offer a better separation of the forms to be categorized
by exploiting their grapho-morphological, their density and their significant orientations particularities.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Online handwritten data, produced with Tablet PCs or digital pens, consists in a sequence of points (x, y). As
the amount of data available in this form increases, algorithms for retrieval of online data are needed. Word
spotting is a common approach used for the retrieval of handwriting. However, from an information retrieval
(IR) perspective, word spotting is a primitive keyword based matching and retrieval strategy. We propose a
framework for handwriting retrieval where an arbitrary word spotting method is used, and then a manifold
ranking algorithm is applied on the initial retrieval scores. Experimental results on a database of more than
2,000 handwritten newswires show that our method can improve the performances of a state-of-the-art word
spotting system by more than 10%.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper describes the system for the recognition of French handwriting submitted by A2iA to the competition organized at ICDAR2011 using the Rimes database.
This system is composed of several recognizers based on three different recognition technologies, combined using a novel combination method.
A framework multi-word recognition based on weighted finite state transducers is presented, using an explicit word segmentation, a combination of isolated word recognizers and a language model.
The system was tested both for isolated word recognition and for multi-word line recognition and submitted to the RIMES-ICDAR2011 competition.
This system outperformed all previously proposed systems on these tasks.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Straight line segment detection in digital documents has been studied extensively for the past few decades. One of the challenges is to detect line segments without priori information about document images and render good results without much parameter calibration. In this paper, we introduce a novel algorithm that is simple but effective in detecting straight line segments in scanned documents. Our Connected Component Decomposition (CCD) approach first decomposes the connected components based on the gradient direction of the edge contours, and then uses Chebyshev's inequality to statistically distinguish lines from characters, followed by a simple post processing step to examine straightness of remain segments. This CCD approach is simple to follow and fast in its implementation, and its high accuracy and usability are demonstrated empirically on a practical data set with large varieties.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Document images accompanied by OCR output text and ground truth transcriptions are useful for developing and evaluating
document recognition and processing methods, especially for historical document images. Additionally, research into
improving the performance of such methods often requires further annotation of training and test data (e.g., topical document
labels). However, transcribing and labeling historical documents is expensive. As a result, existing real-world document
image datasets with such accompanying resources are rare and often relatively small. We introduce synthetic document
image datasets of varying levels of noise that have been created from standard (English) text corpora using an existing
document degradation model applied in a novel way. Included in the datasets is the OCR output from real OCR engines
including the commercial ABBYY FineReader and the open-source Tesseract engines. These synthetic datasets are designed
to exhibit some of the characteristics of an example real-world document image dataset, the Eisenhower Communiqu´es. The
new datasets also benefit from additional metadata that exist due to the nature of their collection and prior labeling efforts.
We demonstrate the usefulness of the synthetic datasets by training an existing multi-engine OCR correction method on the
synthetic data and then applying the model to reduce word error rates on the historical document dataset. The synthetic
datasets will be made available for use by other researchers.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.