E.M. Pekalska | TU Delft Repository

Pattern Recognition

Introduction and Terminology

Book (2016) - Robert P.W. Duin, Elzbieta Pekalska

This ebook gives the starting student an introduction into the eld of pattern recognition. It may serve as reference to others by giving intuitive descriptions of the terminology. The book is the rst in a series of ebooks on topics and examples in the eld. Our goal is an informal explanation of the concepts. For thorough mathematical descriptions we refer to the textbooks and lectures. In ten
chapters the topics of pattern recognition are summarized and its terminology is introduced. In the glossary about 200 terms are described. All glossary terms are linked, forward and backward by hypertext. In the glossary chapter external links are provided to internet pages, papers tutorials, Wikipedia entries, examples, etcetera. Internal links are in dark blue in order to preserve the readability. External links are in blue. This ebook is offered by the authors of a website on pattern recognition tools, http://37steps.com/. Here more information, software, data and examples can be found. The book itself does not assume the use of specific software. The code for generating the examples, however, is written in Matlab using PRTools. It can be inspected by clicking on the gures or example links. ...

On euclidean correction for non-euclidean dissimilarities

Conference paper (2008) - RPW Duin, EM Pekalska, A Harol, WJ Lee, H Bunke

Group-induced vector spaces

Conference paper (2007) - M Bicego, EM Pekalska, RPW Duin

The science of pattern recognition. Achievements and perspectives.

Book chapter (2007) - RPW Duin, EM Pekalska

Non-Euclidean or non-metric measures can be informative

Conference paper (2006) - EM Pekalska, A Harol, RPW Duin, B Spillmann, H Bunke

StatisticallearningalgorithmsoftenrelyontheEuclideandistance.Inpractice,non-Euclideanornon-metricdissimilaritymeasuresmayarisewhencontours,spectraorshapesarecomparedbyeditdistancesorasaconsequenceofrobustobjectmatching[1,2].Itisanopenissuewhethersuchmeasuresareadvantageousforstatisticallearningorwhethertheyshouldbeconstrainedtoobeythemetricaxioms. Thek-nearestneighbor(NN)ruleiswidelyappliedtogeneraldissimilaritydataasthemostnaturalapproach.Alternativemethodsexistthatembedsuchdataintosuitablerepresentationspacesinwhichstatisticalclassi¿ersareconstructed[3].Inthispaper,weinvestigatetherelationbetweennon-Euclideanaspectsofdissimilaritydataandtheclassi¿cationperformanceofthedirectNNruleandsomeclassi¿erstrainedinrepresentationspaces.Thisisevaluatedonaparameterizedfamilyofeditdistances,inwhichparametervaluescontrolthestrengthofnon-Euclideanbehavior.Our¿ndingisthatthediscriminativepowerofthismeasureincreaseswithincreasingnon-Euclideanandnon-metricaspectsuntilacertainoptimumisreached.Theconclusionisthatstatisticalclassi¿ersperformwellandtheoptimalvaluesoftheparameterscharacterizeanon-Euclideanandsomewhatnon-metricmeasure ...

Transforming strings to vector spaces using prototype selection

Conference paper (2006) - B Spillmann, M Neuhaus, H Bunke, EM Pekalska, RPW Duin

A common way of expressing string similarity in structural pattern recognition is the edit distance. It allows one to apply the kNN rule in order to classify a set of strings. However, compared to the wide range of elaborated classi¿ers known from statistical pattern recognition, this is only a very basic method. In the present paper we propose a method for transforming strings into n-dimensional real vector spaces based on prototype selection. This allows us to subsequently classify the transformed strings with more sophisticated classi¿ers, such as support vector machine and other kernel based methods. In a number of experiments, we show that the recognition rate can be signi¿cantly improved by means of this procedure. ...

Structural inference of sensor-based measurements

Conference paper (2006) - RPW Duin, EM Pekalska

Statisticalinferenceofsensor-basedmeasurementsisintensivelystudiedinpatternrecognition.Itisusuallybasedonfeaturerepresentationsoftheobjectstoberecognized.Suchrepresentations,however,neglecttheobjectstructure.Structuralpatternrecognition,onthecontrary,focussesonencodingtheobjectstructure.Asgeneralproceduresarestillweaklydeveloped,suchobjectdescriptionsareoftenapplicationdependent.Thishamperstheusageofagenerallearningapproach. Thispaperaimstosummarizetheproblemsandpossibilitiesofgeneralstructuralinferenceapproachesforthefamilyofsensor-basedmeasurements:images,spectraandtimesignals,assumingacontinuitybetweenmeasurementsamples.Inparticularitwillbediscussedwhenprobabilisticassumptionsareneeded,leadingtoastatistically-basedinferenceofthestructure,andwhenapure,non-probabilisticstructuralinferenceschememaybepossible. ...

Augmented embedding of dissimilarity data into (pseudo-)Euclidean spaces

Conference paper (2006) - A Harol, EM Pekalska, S Verzakov, RPW Duin

Pairwiseproximitiesdescribethepropertiesofobjectsintermsoftheirsimilarities.Byusingdi¿erentdistance-basedfunctionsonemayencodedi¿erentcharacteristicsofagivenproblem.However,tousetheframeworkofstatisticalpatternrecognitionsomevectorrepresentationshouldbeconstructed.Oneofthesimplestwaystodothatistode¿neanisometricembeddingtosomevectorspace.Inthiswork,wewillfocusonalinearembeddingintoa(pseudo-)Euclideanspace. Thisisusuallywellde¿nedfortrainingdata.Someinadequacy,however,appearswhenprojectingnewortestobjectsduetotheresultingprojectionerrors.Inthispaperweproposeanaugmentedembeddingalgorithmthatenlargesthedimensionalityofthespacesuchthattheresultingprojectionerrorvanishes.Ourpreliminaryresultsshowthatitmayleadtoabetterclassi¿cationaccuracy,especiallyfordatawithhighintrinsicdimensionality. ...

Outlier detection using ball descriptions with adjustable metric

Conference paper (2006) - DMJ Tax, P Juszczak, EM Pekalska, RPW Duin

Sometimesnoveloroutlierdatahastobedetected.Theoutliersmayindicatesomeinterestingrareevent,ortheyshouldbedisregardedbecausetheycannotbereliablyprocessedfurther.Intheidealcasethattheobjectsarerepresentedbyverygoodfeatures,thegenuinedataformsacompactclusterandagoodoutliermeasureisthedistancetotheclustercenter.Thispaperproposesthreenewformulationsto¿ndagoodclustercentertogetherwithanoptimizedp-distancemeasure.Experimentsshowthatforsomerealworlddatasetsverygoodclassi¿cationresultsareobtainedandthat,morespeci¿cally,the1-distanceisparticularlysuitedfordatasetscontainingdiscretefeaturevalues. ...

Object representation, sample size, and data set complexity

Book chapter (2006) - RPW Duin, EM Pekalska

Pairwise selection of features and prototypes

Conference paper (2005) - EM Pekalska, A Harol, RPW Duin

Open issues in pattern recognition

Conference paper (2005) - RPW Duin, EM Pekalska

Combining dissimilarity-based one-class classifiers

Conference paper (2004) - EM Pekalska, M Skurichina, RPW Duin

On not making dissimilarities euclidean

Conference paper (2004) - EM Pekalska, RPW Duin, S Gunter, H Bunke

One-Class LP Classifiers for Dissimilarity Representations

Conference paper (2003) - EM Pekalska, DMJ Tax, RPW Duin

On combining one-class classifiers for image database retrieval

Conference paper (2002) - C Lai, DMJ Tax, RPW Duin, EM Pekalska, P Paclik

In image retrieval systems, images can be represented by single feature vectors or by clouds of points. A cloud of points offers a more flexible description but suffers from class overlap. We propose a novel approach for describing clouds of points based on support vector data description (SVDD). We show that combining SVDD-based classifiers improves the retrieval precision. We investigate the performance of the proposed retrieval technique on a database of 368 texture images and compare it to other methods. ...

Spatial representation of dissimilarity data via lower-complexity linear and nonlinear mappings

Conference paper (2002) - EM Pekalska, RPW Duin

Dissimilarity representations are of interest when it is hard to define well-discriminating features for the raw measurements. For an exploration of such data, the techniques of multidimensional scaling (MDS) can be used. Given a symmetric dissimilarity matrix, they find a lower-dimensional configuration such that the distances are preserved. Here, Sammon nonlinear mapping is considered. In general, this iterative method must be recomputed when new examples are introduced, but its complexity is quadratic in the number of objects in each iteration step. A simple modification to the nonlinear MDS, allowing for a significant reduction in complexity, is therefore considered, as well as a linear projection of the dissimilarity data. Now, generalization to new data can be achieved, which makes it suitable for solving classification problems. The linear and nonlinear mappings are then used in the setting of data visualization and classification. Our experiments show that the nonlinear mapping can be preferable for data inspection, while for discrimination purposes, a linear mapping can be recommended. Moreover, for the spatial lower-dimensional representation, a more global, linear classifier can be built, which outperforms the local nearest neighbor rule, traditionally applied to dissimilarities. ...

A discussion on the classifier projection space for classifier combining

Conference paper (2002) - EM Pekalska, RPW Duin, M Skurichina

In classifier combining, one tries to fuse the information that is given by a set of base classifiers. In such a process, one of the difficulties is how to deal with the variability between classifiers. Although various measures and many combining rules have been suggested in the past, the problem of constructing optimal combiners is still heavily studied. In this paper, we discuss and illustrate the possibilities of classifier embedding in order to analyse the variability of base classifiers, as well as their combining rules. Thereby, a space is constructed in which classifiers can be represented as points. Such a space of a low dimensionality is a Classifier Projection Space (CPS). In the first instance, it is used to design a visual tool that gives more insight into the differences of various combining techniques. This is illustrated by some examples. In the end, we discuss how the CPS may also be used as a basis for constructing new combining rules. ...

On Combining Dissimilarity Representations.

Conference paper (2001) - EM Pekalska, RPW Duin

Combining Fisher Linear Discriminants for Dissimilarity Representations.

Conference paper (2000) - EM Pekalska, M Skurichina, RPW Duin