EP
E.M. Pekalska
info
Please Note
<p>This page displays the records of the person named above and is not linked to a unique person identifier. This record may need to be merged to a profile.</p>
20 records found
1
Pattern Recognition
Introduction and Terminology
This ebook gives the starting student an introduction into the eld of pattern recognition. It may serve as reference to others by giving intuitive descriptions of the terminology. The book is the rst in a series of ebooks on topics and examples in the eld. Our goal is an informal explanation of the concepts. For thorough mathematical descriptions we refer to the textbooks and lectures. In ten
chapters the topics of pattern recognition are summarized and its terminology is introduced. In the glossary about 200 terms are described. All glossary terms are linked, forward and backward by hypertext. In the glossary chapter external links are provided to internet pages, papers tutorials, Wikipedia entries, examples, etcetera. Internal links are in dark blue in order to preserve the readability. External links are in blue. This ebook is offered by the authors of a website on pattern recognition tools, http://37steps.com/. Here more information, software, data and examples can be found. The book itself does not assume the use of specific software. The code for generating the examples, however, is written in Matlab using PRTools. It can be inspected by clicking on the gures or example links. ...
chapters the topics of pattern recognition are summarized and its terminology is introduced. In the glossary about 200 terms are described. All glossary terms are linked, forward and backward by hypertext. In the glossary chapter external links are provided to internet pages, papers tutorials, Wikipedia entries, examples, etcetera. Internal links are in dark blue in order to preserve the readability. External links are in blue. This ebook is offered by the authors of a website on pattern recognition tools, http://37steps.com/. Here more information, software, data and examples can be found. The book itself does not assume the use of specific software. The code for generating the examples, however, is written in Matlab using PRTools. It can be inspected by clicking on the gures or example links. ...
This ebook gives the starting student an introduction into the eld of pattern recognition. It may serve as reference to others by giving intuitive descriptions of the terminology. The book is the rst in a series of ebooks on topics and examples in the eld. Our goal is an informal explanation of the concepts. For thorough mathematical descriptions we refer to the textbooks and lectures. In ten
chapters the topics of pattern recognition are summarized and its terminology is introduced. In the glossary about 200 terms are described. All glossary terms are linked, forward and backward by hypertext. In the glossary chapter external links are provided to internet pages, papers tutorials, Wikipedia entries, examples, etcetera. Internal links are in dark blue in order to preserve the readability. External links are in blue. This ebook is offered by the authors of a website on pattern recognition tools, http://37steps.com/. Here more information, software, data and examples can be found. The book itself does not assume the use of specific software. The code for generating the examples, however, is written in Matlab using PRTools. It can be inspected by clicking on the gures or example links.
chapters the topics of pattern recognition are summarized and its terminology is introduced. In the glossary about 200 terms are described. All glossary terms are linked, forward and backward by hypertext. In the glossary chapter external links are provided to internet pages, papers tutorials, Wikipedia entries, examples, etcetera. Internal links are in dark blue in order to preserve the readability. External links are in blue. This ebook is offered by the authors of a website on pattern recognition tools, http://37steps.com/. Here more information, software, data and examples can be found. The book itself does not assume the use of specific software. The code for generating the examples, however, is written in Matlab using PRTools. It can be inspected by clicking on the gures or example links.
StatisticallearningalgorithmsoftenrelyontheEuclideandistance.Inpractice,non-Euclideanornon-metricdissimilaritymeasuresmayarisewhencontours,spectraorshapesarecomparedbyeditdistancesorasaconsequenceofrobustobjectmatching[1,2].Itisanopenissuewhethersuchmeasuresareadvantageousforstatisticallearningorwhethertheyshouldbeconstrainedtoobeythemetricaxioms.
Thek-nearestneighbor(NN)ruleiswidelyappliedtogeneraldissimilaritydataasthemostnaturalapproach.Alternativemethodsexistthatembedsuchdataintosuitablerepresentationspacesinwhichstatisticalclassi¿ersareconstructed[3].Inthispaper,weinvestigatetherelationbetweennon-Euclideanaspectsofdissimilaritydataandtheclassi¿cationperformanceofthedirectNNruleandsomeclassi¿erstrainedinrepresentationspaces.Thisisevaluatedonaparameterizedfamilyofeditdistances,inwhichparametervaluescontrolthestrengthofnon-Euclideanbehavior.Our¿ndingisthatthediscriminativepowerofthismeasureincreaseswithincreasingnon-Euclideanandnon-metricaspectsuntilacertainoptimumisreached.Theconclusionisthatstatisticalclassi¿ersperformwellandtheoptimalvaluesoftheparameterscharacterizeanon-Euclideanandsomewhatnon-metricmeasure
...
StatisticallearningalgorithmsoftenrelyontheEuclideandistance.Inpractice,non-Euclideanornon-metricdissimilaritymeasuresmayarisewhencontours,spectraorshapesarecomparedbyeditdistancesorasaconsequenceofrobustobjectmatching[1,2].Itisanopenissuewhethersuchmeasuresareadvantageousforstatisticallearningorwhethertheyshouldbeconstrainedtoobeythemetricaxioms.
Thek-nearestneighbor(NN)ruleiswidelyappliedtogeneraldissimilaritydataasthemostnaturalapproach.Alternativemethodsexistthatembedsuchdataintosuitablerepresentationspacesinwhichstatisticalclassi¿ersareconstructed[3].Inthispaper,weinvestigatetherelationbetweennon-Euclideanaspectsofdissimilaritydataandtheclassi¿cationperformanceofthedirectNNruleandsomeclassi¿erstrainedinrepresentationspaces.Thisisevaluatedonaparameterizedfamilyofeditdistances,inwhichparametervaluescontrolthestrengthofnon-Euclideanbehavior.Our¿ndingisthatthediscriminativepowerofthismeasureincreaseswithincreasingnon-Euclideanandnon-metricaspectsuntilacertainoptimumisreached.Theconclusionisthatstatisticalclassi¿ersperformwellandtheoptimalvaluesoftheparameterscharacterizeanon-Euclideanandsomewhatnon-metricmeasure
Statisticalinferenceofsensor-basedmeasurementsisintensivelystudiedinpatternrecognition.Itisusuallybasedonfeaturerepresentationsoftheobjectstoberecognized.Suchrepresentations,however,neglecttheobjectstructure.Structuralpatternrecognition,onthecontrary,focussesonencodingtheobjectstructure.Asgeneralproceduresarestillweaklydeveloped,suchobjectdescriptionsareoftenapplicationdependent.Thishamperstheusageofagenerallearningapproach.
Thispaperaimstosummarizetheproblemsandpossibilitiesofgeneralstructuralinferenceapproachesforthefamilyofsensor-basedmeasurements:images,spectraandtimesignals,assumingacontinuitybetweenmeasurementsamples.Inparticularitwillbediscussedwhenprobabilisticassumptionsareneeded,leadingtoastatistically-basedinferenceofthestructure,andwhenapure,non-probabilisticstructuralinferenceschememaybepossible.
...
Statisticalinferenceofsensor-basedmeasurementsisintensivelystudiedinpatternrecognition.Itisusuallybasedonfeaturerepresentationsoftheobjectstoberecognized.Suchrepresentations,however,neglecttheobjectstructure.Structuralpatternrecognition,onthecontrary,focussesonencodingtheobjectstructure.Asgeneralproceduresarestillweaklydeveloped,suchobjectdescriptionsareoftenapplicationdependent.Thishamperstheusageofagenerallearningapproach.
Thispaperaimstosummarizetheproblemsandpossibilitiesofgeneralstructuralinferenceapproachesforthefamilyofsensor-basedmeasurements:images,spectraandtimesignals,assumingacontinuitybetweenmeasurementsamples.Inparticularitwillbediscussedwhenprobabilisticassumptionsareneeded,leadingtoastatistically-basedinferenceofthestructure,andwhenapure,non-probabilisticstructuralinferenceschememaybepossible.
A common way of expressing string similarity in structural pattern recognition is the edit distance. It allows one to apply the kNN rule in order to classify a set of strings. However, compared to the wide range of elaborated classi¿ers known from statistical pattern recognition, this is only a very basic method. In the present paper we propose a method for transforming strings into n-dimensional real vector spaces based on prototype selection. This allows us to subsequently classify the transformed strings with more sophisticated classi¿ers, such as support vector machine and other kernel based methods. In a number of experiments, we show that the recognition rate can be signi¿cantly improved by means of this procedure.
...
A common way of expressing string similarity in structural pattern recognition is the edit distance. It allows one to apply the kNN rule in order to classify a set of strings. However, compared to the wide range of elaborated classi¿ers known from statistical pattern recognition, this is only a very basic method. In the present paper we propose a method for transforming strings into n-dimensional real vector spaces based on prototype selection. This allows us to subsequently classify the transformed strings with more sophisticated classi¿ers, such as support vector machine and other kernel based methods. In a number of experiments, we show that the recognition rate can be signi¿cantly improved by means of this procedure.
Pairwiseproximitiesdescribethepropertiesofobjectsintermsoftheirsimilarities.Byusingdi¿erentdistance-basedfunctionsonemayencodedi¿erentcharacteristicsofagivenproblem.However,tousetheframeworkofstatisticalpatternrecognitionsomevectorrepresentationshouldbeconstructed.Oneofthesimplestwaystodothatistode¿neanisometricembeddingtosomevectorspace.Inthiswork,wewillfocusonalinearembeddingintoa(pseudo-)Euclideanspace.
Thisisusuallywellde¿nedfortrainingdata.Someinadequacy,however,appearswhenprojectingnewortestobjectsduetotheresultingprojectionerrors.Inthispaperweproposeanaugmentedembeddingalgorithmthatenlargesthedimensionalityofthespacesuchthattheresultingprojectionerrorvanishes.Ourpreliminaryresultsshowthatitmayleadtoabetterclassi¿cationaccuracy,especiallyfordatawithhighintrinsicdimensionality.
...
Pairwiseproximitiesdescribethepropertiesofobjectsintermsoftheirsimilarities.Byusingdi¿erentdistance-basedfunctionsonemayencodedi¿erentcharacteristicsofagivenproblem.However,tousetheframeworkofstatisticalpatternrecognitionsomevectorrepresentationshouldbeconstructed.Oneofthesimplestwaystodothatistode¿neanisometricembeddingtosomevectorspace.Inthiswork,wewillfocusonalinearembeddingintoa(pseudo-)Euclideanspace.
Thisisusuallywellde¿nedfortrainingdata.Someinadequacy,however,appearswhenprojectingnewortestobjectsduetotheresultingprojectionerrors.Inthispaperweproposeanaugmentedembeddingalgorithmthatenlargesthedimensionalityofthespacesuchthattheresultingprojectionerrorvanishes.Ourpreliminaryresultsshowthatitmayleadtoabetterclassi¿cationaccuracy,especiallyfordatawithhighintrinsicdimensionality.
Sometimesnoveloroutlierdatahastobedetected.Theoutliersmayindicatesomeinterestingrareevent,ortheyshouldbedisregardedbecausetheycannotbereliablyprocessedfurther.Intheidealcasethattheobjectsarerepresentedbyverygoodfeatures,thegenuinedataformsacompactclusterandagoodoutliermeasureisthedistancetotheclustercenter.Thispaperproposesthreenewformulationsto¿ndagoodclustercentertogetherwithanoptimizedp-distancemeasure.Experimentsshowthatforsomerealworlddatasetsverygoodclassi¿cationresultsareobtainedandthat,morespeci¿cally,the1-distanceisparticularlysuitedfordatasetscontainingdiscretefeaturevalues.
...
Sometimesnoveloroutlierdatahastobedetected.Theoutliersmayindicatesomeinterestingrareevent,ortheyshouldbedisregardedbecausetheycannotbereliablyprocessedfurther.Intheidealcasethattheobjectsarerepresentedbyverygoodfeatures,thegenuinedataformsacompactclusterandagoodoutliermeasureisthedistancetotheclustercenter.Thispaperproposesthreenewformulationsto¿ndagoodclustercentertogetherwithanoptimizedp-distancemeasure.Experimentsshowthatforsomerealworlddatasetsverygoodclassi¿cationresultsareobtainedandthat,morespeci¿cally,the1-distanceisparticularlysuitedfordatasetscontainingdiscretefeaturevalues.
In classifier combining, one tries to fuse the information that is given by a set of base classifiers. In such a process, one of the difficulties is how to deal with the variability between classifiers. Although various measures and many combining rules have been suggested in the past, the problem of constructing optimal combiners is still heavily studied.
In this paper, we discuss and illustrate the possibilities of classifier embedding in order to analyse the variability of base classifiers, as well as their combining rules. Thereby, a space is constructed in which classifiers can be represented as points. Such a space of a low dimensionality is a Classifier Projection Space (CPS). In the first instance, it is used to design a visual tool that gives more insight into the differences of various combining techniques. This is illustrated by some examples. In the end, we discuss how the CPS may also be used as a basis for constructing new combining rules.
...
In classifier combining, one tries to fuse the information that is given by a set of base classifiers. In such a process, one of the difficulties is how to deal with the variability between classifiers. Although various measures and many combining rules have been suggested in the past, the problem of constructing optimal combiners is still heavily studied.
In this paper, we discuss and illustrate the possibilities of classifier embedding in order to analyse the variability of base classifiers, as well as their combining rules. Thereby, a space is constructed in which classifiers can be represented as points. Such a space of a low dimensionality is a Classifier Projection Space (CPS). In the first instance, it is used to design a visual tool that gives more insight into the differences of various combining techniques. This is illustrated by some examples. In the end, we discuss how the CPS may also be used as a basis for constructing new combining rules.
Dissimilarity representations are of interest when it is hard to define well-discriminating features for the raw measurements. For an exploration of such data, the techniques of multidimensional scaling (MDS) can be used. Given a symmetric dissimilarity matrix, they find a lower-dimensional configuration such that the distances are preserved. Here, Sammon nonlinear mapping is considered. In general, this iterative method must be recomputed when new examples are introduced, but its complexity is quadratic in the number of objects in each iteration step.
A simple modification to the nonlinear MDS, allowing for a significant reduction in complexity, is therefore considered, as well as a linear projection of the dissimilarity data. Now, generalization to new data can be achieved, which makes it suitable for solving classification problems. The linear and nonlinear mappings are then used in the setting of data visualization and classification. Our experiments show that the nonlinear mapping can be preferable for data inspection, while for discrimination purposes, a linear mapping can be recommended. Moreover, for the spatial lower-dimensional representation, a more global, linear classifier can be built, which outperforms the local nearest neighbor rule, traditionally applied to dissimilarities.
...
Dissimilarity representations are of interest when it is hard to define well-discriminating features for the raw measurements. For an exploration of such data, the techniques of multidimensional scaling (MDS) can be used. Given a symmetric dissimilarity matrix, they find a lower-dimensional configuration such that the distances are preserved. Here, Sammon nonlinear mapping is considered. In general, this iterative method must be recomputed when new examples are introduced, but its complexity is quadratic in the number of objects in each iteration step.
A simple modification to the nonlinear MDS, allowing for a significant reduction in complexity, is therefore considered, as well as a linear projection of the dissimilarity data. Now, generalization to new data can be achieved, which makes it suitable for solving classification problems. The linear and nonlinear mappings are then used in the setting of data visualization and classification. Our experiments show that the nonlinear mapping can be preferable for data inspection, while for discrimination purposes, a linear mapping can be recommended. Moreover, for the spatial lower-dimensional representation, a more global, linear classifier can be built, which outperforms the local nearest neighbor rule, traditionally applied to dissimilarities.
In image retrieval systems, images can be represented by single feature vectors or by clouds of points. A cloud of points offers a more flexible description but suffers from class overlap. We propose a novel approach for describing clouds of points based on support vector data description (SVDD). We show that combining SVDD-based classifiers improves the retrieval precision. We investigate the performance of the proposed retrieval technique on a database of 368 texture images and compare it to other methods.
...
In image retrieval systems, images can be represented by single feature vectors or by clouds of points. A cloud of points offers a more flexible description but suffers from class overlap. We propose a novel approach for describing clouds of points based on support vector data description (SVDD). We show that combining SVDD-based classifiers improves the retrieval precision. We investigate the performance of the proposed retrieval technique on a database of 368 texture images and compare it to other methods.