The Similarity Between Dissimilarities

More Info
expand_more

Abstract

When characterizing teams of people, molecules, or general graphs, it is difficult to encode all information using a single feature vector only. For these objects dissimilarity matrices that do capture the interaction or similarity between the sub-elements (people, atoms, nodes), can be used. This paper compares several representations of dissimilarity matrices, that encode the cluster characteristics, latent dimensionality, or outliers of these matrices. It appears that both the simple eigenvalue spectrum, or histogram of distances are already quite effective, and are able to reach high classification performances in multiple instance learning (MIL) problems. Finally, an analysis on teams of people is given, illustrating the potential use of dissimilarity matrix characterization for business consultancy.