Active Semi-Supervised Learning For Diffusions on Graphs

None, None

Active Semi-Supervised Learning For Diffusions on Graphs

Master Thesis (2019)

Author(s)

B. Das (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

G.J.T. Leus – Mentor (TU Delft - Signal Processing Systems)

E. Isufi – Mentor (TU Delft - Multimedia Computing)

D.M.J. Tax – Graduation committee member (TU Delft - Pattern Recognition and Bioinformatics)

Faculty

Electrical Engineering, Mathematics and Computer Science

Copyright

Active learning Compressed sensing Sparse sensing Semi-supervised learning Diffusion on graphs

To reference this document use:

https://resolver.tudelft.nl/uuid:ae389541-9316-47dd-8fbc-b96c92da3c3b

More Info

expand_more

Publication Year

2019

Language

English

Copyright

Graduation Date

27-11-2019

Awarding Institution

Delft University of Technology

Programme

['Electrical Engineering | Circuits and Systems']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

In statistical learning over large data-sets, labeling all points is expensive and time-consuming. Semi-supervised classification allows learning with very few labels. Naturally, selecting a few points to label becomes crucial as the performance relies heavily on the labeled points. The motivation behind active learning is to build an optimal training set keeping the classifier in mind. Random or heuristic-driven selection does not care for the classification process or are trivially defined. We are interested in the graph structure formed by the data, as seen in citation, social and biological networks. Accordingly, active semi-supervised learning on graphs labels nodes to enhance the performance of classification. We propose a new methodology to perform active learning for diffusion-based semi-supervised classifiers. In particular, we focus on a classifier which diffuses probability distributions over the graph through random walks. We postulate the active learning problem as $i)$ a linear inverse problem with a sparse starting distribution over the nodes; $ii)$ a model output selection problem. For the former, we use sparsity-regularized inverse problems to select nodes. For the latter, we use tools from Compressed Sensing and Sparse Sensing to select the nodes with the relevant model output. We show that we can select all the relevant nodes in a single shot fashion, hence avoiding reliance on multiple training phases. Results on simulated as well as real data-sets show the proposed methods outperform random labeling, thereby proving to be relevant for active semi-supervised learning on graphs.

Files

Thesis_Report.pdf

(pdf | 2.77 Mb)

License info not available