Nessy

None, None; None, None; None, None; None, None

Nessy

A Neuro-Symbolic System for Label Noise Reduction

Journal Article (2022)

Author(s)

Alisa Smirnova (University of Fribourg)

Jie Yang (TU Delft - Web Information Systems)

Dingqi Yang (University of Macau)

Philippe Cudré-Mauroux (University of Fribourg)

Research Group

Web Information Systems

Copyright

DOI related publication

https://doi.org/10.1109/TKDE.2022.3199570

Deep learning Data mining Feature extraction Noise reduction Noise measurement Data models Probabilistic logic Deep probabilistic model Distant supervision Neuro-symbolic systems Relation extraction Training data

To reference this document use:

https://resolver.tudelft.nl/uuid:49600693-899c-426a-9ce3-46da8b80f08b

More Info

expand_more

Publication Year

2022

Language

English

Copyright

Research Group

Web Information Systems

Issue number

8

Volume number

35

Pages (from-to)

8300-8311

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Noisy labels represent one of the key issues in supervised machine learning. Existing work for label noise reduction mainly takes a probabilistic approach that infers true labels from data distributions in low-level feature spaces. Such an approach is not only limited by its capability to learn high-quality data representations, but also by the low predictive power of data distributions in inferring true classes. To address those problems, we introduce Nessy, a neuro-symbolic system that integrates deep probabilistic modeling and symbolic knowledge for label noise reduction. Our deep probabilistic model infers the true classes of data instances with noisy labels by exploiting data distributions in an underlying latent feature representation space. For data instances where inference is not reliable enough, Nessy extracts symbolic rules and ranks them according to several utility metrics. Top-ranking rules are injected into the deep probabilistic model via expectation regularization, i.e., via a posterior regularization term constraining the class distribution in the objective function. In a real deployment over multiple relation extraction tasks, we demonstrate that Nessy is able to significantly improve the state of the art, by 7% accuracy and 10.7% AUC on average.

Files

Nessy_A_Neuro_Symbolic_System_... (pdf)

(pdf | 2.83 Mb)

- Embargo expired in 24-07-2023

License info not available