A Human-Machine Approach to Preserve Privacy in Image Analysis Crowdsourcing Tasks

Master Thesis (2019)
Author(s)

Sharad Shriram (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Alessandro Bozzon – Mentor (TU Delft - Web Information Systems)

A. Mauri – Mentor (TU Delft - Web Information Systems)

G.J. Houben – Graduation committee member (TU Delft - Web Information Systems)

Mauricio Aniche – Graduation committee member (TU Delft - Software Engineering)

Faculty
Electrical Engineering, Mathematics and Computer Science
Copyright
© 2019 Sharad Shriram
More Info
expand_more
Publication Year
2019
Language
English
Copyright
© 2019 Sharad Shriram
Graduation Date
19-08-2019
Awarding Institution
Delft University of Technology
Programme
Computer Science
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Modern web information systems use machine learning models to provide personalized user services and experiences. However, machine learning models require annotated data for training, and creating annotated data is done through crowdsourcing tasks. The content used in annotation crowdsourcing tasks like medical records and images might contain some private information which can directly or indirectly identify an individual. The name, age, ethnicity, gender, contact details are examples of private information that directly identifies an individual. Indirect private information relates to the cultural, economic, and social factors of an individual. For instance, the visual cues of religious objects or symbols relate to the religious beliefs of an individual. In this thesis, we study how to minimize the amount of private information extracted from images using a hybrid algorithm which combines machine learning models and crowdsourcing. We also demonstrate that the proposed hybrid algorithm reduces the amount of private information exposed from the image and the cost of using the crowd for detecting private information in the image.

Files

Sharad_shriram_thesis.pdf
(pdf | 5.82 Mb)
License info not available