L. Du
Please Note
4 records found
1
To build applications based on appearance-based gaze estimation, developers can choose among three paradigms. One paradigm is to train gaze estimation models themselves, which allows developers to customize models to meet various application requirements. Another option is to adopt pre-trained gaze estimation models, which avoids the resource-intensive process for model training. The third paradigm is to call gaze estimation services running on the cloud, which are well-suited for developers who wish to reduce the resource consumption for model deployment. In this case, the full-face images of users are sent to the service provider, which returns estimated gaze directions.
Despite these paradigms offering flexible options to developers for building applications, each paradigm comes with distinct challenges that hinder widespread adoption. Training an accurate gaze estimation model requires the availability of large-scale gaze datasets and the adoption of complex neural networks. The former is sparse and difficult to collect, while the latter demands substantial computational resources. Adopting pre-trained models removes the resource burden of model training, but exposes gaze estimation systems to backdoor attacks, in which an adversary can inject a backdoor into the pre-trained model and manipulate its output with a visual trigger after deployment. This compromises the security of many gaze-based applications, e.g., causing the driving assistant system to fail in tracking the driver’s attention. Lastly, calling gaze estimation services raises severe privacy concerns. This is because these services often operate as black boxes, leaving users unaware of how their face images that contain sensitive attributes are processed or utilized.
Taking these paradigms together, we observe that they either require substantial resources for model training or raise trustworthiness concerns due to the involvement of third parties. This motivates the main research question of this dissertation: “How can we make gaze estimation systems both resource-efficient and trustworthy? ” This dissertation answers this question by addressing the challenges associated with each paradigm.
To reduce the resource burden of self-trained models, we present a resource-efficient framework that includes frequency-domain gaze estimation and gaze-aware contrastive learning. The frequency-domain gaze estimation exploits the feature extraction capability and the spectral compaction property of the discrete cosine transform to substantially reduce the computational cost of gaze estimation models. Meanwhile, gaze-aware contrastive learning enables learning gaze representations in an unsupervised manner to overcome the data labeling hurdle. We show that the proposed framework can achieve comparable gaze estimation performance to existing approaches that rely on a largescale, well-labeled dataset, while enabling up to 1.67 times speedup in inference latency.
For pre-trained gaze estimation models, we explore solutions to defend against backdoor attacks. We identify the key characteristics that distinguish backdoored gaze estimation models from benign ones, based on which we propose a novel approach to reverse-engineer the backdoor trigger that leads to the identified characteristics. Given a pre-trained model, we use the reverse-engineered trigger to determine whether it is backdoored or not. If it is identified as a compromised model, we further use the reverse engineered trigger to mitigate its backdoor behavior. We show that the proposed method can defend against various backdoor attacks.
To address privacy concerns in gaze estimation services, we develop a privacy preserver that converts privacy-sensitive full-face images into obfuscated images. The obfuscated versions are then shared with the service provider for gaze estimation. The privacy preserver is designed to generate obfuscated images that exhibit the same facial appearance for different users to protect user privacy, while preserving the gaze features of the raw images to remain effective for accurate gaze estimation. Our experiments show that obfuscated images can effectively protect user privacy while leading to comparable gaze estimation performance to the original images.
Overall, this dissertation contributes to the development of resource-efficient and trustworthy gaze estimation systems. We enhance the resource efficiency of using self-trained models, which typically demand substantial resources, while improving trustworthiness of the other two paradigms, where the resource burden is offloaded to external parties through the use of pre-trained models or vendor-provided services.
...
To build applications based on appearance-based gaze estimation, developers can choose among three paradigms. One paradigm is to train gaze estimation models themselves, which allows developers to customize models to meet various application requirements. Another option is to adopt pre-trained gaze estimation models, which avoids the resource-intensive process for model training. The third paradigm is to call gaze estimation services running on the cloud, which are well-suited for developers who wish to reduce the resource consumption for model deployment. In this case, the full-face images of users are sent to the service provider, which returns estimated gaze directions.
Despite these paradigms offering flexible options to developers for building applications, each paradigm comes with distinct challenges that hinder widespread adoption. Training an accurate gaze estimation model requires the availability of large-scale gaze datasets and the adoption of complex neural networks. The former is sparse and difficult to collect, while the latter demands substantial computational resources. Adopting pre-trained models removes the resource burden of model training, but exposes gaze estimation systems to backdoor attacks, in which an adversary can inject a backdoor into the pre-trained model and manipulate its output with a visual trigger after deployment. This compromises the security of many gaze-based applications, e.g., causing the driving assistant system to fail in tracking the driver’s attention. Lastly, calling gaze estimation services raises severe privacy concerns. This is because these services often operate as black boxes, leaving users unaware of how their face images that contain sensitive attributes are processed or utilized.
Taking these paradigms together, we observe that they either require substantial resources for model training or raise trustworthiness concerns due to the involvement of third parties. This motivates the main research question of this dissertation: “How can we make gaze estimation systems both resource-efficient and trustworthy? ” This dissertation answers this question by addressing the challenges associated with each paradigm.
To reduce the resource burden of self-trained models, we present a resource-efficient framework that includes frequency-domain gaze estimation and gaze-aware contrastive learning. The frequency-domain gaze estimation exploits the feature extraction capability and the spectral compaction property of the discrete cosine transform to substantially reduce the computational cost of gaze estimation models. Meanwhile, gaze-aware contrastive learning enables learning gaze representations in an unsupervised manner to overcome the data labeling hurdle. We show that the proposed framework can achieve comparable gaze estimation performance to existing approaches that rely on a largescale, well-labeled dataset, while enabling up to 1.67 times speedup in inference latency.
For pre-trained gaze estimation models, we explore solutions to defend against backdoor attacks. We identify the key characteristics that distinguish backdoored gaze estimation models from benign ones, based on which we propose a novel approach to reverse-engineer the backdoor trigger that leads to the identified characteristics. Given a pre-trained model, we use the reverse-engineered trigger to determine whether it is backdoored or not. If it is identified as a compromised model, we further use the reverse engineered trigger to mitigate its backdoor behavior. We show that the proposed method can defend against various backdoor attacks.
To address privacy concerns in gaze estimation services, we develop a privacy preserver that converts privacy-sensitive full-face images into obfuscated images. The obfuscated versions are then shared with the service provider for gaze estimation. The privacy preserver is designed to generate obfuscated images that exhibit the same facial appearance for different users to protect user privacy, while preserving the gaze features of the raw images to remain effective for accurate gaze estimation. Our experiments show that obfuscated images can effectively protect user privacy while leading to comparable gaze estimation performance to the original images.
Overall, this dissertation contributes to the development of resource-efficient and trustworthy gaze estimation systems. We enhance the resource efficiency of using self-trained models, which typically demand substantial resources, while improving trustworthiness of the other two paradigms, where the resource burden is offloaded to external parties through the use of pre-trained models or vendor-provided services.
Through the Eyes of Emotion
A Multi-faceted Eye Tracking Dataset for Emotion Recognition in Virtual Reality
PrivateGaze
Preserving User Privacy in Black-box Mobile Gaze Tracking Services
Eye gaze contains rich information about human attention and cognitive processes. This capability makes the underlying technology, known as gaze tracking, a critical enabler for many ubiquitous applications and has triggered the development of easy-to-use gaze estimation services. Indeed, by utilizing the ubiquitous cameras on tablets and smartphones, users can readily access many gaze estimation services. In using these services, users must provide their full-face images to the gaze estimator, which is often a black box. This poses significant privacy threats to the users, especially when a malicious service provider gathers a large collection of face images to classify sensitive user attributes. In this work, we present PrivateGaze, the first approach that can effectively preserve users’ privacy in black-box gaze tracking services without compromising gaze estimation performance. Specifically, we proposed a novel framework to train a privacy preserver that converts full-face images into obfuscated counterparts, which are effective for gaze estimation while containing no privacy information. Evaluation on four datasets shows that the obfuscated image can protect users’ private information, such as identity and gender, against unauthorized attribute classification. Meanwhile, when used directly by the black-box gaze estimator as inputs, the obfuscated images lead to comparable tracking performance to the conventional, unprotected full-face images.