ORCID
https://orcid.org/0009-0008-1436-1639
Date of Award
Fall 2024
Language
English
Embargo Period
11-26-2024
Document Type
Dissertation
Degree Name
Doctor of Philosophy (PhD)
College/School/Department
Department of Computer Science
Program
Computer Science
First Advisor
Pradeep Atrey
Committee Members
Vivek Singh, Ming-Ching Chang, Chinwe Ekenna
Keywords
Computer Vision, Multimedia, AI, ML, Bias, Gaze Uniformity
Subject Categories
Artificial Intelligence and Robotics | Graphics and Human Computer Interfaces | Other Computer Sciences
Abstract
Today, more than 5 billion photos are captured every day, with smartphones generating over 94\% of these images. However, despite advancements in technology, achieving aesthetically pleasing group photos remains challenging, especially when it comes to aligning the direction of everyone’s gaze. While current methods focus on facial features, they often fail to ensure consistent gaze direction. The introduction of the iPhone's Live mode, which captures a 1.5-second video snippet along with still images, complicates the selection of the best key photo due to its subjective nature and a lack of publicly available data, especially during the pandemic.
To address these issues, this thesis outlines three primary goals: First, to detect and improve the aesthetic quality of group instant and live images by enhancing gaze uniformity. Second, to investigate and reduce biases in the gaze uniformity algorithm to ensure fairness across different demographic groups. Lastly, to create a diverse Live Photos dataset, containing images captured with various cameras and in different settings, to support future research efforts.
To accomplish these goals, the thesis makes several key contributions. It identifies gaze uniformity as a crucial aspect of group photo aesthetics and introduces a novel method for assessing gaze uniformity, an approach not previously explored in the literature. Additionally, it highlights that Apple’s proprietary algorithm overlooks gaze uniformity when selecting representative frames for Live Photos. A method for determining a Gaze-Aware Representative Group Image (GARGI) is proposed, along with a user-friendly iOS application that assesses gaze uniformity and categorizes photos as GOOD, BAD, or OK, thus enhancing group photo quality in both instant and live modes. Furthermore, the thesis conducts an audit of gaze uniformity detection algorithms to evaluate fairness concerning gender and presents a multi-stage framework to address identified biases. Finally, it compiles a unique dataset of Live Photos, called LivePics-24, to fill a significant gap in available resources by including diverse groups and settings.
Through these contributions, the thesis aims to improve the user experience in smartphone photography while also addressing the societal implications of algorithmic biases.
License
This work is licensed under the University at Albany Standard Author Agreement.
Recommended Citation
Kulkarni, Omkar, "BIAS-AWARE GAZE UNIFORMITY ASSESSMENT IN GROUP IMAGES" (2024). Electronic Theses & Dissertations (2024 - present). 70.
https://scholarsarchive.library.albany.edu/etd/70
Included in
Artificial Intelligence and Robotics Commons, Graphics and Human Computer Interfaces Commons, Other Computer Sciences Commons