Electronic Theses and Dissertations



Document Type


Degree Name

Master of Science


Computer Science

Committee Chair

Deepak Venugopal

Committee Member

Nirman Dr. Kumar

Committee Member

Xiaofei Dr. Zhang


Despite Generative AI’s rapid growth (e.g., ChatGPT, GPT-4, Dalle-2, etc.), generated data from these models may have an inherent bias. This bias can propagate to downstream tasks e.g., classification, and data augmentation that utilize data from generative AI models. This thesis empirically evaluates model bias in different deep generative models like Variational Autoencoder and PixelCNN++. Further, we resample generated data using importance sampling to reduce bias in the generated images based on a recently proposed method for bias-reduction using probabilistic classifiers. The approach is developed in the context of image generation and we demonstrate that importance sampling can produce better quality samples with lower bias. Next, we improve downstream classification by developing a semi-supervised learning pipeline where we use importance-sampled data as unlabeled examples within a classifier. Specifically, we use a loss function called as the semantic-loss function that was proposed to add constraints on unlabeled data to improve the performance of classification using limited labeled examples. Through the use of importance-sampled images, we essentially add constraints on data instances that are more informative for the classifier, thus resulting in the classifier learning a better decision boundary using fewer labeled examples.


Data is provided by the student.

Library Comment

Dissertation or thesis originally submitted to ProQuest


Open Access