Electronic Theses and Dissertations
Date
2024
Document Type
Dissertation
Degree Name
Doctor of Philosophy
Department
Computer Science
Committee Chair
Xiaofei Zhang
Committee Member
Vasile Rus
Committee Member
Xiaolei Huang
Committee Member
Deepak Venugopal
Abstract
A prevalent limitation of optimizing over a single objective is that it can be misguided, becoming trapped in local optimum. Quality-Diversity (QD) algorithms overcome this limitation by seeking a population of high-quality and diverse solutions to a problem. Most conventional QD approaches, such as MAP-Elites, rely on a behavior archive where solutions are categorized into predefined niches. While promising, these approaches require formulating assumptions on the set of behaviors, metrics for defining the distance between behaviors, and at many times constraining the learned behaviors into a fixed set of bins. In this work, we begin by proposing an alternative to archive-based QD algorithms called Diverse Quality Species (DQS), which breaks solutions down into independently evolving species and employs unsupervised skill discovery to learn diverse, high-performing solutions without the need for an archive or predefined ranges of behaviors. We achieve this using gradient-based mutations that jointly maximize a mutual information objective and reward. We evaluate DQS on several simulated robot environments and demonstrate that it can learn a diverse set of solutions from varying species. Furthermore, we propose a novel unsupervised skill discovery algorithm called SSD that which trains a speciated population of skill-conditioned policies. SSD maximizes the mutual information between states and skills and between state and species-given skills. To achieve this, we employ a contrastive learning framework to minimize the conditional entropy between states and skill-species pairs, facilitating the learning of controllable latent behaviors. Moreover, we utilize a particle-based entropy estimator to maximize state entropy, thereby promoting state space exploration. Lastly, we combine DQS with SSD and show how it uncovers a range of innovative latent behavior, surpassing the performance of prior methods.
Library Comment
Dissertation or thesis originally submitted to ProQuest.
Notes
Open Access
Recommended Citation
Wickman, Ryan, "Archive-free Quality-Diversity Optimization through Diverse Quality Species" (2024). Electronic Theses and Dissertations. 3602.
https://digitalcommons.memphis.edu/etd/3602
Comments
Data is provided by the student.