Electronic Theses and Dissertations

Author

Pulin Agrawal

Date

2019

Document Type

Dissertation

Degree Name

Doctor of Philosophy

Department

Computer Science

Committee Chair

Vasile Rus

Committee Member

Bernie Daigle

Committee Member

Stan Franklin

Committee Member

Deepak Venugopal

Abstract

In this dissertation I explore the properties and uses of sparse representations. Sparse representations use high dimensional binary vectors for representing information. They have many properties which make this representation useful for applications involving pattern recognition in highly noisy and complex environments. Sparse representations have a very high capacity. A typical sparse representation vector has a capacity of 10^84 distinct vectors, which is more than the number of atoms in the universe. Sparse representations are highly noise robust. They can tolerate even up to 50% noise. A very powerful and useful property of sparse representations is that they allow us to easily measure similarity between two things by directly comparing their representations. These properties allow them to have applications in a variety of fields, like Artificial Intelligence and Molecular Biology, that need to encode information that is complex and noisy in nature. In this dissertation, I show how sparse representations can be used for representing complex environments for an agent based on Learning Intelligent Decision Agent (LIDA) model. Sparse representations allowed us to achieve a two-fold goal of producing information rich representations of things in the environment while proposing a method of generating grounded representations for the LIDA model. Sparse representations also allowed us to ground the representations used by LIDA in the sensory apparatus of the agent while still allowing a perfect fidelity communication between the sensory memory of LIDA and the rest of the model. I also show how sparse representations are useful in Molecular Biology for discovering data-driven patterns in heterogeneous and noisy gene expression data. We used a sparse auto-encoder to learn sparse representations of transcriptomics experiments taken from a huge publicly available dataset. These representations were then used to identify biological patterns in the form of gene sets. The representation provided a unique signature for a set of samples originating from the same experimental condition. Applications of our method include the identification of previously undiscovered gene sets as well as supervised classification of samples from different biological classes. Overall, our results show that sparse representations are useful in a variety of fields that involve finding patterns in a complex and noisy environment.

Comments

Data is provided by the student.

Library Comment

Dissertation or thesis originally submitted to ProQuest

Notes

embargoed

Share

COinS