Electronic Theses and Dissertations

THE PERCEPTION-ACTION LOOP IN ATTENTION-BASED PREDICTIVE AGENTS: APPLICATION TO MULTIMODAL DATA GENERATION AND RECOGNITION

Murchana Baruah

Date

2022

Document Type

Dissertation

Degree Name

Doctor of Philosophy

Department

Electrical & Computer Engineering

Committee Chair

Bonny Banerjee

Committee Member

Madhusudhanan Balasubramanian

Committee Member

Deepak Venugopal

Committee Member

Aaron L Robinson

Abstract

With the proliferation of soft and hard sensors, data in multiple sensor modalities has become commonplace. In this dissertation, we propose a general-purpose agent model that operates using a closed perception-action loop. The agent actively and sequentially samples its environment, driven by sensory prediction error. It learns where and what to sample by minimizing this prediction error, without any reinforcement. This end-to-end model is evaluated on three applications: (1) generation and recognition of handwritten numerals and alphabets from images and videos, (2) generation and recognition of human-human interactions from videos, and (3) recognition of emotions from speech via generation. For each application, the model yields state-of-the-art accuracy on benchmark datasets, while also maintaining sample and model size efficiency. In order to validate our model with respect to human performance, we collect mouse-click attention tracking (mcAT) data from 382 participants trying to recognize handwritten numerals and alphabets (upper and lowercase) from images via sequential sampling. Images from benchmark datasets are presented as stimuli. The collected data consists of a sequence of sample (click) locations, predicted class label(s) at each sampling, and duration of each sampling. We show that on average, participants observe only 12.8% of an image for recognition. When exposed to the same stimuli and experimental conditions as the participants, our agent model performs handwritten numeral/alphabet recognition more efficiently than the participants as well as a highly-cited attention-based reinforcement model.

Comments

Data is provided by the student.

Library Comment

Dissertation or thesis originally submitted to ProQuest.

Notes

Embargoed until 4/12/2024

Recommended Citation

Baruah, Murchana, "THE PERCEPTION-ACTION LOOP IN ATTENTION-BASED PREDICTIVE AGENTS: APPLICATION TO MULTIMODAL DATA GENERATION AND RECOGNITION" (2022). Electronic Theses and Dissertations. 3394.
https://digitalcommons.memphis.edu/etd/3394

Download

COinS

Electronic Theses and Dissertations

THE PERCEPTION-ACTION LOOP IN ATTENTION-BASED PREDICTIVE AGENTS: APPLICATION TO MULTIMODAL DATA GENERATION AND RECOGNITION

Date

Document Type

Degree Name

Department

Committee Chair

Committee Member

Committee Member

Committee Member

Abstract

Comments

Library Comment

Notes

Recommended Citation

Search

Browse

Author Corner

Libraries

Electronic Theses and Dissertations

THE PERCEPTION-ACTION LOOP IN ATTENTION-BASED PREDICTIVE AGENTS: APPLICATION TO MULTIMODAL DATA GENERATION AND RECOGNITION

Author

Date

Document Type

Degree Name

Department

Committee Chair

Committee Member

Committee Member

Committee Member

Abstract

Comments

Library Comment

Notes

Recommended Citation

Share

Search

Browse

Author Corner

Libraries