Electronic Theses and Dissertations

Identifier

6741

Date

2021

Document Type

Thesis

Degree Name

Master of Science

Major

Computer Science

Committee Chair

Lan Wang

Committee Member

Eddie Jacobs

Committee Member

Weizi Li

Abstract

An accurate model of building interiors with detailed annotations is critical to protecting first responders and building occupants during emergencies. First responders and building occupants can use these 3D building models to navigate indoor environments or to vacate the building safely. Light Detection and Ranging (LiDAR) is a commonly used remote sensing method that uses light (laser) to create a 3D map. However, it provides a low-resolution point cloud, which makes it difficult for first responders to identify objects of interest directly in the point cloud. More specifically, small safety objects do not have a clear presence in the point cloud, and some safety objects are differentiable only by color. In this project, we apply instance segmentation on RGB images of buildings instead of segmenting the 3D point clouds directly to locate these objects and create detailed annotations. There has been extensive research related to object detection and segmentation. However, the segmentation of public safety objects in an indoor scene is not studied widely. This task can be challenging due to the insufficiency of natural light, irregularity of ambient light, and deficiency of a relevant training dataset. Our research creates a labeling system for the environments inside and adjacent to buildings. Firstly, we collected 360-degree panoramic videos and sampled them into an equirectangular projection image frame. Later, we created a manually labeled equirectangular image dataset for indoor public safety objects and used them to train machine learning models. Finally, we utilized machine learning models to locate and classify those objects in the equirectangular video frames. Our results show that the deep neural network Mask RCNN with classification architecture such as Inception-ResNet-V2 and ResNet-101 performs well in labeling public safety objects in our image dataset, especially for large objects. We reported the test dataset results and analyzed them with other similar models. Our results are encouraging but not conclusive. We have experimented with different projections and found that equirectangular works better. We have tried to train our model with and without transfer learning. We noticed that transfer learning from the MS Coco dataset serves well. We also observed that, with transfer learning, adding just a few hundred labeled images from a building to the training dataset significantly improves a model’s performance. We also tried hard-negative mining for training the model and observed a considerable performance improvement.

Comments

Data is provided by the student.

Library Comment

Dissertation or thesis originally submitted to the local University of Memphis Electronic Theses & dissertation (ETD) Repository.

Share

COinS