Electronic Theses and Dissertations
Identifier
6741
Date
2021
Document Type
Thesis
Degree Name
Master of Science
Major
Computer Science
Committee Chair
Lan Wang
Committee Member
Eddie Jacobs
Committee Member
Weizi Li
Abstract
An accurate model of building interiors with detailed annotations is critical to protecting first responders and building occupants during emergencies. First responders and building occupants can use these 3D building models to navigate indoor environments or to vacate the building safely. Light Detection and Ranging (LiDAR) is a commonly used remote sensing method that uses light (laser) to create a 3D map. However, it provides a low-resolution point cloud, which makes it difficult for first responders to identify objects of interest directly in the point cloud. More specifically, small safety objects do not have a clear presence in the point cloud, and some safety objects are differentiable only by color. In this project, we apply instance segmentation on RGB images of buildings instead of segmenting the 3D point clouds directly to locate these objects and create detailed annotations. There has been extensive research related to object detection and segmentation. However, the segmentation of public safety objects in an indoor scene is not studied widely. This task can be challenging due to the insufficiency of natural light, irregularity of ambient light, and deficiency of a relevant training dataset. Our research creates a labeling system for the environments inside and adjacent to buildings. Firstly, we collected 360-degree panoramic videos and sampled them into an equirectangular projection image frame. Later, we created a manually labeled equirectangular image dataset for indoor public safety objects and used them to train machine learning models. Finally, we utilized machine learning models to locate and classify those objects in the equirectangular video frames. Our results show that the deep neural network Mask RCNN with classification architecture such as Inception-ResNet-V2 and ResNet-101 performs well in labeling public safety objects in our image dataset, especially for large objects. We reported the test dataset results and analyzed them with other similar models. Our results are encouraging but not conclusive. We have experimented with different projections and found that equirectangular works better. We have tried to train our model with and without transfer learning. We noticed that transfer learning from the MS Coco dataset serves well. We also observed that, with transfer learning, adding just a few hundred labeled images from a building to the training dataset significantly improves a model’s performance. We also tried hard-negative mining for training the model and observed a considerable performance improvement.
Library Comment
Dissertation or thesis originally submitted to the local University of Memphis Electronic Theses & dissertation (ETD) Repository.
Recommended Citation
Hossain, Mazharul, "Instance Segmentation of Public Safety Objects in RGB Image from Indoor Scene to Build Rich Interior Hazard Maps" (2021). Electronic Theses and Dissertations. 2200.
https://digitalcommons.memphis.edu/etd/2200
Comments
Data is provided by the student.