文章题目:《FloodNet: A High Resolution Aerial Imagery Dataset for Post Flood Scene Understanding》
文章引用:Maryam Rahnemoonfar, Tashnim Chowdhury, Argho Sarkar, Debvrat Varshney, Masoud Yari and Robin Murphy. "FloodNet: A High Resolution Aerial Imagery Dataset for Post Flood Scene Understanding". arXiv preprint, arXiv: 2012.02951, 2020.
Visual scene understanding is the core task in making any crucial decision in any computer vision system. Although popular computer vision datasets like Cityscapes, MS-COCO, PASCAL provide good benchmarks for several tasks (e.g. image classification, segmentation, object detection), these datasets are hardly suitable for post disaster damage assessments. On the other hand, existing natural disaster datasets include mainly satellite imagery which have low spatial resolution and a high revisit period. Therefore, they do not have a scope to provide quick and efficient damage assessment tasks. Unmanned Aerial Vehicle (UAV) can effortlessly access difficult places during any disaster and collect high resolution imagery that is required for aforementioned tasks of computer vision. To address these issues we present a high resolution UAV imagery, FloodNet, captured after the hurricane Harvey. This dataset demonstrates the post flooded damages of the affected areas. The images are labeled pixel-wise for semantic segmentation task and questions are produced for the task of visual question answering. FloodNet poses several challenges including detection of flooded roads and buildings and distinguishing between natural water and flooded water. With the advancement of deep learning algorithms, we can analyze the impact of any disaster which can make a precise understanding of the affected areas. In this paper, we compare and contrast the performances of baseline methods for image classification, semantic segmentation, and visual question answering on our dataset.
First we introduce a high resolution UAV imagery named FloodNet for post disaster damage assessment. 首先介绍了一个FloodNet 数据集。
Secondly, we compare the performance of sevral classification, semantic segmentation and visual question answering on our dataset. 其次在这个数据集上应用分类、分割和VQA任务。
To the best of our knowledge, this is the first VQA work focused on UAV imagery for any disaster damage assessment. 据我们所知,这是第一个VQA在航空影像上对灾害评估的数据集。
1. 相关工作
2. FloodNet数据集
数据集的采集使用的无人机是DJI Mavic Pro,数据采集在飓风Harvey过境之后,Harvey在2017年8月过境后,造成的路易斯安那州和德州的山体滑坡。数据的采集时间为2017年8与30-9月4号之间,位于德州的Ford
Bend County县。该数据集有两个独特的地方,一个是保真度,即反应者反应灾害情况后立马进行无人机影像的拍摄,第二,它是仅有的关于灾害的航空影像。尽管还有其他的一些灾害的遥感数据集,但是这些数据集的飞行高度都在400英尺以上,而本文的无人机飞行高度则在200英尺以下。
2.1 标注任务
这里主要是做分类的标签还有语义分割的标签。总计有3200张的图片,分为9类,分别是building-flooded, building-non-flooded, road-flooded, road-non-flooded, water, tree, vehicle, pool, and grass。被洪水淹没的建筑被定义为至少有一条边与洪水相接触;为了区分自然水体(比如湖和河)和洪水,还定义了water类别;如果一张图的30%被洪水覆盖,那么这张图就被定义为“flooded”;每一种类别的图像和类别的数量如下表所示:
整体的任务量是非常大的,平均标注一张图片需要一个小时。为了保证标注的质量,我们进行了两级核查工作。图像是在V7 Darwin平台上进行标注的,数据中的70%用作训练,30%用作验证和测试。
2.2 VQA任务
问题类型分为3类:“Simple Counting”简单计数,“Complex Counting”复杂计数和“Condition Recognition”情况识别。问题的提问方式只有3种:how,what,is,问题的最大长度为11。在简单计数问题中,我们只会简单的询问目标在图像中出现的数量。在复杂计数问题中,我们会针对某种具有特定属性的目标数量进行提问。情况识别类问题分为三种子类问题,一种是问路况“What is the condition of the road”,第二种是问整张图像的情况“What is the overall condition of the entire image”,第三种是“Yes/No”类型的问题。所有问题的统计结果如下图所示:
3. 实验