深度学习Course4第三周Detection Algorithms习题整理

Detection Algorithms

You are building a 3-class object classification and localization algorithm. The classes are: pedestrian (c=1), car (c=2), motorcycle (c=3). What should yy be for the image below? Remember that “?” means “don’t care”, which means that the neural network loss function won’t care what the neural network gives for that component of the output. Recall y = [].
深度学习Course4第三周Detection Algorithms习题整理

y=[1,?,?,?,?,1,?,?]
y=[1,0.66,0.5,0.75,0.16,1,0,0]
y=[1,0.66,0.5,0.16,0.75,1,0,0]
y=[1,0.66,0.5,0.75,0.16,0,0,0]

2;You are working on a factory automation task. Your system will see a can of soft-drink coming down a conveyor belt, and you want it to take a picture and decide whether (i) there is a soft-drink can in the image, and if so (ii) its bounding box. Since the soft-drink can is round, the bounding box is always square, and the soft drink can always appear the same size in the image. There is at most one soft drink can in each image. Here’re some typical images in your training set:

深度学习Course4第三周Detection Algorithms习题整理
To solve this task it is necessary to divide the task into two: 1. Construct a system to detect if a can is present or not. 2. Construct a system that calculates the bounding box of the can when present. Which one of the following do you agree with the most?

We can approach the task as an image classification with a localization problem.
An end-to-end solution is always superior to a two-step system.
The two-step system is always a better option compared to an end-to-end solution.
We can’t solve the task as an image classification with a localization problem since all the bounding boxes have the same dimensions.

If you build a neural network that inputs a picture of a person’s face and outputs N landmarks on the face (assume the input image always contains exactly one face), how many output units will the network have?

2N
N
N 2 N^2 N
2
3N

When training one of the object detection systems described in the lectures, you need a training set that contains many pictures of the object(s) you wish to detect. However, bounding boxes do not need to be provided in the training set, since the algorithm can learn to detect the objects by itself.

False
True

解析：you need bounding boxes in the training set. Your loss function should try to match the predictions for the bounding boxes to the true bounding boxes from the training set.

What is the IoU between the red box and the blue box in the following figure? Assume that all the squares have the same measurements.
深度学习Course4第三周Detection Algorithms习题整理

解析：（2 * 2）/ （4 * 4 + 4 * 4 - 2 * 2）= 4 / 28 = 1 / 7

Suppose you run non-max suppression on the predicted boxes below. The parameters you use for non-max suppression are that boxes with probability q≤ 0.4 are discarded, and the IoU threshold for deciding if two boxes overlap is 0.5. How many boxes will remain after non-max suppression?
深度学习Course4第三周Detection Algorithms习题整理
If we use anchor boxes in YOLO we no longer need the coordinates of the bounding box since they are given by the cell position of the grid and the anchor box selection. True/False?

False
True

解析：We use the grid and anchor boxes to improve the capabilities of the algorithm to localize and detect objects, for example, two different objects that intersect, but we still use the bounding box coordinates.

Semantic segmentation can only be applied to classify pixels of images in a binary way as 1 or 0, according to whether they belong to a certain class or not. True/False?

False
True

解析：The same ideas used for multi-class classification can be applied to semantic segmentation.

Using the concept of Transpose Convolution, fill in the values of X, Y and Z below. (padding = 1, stride = 2)

Input: 2x2 **
深度学习Course4第三周Detection Algorithms习题整理
Filter: 3x3
深度学习Course4第三周Detection Algorithms习题整理
Result: 6x6
深度学习Course4第三周Detection Algorithms习题整理

X = 3, Y = 0, Z = 4
X = 10, Y = 0, Z = 6
X = 4, Y = 3, Z = 2
X = 10, Y = 0, Z = 0

When using the U-Net architecture with an input , where cc denotes the number of channels, the output will always have the shape . True/False?

False
True

深度学习Course4第三周Detection Algorithms习题整理

Detection Algorithms

继续阅读

吴恩达机器学习笔记（3）

吴恩达j机器学习之过拟合

吴恩达机器学习(一) 介绍

UVA 519 Puzzle (II)

磁盘结构及在Linux中的命名

深度学习模型分析人类复杂疾病的准确性

疾病研究：重症肌无力

人工智能如何有效地运用于自然语言处理

新闻 | Mapbox 牵手阿里，飞猪旅行上线六大城市地图功能

【趋高机器视觉】机器视觉技术原理解析及解决方案

[HTML5]自定义属性 data-* 和 jQuery.data 详解

解码器用于语义分割：数据依赖的解码可以实现灵活的特征聚合

2021-2025年中国运动疗法（KT）带行业市场供需与战略研究报告

cs231n斯坦福基于卷积神经网络的CV学习笔记（一）KNN和线性分类器/分类器损失/反向传播一，KNN图像分类算法二，线性分类器三，线性分类器损失四，反向传播五，神经网络

2021年危险化学品经营单位安全管理人员考试题库及危险化学品经营单位安全管理人员考试技巧

无人机--飞控科普