laitimes

数据增强Data Augmentation

author:Not bald programmer
数据增强Data Augmentation

Data augmentation is a simple technique by reducing overfitting. In data augmentation, we can generate data through data augmentation, assuming that we are dealing with a limited set of data and deep learning requires more data.

For example, if we have a photo, we can create a new one using the Keras image generator. This process is known as data augmentation and helps reduce overfitting.

Data augmentation manually augments the training set by creating a modified copy of the dataset with existing data.

数据增强Data Augmentation

Augmented vs. synthetic data

  • The enhanced data comes from the original data with some modifications.
  • Synthetic data is artificially generated without using the original dataset. It typically uses deep neural networks (DNNs) and generative adversarial networks (GANs) to generate synthetic data.

When to use data augmentation?

Prevent model overfitting.

The initial training set is too small.

Improve model accuracy.

Reduce the operational costs of labeling and cleaning raw datasets.

数据增强Data Augmentation

Limitations of data augmentation

Deviations in the original dataset are persisted in the augmentation data.

Quality assurance for data enhancement is costly.

Research and development (RnD) is needed to build systems with advanced applications. For example, generating high-resolution images using GANs can be challenging.

数据增强Data Augmentation

Data-enhanced apps

  • medical
  • Self-driving cars
  • Natural language processing
  • Automatic speech recognition
数据增强Data Augmentation

The image enhancement features provided by Tensorflow and Keras are very convenient.

数据增强Data Augmentation

Just add an enhancement layer, tf.image or ImageDataGenerator to perform the enhancement.

Data augmentation is more commonly used in machine learning models that involve text or image classification, as it can be difficult to collect new data in these domains.

.flow_from_directory (directory) These generators can be used with Keras model methods that accept data generators as input, such as fit_generator, evaluate_generator, and predict_generator.