Natural Language Processing with TensorFlow/

...

Getting to Know the Data

Learn about the image datasets that we'll use to train the model.

We'll cover the following...

ILSVRC ImageNet dataset
The MS-COCO dataset
Downloading the data

Let’s first look at the data we’re working with both directly and indirectly. There are two datasets we’ll rely on:

We won’t engage the first dataset directly, but it’s essential for caption learning. This dataset contains images and their respective class labels (for example, cat, dog, and car). We’ll use a CNN that’s already trained on this dataset, so we don’t have to download and train on this dataset from scratch. Next, we’ll use the MS-COCO dataset, which contains images and their respective captions. We’ll directly learn from this dataset by mapping the image to a fixed-size feature vector using the vision transformer and then map this vector to the corresponding caption using a text-based transformer.

ILSVRC ImageNet dataset

ImageNet is an image dataset ...

Introduction to Natural Language Processing

Understanding TensorFlow 2

Word2vec: Learning Word Embeddings

Advanced Word Vector Algorithms

Sentence Classification with Convolutional Neural Networks

Recurrent Neural Networks

Understanding Long Short-Term Memory Networks

Applications of LSTM: Generating Text

Sequence-to-Sequence Learning: Neural Machine Translation

Transformers

Sarcasm Classification Using BERT

Image Captioning with Transformers

Caption Generation Using PyTorch

Final Remarks

Appendix: Mathematical Foundations and Advanced TensorFlow

Getting to Know the Data

ILSVRC ImageNet dataset