Two questions about deep learning, (transfer learning)

Description

The detailed is in the attachment

Don't use plagiarized sources. Get Your Custom Assignment on
Two questions about deep learning, (transfer learning)
From as Little as $13/Page

Unformatted Attachment Preview

Problem 1: ConvNets Transfer Learning (5 points)
Goal: Build convolutional neural networks (ConvNets) based on ResNet101, InceptionV3, and
EfficientNetB1, using transfer learning to classify images of fruits and vegetables into their respective
classes. Analyze the performance of the constructed ConvNets of varying sizes in terms of classification
accuracy as well as computational efforts required for inference.
You are encouraged to use TensorFlow + Keras or PyTorch + Lightning libraries, but any other deep
learning library (in Python, Julia or Matlab) would be acceptable for this homework assignment.
Data: The Fruits-360 dataset contains 100×100 images for 131 different varieties of fruits and vegetables.
Processing the images should be consistent with settings described in Homework 2.
Background: The learned features in pre-trained models can be repurposed or transferred to another
ConvNet designed for a specific task. Several complex pre-trained ConvNet models are made available
by popular deep learning libraries. The pre-trained neural networks of varying sizes, Efficient Net B1,
Inception V3, and ResNet 101 neural networks, could be used to classify images of fruits and vegetables.
Weights corresponding to pre-training on ImageNet should be used as the initial weights for your pretrained models.
Although it is known that large-scale deep learning models can perform certain tasks efficiently, there
exists a trade-off between the model size, their performance, and their computational resource utilization.
It is important to analyze the model size subject to specific machine learning task requirements, as largescale models consume a significant amount of computational resources and, in turn, energy, during the
training and inference tasks.
Architecture: Because the above-mentioned ConvNets were previously trained to classify different
numbers of image classes, the constructed models need to be customized such that they can classify 131
classes of fruits and vegetables. To achieve this, an additional Dense layer with 131 units and Softmax
activation function can be added at the end of the pre-trained models. As the pre-trained ConvNets have
sophisticated architectures and consist of several cyclic networks, sequential style modeling cannot be
used here. Instead, the modified ConvNets for classifying fruits/vegetables can be built following an
alternate modeling scheme termed as the functional model. In this assignment, you will need to:
1. Implement ResNet101, InceptionV3, and EfficientNetB1 models with a custom fullyconnected layer (number_of_classes=131).
2. Incorporate pre-trained weights obtained from ImageNet for all three ConvNets. Proceed to
train the constructed models while freezing the original weights of the convolutional blocks.
3. Freeze the weights of the original layers within the pre-trained models, and subsequently
fine-tune the remaining layers. After retraining the modified models, elaborate on the roles
that transfer learning and frozen layers play within transfer learning tasks.
4. Measure the computational time required for the prediction step over the testing dataset
performed with all three trained ConvNets.
Training: All models should be trained with the ADAM optimizer to minimize the categorical crossentropy. The new ConvNet models should be trained for 50 epochs using the data generators created for
the fruits-360 dataset. Please note that some pre-trained models are large and might consume large
computational resources for each step. It is recommended that the training process is performed on
Google Colab / Google Cloud using a GPU or a TPU.
Deliverables:
1. Please report the accuracy for both the training and testing datasets obtained after training the
modified ConvNets. Also, plot the loss curves for both training and validation datasets.
Discuss the roles that transfer learning and frozen layers play within transfer learning tasks.
2. Report the computational time required to predict classification accuracy over the testing
dataset with the trained ConvNet models along with their model sizes. Please make sure to
measure ONLY the times required for the prediction step; you may use Python’s inbuilt time
functions for the same. Discuss the trade-offs between computational resource utilization
with the accuracy levels and model sizes for the classification task.
3. Please make sure to submit your working code files along with the final results and the plots.
Problem 2: Sentiment Analysis using Transformer-based Transfer Learning (5 points)
Transformer architectures, like GPT, have set new benchmarks in NLP tasks. In this problem,
you will leverage the power of transfer learning using the “A Lite BERT” (ALBERT) model on
sentiment analysis similar to problem 2 from the last homework assignment.
Data: The IMDB dataset is used, which is available via the datasets library. Unlike the processed
version provided by Keras, this task requires the use of the raw dataset which can be accessed
via:
from datasets import load_dataset
raw_datasets = load_dataset(“imdb”)
Use 1000 samples for validation and the rest for training.
Processing: The dataset contains raw text reviews. To ensure consistent input to the model, your
task is to tokenize these reviews. The tokenization can be achieved using AutoTokenizer from
the transformers library. Please note that further installation would be required if
AlbertTokenizer is used instead.
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained(“albert-base-v2”)
Architecture: You’ll be using the albert-base-v2 model, which is a light BERT-based model
specifically designed for transfer learning tasks. Your main objective is to fine-tune this pretrained model on the IMDB dataset.
Training: The fine-tuning process requires careful selection of parameters. Given the complexity
and capabilities of the Albert model:
Batch Size: Start with a smaller batch size (16) to ensure that the model fits into GPU memory.
Batch size has a significant impact on the model’s performance and convergence rate.
Learning Rate: Consider a starting LR of 1e-5 with a weight decay of 0.01 with Adam as the
optimizer. Transformers are sensitive to the learning rate.
Epochs: Given that you’re leveraging transfer learning, 10 might suffice as the model has already
learned useful representations from vast amounts of data.
This training process will take hours on CPU. Please use Google Colab for this problem and
ensure you are using the T4 GPU runtime to train the model in about 20 min/epoch. On other
GPU machines, you can adjust the batch number accordingly.
Evaluation: After training, evaluate the model’s performance on the test data to get an
understanding of its generalization capability.
Visualization: Plot the loss for both the training and validation sets, and the accuracy for the
validation set across epochs to analyze the model’s performance.
Deliverables:
Model Accuracy and Loss Curves: Report the model validation accuracy and both training and
validation loss over epochs with the plots detailed above.
Analysis of Model Performance: Discuss the results obtained from the model. This should
include:
Examination of whether the model overfits or underfits the training data.
Compare this model’s performance to the one you obtained in HW2 while also considering the
different data splits.
Analysis of loss and accuracy for potential insights.
Suggest improvements to the model and training process.
Code and Resources: Provide working code, accompanied by results and plots.
Bonus (+1) Prediction: Predict the label and provide the class probability of the statement “This
movie was really amazing!”

Purchase answer to see full
attachment