Exercise 3.1: Sentiment Classification with Increasing Neurons#
Objective: Build a simple text classification model to classify movie reviews as positive or negative. Start with a small number of neurons and progressively increase it to observe the effects on performance.
Data: Use a subset of the IMDB movie review dataset.
Steps:
Start with an
Embedding
layer followed by aDense
layer with 16 neurons.Train the model and record the accuracy.
Gradually increase the number of neurons in the
Dense
layer (e.g., from 16 to 64 and then 128) and observe how accuracy and training time are affected.Plot the results for accuracy and loss, including for validation tests.
Complete the code given below.
import tensorflow as tf
from tensorflow.keras import layers
from tensorflow.keras.datasets import imdb
from tensorflow.keras.preprocessing.sequence import pad_sequences
import matplotlib.pyplot as plt
# Load and preprocess the data
vocab_size = 10000 # Only consider the top 10k words
max_length = 256 # Pad/truncate all reviews to be 256 words
# Load IMDB data
(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=vocab_size)
# Pad sequences to the same length
x_train = pad_sequences(x_train, maxlen=max_length)
x_test = pad_sequences(x_test, maxlen=max_length)
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/imdb.npz
17464789/17464789 ━━━━━━━━━━━━━━━━━━━━ 0s 0us/step
def build_model(dense_neurons):
model = tf.keras.Sequential([
layers.Embedding(input_dim=vocab_size, output_dim=64, input_length=max_length),
layers.GlobalAveragePooling1D(),
layers.Dense(dense_neurons, activation='relu'),
layers.Dense(1, activation='sigmoid')
])
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
return model
neuron_counts = [16, 64, 128] # Different sizes for the Dense layer
results = {}
for neurons in neuron_counts:
print(f"\nTraining model with {neurons} neurons in the Dense layer")
model = build_model(dense_neurons=neurons)
# Train the model
history = model.fit(
x_train, y_train,
epochs=5,
batch_size=512,
validation_split=0.2,
verbose=2
)
# Evaluate the model
loss, accuracy = model.evaluate(x_test, y_test, verbose=2)
results[neurons] = {'accuracy': accuracy, 'loss': loss}
Exercise 3.2: Fine-grained Image Classification with Increasing Neurons#
Objective: Build a basic image classification model to classify images of flowers (oxford_flowers102
- the Oxford 102 Flower Dataset ). Start with a small number of neurons and progressively increase them to observe the effects on performance.
Data: Use the Oxford 102 Flower Dataset, which contains images of 102 flower categories.
Steps:
Begin with
Conv2D
andMaxPooling2D
layers for feature extraction from images.Add a
Flatten
layer to convert 2D feature maps into a 1D vector.Add a
Dense
layer with a small number of neurons (e.g., 32), followed by an output layer.Train the model and record the accuracy.
Gradually increase the number of neurons in the
Dense
layer (e.g., from 32 to 128 and then 256) to observe changes in accuracy and training time.
Complete the code given below.
import tensorflow as tf
import tensorflow_datasets as tfds
from tensorflow.keras import layers, models
import matplotlib.pyplot as plt
# Load the Oxford 102 Flower Dataset
dataset, info = tfds.load('oxford_flowers102', with_info=True, as_supervised=True)
# Split the dataset into training and testing
train_dataset = dataset['train']
test_dataset = dataset['test']
# Set image parameters
image_size = (150, 150) # Resize images to this size
batch_size = 32
# Data preprocessing function to resize images and normalize pixel values
def preprocess_image(image, label):
image = tf.image.resize(image, image_size)
image = image / 255.0 # Normalize pixel values to [0, 1]
return image, label
# Apply preprocessing to the train and test datasets
train_dataset = train_dataset.map(preprocess_image).batch(batch_size).prefetch(tf.data.experimental.AUTOTUNE)
test_dataset = test_dataset.map(preprocess_image).batch(batch_size).prefetch(tf.data.experimental.AUTOTUNE)
Exercise 3.3 : Investigating the Influence of Batch Sizes on Model Performance#
Objective: This exercise will demonstrate how different batch sizes affect the performance of a neural network model, including its training speed, loss, accuracy, and generalization capability.
Dataset: Fashion MNIST — a dataset containing grayscale images of 10 different types of clothing, with 60,000 training images and 10,000 test images.
Steps:#
1. Load the Dataset#
Use the
tensorflow.keras.datasets.fashion_mnist
to load the dataset.Preprocess the data by normalizing the pixel values to the range
[0, 1]
.
2. Define the Model Architecture#
Build a simple Convolutional Neural Network (CNN) or Fully Connected Neural Network (FCNN) model.
The model should include:
An input layer (to handle the 28x28 images).
One or more hidden layers (e.g., Dense, Conv2D).
An output layer with 10 units (one for each clothing category).
Use softmax activation for the output layer since it’s a multi-class classification problem.
3. Vary the Batch Size#
Experiment with different batch sizes (e.g., 16, 32, 64, 128, 256).
For each batch size:
Train the model for a fixed number of epochs (e.g., 10 epochs).
Record training loss, validation loss, and accuracy.
4. Train the Model#
Train the model for each batch size and measure the following:
Training time: How long it takes to complete one epoch.
Training and validation loss: Monitor how the loss changes during training.
Accuracy: Track how well the model performs on the training and validation data.
5. Analyze the Results#
Compare the following:
Training time: Larger batch sizes may lead to faster training, but they could also lead to diminishing returns in terms of model performance.
Loss and accuracy: Observe how the batch size affects the convergence of the loss function and the accuracy on both training and validation datasets.
Overfitting: Check if smaller batch sizes lead to better generalization (lower validation loss) or if larger batch sizes cause overfitting.
6. Plot the Results#
Plot graphs comparing training loss, validation loss, and accuracy for different batch sizes.
Plot the training time for different batch sizes.
import tensorflow as tf
import matplotlib.pyplot as plt
from tensorflow.keras.datasets import fashion_mnist
# Load and preprocess the dataset
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()
train_images, test_images = train_images / 255.0, test_images / 255.0
# Reshape the images to (28, 28, 1) to match the input shape expected by CNNs
train_images = train_images.reshape(-1, 28, 28, 1)
test_images = test_images.reshape(-1, 28, 28, 1)
Exercise 3.4: Emotion Classification Using the CREMA-D Dataset#
Objective: Build a model to classify emotions based on audio clips of human speech. This exercise focuses on identifying emotions such as anger, happiness, sadness, and neutral tones, using basic audio preprocessing and a convolutional neural network.
Dataset: CREMA-D (Crowd-sourced Emotional Multimodal Actors Dataset) contains audio clips of actors expressing six emotions: anger, disgust, fear, happiness, neutral, and sadness. Although not directly available in tensorflow_datasets
, it’s small enough to preprocess and load efficiently in TensorFlow.
Steps:#
1. Data Loading and Preprocessing#
Load the Dataset:
Download the CREMA-D dataset from its official source.
Organize the audio files and corresponding emotion labels.
Audio Processing:
Convert audio waveforms into spectrograms or mel-spectrograms for each audio clip.
Normalize the spectrograms and pad/truncate them to a consistent length, e.g., 2 seconds.
2. Build a Simple Emotion Classification Model#
Convolutional Layers:
Start with a
Conv2D
layer to learn spatial patterns in the spectrogram.Add additional
Conv2D
layers andMaxPooling2D
layers to capture higher-level features.
Flatten and Dense Layers:
Flatten the final output and pass it through one or two
Dense
layers for classification.Use a softmax activation in the final
Dense
layer with six output units, one for each emotion class.
Compile and Train:
Use categorical crossentropy as the loss function and an optimizer like Adam.
Train the model on the training set, using a validation set to tune performance.
3. Model Evaluation#
Accuracy:
Evaluate the model’s accuracy on the test set.
Confusion Matrix:
Generate a confusion matrix to analyze which emotions are well-classified and which are commonly misclassified.
!git clone https://github.com/CheyneyComputerScience/CREMA-D.git
import os
import tensorflow as tf
import numpy as np
import librosa
from sklearn.model_selection import train_test_split
from tensorflow.keras import layers, models
# Define path to CREMA-D dataset (update path accordingly)
DATA_PATH = '/content/CREMA-D/AudioWAV'
LABELS = {'ANG': 0, 'DIS': 1, 'FEA': 2, 'HAP': 3, 'NEU': 4, 'SAD': 5}
def preprocess_audio(file_path, max_length=2.5, sr=16000):
audio, _ = librosa.load(file_path, sr=sr)
# Calculate the desired length in samples
target_length = int(sr * max_length)
# Adjust the audio to the desired length
audio = librosa.util.fix_length(audio, size=target_length)
# Convert audio to a mel-spectrogram
spectrogram = librosa.feature.melspectrogram(y=audio, sr=sr)
spectrogram_db = librosa.power_to_db(spectrogram, ref=np.max)
return spectrogram_db
# Load and preprocess data
def load_data(data_path, labels):
data, targets = [], []
for file_name in os.listdir(data_path):
if file_name.endswith('.wav'):
file_path = os.path.join(data_path, file_name)
label_str = file_name.split('_')[2] # e.g., "ANG", "DIS"
if label_str in labels:
spectrogram = preprocess_audio(file_path)
data.append(spectrogram)
targets.append(labels[label_str])
return np.array(data), np.array(targets)
# Load data
X, y = load_data(DATA_PATH, LABELS)
X = X[..., np.newaxis] # Add channel dimension for Conv2D
# Split data into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)