MLP Visualization - Batch vs Stochastic Learning

📚 Select Learning Mode:

🎯 Stochastic Gradient Descent (SGD)

Update after EACH sample
• Faster, more frequent updates
• Weights change after every training example
• Noisy but can escape local minima
• Current loss shown per sample

📦 Batch Gradient Descent

Update ONCE per epoch
• Accumulate gradients from all samples
• Weights update after seeing all data
• Stable, consistent gradient direction
• Average loss across all samples

📚 Understanding Error Handling in One Epoch:

Stochastic Gradient Descent (SGD):

• Process Sample 1 → Calculate error → Update weights immediately
• Process Sample 2 → Calculate error → Update weights immediately
• Process Sample 3 → Calculate error → Update weights immediately
• Process Sample 4 → Calculate error → Update weights immediately
Result: 4 weight updates per epoch. Each sample immediately influences the network.

Batch Gradient Descent:

• Process Sample 1 → Calculate error → Store gradients
• Process Sample 2 → Calculate error → Accumulate gradients
• Process Sample 3 → Calculate error → Accumulate gradients
• Process Sample 4 → Calculate error → Accumulate gradients
• Average all gradients → Update weights ONCE
Result: 1 weight update per epoch. All samples influence the update equally.

Key Difference: In SGD, weights change after each sample, so later samples see different weights than earlier ones. In Batch GD, all samples see the same weights, and we update based on the average gradient.

Average Loss Calculation: For both methods, we report the average loss across all samples in the epoch: Avg Loss = (Loss₁ + Loss₂ + Loss₃ + Loss₄) / 4

🧠 MLP Learning: Batch vs Stochastic Gradient Descent

📚 Select Learning Mode:

🎯 Stochastic Gradient Descent (SGD)

📦 Batch Gradient Descent

Ready to Start

📊 Error Tracking for Current Epoch

📚 Understanding Error Handling in One Epoch: