🧠 Gradient Descent Methods Visualization
Compare SGD, BGD, and Mini-batch GD with 4 samples and 2 batches
📊 SGD (Stochastic)
📈 BGD (Batch)
📉 Mini-batch GD
🔄 Train 1 Epoch
↻ Reset
⚡ Speed:
Normal
👁 Weights:
Hidden
📦 Training Samples (4 total, 2 batches of 2 samples each)
Sample 1
Input: [0.2, 0.8, 0.5, 0.3]
Target: [0.9, 0.1]
Sample 2
Input: [0.7, 0.3, 0.6, 0.4]
Target: [0.2, 0.8]
Sample 3
Input: [0.4, 0.6, 0.2, 0.7]
Target: [0.8, 0.2]
Sample 4
Input: [0.9, 0.1, 0.8, 0.2]
Target: [0.3, 0.7]
Batch 1:
Samples 1-2 |
Batch 2:
Samples 3-4
Current Mode
SGD
Epoch
0
Weight Updates/Epoch
4
Current Sample/Batch
-
Loss
--
Progress
Ready
📚 Stochastic Gradient Descent (SGD)
Weight Updates:
4 per epoch (one after each sample)
Process:
For each sample → Forward pass → Backward pass → Update weights immediately
Pros:
Fast updates, can escape local minima, good for large datasets
Cons:
Noisy gradient estimates, unstable convergence
Active Sample
In Current Batch
Strong Activation
Gradient Flow
Weight Update