🧠 Gradient Descent Methods Visualization

Compare SGD, BGD, and Mini-batch GD with 4 samples and 2 batches

Sample 1
Input: [0.2, 0.8, 0.5, 0.3]
Target: [0.9, 0.1]

Sample 2
Input: [0.7, 0.3, 0.6, 0.4]
Target: [0.2, 0.8]

Sample 3
Input: [0.4, 0.6, 0.2, 0.7]
Target: [0.8, 0.2]

Sample 4
Input: [0.9, 0.1, 0.8, 0.2]
Target: [0.3, 0.7]

Batch 1: Samples 1-2 | Batch 2: Samples 3-4

SGD

Ready

Weight Updates: 4 per epoch (one after each sample)
Process: For each sample → Forward pass → Backward pass → Update weights immediately
Pros: Fast updates, can escape local minima, good for large datasets
Cons: Noisy gradient estimates, unstable convergence

Active Sample

In Current Batch

Strong Activation

Gradient Flow

Weight Update