Data Mining and Machine Learning

John Samuel
CPE Lyon

Year: 2025-2026
Email: john.samuel@cpe.fr

Creative Commons License

8.1. Neural Network Fundamentals

Biological Neurons

Neuron3
Biological neuron1
  1. https://en.wikipedia.org/wiki/File:Neuron3.png

8.1. Neural Network Fundamentals

Introduction

Colored neural network
Artificial neural networks

8.1. Neural Network Fundamentals

Layers

Neurons are organized into layers. There are generally three types of layers in a neural network:

8.1. Neural Network Fundamentals

Training

The overall goal of training is to adjust the weights of the network so that it can generalize to new data, producing accurate results for examples it has not seen during training.

8.1. Neural Network Fundamentals

Components of Artificial Neural Networks

8.1. Neural Network Fundamentals

Components of Artificial Neural Networks

Propagation Function

Activation Function: After computing the neuron's input, it is passed through an activation function. This function introduces nonlinearity into the model, allowing the neural network to capture complex relationships and learn nonlinear patterns. Some commonly used activation functions include:

8.1. Neural Network Fundamentals

Perceptron

The perceptron is a supervised learning algorithm used for binary classification. It is designed to solve problems where the objective is to determine whether a given input belongs to a particular class or not.

8.1. Neural Network Fundamentals

Perceptron: Formal Definition

8.1. Neural Network Fundamentals

Perceptron: Steps

  1. Initialize the weights and thresholds
  2. For each example, \((x_j, d_j)\) in the training set
    • Compute the current output: \[y_j(t)= f[w(t).x_j]\] \[= f[w_0(t)x_{j,0} + w_1(t)x_{j,1} + w_2(t)x_{j,2} + \dotsb + w_n(t)x_{j,n}]\]
    • Update the weight: \[w_i(t + 1) = w_i(t) + r. (d_j-y_j(t))x_{j,i}\]
    \(r\) is the learning rate.

8.1. Neural Network Fundamentals

Perceptron: Steps

  1. Repeat step 2 until the iteration error \[\frac{1}{s} (Σ |d_j - y_j(t)|)\] is less than the user-specified threshold \(\gamma\), or a predetermined number of iterations have been performed, where \(s\) is again the size of the sample set.

8.2.1. Training: Optimization

8.2.2. Training Stability

8.2.2. Training Stability: Checklist

8.1. Neural Network Fundamentals

Multiclass perceptron

8.2. Deep Learning

A deep neural network, also known as a deeply hierarchical neural network or deep neural network (DNN), is a type of artificial neural network that includes multiple processing layers, generally more than two. These networks are called "deep" because of their stacked layer architecture, enabling the creation of complex hierarchical representations of data.

Layered architecture: Deep neural networks are composed of multiple layers, generally divided into three main types:

8.2. Deep Learning

Training deep neural networks may require large volumes of data and computing power.

8.2. Deep Learning

Example: Tensorflow

# Step 3: Add a dense output layer with softmax activation function
# The layer has 2 neurons for a binary classification task, and softmax is used
# to obtain probabilities
model.add(Dense(units=2, activation='softmax'))

# Step 4: Compile the model
# Using stochastic gradient descent (SGD) as optimizer with a learning rate of 0.01
# The loss function is 'mean_squared_error' for a regression problem
# Model performance will be measured in terms of 'accuracy'
sgd = SGD(lr=0.01)
model.compile(loss='mean_squared_error', optimizer=sgd, metrics=['accuracy'])

8.2. Deep Learning

Convolutional Neural Networks

Deep Learning
Source: https://en.wikipedia.org/wiki/File:Deep_Learning.jpg

8.2. Deep Learning

Convolutional Neural Networks

Convolutional neural networks (CNNs) are a class of neural network architectures designed primarily for image analysis. They have been particularly effective in tasks such as image classification, object detection, and image segmentation.

8.2. Deep Learning

Convolutional Neural Networks

8.2. Deep Learning

Convolutional Neural Networks

8.2. Deep Learning

Convolutional Neural Networks: architecture

Typical cnn

8.2. Deep Learning

Convolutional Neural Networks: architecture

In summary, CNNs follow a hierarchical architecture, where convolutional layers learn local features, and these features are then combined in subsequent layers to form more complex representations. The nonlinearity introduced by the ReLU activation function is crucial to allow the model to learn nonlinear relationships in the data.

8.2. Deep Learning

Kernel (image processing)

A kernel in the context of image processing, also called a filter or mask, is a small matrix that is applied to an image using a convolution operation. The purpose of applying these kernels is to perform various filtering operations on the image, such as edge detection, detail enhancement, highlighting certain features, etc.

8.3. Scientific Data Modalities

Scientific datasets come in fundamentally different structural forms. The choice of architecture depends on the data modality.

Three main modalities in scientific contexts:

  1. Time series / signals: measurements ordered in time or frequency (e.g., spectra, waveforms, sensor readings)
  2. Images and spatial detectors: 2D or 3D grids (e.g., telescope images, microscopy, particle detector maps)
  3. Tabular data: structured rows and features (e.g., catalog data, simulation parameters, measurement tables)

Note: This section is designated as asynchronous reading material.

8.3. Time Series and Signals

8.3. Images and Spatial Detectors

Convolutional Neural Network (CNN): processes 2D grids using learned convolution filters. Key operations:

8.3. Tabular Data

8.4. Reinforcement Learning

Reinforcement Learning

  • Reinforcement Learning (RL) is a branch of machine learning inspired by theories of animal psychology.
  • Autonomous agent: RL involves an autonomous agent interacting with an environment.
  • Decision making: The agent makes decisions based on its current state.
  • Rewards and penalties: The environment provides the agent with rewards, which can be positive or negative.
  • Objective: The objective is to maximize the sum of cumulative rewards over time.
Reinforcement learning diagram

8.5. Ethics, Licenses and Privacy

Licenses, Ethics and Privacy

Open Definition logo Privacy written in tiles

8.5. Ethics, Licenses and Privacy

CC BY NC ND CC BY NC SA CC BY NC CC BY ND CC BY SA CC BY Cc zero Cc public domain mark white
Examples: Creative Commons (CC)

8.5. Ethics, Licenses and Privacy

Creative commons license spectrum
Exemples: Creative Commons (CC)

8.5. Ethics, Licenses and Privacy

Wikimedia logo family complete 2013
Open data

8.5. Ethics, Licenses and Privacy

LOD Cloud 2014.svg
Linked Open Data (LOD)

8.5. Ethics, Licenses and Privacy

Internet Archive logo and wordmark.svg
Archived data

8.6. Uncertainty and Calibration

8.6. Uncertainty Estimation

Types of uncertainty:

8.6. Robustness

Distribution shift: training and test data come from different distributions. Common in physics experiments (simulation ≠ real data).

Types:

Strategies:

8.6. Robustness: Practical Strategies

References

Online Resources

References

Colors

Images