Practical 4#
Goals#
Work on builtin Python functions#
map()
filter()
reduce()
Multiprocessing#
multiprocessing.cpu_count
multiprocessing.Pool
map
Exercice 1 [★]#
In this exercise, we will take a look at the Python builtin function called filter() which can be used to select items from a collection matching a particular condition.
# Initialization
num = [i for i in range(1, 20)]
print(num)
We will make use of the available documentation for the different functions. For this purpose, we will use a question mark (?) after the name of the function or a class as shown below.
?filter
The function filter(function, iterable)
takes two parameters: a function and an iterable. The function acts on each element of an iterable data type.
In the first example, we use None
as the first parameter. In this case, filter will act as an identity function and returns the iterable.
# Use of filter function with None as the first parameter
num = [i for i in range(1, 20)]
filtered = list(filter(None, num))
print(filtered)
In the next example, we will filter out the even numbers from the input list. Note that we have written a function even() which returns True if the input number is even, else False.
filter will return the items from the list which returned True when passed as argument to the function even().
def even(item):
if item % 2 == 0:
return True
return False
num = [i for i in range(1, 20)]
filtered = list(filter(even, num))
print(filtered)
In the following example, we have a new function odd() which returns True when the input number is odd.
We use this new function as input to the filter() function.
def odd(item):
if item % 2 == 0:
return False
return True
num = [i for i in range(1, 20)]
filtered = list(filter(odd, num))
print(filtered)
Question Write a program using filter() that takes a list of strings and filters out the palindromes.
Filtering with Nested Structures#
The filter()
function can also be applied to complex data structures like lists of dictionaries or tuples.
You are given a list of dictionaries representing employees, each with name
, age
, and department
keys:
employees = [
{"name": "Alice", "age": 28, "department": "HR"},
{"name": "Bob", "age": 35, "department": "Engineering"},
{"name": "Charlie", "age": 22, "department": "Marketing"},
{"name": "David", "age": 45, "department": "Engineering"},
{"name": "Pierre", "age": 29, "department": "HR"}
]
Questions
Filter by Department: Write a program that uses
filter()
to create a list of employees who work in the “Engineering” department.Filter by Age Range: Write a program that uses
filter()
to find employees whose age is between 25 and 40 (inclusive).Filter by Name Length: Write a program that uses
filter()
to find employees whose name has more than 3 characters.
Advanced String Filtering#
You are given a list of sentences. Your task is to filter sentences based on various conditions.
sentences = [
"Bienvenue dans le monde de la programmation!",
"Le débogage fait partie du jeu.",
"La pratique rend parfait, alors continue à coder.",
"Les algorithmes d'apprentissage automatique sont de plus en plus puissants.",
"La visualisation de données est un moyen de communiquer des informations complexes.",
"Les données non structurées sont un défi pour les Data Scientists."
]
Questions
Filter by Length: Write a program that uses
filter()
to select sentences that have fewer than 5 words.Filter by Keywords: Write a program that uses
filter()
to select sentences that contain the word “coder”.Filter by Palindromic Words: Write a program that uses
filter()
to select sentences that contain at least one palindrome.
Exercice 2 [★]#
What if we want to apply the same function on multiple elements in a list.
Take for example, let’s assume that we have a function that can return the square of a number. Now we want to apply this to all the numbers in a list. We can write a program with a loop to achieve this. But, we are going to write a smaller program to achieve this.
\(f(x) = x ^ 2\)
\(g([a,b,...]) = [f(a), f(b), ..]\)
\(g([a,b,...]) = [a^2, b^2, ..]\)
Python provides another builtin function called map(function, iterable, ...)
.
?map
def square(item):
return item * item
num = [i for i in range(1, 20)]
squared = list(map(square, num))
print(filtered)
But what if our program takes multiple inputs.
The following example shows this cases. The function product() takes two numbers as input and returns their product.
def product(item1, item2):
return item1 * item2
num1 = [i for i in range(1, 20)]
print(num1)
num2 = [i for i in range(10, 20)]
print(num2)
product_value = list(map(product, num1, num2))
print(filtered)
Finally, we look at another function called reduce() that applies a function of two arguments cumulatively on the members of the list, from left to right.
\(f(x) = x ^ 2\)
\(g([a,b,...]) = [f(a), f(b), ..]\)
\(g([a,b,...]) = [a^2, b^2, ..]\)
\(h(g([a,b,c,..])) = (((a^2 + b^2) + c^2) + ...) \)
from functools import reduce
?reduce
In the following example, we calculate the sum of members of a list.
We pass the function sum_num() as the first argument to the reduce function. sum_num() takes two numbers as input and returns their number.
from functools import reduce
import random
def sum_num(item1, item2):
return item1 + item2
num = [i for i in range(1, 20)]
print(num)
sum_value = reduce(sum_num, num)
print(sum_value)
In the next example, we make use of the same function sum_num(), but on real numbers.
num = [random.uniform(0, i) for i in range(1, 20)]
print(num)
sum_value = reduce(sum_num, num)
print(sum_value)
In the next example, we use another function product().
from functools import reduce
def product(item1, item2):
return item1 * item2
num = [i for i in range(1, 20)]
print(num)
sum_num = reduce(product, num)
print(sum_num)
Question: Write a program that takes a list of matrices of size 2x2 and computes the sum of all matrices.
Matrix Operations with map()
and reduce()
#
You are given a list of 2x2 matrices (lists of lists). You need to apply various operations using map()
and reduce()
.
from functools import reduce
matrices = [
[[1, 2], [3, 4]],
[[0, 1], [2, 3]],
[[-1, -2], [-3, -4]],
[[5, 6], [7, 8]]
]
Questions
Sum of All Matrices: Write a program that uses
reduce()
to compute the sum of all matrices.Element-wise Multiplication: Write a program that uses
map()
to compute the element-wise multiplication of two matrices.Matrix Filtering: Write a program that uses
filter()
to select only matrices where all elements are positive.
Data Transformation and Aggregation#
You have a list of dictionaries representing products, with keys name
, price
, and quantity
.
from functools import reduce
products = [
{"name": "Laptop", "price": 1200, "quantity": 3},
{"name": "Smartphone", "price": 800, "quantity": 5},
{"name": "Tablet", "price": 300, "quantity": 10},
{"name": "Smartwatch", "price": 200, "quantity": 15}
]
Questions
Total Inventory Value: Write a program that uses
map()
andreduce()
to calculate the total value of all products in stock.Price Filtering: Write a program that uses
filter()
to find products priced above a certain threshold (e.g., 500).Discount Application: Write a program that uses
map()
to apply a 10% discount to all products and returns the updated list.
Exercice 3 [★★]#
In the following examples, we use lambda expressions and pass them as arguments to the functions filter(), map(), and reduce().
In the following example the lambda expression lambda x: x%2
takes x as input and returns the value for x%2
. This is similar to the approach we saw above with the function even().
num = [i for i in range(1, 20)]
filtered = list(filter(lambda x: x % 2 == 0, num))
print(filtered)
In the following example, we take the example with the function odd() and replace it by a lambda expression.
num = [i for i in range(1, 20)]
filtered = list(filter(lambda x: x % 2 != 0, num))
print(filtered)
In the following example, we take the example with the function square() and replace it by a lambda expression.
num = [i for i in range(1, 20)]
squared = list(map(lambda x: x * 2, num))
print(squared)
What if we want to pass two arguments, like in the example product() above.
num1 = [i for i in range(1, 20)]
print(num1)
num2 = [i for i in range(10, 20)]
print(num2)
product = list(map(lambda x, y: x * y, num1, num2))
print(product)
In the following examples, we use lambda expression with the reduce() function.
from functools import reduce
import random
num = [i for i in range(1, 20)]
print(num)
sum_value = reduce(lambda x, y: x + y, num)
print(sum_value)
Like in the example with sum_num(), we test real numbers with the lambda expressions.
from functools import reduce
import random
num = [i for i in range(1, 20)]
print(num)
sum_value = reduce(lambda x, y: x + y, num)
print(sum_value)
Now we replace the product() with a lambda expression.
from functools import reduce
import random
num = [i for i in range(1, 20)]
print(num)
product_value = reduce(lambda x, y: x * y, num)
print(product_value)
Question Write a program using map(), reduce() and lambda expressions to count the total length of all strings in a list.
Text Analysis with Lambda Expressions#
You are given a list of sentences. Each sentence is a string containing multiple words.
Use map()
, filter()
, and reduce()
with lambda expressions to analyze the text.
sentences = [
"Bienvenue dans le monde de la programmation!",
"Le débogage fait partie du jeu.",
"La pratique rend parfait, alors continue à coder.",
"Les algorithmes d'apprentissage automatique sont de plus en plus puissants.",
"La visualisation de données est un moyen de communiquer des informations complexes.",
"Les données non structurées sont un défi pour les Data Scientists."
]
Questions
Word Count: Write a program that uses
map()
andreduce()
to count the total number of words in all sentences.Longest Sentence: Write a program that uses
reduce()
to find the longest sentence by word count.Filtering Short Sentences: Write a program that uses
filter()
to keep only sentences with more than 6 words.
from functools import reduce
transactions = [
{"date": "2025-03-10", "type": "income", "amount": 1200},
{"date": "2025-03-11", "type": "expense", "amount": 400},
{"date": "2025-03-12", "type": "income", "amount": 1500},
{"date": "2025-03-13", "type": "expense", "amount": 800},
{"date": "2025-03-14", "type": "income", "amount": 2000},
{"date": "2025-03-15", "type": "expense", "amount": 500},
{"date": "2025-03-16", "type": "income", "amount": 1800},
]
Financial Data Processing with Lambda Expressions#
You have a list of transactions represented as dictionaries.
Use map()
, filter()
, and reduce()
with lambda expressions to process the data.
Questions
Net Balance Calculation: Write a program that uses
reduce()
to calculate the net balance (sum of all income minus sum of all expenses).Filter Transactions: Write a program that uses
filter()
to retrieve only income transactions above a threshold (e.g., 1500).Transaction Amounts: Write a program that uses
map()
to extract only the amounts from the transactions and returns them as a list.
Exercice 4 [★★★]#
Next, we want to use multiprocessing to compute the values in parallel. For this purpose we will use multiprocessing package.
First we find the number of processors in our machine.
import multiprocessing as mp
?mp.cpu_count
import multiprocessing as mp
print(mp.cpu_count())
Next, we will create a pool of processes for the calculation and we make use of the Pool() method.
import multiprocessing as mp
?mp.Pool
In the following example, we create a pool with the number of processes equal to the number of processors in our machine.
Take a look how we tranform our previous example of map-reduce in the mulitprocessing context.
from functools import reduce
import multiprocessing as mp
cpu_count = mp.cpu_count()
def squared(x):
return x * x
num = [i for i in range(1, 20)]
with mp.Pool(processes=cpu_count) as pool:
list_squared = pool.map(squared, num)
print(list_squared)
product_value = reduce(lambda x, y: x * y, list_squared)
print(product_value)
In the following example, we want to download a number of pages in parallel. We pass the download_page() as an input to the pool.map() function. The goal of the function is to download Wikidata pages. Check the output of the following code.
Change the number of processes and test the output.
import requests
def download_page(item):
r = requests.get(
"https://www.wikidata.org/wiki/Special:EntityData/" + item + ".json"
)
# success
if r.status_code == 200:
with open(item + ".json", "w") as w:
w.write(str(r.json()))
w.close()
return r.status_code
process_count = 2
pages = ["Q1", "Q2", "Q3", "Q4", "Q5", "Q6"]
with mp.Pool(processes=process_count) as pool:
status = pool.map(download_page, pages)
print(status)
Now, we want to analyse the downloaded pages. In the following example, we count the number of URLs containing “wikipedia.org”.
import os
def analyse_file(filename):
with open(filename, "r") as w:
data = w.read()
tokens = data.split(",")
urls = list(filter(lambda w: "wikipedia.org" in w, tokens))
return len(urls)
return 0
files = os.listdir(".")
json_files = list(filter(lambda f: ".json" in f, files))
with mp.Pool(processes=cpu_count) as pool:
counts = pool.map(analyse_file, json_files)
print(counts)
total_count = reduce(lambda x, y: x + y, counts)
print(total_count)
Question: Write a program that queries Wikidata to obtain 100 image URLs of cities and downloads the images to your machine using multiprocessing and map(). The program must then analyse every downloaded image and find two predominant colours of each image, again using multiprocessing and map().
Parallel Data Processing with Multiprocessing#
Question
Write a program that takes a list of URLs pointing to text files hosted online.
Download all the text files concurrently using
multiprocessing.Pool()
andmap()
.Once downloaded, perform the following tasks in parallel:
Count the total number of words in each file.
Count the frequency of each word across all files (case-insensitive).
Aggregate the results and display the 10 most common words and their frequencies.
Note: : Use multiprocessing.Pool()
and map()
for processing.
Image Analysis with Multiprocessing#
Question
Write a program that takes a list of image file paths and processes them concurrently to perform the following tasks:
Resize all images to a fixed resolution (e.g., 128x128).
Convert the images to grayscale and compute their average intensity.
Generate thumbnails for all images and save them in a specified directory.
Aggregate the results and display:
The image with the highest average intensity.
The image with the lowest average intensity.
Note: : Use multiprocessing.Pool()
and map()
for processing.