Traitement de données massives
John Samuel
CPE Lyon
Year: 2020-2021
Email: john(dot)samuel(at)cpe(dot)fr

$ tail /var/log/apache2/access.log
127.0.0.1 - - [14/Nov/2018:14:46:49 +0100] "GET / HTTP/1.1" 200 3477 "-"
"Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:63.0) Gecko/20100101 Firefox/63.0"
127.0.0.1 - - [14/Nov/2018:14:46:49 +0100] "GET /icons/ubuntu-logo.png HTTP/1.1" 304 180 "http://localhost/"
"Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:63.0) Gecko/20100101 Firefox/63.0"
127.0.0.1 - - [14/Nov/2018:14:46:49 +0100] "GET /favicon.ico HTTP/1.1" 404 294 "-"
"Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:63.0) Gecko/20100101 Firefox/63.0"
$ tail /var/log/apache2/error.log
[Wed Nov 14 09:53:39.563044 2018] [mpm_prefork:notice] [pid 849]
AH00163: Apache/2.4.29 (Ubuntu) configured -- resuming normal operations
[Wed Nov 14 09:53:39.563066 2018] [core:notice] [pid 849]
AH00094: Command line: '/usr/sbin/apache2'
[Wed Nov 14 11:35:35.060638 2018] [mpm_prefork:notice] [pid 849]
AH00169: caught SIGTERM, shutting down
$ cat /etc/apache2/apache2.conf
LogFormat "%v:%p %h %l %u %t \"%r\" %>s %O \"%{Referer}i\" \"%{User-Agent}i\"" vhost_combined
LogFormat "%h %l %u %t \"%r\" %>s %O \"%{Referer}i\" \"%{User-Agent}i\"" combined
LogFormat "%h %l %u %t \"%r\" %>s %O" common
LogFormat "%{Referer}i -> %U" referer
LogFormat "%{User-agent}i" agent
[
{
"languageLabel": "ENIAC coding system",
"year": "1943"
},
{
"languageLabel": "ENIAC Short Code",
"year": "1946"
},
{
"languageLabel": "Von Neumann and Goldstine graphing system",
"year": "1946"
}
]
<?xml version="1.0" encoding="UTF-8"?>
<root>
<element>
<languageLabel>ENIAC coding system</languageLabel>
<year>1943</year>
</element>
<element>
<languageLabel>ENIAC Short Code</languageLabel>
<year>1946</year>
</element>
<element>
<languageLabel>Von Neumann and Goldstine graphing system</languageLabel>
<year>1946</year>
</element>
</root>
languageLabel,year ENIAC coding system,1943 ENIAC Short Code,1946 Von Neumann and Goldstine graphing system,1946
Il est impossible sur un système informatique de calcul distribué de garantir en même temps (c'est-à-dire de manière synchrone) les trois contraintes suivantes
| num | languageLabel | year |
|---|---|---|
| 1 | ENIAC coding system | 1943 |
| 2 | ENIAC Short Code | 1946 |
| 3 | Von Neumann and Goldstine graphing system | 1946 |
ENIAC coding system:1; ENIAC Short Code:2 Von Neumann and Goldstine graphing system:3 1943:1; 1946:2; 1946:3
{
"languageLabel": "ENIAC coding system",
"year": "1943"
}
{
"languageLabel": "ENIAC Short Code",
"year": "1946"
}
{
"languageLabel": "Von Neumann and Goldstine graphing system",
"year": "1946"
}
| identifiant | languageLabel,year |
|---|---|
| p1 | ENIAC coding system,1943 |
| p2 | ENIAC Short Code,1946 |
| p3 | Von Neumann and Goldstine graphing system,1946 |
import requests
url = "https://api.github.com/users/johnsamuelwrites"
response = requests.get(url)
print(response.json())
<xs:schema attributeFormDefault="unqualified"
elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="root" type="rootType"/>
<xs:complexType name="elementType">
<xs:sequence>
<xs:element type="xs:string" name="languageLabel"/>
<xs:element type="xs:short" name="year"/>
</xs:sequence>
</xs:complexType>
<xs:complexType name="rootType">
<xs:sequence>
<xs:element type="elementType" name="element" maxOccurs="unbounded" minOccurs="0"/>
</xs:sequence>
</xs:complexType>
</xs:schema>
| num | languageLabel | year |
|---|---|---|
| 1 | ENIAC coding system | 1943 |
\({num}\rightarrow{languageLabel}\)
\({languageLabel}\rightarrow{year}\)
\({num}\rightarrow{year}\)