Big Data

Reference Material

Advanced course exploring distributed data processing frameworks, scalable storage systems, and modern analytics platforms for handling massive datasets across multiple domains.
DATA SOURCES API IoT DB Logs Ingestion Kafka Flume NiFi Processing Spark Hadoop Flink Storm STORAGE HDFS NoSQL Volume Velocity Variety

Course Materials

5th Year ETI

Big Data Architecture

Comprehensive exploration of distributed data processing frameworks including Hadoop, Spark, and modern stream processing systems. Focus on scalable architectures and ETL pipelines for the ETI specialization.

View Course
5th Year IRC

Big Data Systems

Advanced study of distributed storage systems, NoSQL databases, and real-time analytics platforms. Covers MapReduce paradigms, data lakes, and cloud-based big data solutions for the IRC program.

View Course

Assessment Resources

Exam Questions

First Session Questions

Comprehensive examination questions covering distributed computing fundamentals, MapReduce concepts, and data processing frameworks.

View Questions
Exam Questions

Second Session Questions

Advanced assessment focusing on practical implementation of big data systems, performance optimization, and real-world architectural decisions.

View Questions

Key Technologies

Framework

Hadoop Ecosystem

HDFS, MapReduce, YARN, Hive, and HBase for distributed storage and batch processing at scale.

In-Memory Processing

Apache Spark

Unified analytics engine supporting batch, streaming, machine learning, and graph processing workloads.

Storage

NoSQL Databases

MongoDB, Cassandra, and Redis for flexible schema design and horizontal scalability.

Real-time

Stream Processing

Kafka, Flink, and Storm for real-time data ingestion, processing, and analytics pipelines.