Reference Material
Comprehensive exploration of distributed data processing frameworks including Hadoop, Spark, and modern stream processing systems. Focus on scalable architectures and ETL pipelines for the ETI specialization.
View CourseAdvanced study of distributed storage systems, NoSQL databases, and real-time analytics platforms. Covers MapReduce paradigms, data lakes, and cloud-based big data solutions for the IRC program.
View CourseComprehensive examination questions covering distributed computing fundamentals, MapReduce concepts, and data processing frameworks.
View QuestionsAdvanced assessment focusing on practical implementation of big data systems, performance optimization, and real-world architectural decisions.
View QuestionsHDFS, MapReduce, YARN, Hive, and HBase for distributed storage and batch processing at scale.
Unified analytics engine supporting batch, streaming, machine learning, and graph processing workloads.
MongoDB, Cassandra, and Redis for flexible schema design and horizontal scalability.
Kafka, Flink, and Storm for real-time data ingestion, processing, and analytics pipelines.