Management and Analysis of Physics Datasets
Period: Second semester
Course unit contents:
Part 1) Data Management
Introduction to data structures
Storage Models
Reliability
Authentication, Authorization
Local and Distributed File systems
Databases
Part 2) Data processing
Introduction to parallel processing
Distributed Computing Systems
Containerization
Hadoop as a paradigm for big data processing
Data processing with Spark
Data processing with Dask
Kafka as a distributed streaming platform
Planned learning activities and teaching methods:
Frontal lectures for the introductory topics.
Hands-on sessions with live-coding examples run by the lecturers.
Exercises and examples to be done in the IT lab.
Ultime modifiche: mercoledì, 8 giugno 2022, 10:41