This module equips learners with rigorous, applied competence in the Big Data technologies and cloud-native architectures that underpin scalable AI systems. The pedagogy is deliberately focused: learners develop deep, assessed proficiency in a curated set of industry-standard open-source tools — PostgreSQL, MongoDB, PySpark, Apache Kafka, Apache Airflow, and Docker — rather than superficial exposure to a broad toolset. Conceptually important but time-intensive technologies (Flink, Databricks, managed cloud ML services) are covered through lecture, case study, and instructor demonstration, ensuring graduates possess both the applied skills to function in data engineering roles and the architectural literacy to engage with enterprise-scale systems.