Big Data Technologies

Entry requirements: Basic programming skills, knowledge of web technologies, SQL and DBMS. Experience in working with Big Data.

Credits: 4

Course: Core

Language of the course: English

Lecturer

Denis Nasonov

Objectives

  • Identification of main causes behind the emergence of Big Data
  • Definition and identification of Big Data
  • Brief identification to Big Data processing technologies
  • Introduction to MapReduce paradigm
  • Introduction to Apache Hadoop technology and its basic infrastructure
  • Introduction to Apache Spark technology

Contents

Big Data technologies hold one of the leading positions in today's software solutions of large companies. At the moment efficient data processing and analysis do not only constitute a base for successful business development but can also become a decisive competitive edge. For this reason the course focuses on mastering skills for handling and analyzing Big Data. The course covers a brief history of Big Data, as well as its definitions and identification. Students will learn the basics of working with the distributed file system HDFS and Apache Hadoop technology, basic functioning of MapReduce, and Apache Spark and Spark Streaming technology. As a result, students will be skilled in main Big Data technologies, such as Apache Hadoop and Apache Spark.

Format

Labs and practical sessions

Assessment

Attendance is mandatory. Students should complete all the assignments. The final grade is based on the student performance throughout the course.