We focus on Big Data technologies, including their merit, need and field deployment. Our courses illustrate the scope of Big Data through explaining the famous Vs (Volume, Variety, Velocity, and Veracity) along with the foundational technologies that are able to handle them. We also explain the elements of a Big Data infrastructure, including distributed systems, databases, data stores, data mining, security and more. Concrete examples of Big Data technologies deployments are provided from a variety of sectors including finance, healthcare, manufacturing, energy, public infrastructure, smart cities and more.
Our courses focus on Hadoop as a foundational infrastructure for Big Data. We provide insights and practical training on HDFS (Hadoop Distributed File System), which is at the core of the Hadoop ecosystem. However, we also illustrate the full range of platforms and tools that comprise the ecosystem, including Map Reduce, Hive, Kafka, Spark, Storm, Oozie, Mahout and HBase. For each of these tools we provide an introductory presentation along with practical examples and use cases.
We provide insights on data streaming technologies and databases, which are destined to handle data streams with very high ingestion rates, such as sensor data and data from social media. As part of our courses, we illustrate the difference between streaming databases and conventional transaction databases, including the differences between streaming engines and common query engines. We illustrate data streaming based on practical examples on popular stream processing platforms, including Apache Spark, Apache Storm and Apache Flink. The offered trainings include practical (hands-on) examples from handling streaming data in the scope of near real-time problems.