Lecturing in the fields of Big Data, data engineering, data science, development and architecture of data-intensive applications.
"We must look for the opportunity in every difficulty, instead of being paralyzed at the thought of the difficulty in every opportunity."
- Walter E. Cole
This lecture will give you a brief introduction to so what is called ’Big Data’. We will quickly refresh the basics about databases, data models and data processing you have learned so far and compare those to the distributed world of Big Data. After that we will take a deep dive into the foundations of distributed data storages and data processing as well as the belonging concepts and challenges of reliability, scalability, replication, partitioning, batch and stream processing. Later on we will take a look at the most common used software and frameworks (mostly the Hadoop and Spark ecosystem). At the end, as you know the basic concepts and you are able to setup and work with distributed environments and huge data sets, there will be a short introduction to data science.
Materials: Script | Slides | Exercises | Solutions | Docker Files | DockerHub | GitThis lecture will give you a brief introduction to so what is called ’Big Data’. We will quickly refresh the basics about databases, data models and data processing you have learned so far and compare those to the distributed world of Big Data. After that we will take a deep dive into the foundations of distributed data storages and data processing as well as the belonging concepts and challenges of reliability, scalability, replication, partitioning, batch and stream processing. Later on we will take a look at the most common used software and frameworks (mostly the Hadoop and Spark ecosystem). At the end, as you know the basic concepts and you are able to setup and work with distributed environments and huge data sets, there will be a short introduction to data science.
Materials: Script | Slides | Exercises | Solutions | Docker Files | DockerHub | GitThis lecture will give you a brief introduction to so what is called ’Big Data’. We will quickly refresh the basics about databases, data models and data processing you have learned so far and compare those to the distributed world of Big Data. After that we will take a deep dive into the foundations of distributed data storages and data processing as well as the belonging concepts and challenges of reliability, scalability, replication, partitioning, batch and stream processing. Later on we will take a look at the most common used software and frameworks (mostly the Hadoop and Spark ecosystem). At the end, as you know the basic concepts and you are able to setup and work with distributed environments and huge data sets, there will be a short introduction to data science.
Materials: Script | Slides | Exercises | Solutions | Docker Files | DockerHub | GitThis lecture will give you a brief introduction to so what is called ’Big Data’. We will quickly refresh the basics about databases, data models and data processing you have learned so far and compare those to the distributed world of Big Data. After that we will take a deep dive into the foundations of distributed data storages and data processing as well as the belonging concepts and challenges of reliability, scalability, replication, partitioning, batch and stream processing. Later on we will take a look at the most common used software and frameworks (mostly the Hadoop and Spark ecosystem). At the end, as you know the basic concepts and you are able to setup and work with distributed environments and huge data sets, there will be a short introduction to data science.
Materials: Script | Slides | Exercises | Solutions | Docker Files | DockerHub | GitThis lecture will give you a brief introduction to so what is called ’Big Data’. We will quickly refresh the basics about databases, data models and data processing you have learned so far and compare those to the distributed world of Big Data. After that we will take a deep dive into the foundations of distributed data storages and data processing as well as the belonging concepts and challenges of reliability, scalability, replication, partitioning, batch and stream processing. Later on we will take a look at the most common used software and frameworks (mostly the hadoop ecosystem). At the end, as you know the basic concepts and you are able to setup and work with distributed environments and huge data sets, there will be a short introduction to data science.
Materials: Script | Slides | Exercises | Solutions | Docker Files | DockerHub | GitThis lecture will give you a brief introduction to so what is called ’Big Data’. We will quickly refresh the basics about databases, data models and data processing you have learned so far and compare those to the distributed world of Big Data. After that we will take a deep dive into the foundations of distributed data storages and data processing as well as the belonging concepts and challenges of reliability, scalability, replication, partitioning, batch and stream processing. Later on we will take a look at the most common used software and frameworks (mostly the hadoop ecosystem). At the end, as you know the basic concepts and you are able to setup and work with distributed environments and huge data sets, there will be a short introduction to data science.
Materials: Script | Slides | Exercises | Solutions | Docker Files | DockerHub | Git