You all know what RDD stands for, right? You have the mental model of a distributed collection. But have you ever consider writing your own RDD? During this talk we will do just that. We will start by explaining essence of how RDDs are implemented internally, following by semi-live demo (*), where we will implement few RDDs from the scratch.
After this talk you will not only be able to write your own RDD, but you will also have a deeper understanding of how Apache Spark works under the hood.
I guarantee fun during the talk and profit during your next job interview.
(*) by 'semi-live' author means not really code live because that almost never works, but slowly pulling small commits from the repo :)