Lessons learned building a Spark distribution
How to supercharge your DevOps for big data.
Apache Spark is a next-generation tool that makes ground-breaking improvements over MapReduce on Hadoop. We’ll touch on how Spark lets Scala developers be more productive and use their resources more efficiently. But it's less known that the authors of Mesos introduced it in their paper as a framework "to validate their hypothesis”.
We built a distribution of Spark at Typesafe, exploiting the synergy with Mesos. We will show how the combination is ready for multi-tenant, heterogeneous cluster environments. We'll see how they fit into a larger picture, along side YARN or containers. And we'll have a look at the rapid prototyping and enterprise use cases that drove some of our choice. Finally, we'll also mention the bumps on the road, and improvements we would like to see to make the Spark and Mesos combination even more versatile and powerful.
François Garillot joined Typesafe in 2012 after an early stint in research, where he spoke frequently at international conferences. He is now working in Typesafe's Spark team, leveraging his Scala knowledge to improve Spark's support for scalable machine learning and data science applications.
Based in Lausanne, he speaks at Swiss conferences and Scala user groups in Lyon and Paris. He recently spoke at Strata Hadoop Barcelona on how to make your next big data hackathon successful.
Claim the event and start manage its content.I am the organizer