PySpark in Practice

PyData London 2016,In this talk we will share our best practices of using PySpark in numerous customer facing data science engagements. Topics covered in this talk are:,At Pivotal Labs we have many data science engagements on big data. Typical problems involve real-time data from sensors collected by telecom operators to GPS data produced by vehicle tracking systems. One widespread framework to solve those inherently difficult problems is Apache Spark. In this talk, we want to share our best practices with

