Developing and training distributed deep learning models at scale is challenging. We will show how to overcome these challenges and successfully build and train a distributed deep neural network with TensorFlow.
First, we will present deep learning on distributed infrastructure and cover various concepts such as experiment parallelism, model parallelism and data parallelism. Then, we will discuss limitations and challenges of each approach. Later, we will demonstrate hands on how to build and train distributed deep neural networks using TensorFlow GRPC (parameter and worker servers) on Clusterone.
Babak Rasolzadeh https://www.linkedin.com/in/babakrasolzadeh/ got his PhD in Computer Vision and Robotics in 2010, after which he started his first company, OculusAI that he sold in 2013 to the world's largest media monitoring company, Meltwater. Since then Babak has held various Director roles in Data Science and Machine Learning in the industry, as well as been an advisor and investor in a handful of startups.
6:30–7 pm Meet and greet
7-8 pm Presentation and discussion
8-8:30pm Social and catch up