Come join us for our Boston NLP Meet up!
Our special guest is Alex Wiltschko from Twitter Cortex.
*Pizza and Beer will be provided :)
Automatic differentiation, the algorithm behind all deep nets (and all gradient-based learning, too!)
A painful and error-prone step of working with gradient-based models (deep neural networks being one kind) is actually deriving the gradient updates. Deep learning frameworks, like Torch, TensorFlow and Theano, have made this a great deal easier for a limited set of models — these frameworks save the user from doing any significant calculus by instead forcing the framework developers to do all of it. However, if a user wants to experiment with a new model type, or change some small detail the developers hadn’t planned, they are back to deriving gradients by hand. Fortunately, a 30+ year old idea, called “automatic differentiation”, and a one year old machine learning-oriented implementation of it, called “autograd”, can bring true and lasting peace to the hearts of model builders. With autograd, building and training even extremely exotic neural networks becomes as easy as describing the architecture. It’s fast, it’s easy, and you should probably be using it.
Alex Wiltschko is a neuroscientist-turned-engineer working on the next generation of machine learning tools in Twitter Cortex, the social media company’s machine learning group.