Incremental learning tools such as Vowpal Wabbit (VW) and Sofia-ML can learn massive datasets by streaming the data through a fixed-size memory window. This means that they can learn a useful model from datasets that are much larger than the amount of system memory. Progressive validation is at the heart of this learning approach. It forces the model to make a prediction before seeing the true label of the example, yielding a surprisingly reliable estimate of generalization error.
We'll do a quick walk through of VW and Sofia with some live demonstrations of model training and testing.
Speaker: Arshak Navruzyan, VP Product @ Argyle Data
My objective is to make distributed systems and machine learning accessible to any organization or individual that wants to transform the world through data. I am currently VP of Product Management at Argyle Data focused on petabyte-scale risk management applications using machine learning and Hadoop. Previously I held senior engineering and product management roles at Alpine Data Labs, Endeca and Oracle. I am a contributor to the Apache Accumulo project and the organizer of San Francisco Machine Learning Meetup group.
Claim the event and start manage its content.I am the organizer