This event is co-organized with the Budapest-Users-of-R-Network: https://www.meetup.com/Budapest-Users-of-R-Network/events/261491336
With all the hype about deep learning and “AI”, it is not well publicized that for structured/tabular data widely encountered in business applications it is actually another machine learning algorithm, the gradient boosting machine (GBM) that most often achieves the highest accuracy in supervised learning/prediction tasks. In this talk we’ll review some of the main GBM implementations such as xgboost, h2o, lightgbm, catboost, Spark MLlib (all of them available from R/Python) and we’ll discuss some of their main features and characteristics (such as training speed, memory footprint, scalability to multiple CPU cores and in a distributed setting, prediction speed etc). If you have seen an earlier version of my talk with the same title (for example at eRum or Crunch, or the video recording from several other conferences/meetups in the USA), this talk will have plenty of updates from as recent as a few weeks ago that will make it worth hearing it (for example more details on the GPU implementations, new results on catboost, or exciting updates on Spark MLlib).
Speaker: Szilard Pafka, Phd, Chief Scientist at Epoch (USA)
Szilard studied Physics in the 90s and obtained a PhD by using statistical methods to analyze the risk of financial portfolios. He worked in finance, then more than a decade ago moved to become the Chief Scientist of a tech company in Santa Monica, California doing everything data (analysis, modeling, data visualization, machine learning, data infrastructure etc). He is the founder/organizer of several meetups in the Los Angeles area (R, data science etc) and the data science community website datascience.la. He is the author of a well-known machine learning benchmark on github (1000+ stars), a frequent speaker at conferences (keynote/invited at KDD, R-finance, Crunch, eRum and contributed at useR!, PAW, EARL, H2O World, Data Science Pop-up, Dataworks Summit etc.), and he has developed and taught graduate data science and machine learning courses as a visiting professor at two universities (UCLA in California and CEU in Europe).