This will be the seventh PyData Warsaw regular meetup at Centrum Szkoleniowe Adgar Ochota near Warszawa Zachodnia station (English: Warsaw West).
room number: "Event Room"
Doors open at 18:00, talks start at 18:30, about 9pm we move to a pub. We are ready to host 150 folks in the room so there may be plenty of people to discuss data science questions with!
Please remember to unRSVP if you realize you can't make it - it will help a lot for our crew.
And make sure you follow @pydatawarsaw for any updates and early announcements.
Michał Jaślan (Onwelo) - HawqData
As you know analyzing data is really demanding task. Business domain knowledge, analytic technics/algorithms, tools define areas that need to be all covered by successful analyst. To make thing a little simpler, I would like to share some experiences related to HAWQ analytical database.
Bartosz Biskupski & Wojciech Walczak (Samsung) - How much meaning can you pack into a real-valued vector? Semantic similarity measuring using recursive auto-encoders..
The presentation will start with a brief overview of AI research and development at Samsung R&D in Poland. We will then describe a solution, developed in one of our projects, that has won the Semantic Textual Similarity (STS) task within the SemEval 2016 research competition. The goal of this competition was to measure semantic similarity between two given sentences on a scale from 0 to 5. At the same time the solution should replicate human language understanding. The presented model is a novel hybrid of recursive auto-encoders (a deep learning technique) and a WordNet award-penalty system, enriched with a number of other similarity models and features used as input for Linear Support Vector Regression.
Michał Tadeusiak i Jan Milczek (CodiLime) - Machine Learning for the Safety in Coal Mines
About the AAIA'16 Data Mining Challenge, evaluation metric and data. Which tricks, ideas are the best. And more about Feature Extraction, Model Selection and blending
Maciej Bryński (Innovation In IT) - Apache Airflow
Apache Airflow is a tool for managing processes of the data processing. As part of the presentation will describe the best things about this tool and limits. Maciej Bryński will also show that Apache Airflow can replace both of Cron, Jenkins as well as Oozie.
18:00 - 18:30 - Doors open
18:30 - 18:35 - Introduction
18:35 - 19:05 - First Talk
19:10 - 19:40 - Second Talk
19:40 - 19:50 - Break
19:50 - 20:20 - Third Talk
20:25 - 20:55 - Fourth Talk
20:55 - 21:00 "Speed Dating"
21:00 - 24:00 - After Party with our Partner, drinks for free !!! (at Świeżo Malowane)
PS1: default language is English, but there may be some exceptions from the rule
PS2: presentation part is mainly Python focused but not only, We expect to host a number of guests working with R, Scala and other languages.
direct contact: [masked], [masked]
See you !