We are thrilled to let you know we are ready to organize our third event and it's just as interesting as the first two! This time we were able to convince Javier Ramirez to leave the British islands and come tell us about data lakes and give lots of practical tips.
Do not forget to follow us at @MadridDataEng for updates on this event and info on future events.
*** From raw data to business insights: what you need to build a modern data lake ***
So you have a lot of data… Congratulations. What are you going to do with it? Analyzing data from multiple sources at scale requires some kind of central repository, what we call a data lake. How are you going to get your data into it? Which format will you choose to store everything? How are you going to do validations and data preparation? What about streaming data? Have I mentioned evolving schemas? And where can I find information about the different datasets? Can my data scientists easily discover and play with the data? Can my business people create reports via drag and drop? Can my ops people monitor what’s going on? Will it scale when I have twice as much data?
In this talk I will address the common pitfalls when trying to build a data lake, and I will present the tools available in the open source ecosystem to deal with it. I will also show you how AWS can help you manage your data lake in a more efficient way.
*** About Javier - @supercoco9 ***
I work as a Technical Evangelist at AWS to help developers make the best of cloud, so they can focus on solving interesting problems and rely on AWS for performance, scalability, elasticity, and security.
I love data storage, big and small. I have extensive experience with different SQL, NoSQL, graph, in-memory, and Big Data solutions. I like distributed, scalable, always-on systems.
Before working at AWS I spent 20 years developing software professionally and sharing what I learnt with the community. I've spoken at events in more than 15 countries, mentored dozens of start-ups, taught for 6 years at universities, and trained hundreds of professionals on cloud and data engineering.
Having co-started 3 companies, I bring a startup mentality even when working with large corporations.
*** The talk will take place in the Geoblink offices, with a ~45 mins presentation in English + questions followed by pizza + beer networking. ***