Building a Serverless Data Lake on AWS

Feb 5, 2019 · New York, United States of America

The Data Lake is a data-centered architecture featuring a repository capable of storing vast quantities of data in various formats. Data from enterprise systems, data bases, web server logs, social media, and third-party data is ingested into the Data Lake in a secure and governed manner. Data is cleansed, conformed, integrated and modeled into “refined and for purpose zones” for exploratory and analytical consumption. Metadata consisting of business and technical metadata is captured including lineage in the data catalog for search and discovery. Security policies, including entitlements, are also applied.

Data can flow into the Data Lake by either batch processing or real-time processing of streaming data. Additionally, data itself is no longer restrained by initial schema decisions, and can be exploited more freely by the enterprise. Rising above this repository is a set of capabilities that allow IT to provide Data and Analytics as a Service (DaaS), in a supply-demand model. IT takes the role of the data provider (supplier), while business users (data scientists, business analysts) are consumers.

AWS provide an extensive set of tools and services to implementing serverless data lake architectures.

In this session, Akshay Goel and Alberto Artasanchez from Knowledgent will walk us through stories from the trenches with the challenges that they have had implementing petabyte scale data lakes and how they have overcome those challenges. In this session, we will learn about S3, DynamoDB, Kinesis, AWS Glue, Athena, Lambda, EMR, QuickSight, SageMaker and other AWS services applicable to Data Lakes.

Event organizers
  • AWS New York | Official Meetup

    Welcome to the official Amazon Web Services (AWS) New York Meetup. Founded and organized by AWS employees, we'll explore all aspects of working with AWS. Learn about new services and features, hear from AWS customers who are using our services in new and exciting ways, learn how to partner with AWS and enjoy the company of others who are eager to share experiences.

    Recent Events

Are you organizing Building a Serverless Data Lake on AWS?

Claim the event and start manage its content.

I am the organizer

based on 0 reviews

Featured Events