Webinar: "Sharing Large Amounts of Data with Open Source Delta Sharing" Online

Oct 26, 2021 · Stockholm, Sweden

To access this webinar, please register here:
https://attendee.gotowebinar.com/register/1795336879817942285

Topic: Sharing Large Amounts of Data with Open Source Delta Sharing

Speaker: Dr. Frank Munz, Developer Advocate at Databricks
https://www.linkedin.com/in/frankmunz/

Bio:
Dr. Frank Munz authored three computer science books, built up technical evangelism for Amazon Web Services in Germany, Austria and Switzerland and once upon a time worked as data scientist with a group that won a Nobel prize for linking HPV to cancer.

Frank realized his dream to speak at top-notch conferences on every continent (except antarctica, because it is too cold there) such as re:Invent, Devoxx, Kubecon, and Java One. He holds a PhD in Computer Science from TU Munich.

Abstract:
In this session, speaker will dive deep into Delta Sharing; A Linux Foundation open source solution for sharing massive amounts of data in a cheap, secure, and scalable way.

Delta Sharing reliably accesses data at the bandwidth of modern cloud object stores, such as S3, ADLS, or GCS. The data provider runs a sharing server and decides what data to share. To get you started, a hosted reference sharing service, an open-sourced pre-packaged server, and a Docker image are available for sharing data from your lakehouse.

Under the hood, Delta Sharing uses an open REST protocol, enabling secure data sharing across products and companies for the first time.
Any client supporting pandas, Apache Spark™, or Python, can connect to the sharing server. Clients always read the latest version of the data, and they can provide filters on partitioned data to read a subset of the data.

This talk is built around a number of hands-on demos: We start with a multi-cloud example using Google Colab. Then speaker will share some raw data of the sampled DNA using Delta Sharing and we will build a client in pandas. The client will then check for genetic traits, such as eye color, the coffee metabolism rate, special nutritional requirements etc. All data access is read-only, there will be no harm to the presenter. To conclude, we will compare running your own self-hosted Delta Sharing server with sharing data from a managed cloud service using SQL.

[November] Get your Pass to ODSC West 2021 with an additional discount - https://bit.ly/3fGU0sS or Virtual pass - https://bit.ly/2SXM2E4

[18th November] Free Virtual Ai+ Professionals Expo - https://hubs.li/H0Y8St80

ODSC Links:
• Get free access to more talks/trainings like this at AI+ Training platform:
https://aiplus.training/
• Facebook: https://www.facebook.com/OPENDATASCI
• Twitter: https://twitter.com/odsc & @odsc
• LinkedIn: https://www.linkedin.com/company/open-data-science
• Slack Channel: https://bit.ly/35pfPZo
• ODSC West Kickstart Bootcamp Nov 15th - 18th - https://odsc.com/california/bootcamp/
• West Conference November 16th - 18th: https://odsc.com/california/
• Code of conduct: https://odsc.com/code-of-conduct/

Event organizers
  • Milano Data Science #ODSC

    #ODSC brings together the data science community with the goal of helping its members learn and network around the latest topics and trends in Data Science and Artificial Intelligence. We are looking for volunteers and contributors in Milano to help grow the community. ODSC will host speakers, food, and a venue if necessary. Many of our top volunteers get an all expenses paid trip to ODSC data science conferences in either London,  San Francisco, Boston or other events. If you are interested in helping ou

    Recent Events
    More

Are you organizing Webinar: "Sharing Large Amounts of Data with Open Source Delta Sharing"?

Claim the event and start manage its content.

I am the organizer
Online Event
Online Event

A place to meet the most active, international community members

Social
Topics
Rating

based on 0 reviews

Featured Events