SciDB as a Data Backend for R

Feb 12, 2013 · New York, United States of America

We have a short turnaround time this month as we welcome Bryan Lewis to discuss SciDB.

About the talk:

SciDB is an open-source database that organizes data in n-dimensional arrays.

Interesting SciDB features include parallel processing, distributed storage, ACID transactions, efficient sparse array storage, and native linear algebra operations.

The "scidb" package for R provides two general ways to interact with SciDB from R:
1. By running database queries from R transferring data using data.frame iterators.
2. Through a sparse n-dimensional array object class for R inspired by the bigmemory package. The arrays mimic standard R arrays, but operations on them are performed by the SciDB engine.  Data are materialized to R only when requested.

We illustrate using SciDB and R with a few examples including computing a truncated singular value decomposition of a large matrix, and bi-clustering of large arrays using the biclust package.

 

About Bryan:

Bryan Lewis has worked with R for a number of years and is the author of a number of R packages including irlba, rredis, doRedis, websockets, and bigalgebra, and others. He is the chief data scientist at Paradigm4 in Waltham, MA and has a Ph.D. in applied mathematics.

 

Pizza starts at 6:15, Bryan will go on at 7 then we'll head to the bar.

Event organizers
  • New York Open Statistical Programming Meetup

    Meet with other users of the open-source programming language R. Previously, this meetup focused only on the R language, but is now focused on all open-source data analysis tools; including but not limited to, Python, Julia, C++, Stan, etc. Learn and share tricks and techniques from and with other users. Beginners to advanced users are all welcome.

    Recent Events
    More

Are you organizing SciDB as a Data Backend for R?

Claim the event and start manage its content.

I am the organizer
Social
Rating

based on 0 reviews