Learn about Your Location (Using ALL Your Data)
Presented by Taryn Price and Courtney Shindeldecker from CCRi in Charlottesville.
Unsupervised learning is not just for building robots! What can it tell you about your location? About a new location you're interested in? CCRi is working on new technologies that take your data -- in its many forms -- into a single “embedding space” over which we can reason, search, and learn.
Often, there are multiple, disparate data sources available for a single geographic region or modeling task that vary widely in form. Consider the data available for City of Chicago: crime events stored as traditional flat files, transportation lines available as shapefiles, satellite imagery, geotagged Twitter tweets, and other publicly available data. Unfortunately, traditional machine learning algorithms require that these input data be of the same format. We must therefore combine these disparate data sources into a single, “fused” dataset, which can then be fed into an unsupervised machine learning model. The result is a set of high-dimensional vector representations of the data from the different sources, called embeddings, of the geographic region and what it contains.
Once we represent a region as a set of of embeddings, we can apply conventional supervised learning techniques to uncover both spatial and non-spatial relationships in a geographic setting. We can take it one step further by exploring this learned embedding space in an unsupervised fashion to uncover hidden structure in the data. These tools can help answer questions that permeate many fields: What parts of my city are similar in ways I didn't know about? Are vehicles traveling anomalously to their normal patterns? Based on past successes (or failures), in what city should I roll out my new product?
Taryn Price brings an extensive background in machine learning, modeling, simulation, and statistical analysis. She has data science experience working on projects in many different industries including healthcare, energy, and national security.
Taryn’s current work at CCRi focuses on uncovering relationships between entities by combining many data sources into a single model-space.
Courtney Shindeldecker is a Data Scientist at CCRi, where she focuses on development of modeling and analytic techniques for the Office of Naval Research. Her work has included optimization, machine learning, and Markov Decision Processes. Before coming to CCRi she worked as an Operations Research Analyst at Northrop Grumman in the domains of modeling and analysis. She received her MS in Mathematical Sciences from Clemson University, and BS in Mathematics and Spanish from Grove City College.