Beyond Word2Vec: Recent Developments in Document Embedding

Oct 24, 2017 · San Francisco, United States of America

Main Talk: Beyond Word2Vec: Recent Developments in Document Embedding - Andrew Blevins (Metis (

Abstract: It is easy to be amazed by their seemingly magical power of word2vec. But in real business use cases, we rarely need to understand single words. So how do we apply the power of word2vec to phrases, sentences, paragraphs or entire documents? We will compare various techniques of generating useful representations of documents of indeterminate length and look at ways of comparing methods.

We will start with bag-of-words approaches and TFIDF. From there we will look at dimensionality reduction techniques like LSA or NMF. After that, we will look at word2vec and sense2vec and various ways to aggregate those word vectors, including summing, weighting, clustering, Chinese restaurant processes, Gensim Doc2vec and developing parse tree representations. Finally, we will look at RNN methods such as LSTMs using Keras. Along the way, we will look at ways to evaluate each of these methods and discuss strengths and weaknesses.

Bio: Andrew comes to Metis from LinkedIn, where he worked as a data scientist, on projects ranging from executive dashboarding, education, inferring profiles and skills standardization. He is passionate about helping people make rational decisions and building cool data products. Prior to that he worked on fraud modelling at IMVU (the lean startup) and studied applied physics at Cornell. Andrew grew up on a sheep farm in North Idaho. He loves snowboarding, traveling, scotch and reading about all kinds of nerdy topics.

Lightning Talk: Machine Learning at TrueAccord - Nadav Samet (True Accord (

Abstract: TrueAccord reinvents debt collection and empowers consumers to regain financial health. Using machine learning and behavioral analytics we replace the majority of human to human interactions with human to machine interactions and make a significant impact on millions of consumers in one of the most regulated industries in the US.

Bio: Nadav has over 20 years of coding experience, with more than 7 years as a solutions engineer for various startups. He began his career at the elite technological unit of the IDF’s Intelligence Corps where he specialized in data and network analysis.

Tentative Schedule:

6:00pm-6:45pm -- pre-reception

6:45pm-7:00pm -- lightning talk

7:00pm-8:00pm -- main talk

8:00pm-8:30pm-- post-reception


Thanks to True Accord ( for hosting, food, and drinks!

Thanks to Intel ( for supporting video recording.


NOTE: attendees will have to sign a release indicating they only had access to the public speaking area of the office (ie, this is not an "NDA"), for TrueAccord compliance purposes.

Intel ML Contest:

Intel also created a mini contest for all the participants. If you have a ML project and want to showcase it, share it, or collaborate it with others, submit it to DevMesh. Everyone who submits a project to DevMesh will get remote access to Machine Learning Servers. On top of that, best projects will be selected and each winner will receive a $50 gift card

Instructions to join DevMesh:

Create a new account at

Join your dedicated group - Artificial Intelligence West Coast

To submit a project, click on “add a project”
*when submitting your project, make sure to select “Artificial Intelligence West Coast” as your group.

To receive invitations for Intel webinars, news and tools for Machine Learning and Deep Learning, register on this link

Event organizers
  • SF Bayarea Machine Learning

    A group to discuss all things Machine Learning! Topics may include (but are not limited to): learning theory, computer vision, natural language processing, data mining, computer audition, etc. Everyone is welcome! Videos of past meetups  • on Hakka Labs • on Vimeo Other local Machine Learning groups:  

    Recent Events

Are you organizing Beyond Word2Vec: Recent Developments in Document Embedding?

Claim the event and start manage its content.

I am the organizer

based on 0 reviews