Apuntes del Taller Análisis de Sentimiento en Textos con SparkNLP

Oct 20, 2021 · Mexico City, Mexico

Apuntes Taller Análisis de Sentimiento en Textos con Python, NLP con SpaCy y SparkNLP

El taller consta de 2 días de 2 hrs por día iniciando a las 8 pm y terminando a las 10 pm

Se lleva a cabo los días: MIÉRCOLES 20 y JUEVES 21 de octubre

COSTO: 1000 pesos + i.v.a.

Pago PayPal

PayPal.me/saxsa2000

También los que deseen pago interbancario, les enviaría la CLABE de BBVA

Costo por módulo de 4 horas, en dos días de taller

Se ofrece una maquina virtual con todos los sistemas instalados: CentOS8, Anaconda, Jupyter, Python, Pandas, Spark, PySpark, Koalas, SpaCy y SparkNLP

Herramienta SpaCy

https://spacy.io/

Herramienta SparkNLP

https://nlp.johnsnowlabs.com/

El tema central es Sistema de Análisis de Sentimientos basado en Textos y todas las técnicas de Natural Lenguage Processing. Nos enfocamos con Python usando primero SpaCy y enseguida cuando son numerosos los datos, se escala la solución a SparkNLP

En estos días los desarrollos de aplicaciones NLP están surgiendo y el conocimiento de herramientas de NLP cada día es mas importante.

Hace un año realizamos unos seminarios de este tema utilizando una excelente herramienta: SpaCy con Python

Hoy deseamos compartir con todas y todos los URL y ofrecemos un taller donde se explican y entregan TODOS los cuadernos funcionando en la maquina virtual

Atte

Dr Gabriel Guerrero

saxsa2000 (at) gmail.com

Apuntes NLP, ideas principales

A good NLP library should be able to correctly transform the free text into structured features and let you train your own NLP models that are easily fed into the downstream machine learning (ML) or deep learning (DL) pipeline with no hassle.

Being a general-purpose in-memory distributed data processing engine, Apache Spark gained a lot of attention from industry and has already its own ML library (SparkML) and a few other modules for certain NLP tasks but it doesn’t cover all the NLP tasks that are needed to have a full-fledged solution.

Spark NLP is an open-source natural language processing library, built on top of Apache Spark and Spark ML.

Spark NLP’s annotators utilize rule-based algorithms, machine learning and some of them Tensorflow running under the hood to power specific deep learning implementations.

The library covers many common NLP tasks, including tokenization, stemming, lemmatization, part of speech tagging, sentiment analysis, spell checking, named entity recognition, and more.

As a native extension of the Spark ML API, the library offers the capability to train, customize and save models so they can run on a cluster, other machines or saved for later.

Using TensorFlow under the hood for a deep learning enables Spark NLP to make the most of modern computer platforms — from nVidia’s DGX-1 to Intel’s Cascade Lake processors.

Spark-NLP introduces NLP annotators

A Transformer is an algorithm which can transform one DataFrame into another DataFrame. E.g., an ML model is a Transformer that transforms a DataFrame with features into a DataFrame with predictions.

An Estimator in Spark ML is an algorithm which can be fit on a DataFrame to produce a Transformer. E.g., a learning algorithm is an Estimator which trains on a DataFrame and produces a model.

In Spark NLP, all Annotators are either Estimators or Transformers

In Spark NLP, there are two types of annotators: AnnotatorApproach and AnnotatorModel

Another important point is that each annotator accepts certain types of columns and outputs new columns in another type (we call this AnnotatorType)

In Spark NLP, we have the following types: Document, token, chunk, pos, word_embeddings, date, entity, sentiment, named_entity, dependency, labeled_dependency.

The DataFrame needs to have a column from one of these types if that column will be fed into an annotator; otherwise, you’d need to use one of the Spark NLP transformers.

trained annotators are called AnnotatorModel

The goal is to transform one DataFrame into another through the specified model (trained annotator).

Event organizers

Are you organizing Apuntes del Taller Análisis de Sentimiento en Textos con SparkNLP?

Claim the event and start manage its content.

I am the organizer
Social
Rating

based on 0 reviews