×

Spark NLP is a state-of-the-art Natural Language Processing library built on top of Apache Spark. It provides simple, performant & accurate NLP annotations for machine learning pipelines that scale easily in a distributed environment. Spark NLP comes with 21000+ pretrained pipelines and models in more than 200+ languages. It also offers tasks such as Tokenization, Word Segmentation, Part-of-Speech Tagging, Word and Sentence Embeddings, Named Entity Recognition, Dependency Parsing, Spell Checking, Text Classification, Sentiment Analysis, Token Classification, Machine Translation (+180 languages), Summarization, Question Answering, Table Question Answering, Text Generation, Image Classification, Image to Text (captioning), Automatic Speech Recognition, Zero-Shot Learning, and many more NLP tasks.

Uploaded Mon Mar 31 02:34:55 2025
md5 checksum 950e38827e3d2333a0287f83c71d90a8
arch x86_64
build py38h06a4308_0
depends jupyter, openjdk >=8, pyspark >=3.3.1, python >=3.8,<3.9.0a0
license Apache-2.0
license_family Apache
md5 950e38827e3d2333a0287f83c71d90a8
name spark-nlp
platform linux
sha1 50a2fe466f998b538a2ae65fc3fe13057f482e63
sha256 b1350e56d6cc067a24c11cced4883d8932760d51e6887b660e5570d916ad3def
size 392105
subdir linux-64
timestamp 1696614447867
version 5.1.2