Spark NLP is a state-of-the-art Natural Language Processing library built on top of Apache Spark. It provides simple, performant & accurate NLP annotations for machine learning pipelines that scale easily in a distributed environment. Spark NLP comes with 21000+ pretrained pipelines and models in more than 200+ languages. It also offers tasks such as Tokenization, Word Segmentation, Part-of-Speech Tagging, Word and Sentence Embeddings, Named Entity Recognition, Dependency Parsing, Spell Checking, Text Classification, Sentiment Analysis, Token Classification, Machine Translation (+180 languages), Summarization, Question Answering, Table Question Answering, Text Generation, Image Classification, Image to Text (captioning), Automatic Speech Recognition, Zero-Shot Learning, and many more NLP tasks.
Uploaded | Mon Mar 31 02:34:54 2025 |
md5 checksum | 047b7da957d9ff408b3177bf984373b2 |
arch | x86_64 |
build | py311h06a4308_0 |
depends | jupyter, openjdk >=8, pyspark >=3.3.1, python >=3.11,<3.12.0a0 |
license | Apache-2.0 |
license_family | Apache |
md5 | 047b7da957d9ff408b3177bf984373b2 |
name | spark-nlp |
platform | linux |
sha1 | 465d3c85fb9b3bbe6089bb622d53b2e140eef5ad |
sha256 | 8d0a8f0f81c3a7b79b2ec69f65e59ed8cb3bcf67e65621f98e42f1e8656dbfdc |
size | 474405 |
subdir | linux-64 |
timestamp | 1696614363099 |
version | 5.1.2 |