CMD + K

sentencepiece-python

Anaconda Verified

Unsupervised text tokenizer for Neural Network-based text generation.

Installation

To install this package, run one of the following:

Conda
$conda install main::sentencepiece-python

Usage Tracking

0.2.1
0.2.0
2 / 8 versions selected
Total downloads: 0

Description

SentencePiece is an unsupervised text tokenizer and detokenizer mainly for Neural Network-based text generation systems where the vocabulary size is predetermined prior to the neural model training. SentencePiece implements subword units (e.g., byte-pair-encoding (BPE) [Sennrich et al.]) and unigram language model [Kudo]) with the extension of direct training from raw sentences. SentencePiece allows us to make a purely end-to-end system that does not depend on language-specific pre/postprocessing.

About

Summary

Unsupervised text tokenizer for Neural Network-based text generation.

Information Last Updated

Dec 11, 2025 at 14:14

License

Apache-2.0

Total Downloads

2.9K

Platforms

Linux 64 Version: 0.2.1
Linux aarch64 Version: 0.2.1
Linux s390x Version: 0.2.0
macOS 64 Version: 0.2.0
macOS arm64 Version: 0.2.1
Win 64 Version: 0.2.1