An unsupervised text tokenizer and detokenizer mainly for Neural Network-based text generation systems.