split natural language text in chunks at reasonable language boundaries
git clone
Log | Files | Refs | README

commit ba95b3ee65c8d4cc74e6bdd23ed922c62e29a5cf
parent 77a5d55608fbdfaf472c2c11e7da108bc949deb3
Author: Antoine Amarilli <>
Date:   Mon, 31 Oct 2011 11:36:57 +0100


README | 4++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/README b/README @@ -90,8 +90,8 @@ acceptable. nlsplit is not Unicode-aware. It will not perform splits according to extended characters, and could theoretically split an extended character. However, as long as you are using ASCII whitespace regularly -enough, these splits should be favoured and that bad situation should -not happen. +enough, these splits should not be favoured and that bad situation +should not happen. nlsplit keeps whitespace at the beginning or at the end of chunks to avoid losing any information. Depending on your application, you might