chars: speed up the counting of string length for the plain ASCII case

For UTF-8, if the most significant bit of a byte is zero, it means the
character is just a single byte and we can skip the call of mblen().

For files consisting of pure ASCII bytes (between 0x00 and 0x7F), this
change reduces the counting time of mbstrlen() by ninety six percent.

This partially addresses https://savannah.gnu.org/bugs/?50406.
master
Benno Schulenberg 2018-06-03 18:27:15 +02:00
parent 430d3bad7a
commit cc2b19c8fd
1 changed files with 6 additions and 2 deletions

View File

@ -540,9 +540,13 @@ size_t mbstrlen(const char *s)
size_t n = 0; size_t n = 0;
while (*s != '\0' && maxlen > 0) { while (*s != '\0' && maxlen > 0) {
if ((signed char)*s < 0) {
int length = mblen(s, MAXCHARLEN); int length = mblen(s, MAXCHARLEN);
s += (length < 0 ? 1 : length); s += (length < 0 ? 1 : length);
} else
s++;
maxlen--; maxlen--;
n++; n++;
} }