Commit Graph

9868 Commits (3638fb9da48188bca49c409d3ac3a4be9f9a300d)

Author SHA1 Message Date
Benno Schulenberg 544351f3be syntaxes: replace [[:space:]] with [[:blank:]] to exclude carriage return
In many places a carriage return is not valid whitespace and should
thus not be colored as such.  In some of these places a vertical tab
or form feed is maybe valid whitespace, but it would be ugly or even
wrong to color them because they are not part of the subsequent
comment or keyword.

This fixes https://savannah.gnu.org/bugs/?60456.
2021-04-27 11:18:41 +02:00
Hussam al-Homsi 96ebaf8ab4 syntax: c: make the highlighting of '#include <...>' more compliant
Changes:
  1. There may be zero spaces between 'include' and '<...>'.
  2. Blanks and '=' may occur inside '<...>' but '>' may not.
  3. There must be at least one character inside '<...>'.

References:
  Change 1:
    C:   www.open-std.org/jtc1/sc22/wg14/www/docs/n2310.pdf#subsection.6.10.2
    C++: www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/n4659.pdf#section.19.2

  Changes 2 and 3:
    C:   www.open-std.org/jtc1/sc22/wg14/www/docs/n2310.pdf#subsection.6.4.7
    C++: www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/n4659.pdf#section.5.8

Signed-off-by: Hussam al-Homsi <sawuare@gmail.com>
2021-04-26 12:14:44 +02:00
Benno Schulenberg 6b7c661fb7 syntax: po: improve the coloring of format specifiers
This now handles most of the things listed in 'man 3 printf'.
2021-04-26 11:29:05 +02:00
Benno Schulenberg 6823831c06 build: drop the check for two functions that we don't use any more
Since commits b0209374 and 1c010d8e from a month ago, nano does not use
mblen() and mbtowc() any more, so there is no need to check for their
presence.

Instead, add a check for iswalpha(), which we do use.
2021-04-25 11:55:04 +02:00
Benno Schulenberg ec82530125 gnulib: update to its current upstream state 2021-04-25 10:48:56 +02:00
Benno Schulenberg a45e1f89c0 oops: that doesn't work -- you can't break out of two for loops at once
This effectively reverts the previous commit.
2021-04-24 13:56:36 +02:00
Benno Schulenberg c1cd813dcb tweaks: elide a function that is now basically just two lines 2021-04-24 11:48:04 +02:00
Benno Schulenberg c96e62e33a startup: save the compiled file-matching regexes, to avoid recompiling
This reduces startup time by seven percent (when using the standard set
of syntaxes) when opening just one file that doesn't match any syntax,
and more than ten percent when opening multiple files.  It takes some
extra memory, but... not wasting CPU cycles is more important.

This addresses https://savannah.gnu.org/bugs/?56433.
2021-04-24 10:54:04 +02:00
Benno Schulenberg 6283557d2f memory: prevent a use-after-free when the user respects a lock file
This fixes https://savannah.gnu.org/bugs/?60447.

Bug existed since commit 2f718e11 from a month ago.
2021-04-23 12:20:45 +02:00
Benno Schulenberg af90f03ac5 tweaks: condense three comments, drop another, and rewrap a line 2021-04-23 12:04:19 +02:00
Benno Schulenberg 588022ab8c editing: prevent the pointer for the top row from becoming dangling
When undoing several actions, it is possible for the line at the top
of the screen to be removed, leaving 'edittop' pointing to a structure
that has been freed.  Soon after, 'edittop' is referenced to determine
whether the cursor is offscreen...  Prevent this invalid reference by
stepping 'edittop' one line back in that special case.  This changes
the normal centering behavior of Undo when the cursor goes offscreen,
but... so be it.

When a single node is deleted, it is always possible to step one line
back, because a buffer contains always at least one line (even though
maybe empty), so if the current line could be deleted, there must be
one before it (when at the top of the screen).

This fixes https://savannah.gnu.org/bugs/?60436.

Bug existed since version 2.3.3, commit 60815461,
since undoing does not always center the cursor.
2021-04-23 09:35:12 +02:00
Benno Schulenberg cf0820549b tweaks: avoid calling extra_chunks_in() when not softwrapping
The function is somewhat costly; better avoid it whenever possible.
2021-04-22 12:19:09 +02:00
Benno Schulenberg f54bc6c7d6 indicator: adjust the size to the number of visible lines, not chunks
Since two commits ago, the position of the indicator shows the position
of the viewport relative to the full buffer in terms of actual lines,
not of visual chunks (to avoid excessive computation).  But the size of
the indicator stayed constant, as if it always covered as many lines as
the edit window has rows.  But the latter will not be the case when
softwrapping occurs.  Therefore, when softwrapping, compute how many
actual lines are visible in the viewport, and adjust the size of the
indicator accordingly.
2021-04-22 11:52:57 +02:00
Benno Schulenberg 2cdff6c32c tweaks: adjust two comments, and reshuffle two fragments
Also rename two variables, to be more fitting.
2021-04-21 16:52:35 +02:00
Benno Schulenberg 49d8b99e4f softwrap: avoid time-consuming computations, to burden large files less
Whenever softwrap was toggled on or line numbers were toggled on/off or
the window was resized, the extra rows per line needed to be recomputed
for ALL the lines.  For large files with many long lines this was too
costly.

(This change causes the indicator to have an incorrect size when there
are many softwrapped chunks onscreen, but that will be addressed later.)

This fixes https://savannah.gnu.org/bugs/?60429.

Problem existed since version 5.0, since the indicator was introduced.
2021-04-21 16:40:20 +02:00
Benno Schulenberg bb81932422 chars: work around the wrong private-use-character widths on OpenBSD
This fixes https://savannah.gnu.org/bugs/?60393.
2021-04-20 11:13:08 +02:00
Benno Schulenberg 5efb6836a8 options: retire the obsolete 'smooth', 'morespace', and 'nopauses' 2021-04-15 11:43:39 +02:00
Benno Schulenberg 48fa14acc0 tweaks: simplify two fragments of code
This makes the handling of plain ASCII a tiny bit slower, but it
affects only the users of --constantshow without --minibar, so...

All other uses of mbstrlen() and collect_char() are not in speed-
critical code paths.
2021-04-13 11:19:32 +02:00
Benno Schulenberg eb7181b35e tweaks: adjust and improve one comment, and frob another 2021-04-12 15:14:05 +02:00
Benno Schulenberg 8db42023bb files: when Mac format has been detected, stay with it
This fixes https://savannah.gnu.org/bugs/?60382.

Bug existed since commit 09b919a6 from three weeks ago.
2021-04-12 14:50:04 +02:00
Benno Schulenberg 018a8e12ca build: fix compilation for --enable-tiny plus --enable-multibuffer
This will not show the error messages for other buffers when using
a tiny build, but... one cannot have everything.
2021-04-10 12:01:34 +02:00
Benno Schulenberg b4a5aedc6c tweaks: remove a misplaced (and nested) #ifdef
It was accidentally introduced two weeks ago by commit 1c010d8e.
2021-04-09 16:55:07 +02:00
Benno Schulenberg d6ed174d09 tweaks: morph a function into what it is actually used for
Since the previous commit, mbwidth() is used only to determine whether
a character is either double width or zero width.  There is no need to
return the actual width of the character; a simple yes or no is enough.

Transforming mbwidth() into is_doublewidth() also allows streamlining
it and is_zerowidth() a bit, so that they become slightly faster.
2021-04-09 16:38:23 +02:00
Benno Schulenberg 78f92e044a tweaks: avoid parsing a multibyte character twice
The number of bytes in the character were determined twice: first in
mbwidth() and then in char_length().  Do it just once, in mbtowide().

Also, avoid calling is_cntrl_char(), because it does unneeded checks
when we already know that the high bit is set.

This duplicates some code, but advance_over() is called a lot, so it
is important that it is as fast as possible.

This shouldn't slow down plain ASCII, as the extra checks (use_utf8
and *string < 0xA0) are done only for non-ASCII (apart from DEL).
2021-04-09 11:32:15 +02:00
Benno Schulenberg f11931a0dd tweaks: rename a variable, for contrast with another
The 'start_index' was in index in the given text, while 'index' is an
index in the displayable string.  Having both of them using 'index' in
their name was somewhat confusing.
2021-04-08 12:19:34 +02:00
Benno Schulenberg 31a6931be9 tweaks: elide a call of strlen() for every row
For a normal file (without overlong lines) the strlen() wasn't much
of a problem.  But when there are very long lines, it wasted time
counting stuff that wouldn't be displayed on the current row anyway,
and reserved *far* too much memory for the displayable string.

Problem existed since commit cf0eed6c from five years ago that traded
a continuous comparison (of the used space with the reserved space)
against a one-time big reservation up front involving a strlen().
In retrospect that was not a good trade-off when softwrapping.

The extra check (charwidth == 0) is incurred only by characters that
have their high bit set, so the average file (with only ASCII) is not
affected by this -- it just loses an unneeded call of strlen().
2021-04-08 12:15:12 +02:00
Benno Schulenberg debb288115 tweaks: reduce the maximum character length from six bytes to four
In UTF-8 valid multibyte characters are at most four bytes long,
and now that we no longer make use of mblen() and mbtowc() from
the underlying system, we won't get five- or six-byte sequences
mistakenly reported as valid (by glibc).  So it is always enough
to reserve space for just four bytes per character.
2021-04-07 17:21:25 +02:00
Benno Schulenberg c75a3839da tweaks: elide a small function that is used just once 2021-04-07 17:08:05 +02:00
Benno Schulenberg b6a32fbd5f tweaks: elide an unneeded resetting NULL call to wctomb()
Calling wctomb() with NULL as the first parameter returns zero in a
UTF-8 locale, meaning that there is no state, so there is no point
in resetting it either.
2021-04-07 16:11:40 +02:00
Benno Schulenberg 09e4c86606 tweaks: improve a couple of comments 2021-04-07 12:28:48 +02:00
Benno Schulenberg 20eb422829 tweaks: avoid converting a file name for more than will fit on screen 2021-04-07 12:12:06 +02:00
Benno Schulenberg 90c6b572d0 display: avoid determining twice from and until where to draw each row
The two calls of draw_row() are each immediately preceded by a call to
display_string(), which has already determined from which x position
and until which x position in the relevant line the current row will
be drawn -- doing this again in draw_row() is a waste of time.  Even
though it is ugly, pass the two data points from one function to the
other via global variables.

For normal files (without overlong lines), this saves on average some
fifty calls of advance_over() per row.  When softwrapping a file with
overlong lines, the savings for each softwrapped chunk are much higher.
2021-04-07 11:27:10 +02:00
Benno Schulenberg 712b574fb7 tweaks: rename a variable, away from an abbreviation 2021-04-06 16:27:46 +02:00
Benno Schulenberg fd023d6dcf tweaks: put the most likely condition first, for a quicker return
Also condense a comment.
2021-04-06 11:32:25 +02:00
Benno Schulenberg 9c16be32d7 tweaks: reshuffle two conditions, to have the most unlikely one first
This also better fits the preceding comment.
2021-04-05 16:10:44 +02:00
Benno Schulenberg c3cdb099da tweaks: reshuffle a comment, and put the main extension first
And add some air to the compact file.
2021-04-05 16:06:54 +02:00
Mike Frysinger 682a088fb8 syntax: tcl: support Expect scripts too 2021-04-05 16:02:44 +02:00
Benno Schulenberg 8c25bd0e94 tweaks: elide two more instances of useless character copying
Just point at the relevant characters directly
instead of first copying them out.
2021-04-02 16:39:10 +02:00
Benno Schulenberg 20e122ef41 startup: show the helpful message only when ^G has not been rebound
(Well, it now checks that ^G is still the first shortcut that is bound
to 'do_help', but that is good enough: if the user did any rebinding,
they probably do not need any reminder about how to invoke 'Help'.)

This fixes https://savannah.gnu.org/bugs/?60315.
Reported-by: Robert Goulding <goulding.2@nd.edu>
2021-04-01 10:09:28 +02:00
Benno Schulenberg 0dcac9188f tweaks: simplify two fragments of code, eliding useless character copying 2021-03-29 20:06:05 +02:00
Benno Schulenberg 1c010d8ec9 chars: implement mbtowc() ourselves, for more efficiency
This saves a function call, and the passing and checking of the
MAXCHARLEN parameter, and the checking whether wc is maybe NULL
(which for nano is never the case), and who knows what other
overheads mbtowc() has, and our workaround for glibc.

Code was written after looking at gnulib/lib/mbrtowc-impl-utf8.h.
2021-03-29 12:36:10 +02:00
Benno Schulenberg b020937475 chars: implement mblen() ourselves, for efficiency
Most implementations of mblen() do a call to mbtowc(), which is
a waste of time when all we want to know is the number of bytes
(and when we already know that we're using UTF-8 and that the
first byte is at least 0xC2).

(This also avoids burdening correct implementations with the
workaround that was needed only for glibc.)

Code was written after looking at gnulib/lib/mbrtowc-impl-utf8.h.
2021-03-27 14:38:28 +01:00
Benno Schulenberg e3f46b066a build: fix compilation when configured with --disable-multibuffer 2021-03-26 12:21:44 +01:00
Benno Schulenberg df7fe1280d tweaks: drop unneeded braces and adjust indentation after previous change 2021-03-26 12:17:44 +01:00
Benno Schulenberg 929770191e chars: work around a UTF-8 bug in glibc, to display invalid codes right
The mblen() and mbtowc() functions will happily return 4 or 5 or 6
for byte sequences that start with 0xF4 0x90 or higher.  But those
sequences encode for U+110000 or higher, which are not valid Unicode
code points.  The libc of FreeBSD and OpenBSD and Alpine correctly
return -1 for such sequences.  Make nano behave correctly also when
linked against glibc, so that invalid sequences are always presented
as a series of invalid bytes and never as a single invalid code.

This fixes https://savannah.gnu.org/bugs/?60262.

Bug existed since before version 2.0.0.
2021-03-26 11:07:05 +01:00
Benno Schulenberg 66d9d6c6d2 tweaks: elide the pointless is_valid_unicode() function
The call of this function in make_mbchar() does not add anything,
because wctomb() already returns -1 for codes U+D800 to U+DFFF,
and parse_verbatim_kbinput() already rejects anything that starts
with U+11.... or higher, so make_mbchar() is never called for codes
beyond U+10FFFF.

And the call in display_string() just needs to check for wc <= 0x10FFFF
because mbtowc() already returns -1 for codes U+D800 to U+DFFF.
2021-03-25 11:24:41 +01:00
Benno Schulenberg de816840cb input: accept Unicode codes for non-characters as valid, since they are
That is, accept U+FDD0 to U+FDEF, and accept U+xxFFFE and U+xxFFFF
for xx from 00 to 10 hex, being the 66 reserved "non-characters".

It may not be wise of the user to input these "things" (by typing
their code after M-V), but the codes are valid Unicode code points
and should not be rejected.

See https://www.unicode.org/faq/private_use.html#nonchar8 et al.

This fixes https://savannah.gnu.org/bugs/?60263.

Bug existed since before version 2.0.0.
2021-03-24 17:11:05 +01:00
Benno Schulenberg 74fcc3be79 tweaks: normalize the indentation after an earlier change
(Should have been done yesterday, right after commit 2f718e11.)
2021-03-24 12:29:50 +01:00
Benno Schulenberg b6909d3737 build: fix compilation when configured with --enable-tiny
Commit 0c1bf429 from last week added two calls to digits()
for --constantshow.
2021-03-24 12:21:50 +01:00
Benno Schulenberg 823d79b36c tweaks: shorten a comment and trim an #ifdef 2021-03-24 12:16:10 +01:00