1
1

297 Коммитов

Автор SHA1 Сообщение Дата
Benno Schulenberg
cd705a7c4c tweaks: elide a counter and a comparison
For clarity and a tiny bit more speed.  Also rename some variables.
2016-12-19 09:44:30 +01:00
Benno Schulenberg
eafae5d417 screen: show an embedded newline in filenames as ^J instead of ^@
The byte 0x0A means 0x00 *only* when it is found in nano's internal
representation of a file's data, not when it occurs in a file name.

This fixes the second part of https://savannah.gnu.org/bugs/?49867.
2016-12-18 11:13:50 +01:00
Benno Schulenberg
0562d27b9c tweaks: delete a bunch of unneeded asserts
Nano would crash straight afterward if any of these asserts would fail,
so they don't add anything.  A few others are simply superfluous.
2016-12-15 21:15:32 +01:00
Benno Schulenberg
c5f49167ea tweaks: write two pieces of conditionalized code like all others
Also trim or improve a few comments.
2016-12-15 19:48:09 +01:00
Benno Schulenberg
9765c2faa0 tweaks: elide a function that is called just once 2016-12-15 19:28:43 +01:00
Benno Schulenberg
85ebe971e2 chars: optimize for the most common case
That is: elide a second test from the most travelled path: a valid
character.  This adds a second call of mblen() when parse_mbchar()
is called on a terminating zero, but that should never happen.
2016-12-15 17:44:18 +01:00
Benno Schulenberg
fc101a6ded tweaks: rename a variable to be shorter and clearer 2016-12-15 15:50:07 +01:00
Benno Schulenberg
08cd197bf1 general: include word-jumping and block-jumping into the tiny version
And also case-sensitive searches, backward searches, and searching again.
2016-09-13 09:27:04 +02:00
Benno Schulenberg
514cd9a099 update the license text to the preferred version
Mentioning "GNU nano" instead of "This program" and referring to the
website instead of to a postal address.
2016-08-29 21:27:16 +02:00
Benno Schulenberg
406e5242a3 update the copyright notices 2016-08-29 21:27:05 +02:00
Benno Schulenberg
86a64b1bb5 tweaks: reduce two comparisons to a single one 2016-08-07 13:00:21 +02:00
Benno Schulenberg
c8bc05b10e chars: make searching case-insensitively some ten percent faster
It is quicker to do a handful of superfluous compares at the end of
each line than it is to compute and keep track of and compare the
remaining line length the whole time.

The typical line is some sixty characters long, the typical search
string ten characters -- with a shorter search string the speedup is
even higher: some fifteen percent.  Only when the string is longer
than half the average line length does searching become slower with
this new method.

All this for a UTF-8 locale.  For a C locale it makes no difference.
2016-08-07 11:02:41 +02:00
Benno Schulenberg
370406bb41 tweaks: don't optimize for a special case -- it is far too seldom 2016-08-06 11:11:56 +02:00
Benno Schulenberg
85844ee6ef chars: remove superfluous afterchecks
Now that mbstrncasecmp() does the right thing, there is no need any
more to verify that only a valid multibyte sequence was matched.

(See https://savannah.gnu.org/bugs/?45579 for a test case.)

Also, this will make it possible to search for invalid sequences.

(Currently it isn't possible to enter a search string with invalid
characters, but... a user might edit the search history file.  And
if pasting at the prompt is implemented, it will be trivial to enter
invalid sequences if you have a file that contains them.)
2016-08-06 11:10:39 +02:00
Benno Schulenberg
e38e2c634b chars: don't persist when only one of the compared sequences is invalid
Persisting might lead to count 'n' reaching zero, which would mean that
the needle has matched, which is wrong when one of the strings contains
an invalid or incomplete multibyte sequence.
2016-08-06 10:34:38 +02:00
Benno Schulenberg
d80109dd5e chars: properly compare strings of different lengths
That is: don't run towlower() on the two differing bytes when having
reached the end of one of the strings.

This fixes https://savannah.gnu.org/bugs/?48700.

In the bargain, don't do the conversion to lowercase twice.

Furthermore, persist when encountering invalid byte sequences --
until finding bytes that differ.
2016-08-05 16:07:55 +02:00
Benno Schulenberg
b305911cba chars: straighten out the flow of a loop, so it is easier to follow 2016-08-04 13:40:55 +02:00
Benno Schulenberg
d60f95137e chars: remove a special case that never occurs
The needle is never part of the hay -- it is always a separate string.

(And even if needle and haystack were identical, the routine works fine,
the case does not need special treatment.)
2016-08-04 13:40:19 +02:00
Benno Schulenberg
20058a1b63 spelling: don't consider digits as word parts, because GNU spell doesn't
This fixes https://savannah.gnu.org/bugs/?48660.
2016-08-03 12:43:57 +02:00
Benno Schulenberg
90a90365a8 tweaks: rename three constants, for clarity, and hardcode two others 2016-08-01 12:56:05 +02:00
Benno Schulenberg
41ad376b70 chars: plug a gushing memory leak 2016-07-22 15:30:09 +02:00
Benno Schulenberg
bf091be778 chars: don't try to see a character in an empty line
This fixes https://savannah.gnu.org/bugs/?48578.
2016-07-21 09:46:47 +02:00
Benno Schulenberg
6f12992cea new feature: add the option --wordchars, to set extra word characters
This allows the user to specify which other characters, besides the
default alphanumeric ones, should be considered as part of a word, so
that word operations like Ctrl+Left and Ctrl+Right will pass them by.

Using this option overrides the option --wordbounds.

This fulfills https://savannah.gnu.org/bugs/?47283.
2016-07-13 20:49:30 +02:00
Benno Schulenberg
e33a0b6dbe screen: avoid converting each character twice from multibyte to wide 2016-07-12 19:41:13 +02:00
Benno Schulenberg
0894587305 screen: elide another intermediate buffer for every visible character 2016-07-12 19:30:50 +02:00
Benno Schulenberg
b6efea266e chars: invalid sequences are not blank, nor text, nor punctuation
So, slightly speed up the functions that check for those.
2016-06-30 14:34:34 +02:00
Benno Schulenberg
8686cb3d3d chars: measure invalid sequences and unassigned codepoints more quickly
Invalid multibyte sequences get depicted with the Replacement Character,
and unassigned codepoints are shown as if they were a space.  Both have
a width of one.
2016-06-30 14:33:25 +02:00
Benno Schulenberg
af53c56ec8 chars: speed up the determination whether something is a control character
Use knowledge of UTF-8 instead of converting to wide characters first.
2016-06-29 20:56:50 +02:00
Benno Schulenberg
019d7b34ca chars: delete a now-unused function 2016-06-29 20:56:50 +02:00
Benno Schulenberg
622995fb12 chars: the representation of a control character is always two bytes
Any control character is represented by a ^ plus an ASCII character.
2016-06-29 20:56:50 +02:00
Benno Schulenberg
03586c60da chars: represent the high-bit controls more intelligibly
Instead of showing the upper control codes like this:

   ^À ^Á ^Â ^Ã ^Ä ^Å ^Æ ^Ç ^È ^É ^Ê ^Ë ^Ì ^Í ^Î ^Ï
   ^Ð ^Ñ ^Ò ^Ó ^Ô ^Õ ^Ö ^× ^Ø ^Ù ^Ú ^Û ^Ü ^Ý ^Þ ^ß

show them like this:

   ^` ^a ^b ^c ^d ^e ^f ^g ^h ^i ^j ^k ^l ^m ^n ^o
   ^p ^q ^r ^s ^t ^u ^v ^w ^x ^y ^z ^{ ^| ^} ^~ ^=

The lower control codes continue to be shown like this:

   ^@ ^A ^B ^C ^D ^E ^F ^G ^H ^I ^J ^K ^L ^M ^N ^O
   ^P ^Q ^R ^S ^T ^U ^V ^W ^X ^Y ^Z ^[ ^\ ^] ^^ ^_

The representation of DEL (0x7F) continues as ^?.

Further, use knowledge of UTF-8 to avoid a roundtrip through
wide characters.
2016-06-29 20:56:50 +02:00
Benno Schulenberg
6fda7a7057 chars: speed up two reverse-searching routines a bit
By removing from their main loops a condition that occurs just once.
2016-06-27 19:22:28 +02:00
Benno Schulenberg
1e2833e07b tweaks: elide two unneeded variables 2016-06-27 19:22:20 +02:00
Benno Schulenberg
56f067a284 chars: ensure that files are sorted also when strncasecmp() is strange
When running in a non-UTF locale, and when strncasecmp() suffers from
the same defect as strncmp(), make sure not to pass a length with the
high bit set.
2016-06-01 21:59:25 +02:00
Benno Schulenberg
05e2a6d259 chars: a control character can never be an invalid multibyte sequence
The function is_cntrl_mbchar() has always been called successfully before
calling control_mbrep(), so the passed character *is* a valid sequence.
2016-06-01 13:08:40 +02:00
Benno Schulenberg
b42887fe14 tweaks: adjust a couple of comments 2016-06-01 13:04:00 +02:00
Benno Schulenberg
4172268bd2 chars: the representation of control characters is always two columns wide 2016-06-01 13:03:43 +02:00
Benno Schulenberg
a9f79a6130 tweaks: reindent and rewrap a few lines, and shorten a comment 2016-06-01 13:03:26 +02:00
Benno Schulenberg
17cf833b9c tweaks: normalize some whitespace 2016-05-30 09:09:36 +02:00
Benno Schulenberg
a5b3f00d78 chars: make comparing multibyte strings twice as fast
Instead of parsing every multibyte character twice, first with
parse_mbchar() and then with mbtowc(), just let mbtowc() do all
the work.  This makes searching for a fixed string twice as fast.

This also gets rid of four variables and lots of memory allocations.
(And, more importantly: it stops nano messing up the internal state
of the multibyte-to-wide character conversion, and thus would make
the calls to mbtowc_reset() superfluous.)
2016-05-27 10:37:43 +02:00
Benno Schulenberg
1bffa17c01 tweaks: rename two more variables 2016-05-27 10:36:54 +02:00
Benno Schulenberg
a151167416 tweaks: rename some variables for contrast 2016-05-27 10:36:12 +02:00
Benno Schulenberg
3b21659661 tweaks: elide four #ifdefs, improve one comment and unwrap some others 2016-05-24 17:55:24 +02:00
Benno Schulenberg
d92eb4fee3 all: eradicate SVN's $Id$ tags 2016-04-05 14:59:12 +02:00
Benno Schulenberg
f9d6aa9ba3 Speeding up Unicode validation.
(The measurable effect (during long searches, for example) is zero, though.)


git-svn-id: svn://svn.savannah.gnu.org/nano/trunk/nano@5773 35c25a1d-7b9e-4130-9fde-d3aeb78583b8
2016-03-29 14:46:53 +00:00
Benno Schulenberg
2163d961a1 Deleting two dead prototypes, adjusting two comments for correctness,
and two other minute tweaks.


git-svn-id: svn://svn.savannah.gnu.org/nano/trunk/nano@5649 35c25a1d-7b9e-4130-9fde-d3aeb78583b8
2016-02-16 10:09:26 +00:00
Benno Schulenberg
9205c28865 Reverting my own patch that claimed that UTF8 is a stateless encoding.
Apparently there is /some/ state somewhere after all.  Don't have time
now to figure out where exactly.


git-svn-id: svn://svn.savannah.gnu.org/nano/trunk/nano@5369 35c25a1d-7b9e-4130-9fde-d3aeb78583b8
2015-09-04 19:34:55 +00:00
Benno Schulenberg
58a0ddebac Not bothering to reset any state, because UTF-8 is a stateless encoding.
git-svn-id: svn://svn.savannah.gnu.org/nano/trunk/nano@5354 35c25a1d-7b9e-4130-9fde-d3aeb78583b8
2015-08-12 19:27:13 +00:00
Benno Schulenberg
b967368d41 Finding only valid UTF-8 byte sequences when searching.
git-svn-id: svn://svn.savannah.gnu.org/nano/trunk/nano@5316 35c25a1d-7b9e-4130-9fde-d3aeb78583b8
2015-07-23 19:18:25 +00:00
Benno Schulenberg
76e7aaf514 Starting to look for a multibyte character not at the start of the string,
but only as far back as such a character can possibly be.
Speedup suggested by Mark Majeres.


git-svn-id: svn://svn.savannah.gnu.org/nano/trunk/nano@5147 35c25a1d-7b9e-4130-9fde-d3aeb78583b8
2015-03-22 11:20:02 +00:00