How to sort the lines in a text file, by the length of each line, in Notepad++

notepad

How Can I sort a text file by line length in notepad++? Is there any plugin available for the mentioned task?
In case that there is no plugin, What is the first and maybe second tutorial to read, In order to write the plugin Myself?

Best Answer

This answer is inspired by a YouTube video. Updated to maintain original sort order, if that is important.

Notepad++ has a built-in TextFX tool that sorts selected lines alphabetically. This tool can be hijacked to sort by the length of the lines by placing spaces on the left of each line, and making sure that all the lines are the same length.

"The Zoo" comes alphabetically before "Their House" because the space is treated as a character and comes before "i". __X (pretending the underscores are really spaces) will similarly come alphabetically before _XX. The idea in this answer is to add spaces and line numbers so that __________092dog will be sorted above _003alligator.

I'll use the following as example data:

Lorem
ipsum
dolor
sit
amet
consectetur
adipisicing

Step 1. Add line numbers.

(Note added by barlop- a note for the reader regarding this step, we will not be sorting according to these line numbers, we're sorting according to the length of the lines. But the reason for adding the line numbers, is so we know the natural order, so that when for example, two+ lines are of equal length we can sort those lines according to that natural order)

Assuming your text file only has the data in it, place the text cursor (the vertical line) into the very first position of the file. Then in the Edit menu select Column Editor... (Alt+C). Choose "Number to Insert" and start with 1, increase by 1, and include leading zeros. Note that this will retain the original ordering when sorting from shortest string to longest string. Reverse all lines first if you want to sort longest to shortest.

1Lorem
2ipsum
3dolor
4sit
5amet
6consectetur
7adipisicing

Step 2. Pad all lines with leading spaces.

Place the text cursor (the vertical line) into the very first position of the file. Then in the Edit menu select Column Editor... (Alt+C). Insert enough spaces so that the shortest line of data will be padded out to the length of the longest line of data. If your shortest line has 4 characters, and your longest 44, then make sure you insert at least 40 spaces.

__________1Lorem
__________2ipsum
__________3dolor
__________4sit
__________5amet
__________6consectetur
__________7adipisicing

Step 3. Trim lines to a uniform length.

Use the following Regular Expression Find/Replace (Ctrl+H) to match the right-hand characters equalling or exceeding the length of your longest data line.

^.*(.{50})$

Replace all with $1. That will trim everything except the right-most 50 characters of every line. If your data is longer (or short) than 50, adjust the {50} in the Regular Expression.

(Note added by barlop- the idea here is the shortest lines have the most spaces at the beginning)

_______1Lorem
_______2ipsum
_______3dolor
_________4sit
________5amet
_6consectetur
_7adipisicing

Step 4. Sort the lines.

Select all of the text (Ctrl+A). Via the TextFX menu, go to Text FX > TextFX Tools > Sort lines case sensitive (at column). Your data should now be in length order, from shortest to longest. If you want them in order from longest to shortest, uncheck the Text FX > TextFX Tools > + Sort ascending option before sorting. Note how line numbers are reversed as well.

_________4sit
________5amet
_______1Lorem
_______2ipsum
_______3dolor
_6consectetur
_7adipisicing

Step 5. Remove leading spaces.

Use another Regular Expression Find/Replace (Ctrl+H) to match the leading spaces.

^ *\d{4}

That's a space between the caret and asterisk. Replace all with nothing. That will remove all leading spaces and the inserted line numbers, if you had 4-digit line numbers. Replace the {4} with the correct number of digits in your line numbers.

sit
amet
Lorem
ipsum
dolor
consectetur
adipisicing

MACRO

I recorded the above steps using Notepad++'s macro feature, and it doesn't work. I'm not sure which step fails, but I haven't diagnosed why. You could probably use AutoHotKey to automate this if you do it repeatedly.

Related Question