Here’s a pretty hard challenge—at least I found it challenging. Given this Lorem Ipsum
Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.
list the unique, capitalized words in alphabetical order:
Accusam Aliquyam Amet At Clita Consetetur Diam Dolor Dolore Dolores Duo Ea Eirmod Elitr Eos Erat Est Et Gubergren Invidunt Ipsum Justo Kasd Labore Lorem Magna No Nonumy Rebum Sadipscing Sanctus Sea Sed Sit Stet Takimata Tempor Ut Vero Voluptua
The winning Vim solutions use 23 keystrokes but the best I can do is 32. My solution:
| 【Ctrl+Meta+%】 | Invoke query-replace-regexp |
| 【space Return】 | Replace spaces |
| 【Ctrl+q Ctrl+j Return】 | With carriage returns |
| 【!】 | In the entire buffer |
| 【Meta+<】 | Back to top-of-buffer |
| 【Ctrl+Meta+%】 | Invoke query-replace-regexp |
[.,] 【Return】 |
Replace periods and commas |
| 【Return】 | With nothing |
| 【!】 | In the entire buffer |
| 【Return】 | Down one line |
| 【Ctrl+Return】 | Invoke rectangle mode |
| 【Meta+<】 | Extend rectangle to top-of-buffer |
| 【Meta+u】 | Upcase rectangle |
| 【Ctrl+x h】 | Mark buffer |
| 【Ctrl+u Meta+|】 | Pipe to shell and replace |
sort -u 【Return】 |
Sort and delete duplicates |
Most of the work involved the two query-replace-regexp calls but I couldn’t figure out a way to do it in one without causing other problems. The call to the external sort may be a technical violation but I’m claiming that we’re grandfathered in by Tim on that.
I think the Vimers did so well because Vim has an internal sort that’s easy to invoke. I’m sure many of you can beat my pathetic 32 so leave a comment with your better solutions.
The result of two replacements (16 keystrokes) can also be achieved by the following one (9 keystrokes – it also adds newline at the end, thus -1 keystroke in total):
C-M-%\W+ RET
C-q C-j RET
!
Nice. And disgustingly obvious in retrospect.
C-q C-j RET
can be replaced by
C-RET RET
for shaving off another key.
There is also no need to mark the entire buffer before piping to shell, at least not if you’re using transient mark. With no active mark, it will replace the entire buffer.
This works as long as there is mark set which may not be the case in a (relatively) new buffer. For some reason I’m always getting bitten by that. But yes, it will almost always work and save a couple of keystrokes. It should probably be the default option and if it fails just jump to one end of the buffer and redo the pipe action.
I’m not sure what’s different, but when I open a fresh buffer, type C-u M-| sort, then it sorts the entire buffer. No worries about the mark.
When I open an empty buffer and type Ctrl+u Meta+| I get the error message: “The mark is not set now, so there is no region” I really like your solution and wish I could figure out why it doesn’t (always) work for me.
I have
cua-selection-modeenabled so Ctrl+Return invokes rectangle mode. The use case for having it insert a Return is not common enough (in my work flow) to make it worthwhile remapping it. I wish Ctrl+j would do it the way it does in multiple cursors.Use fill column to have a word per line. You do C-f then C-x f RET to set it, then M-q to refill all on the same line.
M-x so-l RET to sort lines for 5 less.
Without using “sort -u”, you can do as follows.
First, as slava said, begin with:
C-M-%
\W+ RET
C-q C-j RET
!
Then, “C-x C-x” to exchange point and mark and select region, followed
by “M-x capitalize-region” and “M-x sort-lines”.
Finally, to remove duplicate lines, “M-x replace-regex”;
From (newline is C-q C-j) :
\(^.*\)\1\1*
To “\1″
That is very close to my solution, even though it seems like the second part (after the first replacement) doesn’t improve the original solution (by the number of keystrokes).
The difference from my solution is that I used
M-hinstead ofC-x C-x(orC-x h) to select a region andM-9 M-9 M-cto capitalize words. And in the final regexp replacement I had a bit simplified expression:\(.* C-q C-j \)\1+I’d completely forgotten that
sorteven had a-uargument. I’ve always just usedC-u M-| sort | uniq. Not a big advantage other than for Golf purposes, but handy to know nevertheless :)Well, of course, the pipe is more faithful to the Unix Way™ but that war has been over for some time.