Tuesday, September 25, 2018

Text Replace Trick and Tips

TextPad: Add Line Breaks at ], characters

  1. Click Ctrl + h or Search -> Replace on the top menu
  2. Under the Search Mode group, select Regular expression
  3. In the Find what text field, type ],\s*
  4. In the Replace with text field, type ],\n
  5. Click Replace All


TextPad: How to Add extra Line breaks at end of each line

You can quickly do this by using a feature called "Regular Expressions" to find and add empty lines.

  1. Open up the Find/Replace (Search menu > Replace)
  2. In the "Find what" field, type the following: (^.+$)\n(^.+$)
  3. In the "Replace with" field, type the following: \1\n\n\2
  4. Tick the "Regular expression" checkbox
  5. Click the Replace All button at least twice, but perhaps 3 times, until you get the message Cannot find the Regular Expression
  6. Untick the "Regular expression" checkbox
  7. Close the Replace dialog
  8. Confirm the file is formatted as you are expecting
  9. Save the file.

TextPad: Replace Line Break with a comma

Here is an example document with random words:

document with words in multiple lines

And you fire the replace operation by hitting F8.
Find What: type \n
Replace with: type , (or whatever character you need)

Check "Regular Expression" box before you search / replace

And it will be all merged on to the same line. This looks simple while for an user trying to do this on a several hundred line file, would give big sigh of relief after learning this.

Final result after the replace operation

TextPad: How to Remove All Lines Except the Ones Containing a Pattern

Similar to NotePad,[1] you can remove all lines except the ones containing a pattern in TextPad.[2,3] There is an easy way to achieve this. Basically, you need to perform 3 steps:

  1. Bookmark all lines containing a pattern
  2. Inverse bookmark
  3. Delete all bookmarked lines

Bookmark All Lines Containing a Pattern


For this demonstration, we want to remove all lines except the ones containing the following pattern:
  • "concurrent-mark-end"
To bookmark all lines containing a pattern:
  1. Go to Menu "Search" and select "Find..."  
  2. Specify the pattern in "Find what:" and select "Regular expression"
  3. Hit Button "Mark All"

Inverse Bookmark


To inverse bookmark, do:
  • Go to Menu "Search" and select "Inverse Bookmarks"
You should see some lines are marked and some not.

Delete All Bookmarked Lines


To delete all bookmarked lines, do:
  1. Go to Menu "Edit" and select "Delete > Bookmarked Lines"
and you should see the final results below:


RegEx Summary


Summary of regular expressions:
Default Operator: POSIX Operator: Description:
. . Any single character. Example: h.t matches hat, hit, hot and hut.
[ ] [ ] Any one of the characters in the brackets, or any of a range of characters separated by a hyphen (-), or a character class operator (see below). Examples: h[aeiou][a-z] matches hat, hip, hit, hop, and hut; [A-Za-z] matches any single letter; x[0-9] matches x0, x1, …, x9.
[^] [^] Any characters except for those after the caret "^". Example: h[^u]t matches hat, hit, and hot, but not hut.
^ ^ The start of a line (column 1).
$ $ The end of a line (not the line break characters). Use this for restricting matches to characters at the end of a line. Example: end$ only matches "end" when it's the last word on a line, and ^end only matches "end" when it's the first word on a line.
\< \< The start of a word.
\> \> The end of a word.
\t \t The tab character.
\f \f The page break (form feed) character.
\n \n A new line character, for matching expressions that span line boundaries. This cannot be followed by operators '*', '+' or {}. Do not use this for constraining matches to the end of a line. It's much more efficient to use "$".
\xdd \xdd "dd" is the two-digit hexadecimal code for any character.
\( \) ( ) Groups a tagged expression to use in replacement expressions. An RE can have up to 9 tagged expressions, numbered according to their order in the RE. The corresponding replacement expression is \x, for x in the range 1-9. Example: If \([a-z]+\) \([a-z]+\) matches "way wrong", \2 \1 would replace it with "wrong way".
* * Matches zero or more of the preceding characters or expressions. Example: ho*p matches hp, hop and hoop.
? ? Matches zero or one of the preceding characters or expressions. Example: ho?p matches hp, and hop, but not hoop.
+ + Matches one or more of the preceding characters or expressions. Example: ho+p matches hop, and hoop, but not hp.
\{count\} {count} Matches the specified number of the preceding characters or expressions. Example: ho\{2\}p matches hoop, but not hop.
\{min,\} {min,} Matches at least the specified number of the preceding characters or expressions. Example: ho\{1,\}p matches hop and hoop, but not hp.
\{min,max\} {min,max} Matches between min and max of the preceding characters or expressions. Example: ho\{1,2\}p matches hop and hoop, but not hp or hooop.
\| | Matches either the expression to its left or its right. Example: hop\|hoop matches hop, or hoop.
\ \ "Escapes" the special meaning of the above expressions, so that they can be matched as literal characters. Hence, to match a literal "", you must use "". Example: \< matches the start of a word, but \\< matches "\<".
Character Class Operators "[: ... :]":
These can be used in class expressions as an alternative way of representing classes of characters. For example, [a-z0-9] is equivalent to [[:lower:][:digit:]]. (Note the extra pairs of brackets.) The defined classes are:
Expression: Description:
[:alpha:] Any letter.
[:lower:] Any lower case letter.
[:upper:] Any upper case letter.
[:alnum:] Any digit or letter.
[:digit:] Any digit.
[:xdigit:] Any hexadecimal digit (0-9, a-f or A-F).
[:blank:] Space or tab.
[:space:] Space, tab, vertical tab, return, line feed, form feed.
[:cntrl:] Control characters (Delete and ASCII codes less than space).
[:print:] Printable characters, including space.
[:graph:] Printable characters, excluding space.
[:punct:] Anything that is not a control or alphanumeric character.
[:word:] Letters, hypens and apostrophes.
[:token:] Any of the characters defined on the Syntax page for the document class, or in the syntax definition file if syntax highlighting is enabled for the document class.
Example:
HTML tags are in matched pairs of <…>, such as <FONT SIZE=+1>. To match any tag that begins and ends on the same line, use the regular expression:
<[^>]*>
This matches a "<", followed by zero or more characters, excluding ">", followed by a ">". Note that "*" finds the longest matching sequence on a line, so the regular expression:
<.*>
would be incorrect, because it would not stop at the first ">", if there was more than one on the line.
For more information and examples, see regular expressions, and replacement expressions in the Reference section.

No comments:

Post a Comment