A Practical Guide to Regular Expressions – Learn RegEx with Real Life Examples
abc Letters 123 Digits \. Period [abc] Only a, b, or c [^abc] Not a, b, nor c [a-z] Characters a to z [0-9] Numbers 0 to 9 {m} m Repetitions {m,n} m to n Repetitions * Zero or more repetitions + One or more repetitions ? Optional character . Any Character ^…$ Starts and ends (…) Capture Group (a(bc)) Capture Sub-group (.*) Capture all (abc|def) Matches abc or def \s Any Whitespace [ ] Whitespace \S Any Non-whitespace character \d Any Digit \D Any Non-digit character \w Any Alphanumeric character [a-zA-Z_0-9] \W Any Non-alphanumeric character Non-capturing group: (?:expression) Look-around assertions: (?<=expression) = positive look-behind assertion (?=expression) = positive look-ahead assertion
Capture Groups
Capturing Groups and Backreferences
What is a non-capturing group in regular expressions?
The only difference between capture groups and non-capture groups is that the former captures the matched character sequences for possible later re-use with a numbered back reference while a non-capture group does not.
RegexOne – with Exercises
Lesson 11: Match groups / Capture group
Exercise
List of single characters
File:
hat
tut
cat
blue
red
zat
bet
tat
met
mat
RegEx:
[htm]at => matches "hat", "tat", "mat"
RegEx:
[^htm]at => matches ".at" except "hat", "tat", "mat"
Result: "cat", "zat"
File: hat tut cat blue red zat bet tat met mat RegEx: [htm]at => matches "hat", "tat", "mat" RegEx: [^htm]at => matches ".at" except "hat", "tat", "mat" Result: "cat", "zat"
Whitespace at the beginning or end of a line
^[ \t]+|[ \t]+$ Explanation: ^ Start of line [ ] List of single characters: Space and Tab + One or more occurrences | OR $ End of line
Remove newlines
File: aaa bbb ccc RegEx: '(?<=^.+)\n' Result: aaabbbccc
Cut out only define name
#define DEFAULT_CLOCK_BLA MAC_ADD_HEAD_1,0xff,0xfe,MAC_ADD_HEAD_2,SR_NR
Remove #define:
DEFAULT_CLOCK_BLA MAC_ADD_HEAD_1,0xff,0xfe,MAC_ADD_HEAD_2,SR_NR
Search: '(?<=(^[\w]+))[\s]*[\w\,]*(?=($))' Replace: ''
DEFAULT_CLOCK_BLA
Append parameter list
default_clock_bla
Search: '(?<=(^[\w]+))\s*(?=($))' Replace: '(void);'
default_clock_bla(void);
Trim to only have one space
bool default_clock_bla(void);
Search: '(?<=(^[\w]+))\s+' Replace: ' '
bool default_clock_bla(void);
Separate result type from function name
bool default_clock_bla(void);
Search: '(?<=(^[\w]+))\s' Replace: '\n'
bool default_clock_bla(void);
Add curly brackets
bool default_clock_bla(void);
Search: ';(?=($))' Replace: '\n{\n \n}\n'
bool default_clock_bla(void) { }
Add to every line
DEVICE_I2CSLAVE=1 TARGET_LIKE_MBED DEVICE_PORTOUT=1 DEVICE_PORTINOUT=1 TARGET_RTOS_M4_M7
Search: '(?<=.*)^' Replace: '<name>' Search: '=1$' Replace: '</name><value>1</value>'
<name>DEVICE_I2CSLAVE</name><value>1</value> <name>TARGET_LIKE_MBED <name>DEVICE_PORTOUT</name><value>1</value> <name>DEVICE_PORTINOUT</name><value>1</value> <name>TARGET_RTOS_M4_M7
Add suffix if line-end is NOT “>”
<name>DEVICE_I2CSLAVE</name><value>1</value> <name>TARGET_LIKE_MBED <name>DEVICE_PORTOUT</name><value>1</value> <name>DEVICE_PORTINOUT</name><value>1</value> <name>TARGET_RTOS_M4_M7
Search: '(?<=[^>]$)' Replace: '</name><value/>'
<name>DEVICE_I2CSLAVE</name><value>1</value> <name>TARGET_LIKE_MBED</name><value/> <name>DEVICE_PORTOUT</name><value>1</value> <name>DEVICE_PORTINOUT</name><value>1</value> <name>TARGET_RTOS_M4_M7</name><value/>
Embrace lines
<name>DEVICE_I2CSLAVE</name><value>1</value> <name>TARGET_LIKE_MBED</name><value/> <name>DEVICE_PORTOUT</name><value>1</value> <name>DEVICE_PORTINOUT</name><value>1</value> <name>TARGET_RTOS_M4_M7</name><value/>
Search: '^(?=<name>.*)' Replace: '<macro>\n' Search: '(?<=.*</value>|<value/>)' Replace: '\n</macro>'
<macro> <name>DEVICE_I2CSLAVE</name><value>1</value> </macro> <macro> <name>TARGET_LIKE_MBED</name><value/> </macro> <macro> <name>DEVICE_PORTOUT</name><value>1</value> </macro> <macro> <name>DEVICE_PORTINOUT</name><value>1</value> </macro> <macro> <name>TARGET_RTOS_M4_M7</name><value/> </macro>
Table to SQL
Adding Quotation marks around string with regex
Joy F 11 51.3 50.5 Jane F 12 59.8 84.5 Jim M 12 57.3 83.0 Alice F 13 56.5 84.0 Jeff M 13 62.5 84.0 Bob M 14 64.2 90.0 Philip M 16 72.0 150.0
Search: '[ ]' Replace: ', ' Search: '([^\s,.0-9]+)' Replace: ''$1''
'Joy', 'F', 11, 51.3, 50.5 'Jane', 'F', 12, 59.8, 84.5 'Jim', 'M', 12, 57.3, 83.0 'Alice', 'F', 13, 56.5, 84.0 'Jeff', 'M', 13, 62.5, 84.0 'Bob', 'M', 14, 64.2, 90.0 'Philip', 'M', 16, 72.0, 150.0
Search: '(?<=^)' Replace: ') ,(\n '
) ,( 'Joy', 'F', 11, 51.3, 50.5 ) ,( 'Jane', 'F', 12, 59.8, 84.5 ) ,( 'Jim', 'M', 12, 57.3, 83.0 ) ,( 'Alice', 'F', 13, 56.5, 84.0 ) ,( 'Jeff', 'M', 13, 62.5, 84.0 ) ,( 'Bob', 'M', 14, 64.2, 90.0 ) ,( 'Philip', 'M', 16, 72.0, 150.0
Strip CC, only use source file
One-Liner
Original: CC -m64 -instances=global -c -fast -g0 -I/opt/include -o build/testGlobals.o ../app/testGlobals.cpp Search: ^.* (.*)$ Replace: $1 Result: ../app/testGlobals.cpp
Multi-Liner
mkdir -p build/x86_Release CC -c -fast -g0 -I/opt/include -o build/x86_Release/dynamoGlobals.o dynamoGlobals.cpp CC: Warning: -xchip=native detection failed, falling back to -xchip=generic mkdir -p build/x86_Release /_ext/710760585 CC -c -fast -g0 -I/opt/include -o build/x86_Release/_ext/710760585/TValueList.o ../dynamo_base/TValueList.cpp CC: Warning: -xchip=native detection failed, falling back to -xchip=generic mkdir -p build/x86_Release /_ext/710760585 CC -c -fast -g0 -I/opt/include -o build/x86_Release/_ext/710760585/UProductConstructor.o ../dynamo_base/UProductConstructor.cpp CC: Warning: -xchip=native detection failed, falling back to -xchip=generic mkdir -p build/x86_Release /_ext/710760585
Search: '^(?!CC ).*$\n?' Replace: '' Search: '^.* (.*)$' Replace: '$1'
dynamoGlobals.cpp ../dynamo_base/TValueList.cpp ../dynamo_base/UProductConstructor.cpp
Pingback: Eclipse Embedded for ARM | Andreas' Blog