User Tools

Site Tools


using_20regular_20expressions

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

using_20regular_20expressions [2018/03/31 13:19]
127.0.0.1 external edit
using_20regular_20expressions [2018/04/17 19:21] (current)
tbest3112 Added syntax highlighting
Line 6: Line 6:
 | [abc]\\ | matches "​a",​ "​b"​ or "​c"​\\ | | [abc]\\ | matches "​a",​ "​b"​ or "​c"​\\ |
 | [a-z]\\ | matches any lowercase letter\\ | | [a-z]\\ | matches any lowercase letter\\ |
-| [^b]at\\ | matches "​cat",​ "​fat",​ "​hat"​ etc. but not "​bat"​\\ |+<​nowiki>​[^b]</​nowiki>​at\\ | matches "​cat",​ "​fat",​ "​hat"​ etc. but not "​bat"​\\ |
 \\  For more information on the syntax of regular expressions see this [[http://​en.wikipedia.org/​wiki/​Regular_expression|Wikipedia article]].\\ \\  You can make use of regular expressions in your BBC BASIC program by means of the **gnu_regex** DLL which can be downloaded from [[http://​people.delphiforums.com/​gjc/​gnu_regex.html|here]][[/​Using%20regular%20expressions#​footnote|[1]]]. To start with you must load the DLL in the usual way:\\ \\  \\  For more information on the syntax of regular expressions see this [[http://​en.wikipedia.org/​wiki/​Regular_expression|Wikipedia article]].\\ \\  You can make use of regular expressions in your BBC BASIC program by means of the **gnu_regex** DLL which can be downloaded from [[http://​people.delphiforums.com/​gjc/​gnu_regex.html|here]][[/​Using%20regular%20expressions#​footnote|[1]]]. To start with you must load the DLL in the usual way:\\ \\ 
 +<code bb4w>
         SYS "​LoadLibrary",​ "​gnu_regex.dll"​ TO gnu_regex%         SYS "​LoadLibrary",​ "​gnu_regex.dll"​ TO gnu_regex%
         IF gnu_regex% = 0 ERROR 100, "​Cannot load gnu_regex.dll"​         IF gnu_regex% = 0 ERROR 100, "​Cannot load gnu_regex.dll"​
         SYS "​GetProcAddress",​ gnu_regex%, "​regcomp"​ TO regcomp%         SYS "​GetProcAddress",​ gnu_regex%, "​regcomp"​ TO regcomp%
         SYS "​GetProcAddress",​ gnu_regex%, "​regexec"​ TO regexec%         SYS "​GetProcAddress",​ gnu_regex%, "​regexec"​ TO regexec%
 +</​code>​
 For this to work **gnu_regex.dll** needs to be in the current directory, the Windows directory (often C:​\WINDOWS),​ the Windows system directory (often C:​\WINDOWS\SYSTEM32) or one of the directories specified in the PATH environment variable. Alternatively you can copy the file to your BBC BASIC for Windows library folder and load it explicitly from there:\\ \\  For this to work **gnu_regex.dll** needs to be in the current directory, the Windows directory (often C:​\WINDOWS),​ the Windows system directory (often C:​\WINDOWS\SYSTEM32) or one of the directories specified in the PATH environment variable. Alternatively you can copy the file to your BBC BASIC for Windows library folder and load it explicitly from there:\\ \\ 
 +<code bb4w>
         SYS "​LoadLibrary",​ @lib$+"​gnu_regex.dll"​ TO gnu_regex%         SYS "​LoadLibrary",​ @lib$+"​gnu_regex.dll"​ TO gnu_regex%
 +</​code>​
 The code below illustrates a very simple example of setting up a pattern and inputting strings from the user which are tested against this pattern:\\ \\  The code below illustrates a very simple example of setting up a pattern and inputting strings from the user which are tested against this pattern:\\ \\ 
 +<code bb4w>
         DIM buffer% 255         DIM buffer% 255
  
Line 26: Line 31:
           IF result% PRINT "Not matched"​ ELSE PRINT "​Matched"​           IF result% PRINT "Not matched"​ ELSE PRINT "​Matched"​
         UNTIL FALSE         UNTIL FALSE
 +</​code>​
 You should ensure that **buffer%** points to a memory buffer large enough to contain the //​compiled//​ regular expression (although it's not clear how you are supposed to ascertain this!). As always, make sure you execute the **DIM** statement only once, or use **DIM LOCAL**, to avoid a memory leak and an eventual **No room** error.\\ \\  In this example the pattern matches the characters "​a",​ "​b",​ "​c",​ "​x",​ "​y"​ or "​z"​ anywhere in the string. The program as listed provides no information on //where// in the string the match occurred. You can discover that information by amending the program as follows:\\ \\  You should ensure that **buffer%** points to a memory buffer large enough to contain the //​compiled//​ regular expression (although it's not clear how you are supposed to ascertain this!). As always, make sure you execute the **DIM** statement only once, or use **DIM LOCAL**, to avoid a memory leak and an eventual **No room** error.\\ \\  In this example the pattern matches the characters "​a",​ "​b",​ "​c",​ "​x",​ "​y"​ or "​z"​ anywhere in the string. The program as listed provides no information on //where// in the string the match occurred. You can discover that information by amending the program as follows:\\ \\ 
 +<code bb4w>
         DIM offsets{start%,​ finish%}         DIM offsets{start%,​ finish%}
         REPEAT         REPEAT
Line 33: Line 40:
           IF result% PRINT "Not matched"​ ELSE PRINT "​Matched at ";​offsets.start%           IF result% PRINT "Not matched"​ ELSE PRINT "​Matched at ";​offsets.start%
         UNTIL FALSE         UNTIL FALSE
 +</​code>​
 Here **offsets.start%** is set to the offset from the beginning of the string of the first match.\\ \\  You can specify that the matching is //case insensitive//​ by changing the final parameter of **regcomp** from 0 to 2 as follows:\\ \\  Here **offsets.start%** is set to the offset from the beginning of the string of the first match.\\ \\  You can specify that the matching is //case insensitive//​ by changing the final parameter of **regcomp** from 0 to 2 as follows:\\ \\ 
 +<code bb4w>
         _REG_ICASE = 2         _REG_ICASE = 2
         SYS regcomp%, buffer%, pattern$, _REG_ICASE TO result%         SYS regcomp%, buffer%, pattern$, _REG_ICASE TO result%
 +</​code>​
 You can also specify the use of **extended regular expressions** by setting the final parameter to 1:\\ \\  You can also specify the use of **extended regular expressions** by setting the final parameter to 1:\\ \\ 
 +<code bb4w>
         _REG_EXTENDED = 1         _REG_EXTENDED = 1
         SYS regcomp%, buffer%, pattern$, _REG_EXTENDED TO result%         SYS regcomp%, buffer%, pattern$, _REG_EXTENDED TO result%
 +</​code>​
 In this mode additional //​metacharacters//​ are recognised, for example the vertical bar (|) signifies alternatives:​\\ \\  In this mode additional //​metacharacters//​ are recognised, for example the vertical bar (|) signifies alternatives:​\\ \\ 
  
-| abc|def\\ | matches "​abc"​ or "​def"​\\ |+<​nowiki>​abc|def</​nowiki>​\\ | matches "​abc"​ or "​def"​\\ |
 \\ \\
 ---- ----
 [1] When last checked, the file **gnu_regex.exe** was corrupted (missing the last byte). To repair it you can use this simple BBC BASIC program:\\ \\  [1] When last checked, the file **gnu_regex.exe** was corrupted (missing the last byte). To repair it you can use this simple BBC BASIC program:\\ \\ 
 +<code bb4w>
         F% = OPENUP("​gnu_regex.exe"​)         F% = OPENUP("​gnu_regex.exe"​)
         PTR#F% = EXT#F%         PTR#F% = EXT#F%
         BPUT #F%,0         BPUT #F%,0
         CLOSE #F%         CLOSE #F%
 +</​code>​
using_20regular_20expressions.txt ยท Last modified: 2018/04/17 19:21 by tbest3112