Argument | Description |
pattern | String. A regular expression search pattern. |
data | String. The string in which to perform the search. |
start | Long. The position within the data in which to start the search. Use 1 to start at the beginning. |
flags | Integer. Flags that control how the search is performed. Add individual flag values together. Add 1 to cause the search to match case. Add 2 to cause the search to ignore cr/lf's within the data. |
localstart | Integer. The index of a local variable in which to have the start of the found data returned. Use 0 to not have the start returned. |
locallength | Ingeger. The index of a local variable in which to have the length of the found data returned. Use 0 to not have the length returned. |
. | Matches any character. |
\< | This matches the start of a word, where a word is defined in the traditional sense, that is, letters, or number. Spaces, punctuation, CR/LF, etc. would not be included as part of a word, and thus create a break. |
\> | This matches the end of a word. See also word definition above. |
\x | This allows you to use a character x that would otherwise have a special meaning. For example, \[ would be interpreted as [ and not as the start of a character set. |
[...] | This indicates a set of characters, for example, [abc] means any of the characters a, b or c. You can also use ranges, for example [a-z] for any lower case character. |
[^...] | The complement of the characters in the set. For example, [^A-Za-z] means any character except an alphabetic character. |
^ | This matches the start of a line (unless used inside a set, see above). |
$ | This matches the end of a line. |
* | This matches 0 or more times. For example, Sa*m matches Sm, Sam, Saam, Saaam and so on. |
+ | This matches 1 or more times. For example, Sa+m matches Sam, Saam, Saaam and so on. |
NOTE: If a string contains any quote characters (") then the string must be delimited with the single quote charcter ('). For example... 'he said, "no"'
Also this function will not perform a regular expression search that spans multiple lines. If the data to search contains carraige returns or line feeds, the entire matching search data for the regular expression must exist within a single line. If your regular expression must span across a line, then add 2 to the flags to have CR's and LF's temporarily converted. CR will be converted to ASCII 128 and LF will be converted to ASCII 129. If you convert CF/LF then you can include them in your search with PowerHome escape characters ~128 and ~129 respectively.
NOTE: Where search pattern srings might contain a quote (') character then you must use the single quote (') character to delineate your string variables. For example:
ph_regex ('he said, "no" ', "[LOCAL1"', 1, 0,1,2 )
You may also perform multiple searches by separating your search pattern using the PowerHome escape character ~255. This is most useful when CF/LF IS NOT replaced and trying to match a particular piece of data. When using multiple searches, only the last matching search data is returned. An example multiple search would be: "degrees$~255[0-9]+ humidity". What this search does is first search for the first occurence where the word "degrees" appears at the end of a line. The function will then do a regex search using "[0-9]+ humidity" starting from the end of the last regex search (the start of the line following the one on which "degrees" was found).
This function is often used with other PH string functions to trim a larger string, or to locate a string position within another string. See also pos(), posw(), ph_pos(), left(), mid(), right().
See also the .FAQs-String Tips-Hints Help file.
*** Simple Example ***
Assume you have multiple water leak sensors installed and trigger from each individually, but want to process them all with a single common Macro routine that puts the battery status (GOOD/BAD) in a series of Globals named "BATCHK_SINK", "BATCHK_WASHING", "BATCHK_TOILET"
The Leak Sensor battery periodic heartbeat Trigger will pass the triggering device's ID (eg, "WATER LEAK-SINK") to the Macro. If you name all the device ID's in a similar fashion, such as . . .
WATER LEAK-SINK
WATER LEAK-WASHING
WATER LEAK-TOILET
Then you can strip off the unique device name (following the dash) and write the status to the appropriate Global var, as follows.
The following string operations would find the unique device name then append it to the base Global string to form the unique Global variable name.
Macro line 100 searches the device ID string date passed in TEMP10 by the Trigger and looks for the dash character ("-") starting at position 1 in the string. Since this is a simple search no special Flag settings are needed, so "0" is used. The last two parameters will store in LOCAL 4 the position of the "-" in the string, and in LOCAL5 the length of the found data ("-SINK"). These are useful for further string operations, but actually not needed here, but shown for generality.
Macro line 110 trims the dash off the found string, leaving only "SINK"
Macro line 120 appends "SINK" to the common Global name of "BATCHK_" forming "BATCHK_SINK"
Finally in line 130, the value "OK" is written to this Global variable.
Line 140 prints out the various parameter values FYI.
*** Complex Example ***
The following examples assume that the following string (with CR/LF line enders) is stored in [LOCAL1]...
ROMId,Name, Value,Avg, "3F000001CD92C728","Refrig",39.20,37.46, "3F6000017C8BD128","Outside",23.90,19.81, "3F000001CDB2BA27","House",70.65,70.13, "3F000001CD9E6D28","Freezer",1.96,1.11, |
This captures the initial portion of the first ROM ID
ph_regex ("3F[0-9]+","[LOCAL1]", 1, 0,2,3 ) --> returns "3F000001"
Because of "greediness" the following will search from "3F" to the beginning of the last word it can find on the line (not the first word).
ph_regex ("3F.+\<","[LOCAL1]", 1, 0,2,3 ) --> returns "3F000001CD92C728","Refrig",39.20,37."
Note that if the CR/LF flag had been set to 2 (ignore line endings) the search would have captured everything from the first "3F" all the way to the "...6D28","Freezer",1.96,1." characters at the end.