Active TopicsActive Topics  Display List of Forum MembersMemberlist  Search The ForumSearch  HelpHelp
  RegisterRegister  LoginLogin
PowerHome Programming
 PowerHome Messageboard : PowerHome Programming
Subject Topic: Get the text between two tags Post ReplyPost New Topic
Author
Message << Prev Topic | Next Topic >>
MrGibbage
Super User
Super User
Avatar

Joined: October 23 2006
Location: United States
Online Status: Offline
Posts: 513
Posted: April 17 2010 at 09:01 | IP Logged Quote MrGibbage

For you regex experts...
How can I get the text between two tags in PH?

For instance:

<tagname>My Text</tagname>

I want to write a function/formula that I can pass "tagname" and it should return "My Text". In PH terms, of
course "My Text" can be stored in a local variable.
Back to Top View MrGibbage's Profile Search for other posts by MrGibbage
 
BeachBum
Super User
Super User
Avatar

Joined: April 11 2007
Location: United States
Online Status: Offline
Posts: 1880
Posted: April 17 2010 at 10:04 | IP Logged Quote BeachBum

If I understand your question correctly, I use POSW and MID to accomplish getting text between two tags. (3,21) Points to the start of data, (3,26) is the start tag1, and (3,27) is the end tag2. Of course it’s a little messier than just searching between two tags with one command.

ph_setvar_a(3,26, posw( ph_getvar_s(3,21), ".NOW." ))

ph_setvar_a(3,27, posw(ph_getvar_s(3,21),"$$",ph_getvar_n(3,26)))

mid(ph_getvar_s(3,21), ph_getvar_n(3,26), ph_getvar_n(3,27) - ph_getvar_n(3,26) )


__________________
Pete - X10 Oldie
Back to Top View BeachBum's Profile Search for other posts by BeachBum
 
dhoward
Admin Group
Admin Group
Avatar

Joined: June 29 2001
Location: United States
Online Status: Offline
Posts: 4447
Posted: April 17 2010 at 14:45 | IP Logged Quote dhoward

Pete's method will indeed work but if you want to use the Regex way, this would be the formula to use. The below formula is assuming that the text to search is in [LOCAL1] and the tag to search for is in [LOCAL2]:

ph_regexsnap("<" + ph_getvar_s(1,2) + ">\(.+\)</" + ph_getvar_s(1,2) + ">",ph_getvar_s(1,1),1,0)

Below is the same function but with sample text plugged into where the LOCAL vars are:

ph_regexsnap("<tagname>\(.+\)</tagname>","<ta gname>My Text</tagname>",1,0)

Hope this helps,

Dave.
Back to Top View dhoward's Profile Search for other posts by dhoward Visit dhoward's Homepage
 
MrGibbage
Super User
Super User
Avatar

Joined: October 23 2006
Location: United States
Online Status: Offline
Posts: 513
Posted: April 18 2010 at 16:17 | IP Logged Quote MrGibbage

I went with Dave's method. Works perfectly. I knew there'd be a simple way.
Back to Top View MrGibbage's Profile Search for other posts by MrGibbage
 
syonker
Senior Member
Senior Member
Avatar

Joined: March 06 2009
Location: United States
Online Status: Offline
Posts: 212
Posted: August 03 2014 at 07:41 | IP Logged Quote syonker

Gang,

I am noticing some very unfamiliar escape codes around the
forum. Wondering if anyone had a reference table or
similar that lays out all of the escape codes and how they
are used?

Thanks in advance for any help.

-S

__________________
"I will consider myself having succeeded when my house becomes sentient and attempts to kill me."

><(((º>`·.¸¸.·´¯`·.¸><(((º>¸.

·´¯`·.¸. , . ><(((º>`·.¸¸.·´¯`·.¸><(((º>
Back to Top View syonker's Profile Search for other posts by syonker Visit syonker's Homepage
 
syonker
Senior Member
Senior Member
Avatar

Joined: March 06 2009
Location: United States
Online Status: Offline
Posts: 212
Posted: August 03 2014 at 08:26 | IP Logged Quote syonker

While I'm on this line of programming questions I wanted
to
ask:

Does "~255" (as used in the example below) mean "Skip
until
whatever is past here is reached"...?

ph_regexsnap('<td>Wind Gust</td>~255<span
class="nowrap"><b>\(.+\)</b> mph</span> ',"
[LOCAL1]",1,1)


-S

Edited by syonker - August 03 2014 at 08:27


__________________
"I will consider myself having succeeded when my house becomes sentient and attempts to kill me."

><(((º>`·.¸¸.·´¯`·.¸><(((º>¸.

·´¯`·.¸. , . ><(((º>`·.¸¸.·´¯`·.¸><(((º>
Back to Top View syonker's Profile Search for other posts by syonker Visit syonker's Homepage
 
TonyNo
Moderator Group
Moderator Group
Avatar

Joined: December 05 2001
Location: United States
Online Status: Offline
Posts: 2889
Posted: August 04 2014 at 07:06 | IP Logged Quote TonyNo

Yes. From the Help file...

The ~255 approach allows you to perform multiple searchs within your search pattern (even across multiple lines). Only the last search, of multiple, can be used for Returning "snapped" data. NOTE: the ~255 character is NOT a part of the regular expression syntax and is strictly a PH special creation to separate regex searches.
Back to Top View TonyNo's Profile Search for other posts by TonyNo Visit TonyNo's Homepage
 
syonker
Senior Member
Senior Member
Avatar

Joined: March 06 2009
Location: United States
Online Status: Offline
Posts: 212
Posted: August 06 2014 at 07:36 | IP Logged Quote syonker

Hi Tony,

Thanks for the clarity - for whatever reason my bleary-eyed paradigm prevented me from finding that clearly in the help. So the "~" is actually the escape character used to access a direct ASCII reference? In this case character 255? Hence a "~10~13" is equivalent to "Carriage Return/Line Feed" and "~0" would be (not that you'd ever use it) a null character?

-S

__________________
"I will consider myself having succeeded when my house becomes sentient and attempts to kill me."

><(((º>`·.¸¸.·´¯`·.¸><(((º>¸.

·´¯`·.¸. , . ><(((º>`·.¸¸.·´¯`·.¸><(((º>
Back to Top View syonker's Profile Search for other posts by syonker Visit syonker's Homepage
 
dhoward
Admin Group
Admin Group
Avatar

Joined: June 29 2001
Location: United States
Online Status: Offline
Posts: 4447
Posted: August 08 2014 at 18:16 | IP Logged Quote dhoward

S,

Just to followup on the above. The ~ character is a PowerBuilder (the language that PowerHome is written in) string escape character. If you wanted CR/LF, then you could simply use ~r~n. ~b is backspace, ~t is tab, and ~~ is just the tilde. If the ~ character precedes a number, then PB is expecting a 3 digit ANSI character value in decimal format. So ~10~13 would really be ~010~013. If you prefer hex, then you can use ~hXX where XX is two hex digits. You even have the option of octal which would be ~oXXX where XXX are 3 octal digits.

So the ~255 above (or ~hFF) is exactly that...an ANSI decimal 255 character.

Hope this helps,

Dave.
Back to Top View dhoward's Profile Search for other posts by dhoward Visit dhoward's Homepage
 

If you wish to post a reply to this topic you must first login
If you are not already registered you must first register

  Post ReplyPost New Topic
Printable version Printable version

Forum Jump
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot delete your posts in this forum
You cannot edit your posts in this forum
You cannot create polls in this forum
You cannot vote in polls in this forum