
Replace with regular expressions "except if..."
Yes, I see it doesn't work right. Unfortunately, that's about where my
knowledge of regexps comes to an end. :-( I am sure you should be able
to do this with regexps, but I think that one perhaps wouldn't be
enough. The problem is when you have nested tags, eg:
<a href="http://something"><b>http://something</b></a> - this doesn;t
need to be expanded
<a href="http://something>something</a><b>http://hello</b> - this does
need to be expanded
^^^^^^^^^^^^
For these cases, you need to build up a knolwedge of the structure of
the HTML before and after the URL you match (which would become hugely
complicated (impossible?) with one regexp). I would suggest finding an
HTML parser in Tcl. The Tcl-XML parser by Steve Ball (<URL:
http://sourceforge.net/projects/tclxml>) could also provide a starting
point. You would have to parse looking for URLs as well as tags, and
then decide if the URL came after an <a href...> but before a </a>. Good
luck!
Quote:
> Thanks Neil,
> I tried this but it matches too much and too little. I tried putting a
> URL after the last string (Other stuff) and it chose to match \2 from
> the "t" in stuff and forward :-).
> The expression I showed initially matches any string beginning
> with ...:// and returns it in \1 so that I can use it in {<a
> href="\1">\1</a>. I just don't want it to match adresses that are
> already inside <a href...> tags. I have tried figuring out how to
> utilize lookahead and negate it (DeMorgan-wise) but it is really not
> very easy for me to construct. I don't know any formalized methods of
> designing these regular expressions and my brain is simply too small to
> keep it in the air without one. I truly admire those who do.
> Best regards, Thomas Nielsen
> Sent via Deja.com http://www.deja.com/
> Before you buy.