regex ? 
Author Message
 regex ?

When title has an ' or " this appears to crash.

re.compile(r"<title>(.*)</title>", re.I)

Any ideas?

David



Tue, 22 Jun 2004 02:41:46 GMT  
 regex ?


Quote:
> When title has an ' or " this appears to crash.

> re.compile(r"<title>(.*)</title>", re.I)

> Any ideas?

> David

Works for me:

import re
sampleData = "Spam, spam, <GREEN>ham</GREEN>. " \
                + "<TITLE>Here's lookin' at you, kid.</TITLE> " \
                + "Say no more."

pat = re.compile(r"<title>(.*)</title>", re.I)

match = pat.search(sampleData)

if match:
    print match.groups()
else:
    print "No matches"

Quote:
>>> ("Here's lookin' at you, kid.",)



Tue, 22 Jun 2004 04:23:52 GMT  
 regex ?


Quote:
> pat = re.compile(r"<title>(.*)</title>", re.I)

Or just in case somebody puts two set of titles in one document, for
whatever reason...

pat = re.compile(r"<title>([^</title>]+)</title>", re.I)



Tue, 29 Jun 2004 03:09:46 GMT  
 regex ?

Quote:


> > pat = re.compile(r"<title>(.*)</title>", re.I)

> Or just in case somebody puts two set of titles in one document, for
> whatever reason...

> pat = re.compile(r"<title>([^</title>]+)</title>", re.I)

No!!!  The part [^</title>] excludes any of the CHARACTERS lessthan,
slash, t, i, l, e, and greaterthan, from being accepted as part of
a title.  This is VERY unlikely indeed to be what one would want.

Alex



Tue, 29 Jun 2004 22:47:30 GMT  
 
 [ 4 post ] 

 Relevant Pages 

1. grep regex to ruby regex...

2. Regex++ or other Regex lib for Fortran?

3. Python regex / libc regex interactions

4. French doc about Vassili Bykov RegEx package.

5. Store regex match in a variable

6. (regex) nested complemented character class list

7. (regex) nested complemented character class list

8. regex shell script

9. bug in gawk3.1.0 regex code

10. regex problem

11. regex matching and text moving

12. awk and regex

 

 
Powered by phpBB® Forum Software