What's wrong with this regexp? 
Author Message
 What's wrong with this regexp?

I've got a text file I'm trying to parse, which looks like this:

"field 1","field 2",123.456,"field 4"

And I'm trying to parse it with the following expression:

($field1,$field2,$field3,$field4) = split(/(0-9|\"|^\")\,(0-9|\-|\")/)

But it's not working. What am I doing wrong?

If I use a text file which has "s around all fields, including the numeric
ones, then this string works (but I need to handle the above case):

($field1,$field2,$field3,$field4) = split(/\"\,\"/);

******************************************************************************

* CADENCE Magazine & AutoCAD Tech Journal | CompuServe: 71621,3173           *

*                                                                            *
* <URL: http://www.*-*-*.com/ ;     <URL: http://www.*-*-*.com/ ~psheerin/> *
* <URL: http://www.*-*-*.com/ ;                                              *
******************************************************************************



Fri, 18 Jul 1997 02:32:56 GMT  
 What's wrong with this regexp?

Quote:
> I've got a text file I'm trying to parse, which looks like this:

> "field 1","field 2",123.456,"field 4"

> And I'm trying to parse it with the following expression:

> ($field1,$field2,$field3,$field4) = split(/(0-9|\"|^\")\,(0-9|\-|\")/)

> But it's not working. What am I doing wrong?

It seems to me that the regexp : (0-9....whatever) matches the minus sign,
and you intended to match the decimal digits range which is done by '[0-9]'
And your expresion, if it matches the 0 digit, will split it out the returned list!!
beacuse in the split regesp you have to state how the field are separated, but
keeping in mind that the matched characters will be left out of any element
returned in the list.
In adition, if you try to match quotes at the beggining of the string you'll get
an empty first element.

I suggest you to use the following split

# Optionally : check there are no missing quotes ( I use a variable for clarity here )
$a = '"[^,]"|\d+\.?\d*";
/(($a),)*($a)/ || die "Illegal string: $_\n";

# get rid of first and last quotes
s/^"|"$//g;              

# split the remaining string
split( /","?|"?,"/ );

This allows numerical fields to be first or last in the string. It assumes you have
no redundant spaces in between the quotes and commas.

---
---------------------------------------------------------------------------

                                            Tel    : (972) 9 594210
---------------------------------------------------------------------------



Thu, 24 Jul 1997 18:28:04 GMT  
 
 [ 2 post ] 

 Relevant Pages 

1. What's wrong with that regexp?

2. what's wrong with this regexp?

3. perlscript: can't figure out what's wrong

4. Wrong file endings with 'print'

5. perlscript: can't figure out what's wrong

6. What's wrong: menu's

7. What's wrong with my eval'ed coderef regexp thingy??

8. Regexp: matching '|'

9. *Sigh* What's wrong with this?

10. wrong `-w' line number in perl5gamma

11. What's wrong with my &timelocal()??

 

 
Powered by phpBB® Forum Software