backslash as field separator 
Author Message
 backslash as field separator

I am having difficulty using the following awk script:

BEGIN { FS = "[]]*,[[]*| [[]+|\\" }
{       for ( i = 1; i <= NF; i++)
        print i, ": ", $i
          print $0 }

with the following file:

    MoveJ [[x,y,z],[q1,q2,q3,q4],[cf1,cf2,cf6,cfx],[eax1,eax2,eax3,eax4,eax5,eax6]],v1000,z10,tool20_5cm\WObj:=Wobj4;
    MoveL [[x,y,z],[q1,q2,q3,q4],[cf1,cf2,cf6,cfx],[eax1,eax2,eax3,eax4,eax5,eax6]],v1000,z10,tool56cm\WObj:=Wobj4;

The last field does not separate at the backslash, although if I change to
FS="\\" it does work (but it only separates into two pieces). Could anyone
share some advice on cleaning up the FS expression?

Thanks,

-Jim



Tue, 30 Jul 2002 03:00:00 GMT  
 backslash as field separator

Quote:

>I am having difficulty using the following awk script:

>BEGIN { FS = "[]]*,[[]*| [[]+|\\" }
>{    for ( i = 1; i <= NF; i++)
>        print i, ": ", $i
>     print $0 }

> with the following file:

>    MoveJ
>[[x,y,z],[q1,q2,q3,q4],[cf1,cf2,cf6,cfx],[eax1,eax2,eax
>3,eax4,eax5,eax6]],v1000,z10,tool20_5cm\WObj:=Wobj4;
>    MoveL
>[[x,y,z],[q1,q2,q3,q4],[cf1,cf2,cf6,cfx],[eax1,eax2,eax
>3,eax4,eax5,eax6]],v1000,z10,tool56cm\WObj:=Wobj4;

>The last field does not separate at the backslash,
>although if I change to FS="\\" it does work (but it only
>separates into two pieces). Could anyone share some advice
>on cleaning up the FS expression?

Try  FS = "[]]*,[[]*| [[]+|\\\\"  .

You've been stung by backslash semantics in double quoted
regular expressions. You need four backslashes inside
double quotes to represent a single literal backslash. Awk
itself will translate each pair of backslashes into single
literal backslashes when it compiles the script. When this
string is used as a regular expression, you need two
literal backslashes in it to represent a literal backslash
in the regexp engine, so the string needs to be left with a
pair of backslashes after awk compiles it, which means it
needs to start with two pairs (four) in the script code.

* Sent from AltaVista http://www.altavista.com Where you can also find related Web Pages, Images, Audios, Videos, News, and Shopping.  Smart is Beautiful



Tue, 30 Jul 2002 03:00:00 GMT  
 backslash as field separator

Quote:

> I am having difficulty using the following awk script:

> BEGIN { FS = "[]]*,[[]*| [[]+|\\" }
> {       for ( i = 1; i <= NF; i++)
>         print i, ": ", $i
>           print $0 }

> with the following file:

>     MoveJ [[x,y,z],[q1,q2,q3,q4],[cf1,cf2,cf6,cfx],[eax1,eax2,eax3,eax4,eax5,eax6]],v1000,z10,tool20_5cm\WObj:=Wobj4;
>     MoveL [[x,y,z],[q1,q2,q3,q4],[cf1,cf2,cf6,cfx],[eax1,eax2,eax3,eax4,eax5,eax6]],v1000,z10,tool56cm\WObj:=Wobj4;

> The last field does not separate at the backslash, although if I change to
> FS="\\" it does work (but it only separates into two pieces). Could anyone
> share some advice on cleaning up the FS expression?

> Thanks,

> -Jim

FS = "[]]*,[[]*| [[]+|\\\\" works for me, but I'm having a hard time
explaining this behaviour. Apparently the
\\ in "...|\\" is not interpreted as the \\ in "\\".

confusing...

Eiso

__________________________________________________________________

           o                     Eiso AB

                 o               Dept. of Biochemistry
                                 University of Groningen                
                                 The Netherlands                      
                  o  
            . .    
         o   ^                  
         |   -   _              
          \__|__/                
             |                  
             |
            / \
           /   \
           |   |
________ ._|   |_. ________________________________________________



Tue, 30 Jul 2002 03:00:00 GMT  
 backslash as field separator

..
Quote:
>FS = "[]]*,[[]*| [[]+|\\\\" works for me, but I'm having a
>hard time explaining this behaviour. Apparently the \\ in
>"...|\\" is not interpreted as the \\ in "\\".

..

Aren't single character FS values treated as literal
characters and only multiple character FS values treated as
regexps? If so, FS="\\" would be literal backslash because
FS will have length 1 after awk compiles the string "\\",
but FS=";|\\" would be a poorly formed regexp - awk would
compile it as  ;|\  and apprarently handle the second
alternative differently than a literal backslash.

* Sent from AltaVista http://www.altavista.com Where you can also find related Web Pages, Images, Audios, Videos, News, and Shopping.  Smart is Beautiful



Tue, 30 Jul 2002 03:00:00 GMT  
 backslash as field separator

% > I am having difficulty using the following awk script:
% >
% > BEGIN { FS = "[]]*,[[]*| [[]+|\\" }

[...]

% > The last field does not separate at the backslash, although if I change to
% > FS="\\" it does work (but it only separates into two pieces). Could anyone

[...]

% FS = "[]]*,[[]*| [[]+|\\\\" works for me, but I'm having a hard time
% explaining this behaviour. Apparently the

There are two things at work here -- the string reader and the regular
expression parser, and back-slashes are special to both of them. When you
put \\ into a string, it shows up as \. In the case
 FS = "\\"
this becomes a single-character field separator, which isn't treated as
an RE, so it splits OK. In the other cases, the \ is passed down to the
RE parser, which is expecting it to be escaping the next character, and
ignores it.

--

Patrick TJ McPhee
East York  Canada



Wed, 31 Jul 2002 03:00:00 GMT  
 
 [ 5 post ] 

 Relevant Pages 

1. fixing backslash path separator in C #include statements

2. Dealing with commas as a field separator AND possibly within a field

3. Hexadecimal Field Separator

4. Multiple field separators for paranthesis not working

5. multiple field separators in AWK

6. Field Separator

7. Field Separator larger than one character

8. Field Separator not working

9. Quoted field separators

10. field separator

11. Problem with Field Separators

12. AWK with 2 field separators

 

 
Powered by phpBB® Forum Software