How to do backslash expansion? 
Author Message
 How to do backslash expansion?

So in a perl script I'm writing, I get to a point where I want to apply
backslash-escape interpretation to a string that came in as external data ---
e.g. I've just read "foo\n" (literally), and after tearing off the leading and
trailing double-quotes I want to replace the two bytes "\n" with a newline.

I tried looking for the answer in the pods, I really did. I failed. The
closest I found was in perlfaq4, how to unescape, which tells you how to yank
backslashes but ``won't expand "\n" or "\t" or any other special escapes.''

Text::ParseWords doesn't seem to offer this either. Couldn't find anything in
CPAN/modules/by-module/Text/.

Eventually I gave up looking and coded:

        s/\\t/\t/g;
        s/\\n/\n/g;
        s/\\r/\r/g;
        s/\\f/\f/g;
        s/\\b/\b/g;
        s/\\a/\a/g;
        s/\\e/\e/g;
        s/\\(0\d+)/chr(oct($1))/ge;
        s/\\x([\dA-Fa-f]+)/chr(hex($1))/ge;
        s/\\c(.)/chr(ord("\U$1")-64)/ge;

but that left me with that awful feeling that I was writing a heck of a lot of
noise to inefficiently code something at the perl level that perl implemented
internally with some presumably much more efficient C. Is there any way to
bring perl's builtin backslash-string-escape-interp logic to bear on some data
that's not presented to the compiler? That last clause means no, I'm not
interested in answers that use eval() to route my data through the perl
compiler....

-Bennett



Tue, 13 Nov 2001 03:00:00 GMT  
 How to do backslash expansion?
Well, there is a way to do it, without using the eval command.  Of course,
it's just as slow, since it still does an eval--just not an obvious one.
But, on the plus side, there's no reason to strip the quotes or the
return...

#!/usr/local/bin/perl -Tw
$test = '"Test for \ttab\n"'."\n";
$test =~ s/^(.*)$/$1/see;
print $test;

Joe Lundgren
Software Engineer, Micron Technology
PH:  368-4388  PG: 391-5376

Quote:
-----Original Message-----

Sent: Friday, May 28, 1999 1:32 PM

Subject: How to do backslash expansion?

So in a perl script I'm writing, I get to a point where I want to apply
backslash-escape interpretation to a string that came in as external data
---
e.g. I've just read "foo\n" (literally), and after tearing off the leading
and
trailing double-quotes I want to replace the two bytes "\n" with a newline.



Tue, 13 Nov 2001 03:00:00 GMT  
 How to do backslash expansion?
[Posted and a courtesy copy mailed.]



Quote:
> So in a perl script I'm writing, I get to a point where I want to apply
> backslash-escape interpretation to a string that came in as external data ---
> e.g. I've just read "foo\n" (literally), and after tearing off the leading and
> trailing double-quotes I want to replace the two bytes "\n" with a newline.
...    
>    s/\\t/\t/g;
>    s/\\n/\n/g;
>    s/\\r/\r/g;
>    s/\\f/\f/g;
>    s/\\b/\b/g;
>    s/\\a/\a/g;
>    s/\\e/\e/g;
>    s/\\(0\d+)/chr(oct($1))/ge;
>    s/\\x([\dA-Fa-f]+)/chr(hex($1))/ge;
>    s/\\c(.)/chr(ord("\U$1")-64)/ge;

> but that left me with that awful feeling that I was writing a heck of a lot of
> noise to inefficiently code something at the perl level that perl implemented
> internally with some presumably much more efficient C. Is there any way to
> bring perl's builtin backslash-string-escape-interp logic to bear on some data
> that's not presented to the compiler? That last clause means no, I'm not
> interested in answers that use eval() to route my data through the perl
> compiler....

       s/\\([tnrfbae])/qq{"\\$1"}/gee;

for the first seven lines.

Of course, this uses eval() to route your data through the perl
compiler...

Here is a much cleaner way to write the last one, which works for non-
alphabetic characters also:

       s/\\c(.)/$1 & "\x1F"/ge;

--
(Just Another Larry) Rosler
Hewlett-Packard Company
http://www.hpl.hp.com/personal/Larry_Rosler/



Wed, 14 Nov 2001 03:00:00 GMT  
 How to do backslash expansion?

Quote:

> Eventually I gave up looking and coded:

>    s/\\t/\t/g;
>    s/\\n/\n/g;
>    s/\\r/\r/g;
>    s/\\f/\f/g;
>    s/\\b/\b/g;
>    s/\\a/\a/g;
>    s/\\e/\e/g;
>    s/\\(0\d+)/chr(oct($1))/ge;
>    s/\\x([\dA-Fa-f]+)/chr(hex($1))/ge;
>    s/\\c(.)/chr(ord("\U$1")-64)/ge;

> but that left me with that awful feeling that I was writing a heck of a lot of
> noise to inefficiently code something at the perl level that perl implemented
> internally with some presumably much more efficient C. Is there any way to
> bring perl's builtin backslash-string-escape-interp logic to bear on some data
> that's not presented to the compiler? That last clause means no, I'm not
> interested in answers that use eval() to route my data through the perl
> compiler....

I can sell you a nice substitution --

        s/(.*)/$1/ee;

-- but in truth it's just an eval() in disguise.

The code you've written above is actually flawed.  If the input string
were to contain octal codes for, say, a backslash, an 'x' and an 'A',
then the eighth substitution would write '\\xa' into the string and the
ninth would replace it with a newline.  This is probably not what you
want -- at least, it's not what the Perl compiler does.

The cure is to scan the string just once, doing everything in that one
scan.  Here's an improved bit of Perl that does this.  It also hoists
some of the work into a bit of initialisation code that prepares a hash
called %xlate.

        #!perl -w
        use strict;

        # Initialisation

        my %xlate;
        for ('a' .. 'z')
        {       $xlate{$_} = eval "qq/\\$_/";
                $xlate{"c$_"} = $xlate{"c\U$_"} = chr(ord($_) - 96);
        }

        # Read lines and translate them

        while (<>)
        {       s/\\ (x [\d A-F a-f]+ | 0 \d+ | c? [A-Z a-z])
                 /      (substr($1, 0, 1) eq 'x')?
                                chr hex substr $1, 1:
                        (substr($1, 0, 1) eq '0')?
                                chr oct substr $1, 1:
                                $xlate{$1}
                 /exg;
                print;
        }

Now let's see if someone comes up with a better answer.

Markus

--
Delete the 'delete this bit' bit of my address to reply



Wed, 14 Nov 2001 03:00:00 GMT  
 How to do backslash expansion?
        >snip<
:        s/\\([tnrfbae])/qq{"\\$1"}/gee;
:
: for the first seven lines. Of course, this uses eval() to route your data
: through the perl compiler...

        Yep.

: Here is a much cleaner way to write the last one, which works for non-
: alphabetic characters also:
:        s/\\c(.)/$1 & "\x1F"/ge;

        $ perl -e '$_= q(Mary\nhad\ta\\little\lamb!\a\a\a); s/\\c(.)/$1 & "\x1F"/ge; print'
        Mary\nhad\ta\little\lamb!\a\a\a

        Am I using it wrong?

        Even still, the eval is probably going to be quite slow:

        Benchmark: timing 10000 iterations of Eval, Table...
              Eval: 12 wallclock secs (11.25 usr +  0.41 sys = 11.66 CPU)
             Table:  1 wallclock secs ( 0.95 usr +  0.00 sys =  0.95 CPU)

        A simple lookup table would probably work better overall:

        $ cat foo.pl
        #!/usr/local/bin/perl -w

        use strict;
        use Benchmark;
        my $string = q(Mary\thad a\nlittle\\lamb\n);
        my %escape = (
            t   => "\t",
            n   => "\n",
            r   => "\r",
            f   => "\f",
            b   => "\b",
            a   => "\a",
            e   => "\e",
            "\\"=> "\\",
        );

        timethese shift || 1000, {
            Eval => sub {
                local $_ = $string;
                s/\\([tnrfbae])/qq{"\\$1"}/gee;
            },
            Table => sub {
                local $_ = $string;
                s/\\([tnrfbae\\])/$escape{$1}/g;
            }
        };

--

                                        Pizza......for the body.
                                        Sushi......for the soul.
                                             -- User Friendly



Wed, 14 Nov 2001 03:00:00 GMT  
 How to do backslash expansion?

Quote:
> Here's an improved bit of Perl that does this.

But the code I wrote inherited three bugs from Bennett's original
solution.  (Serves me right for posting code so late at night.)  First,
it assumed that all octal character escapes begin with '\\0', whereas in
fact they don't.  Second, it attempted to include '8' and '9' as octal
digits.  Third, it failed to take into account that an octal escape
can't be more than three digits long or a hex escape more than two.  The
following one-liner --

        perl -e "print qq/1\x0A2\0123/"

-- prints this, on my computer:

        1
        2
        3

Knowing this, we can construct a lookup hash for every escape sequence
that we choose to handle, providing the greatest possible speed.  Here's
the new program:

        #!perl -w
        use strict;

        my %xlate;

        # Set up translations.

        for (0 .. 255)          # portability assumption?
        {       # Do "\1", "\01", "\001",
                # "\xA", "\x0A", "\xAA", "\xaa", etc.

                my $o   = sprintf '%o',   $_;
                my $oo  = sprintf '%02o', $_;
                my $ooo = sprintf '%03o', $_;
                my $x   = sprintf '%x',   $_;
                my $xx  = sprintf '%02x', $_;
                my $X   = uc $x;
                my $Xx  = ucfirst $xx;
                my $XX  = uc $xx;
                my $xX  = lcfirst $XX;

                       "x$x", "x$xx", "x$X", "x$Xx", "x$XX", "x$xX"}
                                        = (chr $_) x 9;

                # Do \a and \ca:

                $_ = chr $_;
                $xlate{$_}    = eval "qq/\\$_/";
                $xlate{"c$_"} = eval "qq/\\c$_/";
        }
        $xlate{'c\\'} = chr 0x1c;

        # Translate the input.
        # Note the order of the RE: we want \x0A and \012 to be found
        # in preference to a simple \x and \0.

        while (<>)
        {       s/\\ (x [\d A-F a-f]{1,2} | [0-7]{1,3} | c? .)
                 /$xlate{$1}/xg;
                print;
        }

This code doesn't attempt to handle \u \l \U \L \Q \E, and it silently
eats them.  It does, however, correctly reduce '\\w' to 'w'.  By using
eval() to set up \a, \b, etc, it automatically handles new escapes built
into future versions of Perl.  Finally, it's not clear how a single
backslash at the end of a line should be treated, so the code leaves it
alone.

Markus

--
Delete the 'delete this bit' bit of my address to reply



Thu, 15 Nov 2001 03:00:00 GMT  
 How to do backslash expansion?


Quote:

>    >snip<
> :        s/\\([tnrfbae])/qq{"\\$1"}/gee;
> :
> : for the first seven lines. Of course, this uses eval() to route your data
> : through the perl compiler...

>    Yep.

> : Here is a much cleaner way to write the last one, which works for non-
> : alphabetic characters also:
> :        s/\\c(.)/$1 & "\x1F"/ge;

>    $ perl -e '$_= q(Mary\nhad\ta\\little\lamb!\a\a\a); s/\\c(.)/$1 & "\x1F"/ge; print'
>    Mary\nhad\ta\little\lamb!\a\a\a

>    Am I using it wrong?

You are using the one-liner that specifically evaluates the "\\cX"
escape sequence on a string that doesn't have any of them.  Perhaps you
meant to use the "\\X" converter that you quoted first.

Quote:
>    Even still, the eval is probably going to be quite slow:

<SNIP> of very convincing benchmark.  's///ee' is harmful to your
performance.

--
(Just Another Larry) Rosler
Hewlett-Packard Company
http://www.hpl.hp.com/personal/Larry_Rosler/



Thu, 15 Nov 2001 03:00:00 GMT  
 How to do backslash expansion?

Quote:

>    while (<>)
>    {       s/\\ (x [\d A-F a-f]+ | 0 \d+ | c? [A-Z a-z])

Ahem.   Clearly untested.     You (and other posters) seem unaware that
/x doesn't ignore spaces in character classes.

Quote:
>             /      (substr($1, 0, 1) eq 'x')?
>                            chr hex substr $1, 1:
>                    (substr($1, 0, 1) eq '0')?
>                            chr oct substr $1, 1:
>                            $xlate{$1}
>             /exg;
>            print;
>    }

Mike Guy


Sat, 17 Nov 2001 03:00:00 GMT  
 How to do backslash expansion?
How come no-one mentioned sprintf?

perldoc *says*
            Returns a string formatted by the usual `printf()'
            conventions of the C library function `sprintf()'.

and goes on to talk about conversions.

'man printf' talks about backslash escapes (OK, not all perl flavoured
ones but to get perl, use eval :) )

so of course

  #!perl
  $a='Here\nwe\nare\n';
  print $a;
  $b=sprintf $a;
  print $b;

will work won't it?

hint 'no'.

Don't know why though.

It doesn't specifically say 'no escapes performed' - maybe it assumes a
hardcoded double quoted format string?
Maybe the docs should say 'nb doesn't do any escapes'

David

--
David Greaves                        Enabling
Technical Director, Telekinesys SW     Productivity    W

http://www.telekinesys.co.uk/              the       W E B



Mon, 19 Nov 2001 03:00:00 GMT  
 How to do backslash expansion?

  DG> How come no-one mentioned sprintf?

  DG>   $a='Here\nwe\nare\n';
  DG>   $b=sprintf $a;
  DG>   print $b;

  DG> will work won't it?
  DG> hint 'no'.

  DG> Don't know why though.

s?printf only converts % formats. the \n and friends are handled by the
double quotish expansion. since you string was single quoted it wasn't \
expanded.

uri

--
Uri Guttman  -----------------  SYStems ARCHitecture and Software Engineering

Have Perl, Will Travel  -----------------------------  http://www.sysarch.com
The Best Search Engine on the Net -------------  http://www.northernlight.com



Mon, 19 Nov 2001 03:00:00 GMT  
 How to do backslash expansion?

Quote:

> perldoc *says*
>             Returns a string formatted by the usual `printf()'
>             conventions of the C library function `sprintf()'.
> 'man printf' talks about backslash escapes (OK, not all perl flavoured
> ones but to get perl, use eval :) )

I believe you are talking about printf(1), the shell command printf, not
printf(3)/sprintf(3), the C library functions printf() and sprintf().
The manpage for sprintf(3), at least on my system, does not talk about
backslash escapes.

The behavior of the shell command is irrelevant.

--
 _ / '  _      /       - aka -

    /                                http://www.tiac.net/users/chipmunk/
        "It's funny 'cause it's true ... and vice versa."



Tue, 20 Nov 2001 03:00:00 GMT  
 
 [ 11 post ] 

 Relevant Pages 

1. Variable expansion problem with Cookie header

2. curly brace expansion?

3. suppress cmd line wildcard expansion

4. shell expansion of filenames

5. fun workaround for Bigperl lack of commandline expansion

6. Filename expansion

7. Summary of complex mail subject lines/expansions of variables

8. Path expansion

9. Filename expansion ("~user")

10. relative to absolute path expansion

11. Protecting a variable from expansion

12. Variable expansion and eval

 

 
Powered by phpBB® Forum Software