
Help with Regular Expression
:
: I am having trouble with a regular expression. What I want to do is
: to substitute certain characters in postscript hex strings. A
: postscript hex string consists of pairs of hex digits between angle
: brackets like this
: <a5414243a5>
:
: Now I want all characters with hex rep of 'a5' to be replaced by 'b7'.
: So I wrote the following fragment in a perl script.
:
: s/<(([0-9a-z][0-9a-z])*)a5(([0-9a-z][0-9a-z])*)>/<$1b7$3>/g;
:
: I know that's hard to read; sorry.
:
: Anyway, my problem is that the first pattern to match pairs
: '([0-9a-z][0-9a-z])*' of hex digits wants to match the longest
: sequence of pairs so it will change
: <a5414243a5>
: into
: <a5414243b7>
:
: That is, it gets only the last a5 within the angle brackets,
: instead of the first.
:
: Is there a way to do what I want?
Presuming that you meant that you want the first character changed along
with the last character, and not instead of, you just do it repeatedly:
1 while s/<(([0-9a-f][0-9a-f])*)a5(([0-9a-f][0-9a-f])*)>/<$1b7$3>/g;
Note that you've asserted the two-byte alignment both before and after
the a5. If you really do want to change the first one and not the last
one, you can do it by dropping the front assertion like this, presuming
right angle bracket is not used in any other way.
s/a5(([0-9a-f][0-9a-f])*)>/b7$1>/g;
If that's not a valid assumption, there are safer ways involving split:
$new = '';
$chunks[0] =~ s/a5(([0-9a-f][0-9a-f])*)>/b7$1>/g;
Quote:
}
Alternately, you can use the original algorithm on a reversed string (not
forgetting to reverse a5 and b7 too):
$tmp = reverse $_;
$tmp =~ s/<(([0-9a-f][0-9a-f])*)5a(([0-9a-f][0-9a-f])*)>/<$17b$3>/g;
$_ = reverse $tmp;
To change them all, Randal will probably suggest something like this:
s/<([0-9a-f]+)>/'<' . &tr_a5_to_b7($1) . '>'/eg;
sub tr_a5_to_b7 {
local($t) = pack('H*', $_[0]);
$t =~ tr/\xa5/\xb7/;
unpack('H*', $t);
}
Larry