Binary pattern matching 
Author Message
 Binary pattern matching

Hi all,
A quick question.  I have set up a perl-based proxy server (don't
laugh).  I set it up to filter rogue HTML code, and rogue javascript.
With perl's pattern matching it is not that difficult to do.  But, what
about Java (re: not javascript).  Java is transmitted as binary and can
contain rogue code too.  Is there any way to make perl detect it?
Here is why I ask.  I once set up the server so it would print the
traffic to my shell screen (in raw form) as it is being transmitted from
the server to the client.  When a binary such as a JPG was being
transmitted, understandibly a whole bunch of garbled characters would
fly across my screen.  I wonder if there is any way to make perl do
pattern patching with that garble?    Or is there an easier way?
~Prime


Wed, 24 Jan 2001 03:00:00 GMT  
 Binary pattern matching
Well, you can do pattern matching on some binary data.  As I understand
it, the binary data is treated as text for some pattern matching purposes.
When I cat a JPG, I get a bunch of garbage, some of which looks like
normal ascii characters, and I can successfully match these characters:

$pat = 'some ascii-looking charcters';
open(F,"<file.jpg") or die $!;
binmode(F);
while (<F>) {print "match: $_" if m/$pat/}
close(F);

If you are looking for non-standard characters, like chr(21), you can't
just paste that into your $pat string.  You have to do something like:

$pat = chr(21).chr(4).chr(3); # etc.

But this does work.

As to knowing which strings of binary java code are dangerous and which
are safe... I have no clue.

Regards,
Reuben Logsdon


Quote:
> Hi all,
> A quick question.  I have set up a perl-based proxy server (don't
> laugh).  I set it up to filter rogue HTML code, and rogue javascript.
> With perl's pattern matching it is not that difficult to do.  But, what
> about Java (re: not javascript).  Java is transmitted as binary and can
> contain rogue code too.  Is there any way to make perl detect it?
> Here is why I ask.  I once set up the server so it would print the
> traffic to my shell screen (in raw form) as it is being transmitted from
> the server to the client.  When a binary such as a JPG was being
> transmitted, understandibly a whole bunch of garbled characters would
> fly across my screen.  I wonder if there is any way to make perl do
> pattern patching with that garble?    Or is there an easier way?
> ~Prime



Wed, 24 Jan 2001 03:00:00 GMT  
 Binary pattern matching

Quote:

> If you are looking for non-standard characters, like chr(21), you can't
> just paste that into your $pat string.  You have to do something like:

> $pat = chr(21).chr(4).chr(3); # etc.

Ick!  You don't have to do it like that.

$pat = "\x15\x4\x3";  # hexadecimal

or

$pat = "\025\04\03";  # octal

or

$pat = "\cU\cD\cC";   # control characters

--


    /                                  http://www.ziplink.net/~rjk/
        "It's funny 'cause it's true ... and vice versa."



Thu, 25 Jan 2001 03:00:00 GMT  
 
 [ 3 post ] 

 Relevant Pages 

1. Pattern matching on binary data

2. pattern matching on binary data

3. RFA: searching for binary pattern in binary file via regex

4. Regular Expressions/Pattern Matching/Unordered pattern

5. pattern as a sentence in pattern matching

6. help with pattern match (repeated patterns)

7. Pattern bug matching whitespace in multi-line match?

8. Pattern Match - substitute a string after the match

9. combine two pattern matches to one match?

10. Pattern matching, grabing everything right of the match

11. Binary pattern regexp

12. Binary pattern regexp

 

 
Powered by phpBB® Forum Software