Perl speed vs. Python speed 
Author Message
 Perl speed vs. Python speed

I just wrote a tiny python program to do some text processing.
When I ran it I was very surprised how long it took to finish.
My first guess is that Perl would be a lot faster, maybe even a
factor 10, but I did not do a direct comparison.

Could someone with more experience comment on Perl speed vs.
Python speed?

Thanks!
        Ralf

Sent via Deja.com http://www.*-*-*.com/
Before you buy.



Sun, 30 Jun 2002 03:00:00 GMT  
 Perl speed vs. Python speed

Quote:

>I just wrote a tiny Python program to do some text processing.
>When I ran it I was very surprised how long it took to finish.
>My first guess is that Perl would be a lot faster, maybe even a
>factor 10, but I did not do a direct comparison.

>Could someone with more experience comment on Perl speed vs.
>Python speed?

Naively written Python is often not as fast as Perl for processing
large quantities of text.  On the other hand you'll still be be able
to read and understand the Python 6 months from now, and by using some
common optimizaions it you can be speed up considerably while still
remaining readable.

Things to look out for:

Reading a single line at a time in Python is not as fast as it could
be.  Using readlines with or without a buffer size argument to limit
memory usage can make a huge difference.

Function calls and name look up outside the local scope are relatively
expensive.  Using the localization trick can make an improvement of
several percent per localized variable, as can moving function code
in line.

The "new" Python re module needs to be optimized and possibly pushed
into C code, the older regex module is about twice as fast.

Precomputing values like bound regex search or match functions can
also help quite a bit.

Python can spend a lot of time allocating and freeing memory if you
get sloppy with your string manipulations.

There are other tricks but this is just supposed to be a quick answer.
Using the profiler is often very instructive.  Posting some code might
also get you more examples.

Bottom line, Perl does {*filter*} {*filter*}uous things with i/o and regular
expressions to make them very fast.  Python does less magic, which
means that it leaves more room for you to do things like provide
pseudo file objects with special behaviour (like the StringIO module),
or replace the regular expression module (regex vs. re).



Sun, 30 Jun 2002 03:00:00 GMT  
 Perl speed vs. Python speed
Perl is fine-tuned for text processing. So try to compare with something else.
IMHO, run-time performance should not be the main reason to choose Python.

Cheers,
_______________________________________________________________________

Quote:
> Date: Wed, 12 Jan 2000 10:02:23 GMT


> Subject: Perl speed vs. Python speed

> I just wrote a tiny Python program to do some text processing.
> When I ran it I was very surprised how long it took to finish.
> My first guess is that Perl would be a lot faster, maybe even a
> factor 10, but I did not do a direct comparison.

> Could someone with more experience comment on Perl speed vs.
> Python speed?

> Thanks!
>         Ralf

> Sent via Deja.com http://www.deja.com/
> Before you buy.
> --
> http://www.python.org/mailman/listinfo/python-list



Sun, 30 Jun 2002 03:00:00 GMT  
 Perl speed vs. Python speed

Quote:

> The "new" Python re module needs to be optimized and possibly pushed
> into C code, the older regex module is about twice as fast.

if everything goes according to plan, 're' (and possibly also
'regex') will be replaced by a new unicode-aware engine in
1.6.  I've attached some (somewhat outdated) benchmarks.

note that 'regex' is faster than 're' only if you apply simple
regular expressions many times.  if you can reorganize the
code to use a single regular expression on a larger string,
're' beats the hell out of 'regex'.  ...and if you make things
complicated enough, 'regex' stops working... (but as usual,
some people prefer to get the wrong answer quickly ;-)

</F>

          0     5    50   250  1000  5000 25000
----- ----- ----- ----- ----- ----- ----- -----
search for Python|Perl in Perl ->
sre8  0.007 0.008 0.010 0.010 0.020 0.073 0.349
sre16 0.007 0.007 0.008 0.010 0.020 0.075 0.353
re    0.097 0.097 0.101 0.103 0.118 0.175 0.480
regex 0.007 0.007 0.009 0.020 0.059 0.271 1.320

search for (Python|Perl) in Perl ->
sre8  0.007 0.007 0.007 0.010 0.020 0.074 0.344
sre16 0.007 0.007 0.008 0.010 0.020 0.074 0.347
re    0.110 0.104 0.111 0.115 0.125 0.184 0.559
regex 0.006 0.006 0.009 0.019 0.057 0.285 1.432

search for Python in Python ->
sre8  0.007 0.007 0.007 0.011 0.021 0.072 0.387
sre16 0.007 0.007 0.008 0.010 0.022 0.082 0.365
re    0.107 0.097 0.105 0.102 0.118 0.175 0.511
regex 0.009 0.008 0.010 0.018 0.036 0.139 0.708

search for .*Python in Python ->
sre8  0.008 0.007 0.008 0.011 0.021 0.079 0.379
sre16 0.008 0.008 0.008 0.011 0.022 0.075 0.402
re    0.102 0.108 0.119 0.183 0.400 1.545 7.284
regex 0.013 0.019 0.072 0.318 1.231 8.035 45.366

search for .*Python.* in Python ->
sre8  0.008 0.008 0.008 0.011 0.021 0.080 0.383
sre16 0.008 0.008 0.008 0.011 0.021 0.079 0.395
re    0.103 0.108 0.119 0.184 0.418 1.685 8.378
regex 0.013 0.020 0.073 0.326 1.264 9.961 46.511

search for .*(Python) in Python ->
sre8  0.007 0.008 0.008 0.011 0.021 0.077 0.378
sre16 0.007 0.008 0.008 0.011 0.021 0.077 0.444
re    0.108 0.107 0.134 0.240 0.637 2.765 13.395
regex 0.026 0.112 3.820 87.322 ...

search for .*P.*y.*t.*h.*o.*n.* in Python ->
sre8  0.010 0.010 0.014 0.031 0.093 0.419 2.212
sre16 0.010 0.011 0.014 0.030 0.093 0.419 2.292
re    0.112 0.121 0.195 0.521 1.747 8.298 40.877
regex 0.026 0.048 0.248 1.148 4.550 24.720 ...

(searching for the given patterns in strings padded
with blanks on both sides.  sre8 is the new python
1.6 engine on 8-bit strings, sre16 is the same engine
using unicode)

regular
1: belonging to a religious order



Sun, 30 Jun 2002 03:00:00 GMT  
 Perl speed vs. Python speed

Quote:


>> The "new" Python re module needs to be optimized and possibly pushed
>> into C code, the older regex module is about twice as fast.

>if everything goes according to plan, 're' (and possibly also
>'regex') will be replaced by a new unicode-aware engine in
>1.6.  I've attached some (somewhat outdated) benchmarks.

Cool.  How much rewriting is it going to require in existing scripts?

Quote:
>note that 'regex' is faster than 're' only if you apply simple
>regular expressions many times.

Which, I think you'll have to admit, is probably the most common usage.

Quote:
>if you can reorganize the
>code to use a single regular expression on a larger string,
>'re' beats the hell out of 'regex'.  ...and if you make things
>complicated enough, 'regex' stops working... (but as usual,
>some people prefer to get the wrong answer quickly ;-)

Big complicated regular expressions are a recipe for insanity.  Ask
Tim, and he'll recommend using small simple regular expressions 9
times out of 10.  At least thats how he's responded every time I've
asked about problems with some excessively clever regular expression
here.  Heck, I think you've even told me the same thing on at least
one occasion. ;-)


Sun, 30 Jun 2002 03:00:00 GMT  
 Perl speed vs. Python speed

Quote:


> >if you can reorganize the
> >code to use a single regular expression on a larger string,
> >'re' beats the hell out of 'regex'.  ...and if you make things
> >complicated enough, 'regex' stops working... (but as usual,
> >some people prefer to get the wrong answer quickly ;-)

> Big complicated regular expressions are a recipe for insanity.  Ask
> Tim, and he'll recommend using small simple regular expressions 9
> times out of 10.  At least thats how he's responded every time I've
> asked about problems with some excessively clever regular expression
> here.  Heck, I think you've even told me the same thing on at least
> one occasion. ;-)

Fredrik is emphasizing size of the string, not complexity of the
expression. While re is also better at backtracking and the
hairy stuff, the main correlation is with the size of the string
being searched - regex is better at short inputs and falls off
quickly thereafter.

Also, a long pattern is not necessarily a complex pattern. A
humongous set of alternatives is no problem. The problem is
when people expect to be able to do more than one thing at a
time with a pattern.

Such as match an entire Python triple quoted string.

Which Tim does.

bad-Tim-<spank>-<spank>-ly y'rs

- Gordon



Sun, 30 Jun 2002 03:00:00 GMT  
 Perl speed vs. Python speed
[Tom Culliton, to /F]

Quote:
> ...
> Ask Tim and he'll recommend using small simple regular
> expressions 9 times out of 10.  At least thats how he's
> responded every time I've asked about problems with some
> excessively clever regular expression here.

Well, there *is* a bit of bias in the stimulus/response here:  by the time
someone writes a regexp so hairy that they feel compelled to post it to the
world in despair of ever getting it to work, the advice to simplify isn't
really hard to come up with <wink>.

[Gordon McMillan]

Quote:
> Fredrik is emphasizing size of the string, not complexity
> of the expression.

This is putting words in /F's mouth -- although I suspect they were the
words he intended to put there himself.  The whole "speed problem" with
regex vs re is that the latter has higher fixed overhead per call, but is
(often) much faster once it gets going.

Quote:
> [and here Gordon rudely repeats what I just wrote <wink>]
> Also, a long pattern is not necessarily a complex pattern. A
> humongous set of alternatives is no problem. The problem is
> when people expect to be able to do more than one thing at a
> time with a pattern.

> Such as match an entire Python triple quoted string.

> Which Tim does.

Hmm.  I'm inclined to call a single string one thing -- unless you pay
exaggerated attention to the three quotes <wink>.  For sure, I don't
hesitate to write extremely involved regexps for tokenization, but those are
always of the form

    alt1 | alt2 | alt3 | ... | altn

where the alt_i are mutually exclusive and so can be developed & reasoned
about in isolation (your "humongous set of alternatives", albeit that n is
rarely more than a dozen).  And I move heaven, earth and three acres of hell
to keep any possibility of backtracking out of each alt_i!  The regexp is
thus doubly defanged, and can't cause conceptual, speed or memory problems
regardless of string size or whether it matches or fails.

The whole trick to outsmarting an NFA engine lies in making sure none pf the
NFA "let's guess what they meant by this ambiguous pattern" machinery ever
gets triggered -- once it does, death eventually follows.

well-really-your-user's-but-it's-easy-to-get-more-
    of-them<wink>-ly y'rs  - tim



Mon, 01 Jul 2002 03:00:00 GMT  
 Perl speed vs. Python speed

Quote:

> if everything goes according to plan, 're' (and possibly also
> 'regex') will be replaced by a new unicode-aware engine in
> 1.6.  I've attached some (somewhat outdated) benchmarks.

Thanks for the details, I'm moderately desperate to get integrated
unicode support, when are we hoping to see the first 1.6 beta hit the
streets?

ht
--
  Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh
     2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440

                     URL: http://www.ltg.ed.ac.uk/~ht/



Mon, 01 Jul 2002 03:00:00 GMT  
 Perl speed vs. Python speed
(answering a bunch of questions in one go)

Quote:

>if everything goes according to plan, 're' (and possibly also
>'regex') will be replaced by a new unicode-aware engine in
>1.6.  I've attached some (somewhat outdated) benchmarks.

Tom Culliton:

Quote:
> Cool.  How much rewriting is it going to require in existing scripts?

hopefully, the new code will be good enough to fully
replace the existing (pc)re and regex engines.  so as
long as you're not (ab)using some strange feature that
isn't covered by python's test suite, old code should
work just fine.

...

Henry S. Thompson:

Quote:
> Thanks for the details, I'm moderately desperate to get integrated
> unicode support, when are we hoping to see the first 1.6 beta hit the
> streets?

I'm speculating here, but I wouldn't be surprised if
the unicode support will appear in the CVS version
soon after the conference.

as for the official 1.6 release, only GvR knows...

...

Neil Hodgson:

Quote:
>    Any hope of a UCS-4 version?

>    Combining Python's preeminence for Mayan calendrical work with support
> for hieroglyphics will allow us to conquer the whole of the ancient world.

python 1.6's unicode support will use 16 bits on
all platforms.

(if it's good enough for Java/Windows, etc...)

extending this for 32-bit characters shouldn't be that
hard, but I guess we have to wait for Python 2.0...

</F>



Tue, 02 Jul 2002 03:00:00 GMT  
 Perl speed vs. Python speed

* Fredrik Lundh
|
| (searching for the given patterns in strings padded with blanks on
| both sides.  sre8 is the new python 1.6 engine on 8-bit strings,
| sre16 is the same engine using unicode)

Which Unicode encoding will be used in the 16bit version? UCS-2 or
UTF-16?

--Lars M.



Tue, 02 Jul 2002 03:00:00 GMT  
 Perl speed vs. Python speed

Quote:
> (searching for the given patterns in strings padded
> with blanks on both sides.  sre8 is the new python
> 1.6 engine on 8-bit strings, sre16 is the same engine
> using unicode)

   Any hope of a UCS-4 version?

   Combining Python's preeminence for Mayan calendrical work with support
for hieroglyphics will allow us to conquer the whole of the ancient world.

   Neil



Wed, 03 Jul 2002 03:00:00 GMT  
 
 [ 11 post ] 

 Relevant Pages 

1. Speed of Python vs. Perl

2. Speed problems with Python vs. Perl

3. Yet another Python vs. Perl speed issue/question

4. Speed of Python vs. Perl

5. python vs. perl speed comparisons

6. Python vs Perl: benchmarking for speed?

7. Perl vs TCL (was: Execution speed of Perl?)

8. Speed..Speed..Speed

9. integer*8 speed vs integer*4 speed

10. Java speed vs Tcl speed

11. TCL Speed vs PERL

12. Speed comparison: Python-Perl-Java-Smalltalk

 

 
Powered by phpBB® Forum Software