Unix dictionary lookups in perl 
Author Message
 Unix dictionary lookups in perl

I'm writing a program that needs to check words against the unix
dictionary file.  It will need to check thousands of words in a single
execution.  Looking through CPAN, I don't see any native perl modules
(although please correct me if I've missed any).  So, what is the best
approach?

1 - Use system calls to the unix 'look' command.  Look is very fast, but
the overhead of thousands of system calls would doubtless be high.
2 - Make calls to a C library, if there is one for this purpose.
3 - Write some perl code to access the 'words' file directly.

Thanks,
Anthony

--

Vancouver, BC, Canada
http://www.*-*-*.com/ ~ajdelore/



Sat, 15 May 2004 01:39:35 GMT  
 Unix dictionary lookups in perl

Quote:

> I'm writing a program that needs to check words against the unix
> dictionary file.  It will need to check thousands of words in a single
> execution.  Looking through CPAN, I don't see any native perl modules
> (although please correct me if I've missed any).  So, what is the best
> approach?

Have you considered reading the file into a hash?  100_000 8-letter
words stored in a hash occupies about 10M of RAM on my linux box.

  % wc -l /usr/share/dict/words
  45424
  % time perl -wlne 'END{print scalar keys %dict; system "ps u $$"}
                     $dict{$_} = undef' /usr/share/dict/words
  45424
  USER       PID %CPU %MEM   VSZ  RSS TTY      STAT START   TIME COMMAND
  joe      17047 63.0  5.3  6632 5080 pts/0    S    19:55   0:01 perl

  real 0m1.312s
  user 0m1.170s
  sys  0m0.100s

--
Joe Schaefer      "Peace cannot be kept by force. It can only be achieved by
                                       understanding."
                                               --Albert Einstein



Sat, 15 May 2004 01:58:36 GMT  
 Unix dictionary lookups in perl


Quote:
> I'm writing a program that needs to check words against the unix
> dictionary file.  It will need to check thousands of words in a single
> execution.  Looking through CPAN, I don't see any native perl modules
> (although please correct me if I've missed any).  So, what is the best
> approach?

> 1 - Use system calls to the unix 'look' command.  Look is very fast, but
> the overhead of thousands of system calls would doubtless be high.
> 2 - Make calls to a C library, if there is one for this purpose.
> 3 - Write some perl code to access the 'words' file directly.

How about comm(1)?

comm selects or reject lines common to two (sorted) files. It's handy
for things like spell checkers.

Dennis



Sun, 16 May 2004 22:10:08 GMT  
 Unix dictionary lookups in perl

Quote:
> I'm writing a program that needs to check words against the unix
> dictionary file.  It will need to check thousands of words in a single
> execution.  Looking through CPAN, I don't see any native perl modules
> (although please correct me if I've missed any).  So, what is the best
> approach?

If you have access to <<Mastering Algorithms with Perl>> (pub ORA), they
detail several binary search algorithms that might work for you.

Tip: keep the file handle open, but try not to read the whole dictionary
into memory.

HTH
Tim Hammerquist
--
American society is a sort of flat, fresh-water pond which absorbs
silently, without reaction, anything which is thrown into it.
    -- Henry Brooks Adams



Fri, 21 May 2004 12:13:40 GMT  
 
 [ 4 post ] 

 Relevant Pages 

1. ip to name lookup (reverse name lookup)

2. ip to name lookup (reverse name lookup)

3. ip to name lookup (reverse name lookup)

4. DICTIONARY - Programming in PERL

5. Perl interface to Websters on-line dictionary?

6. Perl script for modifying PDF Info Dictionary

7. Perl's in the dictionary

8. PROPOSAL: The Perl Dictionary

9. perl data structure for a data table with dictionary

10. Client Based Dictionary/Thesaurus

11. Dictionaries for the Camel Password program

12. Dictionary?

 

 
Powered by phpBB® Forum Software