is Lisp used in text parsing and processing tasks? 
Author Message
 is Lisp used in text parsing and processing tasks?

Hi!

Is Lisp being used in programs that do text processing of various
kind, such as parsing and manipulating an HTML file, parsing C code
(and reindenting it), etc.? Would Lisp be a good to choice if one is
to write such software?

Best wishes,
Tuomas



Sun, 01 May 2005 04:53:27 GMT  
 is Lisp used in text parsing and processing tasks?

Quote:

>Is Lisp being used in programs that do text processing of various
>kind, such as parsing and manipulating an HTML file, parsing C code
>(and reindenting it), etc.? Would Lisp be a good to choice if one is
>to write such software?

The indenting code in Emacs is all written in Lisp.  There's also a web
browser integrated into Emacs, so it does HTML parsing in Lisp.

--

Genuity, Woburn, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.



Sun, 01 May 2005 04:56:01 GMT  
 is Lisp used in text parsing and processing tasks?

Quote:

> Is Lisp being used in programs that do text processing of various
> kind, such as parsing and manipulating an HTML file, parsing C code
> (and reindenting it), etc.? Would Lisp be a good to choice if one is
> to write such software?

As far as parsing the various formats out there, I don't think that CL
is necessarily stronger or weaker than other languages.  It sometimes
helps to have built-in regular expressions, which CL doesn't have, but
which some vendors and implementations support.  Even so, the Lisp
regexp packages don't tend to be as fast as those in C, and probably not
even as fast as they are in Perl and Python.

In general, I'd say that CL doesn't have the most powerful string
processing, though its list processing functions are very powerful.

Stream operations have also been improved in ACL (based on Simple
Streams), which might indirectly affect overall parsing performance,
though I don't know how much of an overall effect this will have on
parsing in general.

Zebu is an example of a parser generator implemented in CL.  Of course,
if what you're parsing is S-expressions, then I doubt there's anything
better than CL, though this is seldom the case, unfortunately.

Other parser-related links can be found at:

http://www-2.cs.cmu.edu/afs/cs/project/ai-repository/ai/areas/nlp/par...

dave



Sun, 01 May 2005 08:19:20 GMT  
 is Lisp used in text parsing and processing tasks?
Quote:


> > Is Lisp being used in programs that do text processing of various
> > kind, such as parsing and manipulating an HTML file, parsing C code
> > (and reindenting it), etc.? Would Lisp be a good to choice if one is
> > to write such software?

<snip>

Quote:
> Zebu is an example of a parser generator implemented in CL.  Of course,
> if what you're parsing is S-expressions, then I doubt there's anything
> better than CL, though this is seldom the case, unfortunately.

While pure s-expressions aren't that common, many file formats are still
readable with CL READ and relatives with a little massaging
(filter out/replace some characters like #:,'`).
I once wrote a parser in CL for IFC (uses ISO 10303-21 format), which
clearly outperformed the previous implementation based on lex/yacc.

--



Sun, 01 May 2005 09:06:23 GMT  
 is Lisp used in text parsing and processing tasks?

Quote:


> > Is Lisp being used in programs that do text processing of various
> > kind, such as parsing and manipulating an HTML file, parsing C code
> > (and reindenting it), etc.? Would Lisp be a good to choice if one is
> > to write such software?

> As far as parsing the various formats out there, I don't think that CL
> is necessarily stronger or weaker than other languages.  It sometimes
> helps to have built-in regular expressions, which CL doesn't have, but
> which some vendors and implementations support.  Even so, the Lisp
> regexp packages don't tend to be as fast as those in C, and probably not
> even as fast as they are in Perl and Python.

The author of the REGEX package claims 5-20x speed improvement over the
GNU C regular expression library.  I haven't verified his claim
personally but I've made available Debian packages (and the ASDF file
which can be used w/o Debian) for CL-REGEX, CL-AWK (CLAWK), and
CL-LEXER.  See <http://www.mapcar.org/~mrd/debs/unstable/>, and the
author's homepage at <http://www.geocities.com/mparker762/clawk.html>.

--

; OpenPGP public key: C24B6010 on keyring.debian.org
; Signed or encrypted mail welcome.
; "There is no dark side of the moon really; matter of fact, it's all dark."



Sun, 01 May 2005 15:38:32 GMT  
 is Lisp used in text parsing and processing tasks?
Quote:

> The author of the REGEX package claims 5-20x speed improvement over the
> GNU C regular expression library.  I haven't verified his claim
> personally but I've made available Debian packages (and the ASDF file
> which can be used w/o Debian) for CL-REGEX, CL-AWK (CLAWK), and
> CL-LEXER.  See <http://www.mapcar.org/~mrd/debs/unstable/>, and the
> author's homepage at <http://www.geocities.com/mparker762/clawk.html>.

In the last distributions of CLAWK there is a speedtest.c program included.
So anyone who want to do can test the speed advantage of REGEX by himself.
I made these tests some times ago and found that REGEX indeed is much
faster than the linux regex library.

Best
AHz



Sun, 01 May 2005 18:29:30 GMT  
 is Lisp used in text parsing and processing tasks?

Quote:

> Is Lisp being used in programs that do text processing of various
> kind, such as parsing and manipulating an HTML file, parsing C code
> (and reindenting it), etc.? Would Lisp be a good to choice if one is
> to write such software?

We use ANSI Common Lisp for this very purpose (among other things)
here.  We have several web robots that can parse various formats,
scoring and rating things that it downloads based on certain criteria,
and reporting their findings.  To say we have several is actually a
lie; we have one robot engine and various frontends to the engine that
allows us to use basically the same robot code for sometimes very
different purposes.

Whether Lisp is a good choice for the job at hand will depend on your
requirements.  If you are writing something that will perform quick
and dirty parsing, Lisp isn't going to offer you significant
advantages over some other languages that are well-suited for text
processing (e.g., Perl).  Most Lisp implementations (including the
free CLISP) have a way to use regular expressions, so it isn't as
though Perl and friends will offer significant advantages over Common
Lisp, either.  Issues like availability of other libraries that you
could use, portability, maintenance, operational considerations,
speed, etc., will need to be considered.

In our case, Lisp was a very good choice.  The ability to patch a
running program without having to stop it is quite nice, and Lisp also
makes it very easy to save state so that the robot can simply pause
for a while during a system reboot...it just loads its memory image
and picks up right where it left off.  We have also been able to
introduce new people to the code and have them understand, very
quickly, what is happening, and how they can perform their work on the
code.

(And on that point, let me add that I think that the concern about
difficulty in finding competent Lisp programmers is bunk.  I don't
think that finding competent Lispers is inherently more difficult than
finding competent programmers for any other language.  Competent
programmers require some looking around, period, irrespective of the
intended target language.  The difference is that there aren't a lot
of incompetent "programmers" out there claiming to be able to write
Common Lisp code.  End of rant.)

--
Matt Curtin  Interhack Corp  +1 614 545 HACK  http://web.interhack.com/
Author,  /Developing Trust: Online Privacy and Security/ (Apress, 2001)
Programming should be fun.  Programs should be beautiful. --Paul Graham



Sun, 01 May 2005 20:29:45 GMT  
 is Lisp used in text parsing and processing tasks?

        ...

Quote:
> While pure s-expressions aren't that common, many file formats are still
> readable with CL READ and relatives with a little massaging
> (filter out/replace some characters like #:,'`).

Exactly. One standard trick on many  ASCII based formats (e.g. rows of numbers)
is to just do

        (read (format nil "(~A)" (remove-bad-characters line)))

Cheers

--
Marco Antoniotti ========================================================
NYU Courant Bioinformatics Group        tel. +1 - 212 - 998 3488
715 Broadway 10th Floor                 fax  +1 - 212 - 995 4122
New York, NY 10003, USA                 http://bioinformatics.cat.nyu.edu
                    "Hello New York! We'll do what we can!"
                           Bill Murray in `Ghostbusters'.



Sun, 01 May 2005 22:55:25 GMT  
 is Lisp used in text parsing and processing tasks?


Quote:
>(And on that point, let me add that I think that the concern about
>difficulty in finding competent Lisp programmers is bunk.  I don't
>think that finding competent Lispers is inherently more difficult than
>finding competent programmers for any other language.  Competent
>programmers require some looking around, period, irrespective of the
>intended target language.  The difference is that there aren't a lot
>of incompetent "programmers" out there claiming to be able to write
>Common Lisp code.  End of rant.)

IMHO, very competent programmers are usually able to adapt to just about
any programming language or environment.  They know general programming and
CS concepts, and have used a few different ones so they know the various
paradigms and idioms.

A programmer who can't learn new languages easily, while not necessarily
incompetent, is at best mediocre.

--

Genuity, Woburn, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.



Sun, 01 May 2005 23:37:18 GMT  
 is Lisp used in text parsing and processing tasks?

Quote:

> IMHO, very competent programmers are usually able to adapt to just about
> any programming language or environment.  They know general programming and
> CS concepts, and have used a few different ones so they know the various
> paradigms and idioms.

        I would agree as far as programming languages are concerned,
but _not_ about environments. In particular, I think it should be
allowed to be "incompatible" with Microsoft Visual Studio without
necessarily being incompetent.

--

Senior Software Engineer             Web:   http://www.fast.no/
Fast Search & Transfer ASA           Phone: +47 23 01 11 60
P.O. Box 1677 Vika                   Fax:   +47 35 54 87 99
NO-0120 Oslo, NORWAY                 Mob:   +47 48 01 11 60

Try FAST Search: http://alltheweb.com/



Mon, 02 May 2005 00:37:41 GMT  
 is Lisp used in text parsing and processing tasks?

Quote:

> A programmer who can't learn new languages easily, while not necessarily
> incompetent, is at best mediocre.

Absolutely. I watched three people (me included) learn CL after I
adopted it for my little team. No problemo. Mind you, eight years later
I am still learning it, but I was productive almost immediately.

When our CEO worries about finding CLers, I tell him that if someone
cannot learn a new language in a week he does not want them anyway.
Especially since, as a start-up, we cannot afford dozens of developers
(which is probably /never/ a good idea).

He doesn't buy it, but his first business was and is tech recruiting and
he says the search he did for Lisp people produced by far the most
talented crop of applicants he's seen in twenty years of tech recruitng.

So he is happy.

--

  kenny tilton
  clinisys, inc
  ---------------------------------------------------------------
""Well, I've wrestled with reality for thirty-five years, Doctor,
   and I'm happy to state I finally won out over it.""
                                                   Elwood P. Dowd



Mon, 02 May 2005 01:32:12 GMT  
 is Lisp used in text parsing and processing tasks?

 > The difference is that there aren't a lot

Quote:
> of incompetent "programmers" out there claiming to be able to write
> Common Lisp code.

Exactly. In fact, sick idea: I'd love to track down the worst programmer
who likes Lisp. Maybe hold an anti-contest. I bet they would not be very
bad. Hmmm, maybe I should look at the Lisp archives.

Then we get to argue over whether Lisp makes us write good code, or
whether only good coders like Lisp.

--

  kenny tilton
  clinisys, inc
  ---------------------------------------------------------------
""Well, I've wrestled with reality for thirty-five years, Doctor,
   and I'm happy to state I finally won out over it.""
                                                   Elwood P. Dowd



Mon, 02 May 2005 02:08:49 GMT  
 is Lisp used in text parsing and processing tasks?

Quote:

> Hi!

> Is Lisp being used in programs that do text processing of various
> kind, such as parsing and manipulating an HTML file, parsing C code
> (and reindenting it), etc.? Would Lisp be a good to choice if one is
> to write such software?

I would argue that there is nothing better. The hard thing in
processing languages is not the lexical analysis, or even the parsing;
these can be done quite easily even in languages like C. Where things
get hard is actually building the parse trees and transforming them;
and this is where Lisp really shines.

To make from-scratch lexing to run fast in Lisp, you will need a good
native compiler, and probably make liberal use of declarations to make
some of the run time type checking go away. In a tight loop that read
and branches on characters, you don't need the machine to be type
checking whether a value is in fact a character or not.

The built in lexical analyzer, called the Lisp reader, can be
customized to some extent; it's good to evaluate whether a given
language can be munged that way.



Mon, 02 May 2005 02:37:43 GMT  
 
 [ 73 post ]  Go to page: [1] [2] [3] [4] [5]

 Relevant Pages 

1. parsing HTML using LISP

2. tcl extension for parsing Open Text queries using SINSI library

3. I am trying to copy a text string from a front panel indicator to a text

4. I am trying to copy a text string from a front panel indicator to a text

5. parsing a long text file for specific text

6. Urgent!!Oracle9i:using ODBC, I am noy able to link to my database tables using Access

7. Parsing - What am I doing wrong?

8. Inter process communication using Tk/send hangs on suspended processes

9. Q: Multiple processes writing to a single process using Expect

10. backslash processing, am I on crack here?

11. AWK vs Perl For Misc Data Processing Tasks

12. I need a background task to process a file once a day

 

 
Powered by phpBB® Forum Software