newbie needs a little help parsing a comma delimited file 
Author Message
 newbie needs a little help parsing a comma delimited file

Hello all,

 I am trying to learn lisp ( I am in the middle of Touretzky's Gentle
Introduction book, and I've got Winston's book as well - I just started
trying to learn lisp about a week ago).  I have been programming C++ for
the last 10 years, and I work on a financial trading floor.  I would
like to start converting a bunch of C++
code to lisp ( I have franz's lisp  for linux, and am delighted with it)

I would like to do the following:
take an ascii file which is uniformly comma delimited ( except at the
end of each line)   i.e.

double,double,double,double,double, double
double,double,double,double,double, double
.
.
double,double,double,double,double, double

for an unknown amount of lines, and create a list
in the form (  (double,double,double,double,double, double)
                      (double,double,double,double,double, double)
                       .
                       .
                        (double,double,double,double,double, double))

I am playing around with the   with-open-file and the read functions,
but
I am not really sure whether to read the file in character by character,
peeking ahead
for a comma, or parsing it line by line.

Any source code or pointers would be incredibly helpful.  By converting
my C++ code to lisp,
I know I will eliminate A LOT of past and future work that I have been
slaving over in the C++ world ( like representing integers, doubles,
chars, char*   as generic pieces of data ) which lisp hands me for free.

In fact, for the past 2 years I have been working on genericizing
functions ( in the form of template function
objects) and objects.  Lisp does all this already.  It's pretty damned
amazing.

 ( Anyway, here is party of some c++ code - a static member function
which gets fed lines of a file, and then returns the individual pieces
in a vector of strings,    I know lisp is going to be easier , once I
get some experience!! )

//------------------------------------------------------------
// parse a string using delimiter returning individual strings in vector

// boolean indicates whether there is a delimier at end of line or not
  vector<string> CRecordset::parseLine( string line, string delimiter,
bool isDelimiterAtEOL )
{
  vector<string> substrings;
  string::size_type pos = 0;
  string::size_type posPrevious = pos;
  string::iterator itBegin =line.begin();
  while (( pos = line.find_first_of(delimiter , pos)) != string::npos) {

    string substring(itBegin+posPrevious,itBegin+pos);
    substrings.push_back(substring);
    pos++;
    posPrevious = pos;
  }

  // handle case of no delimiter at End of Line
  // yet we want that piece of data
  //
  // i.e.   5,10,189
  //
  if (isDelimiterAtEOL == false ) {
    string::iterator itEnd = line.end();
    string substring(itBegin+ posPrevious, itEnd );
    substrings.push_back(substring);
  }

  return substrings;

Quote:
}



Sun, 17 Mar 2002 03:00:00 GMT  
 newbie needs a little help parsing a comma delimited file

Quote:

> Hello all,

>  I am trying to learn lisp ( I am in the middle of Touretzky's Gentle
> Introduction book, and I've got Winston's book as well - I just started
> trying to learn lisp about a week ago).  I have been programming C++ for
> the last 10 years, and I work on a financial trading floor.  I would
> like to start converting a bunch of C++
> code to lisp ( I have franz's lisp  for linux, and am delighted with it)

> I would like to do the following:
> take an ascii file which is uniformly comma delimited ( except at the
> end of each line)   i.e.

> double,double,double,double,double, double
> double,double,double,double,double, double
> .

If you want to do this parser in LISP as an exercise in learning LISP,
fine.  However, if you just want to read this data into LISP, why not
have the C++ code write out the data as a LISP list?  You already have
the code to read it into C++, after all.

--
Samir Barjoud



Sun, 17 Mar 2002 03:00:00 GMT  
 newbie needs a little help parsing a comma delimited file

Quote:

> If you want to do this parser in LISP as an exercise in learning LISP,
> fine.  However, if you just want to read this data into LISP, why not
> have the C++ code write out the data as a LISP list?  You already have
> the code to read it into C++, after all.

> --
> Samir Barjoud


I want to do this purely as an exercise in learning LISP.  I would like those
with experience doing
this simple ( for others :)    ) exercise to point me towards enlightenment.

dave linenberg



Mon, 18 Mar 2002 03:00:00 GMT  
 newbie needs a little help parsing a comma delimited file

Quote:


> > If you want to do this parser in LISP as an exercise in learning LISP,
> > fine.  However, if you just want to read this data into LISP, why not
> > have the C++ code write out the data as a LISP list?  You already have
> > the code to read it into C++, after all.

> > --
> > Samir Barjoud

> I want to do this purely as an exercise in learning LISP.  I would like those
> with experience doing
> this simple ( for others :)    ) exercise to point me towards enlightenment.

Part of learning a language, of course, is learning how to do what it
wants you to do in the style it wants you to do it, so Samir's suggestion
*is* learning LISP, in a sense.  Printing out info in C++ style and
then asking LISP to read it in that way will cause you to write a program
in Lisp that is not very lispy, and so what it teaches you about the
language is questionable.

Here's an example of one way to do it that takes advantage of the fact
that you're doing something that involves relatively simple, predictable
syntax and where you might not mind redefining certain syntaxes of Lisp
that might be used differently in other cases.  But I really don't recommend
this as a general solution and am not sure it teaches good practices to
novices.

(defvar *my-readtable* (copy-readtable *readtable*))

(set-syntax-from-char #\, #\Space *my-readtable*)

(defun parse-comma-delimited-numbers-from-string (string)
  (with-input-from-string (string-stream string)
    (parse-comma-delimited-numbers-from-stream string-stream)))

(defun parse-comma-delimited-numbers-from-stream (stream)
  (let ((*readtable* *my-readtable*)
        (*read-eval* nil)) ;disable trojan horses
    (loop for x = (read stream nil nil)
          while x
          collect x)))

(parse-comma-delimited-numbers-from-string "1.0,3.0,5.3d7,6.2")
=> (1.0 3.0 5.3E7 6.2)

;; Note: I tested an earlier version of this but then edited it a lot
;;  to make it prettier so may have broken it somewhere along the way.
;;  The basic concept should work, though.



Mon, 18 Mar 2002 03:00:00 GMT  
 newbie needs a little help parsing a comma delimited file

+---------------
| (defvar *my-readtable* (copy-readtable *readtable*))
| (set-syntax-from-char #\, #\Space *my-readtable*)
| (defun parse-comma-delimited-numbers-from-string (string)
|   (with-input-from-string (string-stream string)
|     (parse-comma-delimited-numbers-from-stream string-stream)))
| (defun parse-comma-delimited-numbers-from-stream (stream)
|   (let ((*readtable* *my-readtable*)
|         (*read-eval* nil)) ;disable trojan horses
|     (loop for x = (read stream nil nil)
|           while x
|           collect x)))
|
| (parse-comma-delimited-numbers-from-string "1.0,3.0,5.3d7,6.2")
| => (1.0 3.0 5.3E7 6.2)
|
| ;; Note: I tested an earlier version of this but then edited it a lot
| ;;  to make it prettier so may have broken it somewhere along the way.
| ;;  The basic concept should work, though.
+---------------

Hmmm... The test case works fine in CLISP, but CMUCL 18b complains:

        * (parse-comma-delimited-numbers-from-string "1.0,3.0,5.3d7,6.2")
        Reader error at 4 on #<String-Input Stream>:
        Comma not inside a backquote.
        [...thence to de{*filter*}...]

-Rob

-----

Applied Networking               http://www.*-*-*.com/
Silicon Graphics, Inc.          Phone: 650-933-1673
1600 Amphitheatre Pkwy.         FAX: 650-933-0511
Mountain View, CA  94043        PP-ASEL-IA



Fri, 22 Mar 2002 03:00:00 GMT  
 newbie needs a little help parsing a comma delimited file

Quote:

> | (parse-comma-delimited-numbers-from-string "1.0,3.0,5.3d7,6.2")
> | => (1.0 3.0 5.3E7 6.2)
> |
> | ;; Note: I tested an earlier version of this but then edited it a lot
> | ;;  to make it prettier so may have broken it somewhere along the way.
> | ;;  The basic concept should work, though.
> +---------------

> Hmmm... The test case works fine in CLISP, but CMUCL 18b complains:

>         * (parse-comma-delimited-numbers-from-string "1.0,3.0,5.3d7,6.2")
>         Reader error at 4 on #<String-Input Stream>:
>         Comma not inside a backquote.
>         [...thence to de{*filter*}...]

Maybe it is time for a bug report. ;-)


Fri, 22 Mar 2002 03:00:00 GMT  
 newbie needs a little help parsing a comma delimited file
Hi all,

I originally had the question about parsing comma delimited files. ( I just started learning
lisp a week ago)   It is VERY common to have to parse such files ( exported from excel,  data
files holding equity/ forex/ interest-rate instrument data etc. ) on the trading floor where
I work.

In any case, I came up with a simple function,

(defun comma-parse( in-string )
 (concatenate 'string "(" (substitute #\space #\, in-string) ")" ))

which simply substitues spaces for the commas, if they exist, and adds parentheses to both
sides of the string. If you wrap that with the read-from-string function   i.e.

(read-from-string ( comma-parse "1,2,3,4,5"))

(1 2 3 4 5)

which is exactly what I wanted. This can easily be implemented in a loop or do construct
which reads the file line by line until the eof, appending or collecting the lines into the
desired form.

(   (1 2 3 4 5)
    (1 1 5 9 2)
.
.
    )

Although I have pretty much no experience with lisp, I am blown away by how beautiful and
powerful the language is.  Although I haven't even scratched the surface of the language, it
is immediately apparent that the amount of code required to do amazingly powerful things is
quite small.

It's amazing that doing the above in C++ would require quite a bit of code, which would
ultimately be a hack implemented by some programmer(s) implementing containers (lists
/vectors ) holding containers ofgeneric
objects ( string, double, int ), each with a char* constructor, to be read from a file.  I've
done this,  and it's taken a while ( ie at least a year), and lisp gives this, and a lot more
for free!!

dave linenberg

Quote:


> > | (parse-comma-delimited-numbers-from-string "1.0,3.0,5.3d7,6.2")
> > | => (1.0 3.0 5.3E7 6.2)
> > |
> > | ;; Note: I tested an earlier version of this but then edited it a lot
> > | ;;  to make it prettier so may have broken it somewhere along the way.
> > | ;;  The basic concept should work, though.
> > +---------------

> > Hmmm... The test case works fine in CLISP, but CMUCL 18b complains:

> >         * (parse-comma-delimited-numbers-from-string "1.0,3.0,5.3d7,6.2")
> >         Reader error at 4 on #<String-Input Stream>:
> >         Comma not inside a backquote.
> >         [...thence to de{*filter*}...]

> Maybe it is time for a bug report. ;-)



Sat, 23 Mar 2002 03:00:00 GMT  
 
 [ 7 post ] 

 Relevant Pages 

1. Parsing Comma delimited files in J

2. Need Help converting a Clarion DB to comma delimited

3. Help with Comma Delimited File

4. Parsing Comma Delimited Data

5. parsing a comma delimited string

6. Export Clarion .DAT files to ASCII comma delimited files

7. Export Clarion .DAT files to ASCII comma delimited files

8. getting fields NOT comma delimited with commas inside

9. need fast parser for comma/space delimited numbers

10. DejaNews down (was RE: need fast parser for comma/space delimited numbers)

11. matching records in a comma delimited file

12. Comma delimited file problem

 

 
Powered by phpBB® Forum Software