newbie needs a little help parsing a comma delimited file
Author |
Message |
dave linenber #1 / 7
|
 newbie needs a little help parsing a comma delimited file
Hello all, I am trying to learn lisp ( I am in the middle of Touretzky's Gentle Introduction book, and I've got Winston's book as well - I just started trying to learn lisp about a week ago). I have been programming C++ for the last 10 years, and I work on a financial trading floor. I would like to start converting a bunch of C++ code to lisp ( I have franz's lisp for linux, and am delighted with it) I would like to do the following: take an ascii file which is uniformly comma delimited ( except at the end of each line) i.e. double,double,double,double,double, double double,double,double,double,double, double . . double,double,double,double,double, double for an unknown amount of lines, and create a list in the form ( (double,double,double,double,double, double) (double,double,double,double,double, double) . . (double,double,double,double,double, double)) I am playing around with the with-open-file and the read functions, but I am not really sure whether to read the file in character by character, peeking ahead for a comma, or parsing it line by line. Any source code or pointers would be incredibly helpful. By converting my C++ code to lisp, I know I will eliminate A LOT of past and future work that I have been slaving over in the C++ world ( like representing integers, doubles, chars, char* as generic pieces of data ) which lisp hands me for free. In fact, for the past 2 years I have been working on genericizing functions ( in the form of template function objects) and objects. Lisp does all this already. It's pretty damned amazing. ( Anyway, here is party of some c++ code - a static member function which gets fed lines of a file, and then returns the individual pieces in a vector of strings, I know lisp is going to be easier , once I get some experience!! ) //------------------------------------------------------------ // parse a string using delimiter returning individual strings in vector // boolean indicates whether there is a delimier at end of line or not vector<string> CRecordset::parseLine( string line, string delimiter, bool isDelimiterAtEOL ) { vector<string> substrings; string::size_type pos = 0; string::size_type posPrevious = pos; string::iterator itBegin =line.begin(); while (( pos = line.find_first_of(delimiter , pos)) != string::npos) { string substring(itBegin+posPrevious,itBegin+pos); substrings.push_back(substring); pos++; posPrevious = pos; } // handle case of no delimiter at End of Line // yet we want that piece of data // // i.e. 5,10,189 // if (isDelimiterAtEOL == false ) { string::iterator itEnd = line.end(); string substring(itBegin+ posPrevious, itEnd ); substrings.push_back(substring); } return substrings; Quote: }
|
Sun, 17 Mar 2002 03:00:00 GMT |
|
 |
Samir Barjou #2 / 7
|
 newbie needs a little help parsing a comma delimited file
Quote:
> Hello all, > I am trying to learn lisp ( I am in the middle of Touretzky's Gentle > Introduction book, and I've got Winston's book as well - I just started > trying to learn lisp about a week ago). I have been programming C++ for > the last 10 years, and I work on a financial trading floor. I would > like to start converting a bunch of C++ > code to lisp ( I have franz's lisp for linux, and am delighted with it) > I would like to do the following: > take an ascii file which is uniformly comma delimited ( except at the > end of each line) i.e. > double,double,double,double,double, double > double,double,double,double,double, double > .
If you want to do this parser in LISP as an exercise in learning LISP, fine. However, if you just want to read this data into LISP, why not have the C++ code write out the data as a LISP list? You already have the code to read it into C++, after all. -- Samir Barjoud
|
Sun, 17 Mar 2002 03:00:00 GMT |
|
 |
dave linenber #3 / 7
|
 newbie needs a little help parsing a comma delimited file
Quote:
> If you want to do this parser in LISP as an exercise in learning LISP, > fine. However, if you just want to read this data into LISP, why not > have the C++ code write out the data as a LISP list? You already have > the code to read it into C++, after all. > -- > Samir Barjoud
I want to do this purely as an exercise in learning LISP. I would like those with experience doing this simple ( for others :) ) exercise to point me towards enlightenment. dave linenberg
|
Mon, 18 Mar 2002 03:00:00 GMT |
|
 |
Kent M Pitma #4 / 7
|
 newbie needs a little help parsing a comma delimited file
Quote:
> > If you want to do this parser in LISP as an exercise in learning LISP, > > fine. However, if you just want to read this data into LISP, why not > > have the C++ code write out the data as a LISP list? You already have > > the code to read it into C++, after all. > > -- > > Samir Barjoud
> I want to do this purely as an exercise in learning LISP. I would like those > with experience doing > this simple ( for others :) ) exercise to point me towards enlightenment.
Part of learning a language, of course, is learning how to do what it wants you to do in the style it wants you to do it, so Samir's suggestion *is* learning LISP, in a sense. Printing out info in C++ style and then asking LISP to read it in that way will cause you to write a program in Lisp that is not very lispy, and so what it teaches you about the language is questionable. Here's an example of one way to do it that takes advantage of the fact that you're doing something that involves relatively simple, predictable syntax and where you might not mind redefining certain syntaxes of Lisp that might be used differently in other cases. But I really don't recommend this as a general solution and am not sure it teaches good practices to novices. (defvar *my-readtable* (copy-readtable *readtable*)) (set-syntax-from-char #\, #\Space *my-readtable*) (defun parse-comma-delimited-numbers-from-string (string) (with-input-from-string (string-stream string) (parse-comma-delimited-numbers-from-stream string-stream))) (defun parse-comma-delimited-numbers-from-stream (stream) (let ((*readtable* *my-readtable*) (*read-eval* nil)) ;disable trojan horses (loop for x = (read stream nil nil) while x collect x))) (parse-comma-delimited-numbers-from-string "1.0,3.0,5.3d7,6.2") => (1.0 3.0 5.3E7 6.2) ;; Note: I tested an earlier version of this but then edited it a lot ;; to make it prettier so may have broken it somewhere along the way. ;; The basic concept should work, though.
|
Mon, 18 Mar 2002 03:00:00 GMT |
|
 |
Rob Warno #5 / 7
|
 newbie needs a little help parsing a comma delimited file
+--------------- | (defvar *my-readtable* (copy-readtable *readtable*)) | (set-syntax-from-char #\, #\Space *my-readtable*) | (defun parse-comma-delimited-numbers-from-string (string) | (with-input-from-string (string-stream string) | (parse-comma-delimited-numbers-from-stream string-stream))) | (defun parse-comma-delimited-numbers-from-stream (stream) | (let ((*readtable* *my-readtable*) | (*read-eval* nil)) ;disable trojan horses | (loop for x = (read stream nil nil) | while x | collect x))) | | (parse-comma-delimited-numbers-from-string "1.0,3.0,5.3d7,6.2") | => (1.0 3.0 5.3E7 6.2) | | ;; Note: I tested an earlier version of this but then edited it a lot | ;; to make it prettier so may have broken it somewhere along the way. | ;; The basic concept should work, though. +--------------- Hmmm... The test case works fine in CLISP, but CMUCL 18b complains: * (parse-comma-delimited-numbers-from-string "1.0,3.0,5.3d7,6.2") Reader error at 4 on #<String-Input Stream>: Comma not inside a backquote. [...thence to de{*filter*}...] -Rob -----
Applied Networking http://www.*-*-*.com/ Silicon Graphics, Inc. Phone: 650-933-1673 1600 Amphitheatre Pkwy. FAX: 650-933-0511 Mountain View, CA 94043 PP-ASEL-IA
|
Fri, 22 Mar 2002 03:00:00 GMT |
|
 |
Rainer Josw #6 / 7
|
 newbie needs a little help parsing a comma delimited file
Quote:
> | (parse-comma-delimited-numbers-from-string "1.0,3.0,5.3d7,6.2") > | => (1.0 3.0 5.3E7 6.2) > | > | ;; Note: I tested an earlier version of this but then edited it a lot > | ;; to make it prettier so may have broken it somewhere along the way. > | ;; The basic concept should work, though. > +--------------- > Hmmm... The test case works fine in CLISP, but CMUCL 18b complains: > * (parse-comma-delimited-numbers-from-string "1.0,3.0,5.3d7,6.2") > Reader error at 4 on #<String-Input Stream>: > Comma not inside a backquote. > [...thence to de{*filter*}...]
Maybe it is time for a bug report. ;-)
|
Fri, 22 Mar 2002 03:00:00 GMT |
|
 |
dave linenber #7 / 7
|
 newbie needs a little help parsing a comma delimited file
Hi all, I originally had the question about parsing comma delimited files. ( I just started learning lisp a week ago) It is VERY common to have to parse such files ( exported from excel, data files holding equity/ forex/ interest-rate instrument data etc. ) on the trading floor where I work. In any case, I came up with a simple function, (defun comma-parse( in-string ) (concatenate 'string "(" (substitute #\space #\, in-string) ")" )) which simply substitues spaces for the commas, if they exist, and adds parentheses to both sides of the string. If you wrap that with the read-from-string function i.e. (read-from-string ( comma-parse "1,2,3,4,5")) (1 2 3 4 5) which is exactly what I wanted. This can easily be implemented in a loop or do construct which reads the file line by line until the eof, appending or collecting the lines into the desired form. ( (1 2 3 4 5) (1 1 5 9 2) . . ) Although I have pretty much no experience with lisp, I am blown away by how beautiful and powerful the language is. Although I haven't even scratched the surface of the language, it is immediately apparent that the amount of code required to do amazingly powerful things is quite small. It's amazing that doing the above in C++ would require quite a bit of code, which would ultimately be a hack implemented by some programmer(s) implementing containers (lists /vectors ) holding containers ofgeneric objects ( string, double, int ), each with a char* constructor, to be read from a file. I've done this, and it's taken a while ( ie at least a year), and lisp gives this, and a lot more for free!! dave linenberg Quote:
> > | (parse-comma-delimited-numbers-from-string "1.0,3.0,5.3d7,6.2") > > | => (1.0 3.0 5.3E7 6.2) > > | > > | ;; Note: I tested an earlier version of this but then edited it a lot > > | ;; to make it prettier so may have broken it somewhere along the way. > > | ;; The basic concept should work, though. > > +--------------- > > Hmmm... The test case works fine in CLISP, but CMUCL 18b complains: > > * (parse-comma-delimited-numbers-from-string "1.0,3.0,5.3d7,6.2") > > Reader error at 4 on #<String-Input Stream>: > > Comma not inside a backquote. > > [...thence to de{*filter*}...] > Maybe it is time for a bug report. ;-)
|
Sat, 23 Mar 2002 03:00:00 GMT |
|
|
|