Parsing Comma delimited files in J 
Author Message
 Parsing Comma delimited files in J

Greetings,

There have been a number of messages in this group lately
about reading comma delimited *.csv files in J.  This is something
I've been doing for years with the following verb.  It works
quite well for small and modestly sized *.csv files.  Large files will
have to be processed in chunks.

NB. J script begins *********************************************

CR=: 13 { a.

NB. assertion error with message
assertmsg=: [ 13!:8 ([: 0&e. ]) $ 12"_

NB. parse comma delimited *.csv files the y. argument
NB. is a comma delimited character list.  The x. argument
NB. specifies alternate delimiters. Assumes LF , CRLF or LFCR
NB. delimited lines

parsecsv=: 3 : 0
',' parsecsv y.
:
'separater cannot be the " character' assertmsg -. x. -: '"'

NB. CRLF delimited *.csv text to char table
y. =.  x. ,. ];._2 y. -. CR

NB. bit mask of unquoted " field delimiters
b  =. -. }. ~:/\ '"' e.~  ' ' , , y.
b  =. ($y.) $ b *. , x. = y.

NB. use masks to cut lines
b <;._1"1 y.
)

NB. J script ends ***********************************************

To use this verb do the following:

   NB. read comma delimited text file and parse
   x =.  parsecsv 1!:1 <'c:\talks\rasc\mess2.csv'

   NB. corner elements of parsed file
   10 5 {. x
+-------------+--------+--------+--------+--------+
|<SEEN>       |OBJECT_I|OBJ_ALT_|CONS_ABB|OBJ_TYPE|
+-------------+--------+--------+--------+--------+
|3/27/95 21:25|M1      |NGC1952 |Tau     |PN      | <-- wrong supernova
+-------------+--------+--------+--------+--------+     remnant, never
|5/25/96 1:30 |M10     |NGC6254 |Oph     |GC      |     noticed this before
+-------------+--------+--------+--------+--------+
|             |M100    |NGC4321 |Com     |SG      |
+-------------+--------+--------+--------+--------+
|7/21/96 0:30 |M101    |NGC5457 |UMa     |SG      |
+-------------+--------+--------+--------+--------+
|7/21/96 1:00 |M102    |NGC5457 |UMa     |SG      |
+-------------+--------+--------+--------+--------+
|9/20/96 0:30 |M103    |NGC581  |Cas     |OC      |
+-------------+--------+--------+--------+--------+
|3/25/95 23:40|M104    |NGC4594 |Vir     |SG      |
+-------------+--------+--------+--------+--------+
|             |M105    |NGC3379 |Leo     |EG      |
+-------------+--------+--------+--------+--------+
|             |M106    |NGC4258 |CVn     |SG      |
+-------------+--------+--------+--------+--------+

   NB. top of raw data looks like:

<SEEN>,OBJECT_I,OBJ_ALT_,CONS_ABB,OBJ_TYPE,OBJ_RA,OBJ_DEC,OBJ_MAGV,OBJ_SIZE,
OBJ_BURN,OBJ_COMM,SITE,OPTICS
3/27/95 21:25,M1,NGC1952,Tau,PN,534.5,22.01,8.2,6x4,!!,SNR (1054) - Crab
Nebula,Home,125a/18e telescope
5/25/96 1:30,M10,NGC6254,Oph,GC,1657.1,-4.06,6.6,8.2,!,VII,Home,7*50 binoculars
,M100,NGC4321,Com,SG,1222.9,15.49,9.4,5.3x4.5,!,Sc - fine spiral,,

    ....

Hope this helps.
------------------------------------------------------------------------
"Natural selection: the ultimate focus group!
------------------------------------------------------------------------
John D. Baker



Sun, 30 May 1999 03:00:00 GMT  
 
 [ 1 post ] 

 Relevant Pages 

1. newbie needs a little help parsing a comma delimited file

2. Parsing Comma Delimited Data

3. parsing a comma delimited string

4. Export Clarion .DAT files to ASCII comma delimited files

5. Export Clarion .DAT files to ASCII comma delimited files

6. getting fields NOT comma delimited with commas inside

7. matching records in a comma delimited file

8. Comma delimited file problem

9. 2.01 Comma delimited ASCII file

10. Import comma delimited text file

11. VW code to read comma-delimited text files??

12. importing from a comma delimited file

 

 
Powered by phpBB® Forum Software