Using file data transparently as internal data? 
Author Message
 Using file data transparently as internal data?

I'm trying to learn some scheme and I have a tricky problem. What I would
like to do is to use scheme functions to handle veeery long lists. In fact,
there shouldn't be any limit to the length. For example, I'd like to get a
intersection of two or more lists e.g. (1 2 3 4 5)  (1 3 5) and (1 2 4 5)
each saved on a file. Is there a way to use common scheme functions to do
this nicely so that the functions themselves wouldn't need to know that they
are handling file data instead of internal data. The idea is not to load the
whole content to memory at once, but as needed, and the system shouldn't get
slower the longer the lists are. So, I guess, the system should read each
list item one by one from a file and save the correct items to another file,
one by one. I have actually managed to make this kind of function, but the
problem is that the code is heavy and ugly and I would have to customize it
to each list operation. There are many good list operations I'd like use,
but can't because of the size limit. Is there a solution to this???

Thanks,
Jukka



Fri, 11 May 2001 03:00:00 GMT  
 Using file data transparently as internal data?

Quote:

> I'm trying to learn some scheme and I have a tricky problem.  What I would
> like to do is to use scheme functions to handle veeery long lists.  In fact,
> there shouldn't be any limit to the length.  For example, I'd like to get a
> intersection of two or more lists e.g. (1 2 3 4 5) (1 3 5) and (1 2 4 5) each
> saved on a file.  Is there a way to use common scheme functions to do this
> nicely so that the functions themselves wouldn't need to know that they are
> handling file data instead of internal data.  The idea is not to load the
> whole content to memory at once, but as needed, and the system shouldn't get
> slower the longer the lists are.  So, I guess, the system should read each
> list item one by one from a file and save the correct items to another file,
> one by one.  I have actually managed to make this kind of function, but the
> problem is that the code is heavy and ugly and I would have to customize it
> to each list operation.  There are many good list operations I'd like use,
> but can't because of the size limit.  Is there a solution to this???

All things are possible.  You might wake up someday in the den in your dam,
complaining about having a horrible nightmare that you were born and raised as
a small-toothed human who posted messages to comp.lang.scheme.  And then chew
a tree down for breakfast.

One good option is to hunt down a database engine (minisql? postgres?), and
base your functionality on an existing Scheme interface to the engine.  Not
perfect, but not bad.

In a more Scheme-idiomatic fashion, I know RScheme (a Kolby-powered
paren-smashing Scheme implementation) has facilities (the rstore module) for
reading and writing binary Scheme representations of objects to files, it's
pretty snappy.  It probably tries to read the whole file at once, so you might
find yourself breaking lists up into more than one file.  This is a pretty good
option, I think.




Sat, 12 May 2001 03:00:00 GMT  
 Using file data transparently as internal data?
+---------------
| ...there shouldn't be any limit to the length. For example, I'd like to get a
| intersection of two or more lists e.g. (1 2 3 4 5)  (1 3 5) and (1 2 4 5)
| each saved on a file. Is there a way to use common scheme functions to do
| this nicely so that the functions themselves wouldn't need to know that they
| are handling file data instead of internal data.
+---------------

Yeah. Break your problem into two (almost-)orthogonal parts:

1. Convert all of your application-specific operations to work on Scheme
   "streams" (delayed-evaluation lists using promises -- any of the usual
   Scheme texts will cover "make-stream/stream-car/stream-cdr");

2. Implement a "streamed file" -- a delayed list such that "force"-ing an
   element of such a stream causes the associated list item to be read from
   a file. [You'll also need a way to write such files, but that's a *lot*
   simpler.]

Then just apply your operations (which now work on *any* "stream") as
needed to "streamed files"...

-Rob

-----

Applied Networking              http://reality.sgi.com/rpw3/
Silicon Graphics, Inc.          Phone: 650-933-1673
2011 N. Shoreline Blvd.         FAX: 650-964-0811
Mountain View, CA  94043        PP-ASEL-IA



Sat, 12 May 2001 03:00:00 GMT  
 
 [ 3 post ] 

 Relevant Pages 

1. Read VFP data files into Clipper data files

2. I have 100 data files, I want to join them together as one data file

3. Unix Data files vs DOS data files

4. non-gridded ASCII data file to netCDF data file

5. Q: Acquiring data using two E series devices connected by RTSI and streaming data to disk

6. using NNs for data mining/data analysis

7. Ms SQL fileerror 22001=String data, right truncation when I fill data into file

8. Reading a data file from a very remote data

9. help reading negative values in data file - test code and test data

10. How to recup data with Access or Excel from Cobol Data with *.ISM and *.IDX file

11. Extracting data from COBOL data files

12. distutils, data files, and data sources

 

 
Powered by phpBB® Forum Software