What is APL data storage? 
Author Message
 What is APL data storage?

I'm a computer major student. Part of my assignment is to design a
storage managment system for a heap of variaable-size blocks under
assumption that the heap is used only to store arrays of real numbers.
This assumption is basicaly the one behind APL data storage.
        Can anybody please tell me where can I find the APL data description?


Thank you! Alex.



Thu, 13 Apr 2000 02:00:00 GMT  
 What is APL data storage?

This is a multi-part message in MIME format.
--------------7540FB193ED1EFC2DC2EF10D
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Well, it is not clear to me that the assertion about APL data storage
made in your post is correct,
you might start with Ray Trimble's article:

Trimble, R.;
Storage Management in IBM APL Systems;
IBM Systems Journal, XXX-4; 1991; pp. 456-468

The APL data description is another matter, and there are several sources
(and data descriptions) depending on which APL implementation,
what platform, and internal vs external data storage.

(As I live in Baltimore, and have several copies of the journal issue in
question,
you may contact me directly to arrange to give a copy of the journal to you
if you cannot easily locate it at Towson or UM.
I'm at 410-792-5839)

dave

Quote:

> I'm a computer major student. Part of my assignment is to design a
> storage managment system for a heap of variaable-size blocks under
> assumption that the heap is used only to store arrays of real numbers.
> This assumption is basicaly the one behind APL data storage.
>         Can anybody please tell me where can I find the APL data description?


> Thank you! Alex.

--
dave

Don't be cibohphobic!

--------------7540FB193ED1EFC2DC2EF10D
Content-Type: text/x-vcard; charset=us-ascii; name="vcard.vcf"
Content-Transfer-Encoding: 7bit
Content-Description: Card for David M Weintraub
Content-Disposition: attachment; filename="vcard.vcf"

begin:          vcard
fn:             David M Weintraub
n:              Weintraub;David M
org:            APL/JHU
adr:            Applied Physics Laboratory 4-383;;The Johns Hopkins University;Laurel;MD;20723-6099;USA

title:          Sr Physicist
tel;work:       (410-792/301-953)-5839
tel;fax:        (410-792/301-953)-6515
tel;home:       410-358-3283.  Fax: 410-318-831

x-mozilla-cpt:  ;0
x-mozilla-html: TRUE
version:        2.1
end:            vcard

--------------7540FB193ED1EFC2DC2EF10D--



Thu, 13 Apr 2000 02:00:00 GMT  
 What is APL data storage?

I am a designer and implementor of APL interpreters and compilers,
so I might be able to provide a few suggestions and comments:

a. The idea that APL is heavy on arrays of real numbers
    is a common myth, but not true. Boolean, integer, and character
    arrays dominate the data types; 0- and 1-element arrays represent more
    than half of the array sizes encountered in real applications.
    Measurements of real applications, taken over periods
    of weeks on large systems bear out these statistics. I can provide
    3 or 4 independent citations of studies if you wish.
    However, that is not truly relevant to your problem, which
    is fundamentally the issue of array heap management.

b. The issue of arrays being real numbers only is a red herring. Bits is bits,
     as they say. The only potential design advantage given by that limitation
     is that you can elide type information, and perhaps gain some performance
     advantage by ensuring that data is aligned in memory in a manner
     appropriate to your target machine.

c.  I am familiar with several forms of storage management in APL
     systems. Simply put, they are as follows:
        1. Use the operating system calls for storage allocation and
            deallocation (e.g., malloc/free). This generally presumes that
            data is not going to have to move while allocated, and may
            cause storage fragmentation to the point of system failure
            if array sizes are highly variable and the system executes for
            long periods of time. Operating system calls may be more
           expensive than one of the following techniques.
        2. Allocate your own heap at startup time and manage it yourself.
            This is generally faster than operating system calls, but means
             that you get to write and debug your own storage manager,
             and you also tend to be limited in the heap size you can allocate
             (You have to guess how much dynamic memory will be needed
             during the entire execution. This is the APL " workspace full"
             problem.) . Methods of heap management of this sort are
             many and varied, and I recommend you check a standard
             Algorithms text for methods. An advantage of this scheme
             besides performance is that you can move arrays during
             execution to defragment memory.
        3. Use a fancy storage management system that is a hybrid of the
            two above, perhaps combined with a buddy system, to reduce
            fragmentation and the need to garbage collect/defragment
            storage.

d. Arrays tend to be stored in APL as something like this (Call this an m-entry,
      for m-entry):  [By the way, schemes of this sort date back to the late
      1960's, and the development of APL\360 at IBM.]
          backpointer
          length of the m-entry in bytes
          element count
          reference count
          array type
          array rank (# dimensions)
          array shape
          array elements (variable length)
     The meanings of these items are:
       backpointer- an index or address of the thing that points to this
                            array. The backpointer gives you the information you
                            need to move the array somewhere else in memory,
                            perhaps for heap compaction. When you move the
                            m-entry, you use the backpointer to find the forward
                            pointer to the m-entry, and then adjust that forward
                            pointer to point to the new location of the m-entry
after
                            it has been moved.

       m-entry length: This is used by the storage compacter when the array
         is moved and by the storage deallocator when the array is deallocated.
      element count: Many APL primitives need to know the number of
             elements in the array. For example, the "ravel" primitive (,) that
               turns any array into a vector (list) has to compute the result
               shape. The element count, formed as the times reduction of
               the array shape, is redundant, but is sometimes used often
              enough during execution that it makes sense to keep it around,
              rather than doing the multiplies on each primitive call.
      reference count: The number of "things" that point to this m-entry
          New references to the m-entry increase the reference count;
          deleted references decrease it. When the reference count goes to
           zero, the m-entry is deallocated. This speeds up certain
           operations and makes them take less memory. For example,
          if  HERM is a big matrix, and you call a function foo:   foo HERM
        then you can either copy all of HERM to a new array for use by foo,
        OR you can increase the reference count for HERM and pass it
        directly into foo. Anything that might want to alter foo's argument
        checks the reference count -- if it's 1, then it can safely do the
        changes in place; otherwise it has to make a copy then.
        Reference counts are generally swell. In compiled APL, such
       as my APEX compiler, we can perform static analysis to remove
       most of the reference counting. This is difficult to do efficiently
       (or at all, for that matter) in an interpreter.

     array type: Since APL supports many data types on overloaded
       primitives, and as APL does not use declarations, the type is used
       to identify the array as, e.g., boolean, integer, character, real,
complex.
       You don't need this, as your arrays are all real.
     array rank: The number of dimensions (axes) in the array. That is,
         the number of elements in the shape vector.
     array shape: The shape vector is the number of elements along each
        axis of the array. A chessboard has shape 8 8. If you ravel it
       to turn it into a vector, the result has shape ,64.
     array elements: Finally, we have the array elements themselves,
        stored in row-major order. This ordering makes certain primitives
        faster, allows us to easily exploit identities and use cache to
        advantage, and matches most other language storage layout
        schemes (except fortran).

A good thing to do, now that I've told you this, is to decouple the array data
from the
descriptor, keeping the array  elements in one heap (or portion thereof) and the
descriptors in another. This permits you to share the same data among arrays of
different rank or shape. For example, that would let you store the
chessboard and its ravelled list as the same array.

Another good thing to do is to maintain an "array coordinate map" [See Guibas &
Wyatt
POPL78] that extends this scheme so that common array operations, such as
transpose, reversal, drop, and row indexing, can be done in fixed time, with no
data movement regardless of the size of the arrays involved.

There's more, but that should give you a start...

Bob

Quote:

> I'm a computer major student. Part of my assignment is to design a
> storage managment system for a heap of variaable-size blocks under
> assumption that the heap is used only to store arrays of real numbers.
> This assumption is basicaly the one behind APL data storage.
>         Can anybody please tell me where can I find the APL data description?


> Thank you! Alex.



Fri, 14 Apr 2000 03:00:00 GMT  
 
 [ 3 post ] 

 Relevant Pages 

1. APL data storage

2. Toronto APL SIG meeting - June 23 - APL and OLAP/Data Mining

3. Decimal Data Storage In Btrieve/SSQL

4. Using BLOBs for data storage

5. Eiffel and "raw" data storage

6. Data Storage

7. writing data structures to disk/solid storage

8. Ada Sage Display (Form) Data Storage

9. Newbie looking for info on using all of a 3.5 disk for data storage

10. Local Storage and DATA(24|31)

11. Address of data in Working Storage

12. Cobol / Informix Data Storage Error - Help needed

 

 
Powered by phpBB® Forum Software