std blocks vs blocks+cache ( was: block behavior) 
Author Message
 std blocks vs blocks+cache ( was: block behavior)

Quote:




>> >> > 4IM is intentionally not ANS compliant in many ways.  BUFFER
is a
>> >> > constant that points to the beginning of the only memory area
where
>> >> > a BLOCK is read into.  BLOCK takes a number but doesn't leave
anything
>> >> > on the stack.  There's a separate cache that keeps a block
from being
>> >> > read from disk more than once, but you can only access the
current
>> >> > block, and only at BUFFER.

>> >> Oh dear.

>> > Please elaborate. Do you think it is worst than the classic
behaviour
>> > or worst than the behaviour of the standard?

>Yes, much worse than both (they aren't mutually inconsistent).

>> Given that the standard more or less codifies the classic behaviour
>> this is certainly worse, yes.  Having only a single buffer results
in
>> some very {*filter*} behaviour.  Load blocks don't work too well, for
>> example.  Database indexes too.

From the standard:

7.3.3 Block buffer regions
The address of a block buffer returned by
BLOCK or
BUFFER is transient.
A call to BLOCK or BUFFER may render a previously-obtained
block-buffer
address invalid, as may a call to any word that:

parses [...]
displays characters on the user output device [...]
controls the user output device, [...]
receives or tests for the presence of characters from the user input
device such as  [...]
waits for a condition or event, such as [...]
manages the block buffers, such as  [...]
performs any operation on a file or file-name directory that implies
I/O, such as [...]
implicitly performs I/O, such as text interpreter nesting and
un-nesting when files are being used (including un-nesting implied by
THROW [...]

If I interpret this correctly this means that you have, among other
things, to save a copy of the buffer
if you want to print it out. A standard system should also not rely on
the availability of multiple buffers in the system. In other words the
user has to copy in a safe place it's buffer because it may vanish if
the washing machine turns on.
If one takes advantage of particular knowledge of it's system, in
partcular the conditions under which the  buffer actually go fishing
and how many buffers can be accessed simultaneously, things are better
because one can spare the cost of the copy thanks to informed
assumptions.

Quote:
>The classic/standard behavior is the result of efforts over the years
>to produce optimal performance and utility.  Actual reads and writes

I'm not sensitive to this kind of argument. There's several obsolete
words in the standard.

Quote:
>are minimized (because that's what determines performance), and
>provision is made for a number of variations that support further
>optimization based on knowledge of how a system will be used.
>For example, in a large multiuser system, a large number of buffers
>will ensure that more blocks can remain in memory longer.  When
>there will be many sequential operations, however, a small number
>of buffers (e.g. 2) gives the best performance.  A single buffer
>such as you have pretty much ensures the worst possible performance.

My interpretation of the standard is wrong, then ?
"A call to BLOCK or BUFFER may render a previously-obtained
block-buffer
address invalid, as may a call to any word that: [wash dishes]"

Quote:
>Cheers,
>Elizabeth

The interface I have adopted for BUFFER and BLOCK is design as if only
one buffer is available. It simplifies greatly user applications
because the block buffer is at a constant location.
On the system side this does not prevent to have a block cache to
guarantee good performance. I have implemented such a cache on the
standalone version. It features write-thru and read-ahead and it works
pretty well. From my experience in also simplifies the cache code.
Regarding the DOS version I did not care about having a disk cache,
because an  external one ( smartdrive for instance) can be used.

Besides, I think that the copy of the buffer required to compensate
the lack of multiple block buffers is compensated by the fact that one
don't have to drag in and out the addresses, and multiple calls to
BUFFER. If someone could be so kind to provide me sample code for a
simple database management, I could verify it by myself.

 Amicalement,
  Astrobe



Sat, 15 Oct 2005 20:06:10 GMT  
 std blocks vs blocks+cache ( was: block behavior)

Quote:

> The interface I have adopted for BUFFER and BLOCK is design as if only
> one buffer is available. It simplifies greatly user applications
> because the block buffer is at a constant location.

If your code has the dependency on single buffer at constant location,
your code will most likely break on other systems.

I myself (years ago) found it valuable to have at least 3 buffers:
2 are needed to copy one block to the other and 1 more is needed
for input source interpretation.

As to writing portable code, you have to call BLOCK before

at the same time. BTW, even if you have 2 buffers, you can UPDATE
only one block.

For example, the code for exchanging contents of 2 blocks becomes:

\ working code
: >blk< ( n1 n2 -- )
    DUP BLOCK HERE B/BUF MOVE
    OVER BLOCK OVER BLOCK B/BUF MOVE UPDATE
    DROP
    HERE SWAP BLOCK B/BUF MOVE UPDATE
;

Another possible approach is

\ !!!UNTESTED!!!
: >blk< ( m n -- )
   B/BUF 0
   DO ( m n )

      2OVER DROP BLOCK I + C! UPDATE ( m n m.c[i] )
      OVER BLOCK I + C! UPDATE
   LOOP
;

with 4096 calls to BLOCK per invocation.

Quote:
> On the system side this does not prevent to have a block cache to
> guarantee good performance. I have implemented such a cache on the
> standalone version.

I think, the manufacturer of your HDD has already implemented a cache of
2M bytes.

Quote:
> If someone could be so kind to provide me sample code for a
> simple database management, I could verify it by myself.

Do you mean block or DB code?
If you are asking for a sampel BLOCK word set implementation,
there are many around.


Sat, 15 Oct 2005 23:45:48 GMT  
 std blocks vs blocks+cache ( was: block behavior)

Quote:

>> The interface I have adopted for BUFFER and BLOCK is design as if only
>> one buffer is available. It simplifies greatly user applications
>> because the block buffer is at a constant location.
> If your code has the dependency on single buffer at constant location,
> your code will most likely break on other systems.

Not if it's correct.

Quote:
> I myself (years ago) found it valuable to have at least 3 buffers:
> 2 are needed to copy one block to the other and 1 more is needed
> for input source interpretation.
> As to writing portable code, you have to call BLOCK before

> at the same time. BTW, even if you have 2 buffers, you can UPDATE
> only one block.
> For example, the code for exchanging contents of 2 blocks becomes:

Uh, this is incorrect.

Quote:
> : >blk< ( n1 n2 -- )
>     DUP BLOCK HERE B/BUF MOVE
>     OVER BLOCK OVER BLOCK

                        ^  
   there's a task switch here -- your first block has gone

Quote:
>     B/BUF MOVE UPDATE
>     DROP
>     HERE SWAP BLOCK B/BUF MOVE UPDATE
> ;

Above you say "you cannot rely on 2 blocks being in memory at the same
time", so you know this can't work.

Andrew.



Sat, 15 Oct 2005 23:07:38 GMT  
 std blocks vs blocks+cache ( was: block behavior)

Quote:

> From the standard:
> 7.3.3 Block buffer regions
> The address of a block buffer returned by
> BLOCK or
> BUFFER is transient.
> A call to BLOCK or BUFFER may render a previously-obtained
> block-buffer
> address invalid, as may a call to any word that:
> parses [...]
> displays characters on the user output device [...]
> controls the user output device, [...]
> receives or tests for the presence of characters from the user
> input device such as  [...]
> waits for a condition or event, such as [...]
> manages the block buffers, such as  [...]
> performs any operation on a file or file-name directory that
> implies I/O, such as [...]
> implicitly performs I/O, such as text interpreter nesting and
> un-nesting when files are being used (including un-nesting implied
> by THROW [...]
> If I interpret this correctly this means that you have, among other
> things, to save a copy of the buffer if you want to print it out.

No, it doesn't.

For example, you can do 26 LIST and it does BLOCK and displays the
buffer as text.  Say you want to do it yourself, for some reason you
only want to display lines 3 and 12.

: .LINE ( block# line# -- )
   64 CHARS *
   SWAP BLOCK + 64 TYPE ;

: MY-LIST ( block# -- )
   DUP CR 3 .LINE CR 12 .LINE ;

You don't save the buffer contents, you just get them fresh just
before you use them.  Any time you do something that might mess up the
buffer, grab it again.

The list of things that might mess up the buffer are the same things
that will do PAUSE and a task switch on a traditional cooperative
multitasker.  When you do BLOCK you get a task switch, and when your
task starts up again your buffer is ready.  Then you can use it for
awhile, and the first time you do any sort of I/O some other task
starts up and maybe replaces your block.  (I'm not sure that's how it
works but it looks plausible.)  So you get your block back and use
it.  No big deal.

No matter how many block buffers you have, you can't be sure that some
block buffer you care about won't get overwritten.  So it makes sense
to do BLOCK before you use the buffer, each time.  

Quote:
> A standard system should also not rely on the availability of
> multiple buffers in the system. In other words the user has to copy
> in a safe place it's buffer because it may vanish if the washing
> machine turns on.

The washing machine won't turn on until you do PAUSE .  Your buffer
will last until after you've used it, unless you mistakenly do a CR or
something before you use it.

The korean hForth was the first one I saw that used a single block
buffer.  I was surprised to find it can work with just one buffer, but
it can.  The thing that gave me the most trouble about a single block
buffer was that sometimes I wanted to MOVE data from one block to
another, and it just doesn't work like that although that works when
you have 2 buffers.



Sat, 15 Oct 2005 23:31:38 GMT  
 std blocks vs blocks+cache ( was: block behavior)


Quote:
> The technique is not to keep track of the address of the buffer but to
> call BLOCK with the block number whenever the address is needed.

I believe I understand now what to expect when using 4IM.  Your statement
above makes me curious about what happens in other FORTHs which are more
aligned to the ANSI standard.

If one uses BLOCK to retrieve the address of the buffer, what happens to
changes you have made to the buffer but have not necessarily decided to
commit via UPDATE yet?  Assuming no other disk I/O is done in-between, is
each call to BLOCK guaranteed to return the same buffer address?  Will that
buffer still contain your uncommitted changes or will it always be reloaded
with the original disk contents?

Quote:
> > On the system side this does not prevent to have a block cache to
> > guarantee good performance.

> Sure it does.  If there is only one buffer, you have no cache.
> Unless, of course, you want to do yet another layer of buffering
> beneath the Forth buffers, and what's the point of that?  You've made
> the system more complex for no gain.

For what it is worth, I'm thinking there may be an advantage for my intended
use.  It appears that Astrobe has minimized the required footprint in the
x86 64K code/data segment.  For me, this is going to be a big deal and had I
started with a system which allocated multiple buffers then I would have
ended up removing them anyway sooner or later.  On the other hand, I would
have gladly lived without a disk cache.  Now, 4IM has me thinking that the
disk cache buffers can just live up in all those other unused 64K segments
similar to what would happen if I was running DOS and a RAMDISK in HIGH
memory.

I could be wrong about this...I'm still real early in the process here.



Sun, 16 Oct 2005 00:02:36 GMT  
 std blocks vs blocks+cache ( was: block behavior)

Quote:

> If one uses BLOCK to retrieve the address of the buffer, what
> happens to changes you have made to the buffer but have not
> necessarily decided to commit via UPDATE yet?

You'll lose them.  Or rather, you might lose them.

Quote:
> Assuming no other disk I/O is done in-between, is each call to BLOCK
> guaranteed to return the same buffer address?  Will that buffer
> still contain your uncommitted changes or will it always be reloaded
> with the original disk contents?

It depends on what other system activity is going on.

Andrew.



Sun, 16 Oct 2005 00:14:15 GMT  
 std blocks vs blocks+cache ( was: block behavior)

Quote:

> No, it doesn't.
> For example, you can do 26 LIST and it does BLOCK and displays the
> buffer as text.  Say you want to do it yourself, for some reason you
> only want to display lines 3 and 12.
> : .LINE ( block# line# -- )
>    64 CHARS *
>    SWAP BLOCK + 64 TYPE ;
> : MY-LIST ( block# -- )
>    DUP CR 3 .LINE CR 12 .LINE ;
> You don't save the buffer contents, you just get them fresh just
> before you use them.  Any time you do something that might mess up the
> buffer, grab it again.

I don't think it's safe to TYPE out of a buffer.

Andrew.



Sun, 16 Oct 2005 00:15:33 GMT  
 std blocks vs blocks+cache ( was: block behavior)

Quote:


>> ...
>>For example, you can do 26 LIST and it does BLOCK and displays the
>>buffer as text.  Say you want to do it yourself, for some reason you
>>only want to display lines 3 and 12.

>>: .LINE ( block# line# -- )
>>   64 CHARS *
>>   SWAP BLOCK + 64 TYPE ;

>>: MY-LIST ( block# -- )
>>   DUP CR 3 .LINE CR 12 .LINE ;

>>You don't save the buffer contents, you just get them fresh just
>>before you use them.  Any time you do something that might mess up the
>>buffer, grab it again.

> I don't think it's safe to TYPE out of a buffer.

To amplify Andrew's comment a little, the whole issue here revolves
around the historical principle that in a multiuser Forth the users
share the buffer pool.  A buffer address is valid so long as you
don't do anything that can let another user request a buffer (and
maybe get yours).  The list of prohibitions in the standard includes
words that may involve a task swap.

In the case of TYPE, a multitasking system would certainly
relinquish the CPU for the typing.  Therefore, your buffer
address is no longer valid.

Our approach has been to make a word >TYPE that can be used to
type from a buffer, moving the string to be typed to a temporary
location.

Jonah is correct in observing that one should call BLOCK for
every component you want, rather than assuming your buffer
address will persist over time.

Cheers,
Elizabeth

--
==================================================
Elizabeth D. Rather   (US & Canada)   800-55-FORTH
FORTH Inc.                         +1 310-491-3356
5155 W. Rosecrans Ave. #1018  Fax: +1 310-978-9454
Hawthorne, CA 90250
http://www.forth.com

"Forth-based products and Services for real-time
applications since 1973."
==================================================



Sun, 16 Oct 2005 01:30:23 GMT  
 std blocks vs blocks+cache ( was: block behavior)

Quote:

> [summary of Standard cautions snipped]
>>If I interpret this correctly this means that you have, among other
>>things, to save a copy of the buffer
>>if you want to print it out.

As noted above, a better approach is to buffer each string you're
typing.

Quote:
> A standard system should also not rely on
>>the availability of multiple buffers in the system.

> This isn't a standards compliance issue.  This is a quality of
> implementation issue.  A system with only one block buffer will in
> many cases perform poorly.

Moreover, a _system_ can depend on its own implementation
characteristics.  It's only _programs_ that wish to be
portable that need to be sensitive about portability
issues.  A system should provide whatever resources it
thinks can guarantee good performance for users.

Quote:
>>>The classic/standard behavior is the result of efforts over the years
>>>to produce optimal performance and utility.  Actual reads and writes

>>I'm not sensitive to this kind of argument.

A Standard isn't a tutorial.  Many books on Forth Iincluding
mine) explain the assumptions around the block I/O concept.

Quote:
>>>are minimized (because that's what determines performance), and
>>>provision is made for a number of variations that support further
>>>optimization based on knowledge of how a system will be used.
>>>For example, in a large multiuser system, a large number of buffers
>>>will ensure that more blocks can remain in memory longer.  When
>>>there will be many sequential operations, however, a small number
>>>of buffers (e.g. 2) gives the best performance.  A single buffer
>>>such as you have pretty much ensures the worst possible performance.

>>My interpretation of the standard is wrong, then ?
>>"A call to BLOCK or BUFFER may render a previously-obtained
>>block-buffer address invalid, as may a call to any word that:

 >>[wash dishes]"

The key word here is "may".  In a multiuser system, it's a statistical
game that depends on what other users are doing.  A block will stay in a
buffer until it is reused for another block; therefore, repeated calls
to BLOCK for a block# which is already in a buffer won't cause I/O.
In a single-user system, there's no competition for buffers.

Quote:
>>The interface I have adopted for BUFFER and BLOCK is design as if only
>>one buffer is available. It simplifies greatly user applications
>>because the block buffer is at a constant location.

> It doesn't, unless the application is exceedingly badly written.  The
> technique is not to keep track of the address of the buffer but to
> call BLOCK with the block number whenever the address is needed.

Exactly.  The application can't assume the address is constant without
having a dependency on the unusual implementation strategy of having
only one buffer.  And if you're doing something like copying data from
one block to another, having only one buffer will slow you down a lot.
As I noted above, repeated calls to BLOCK for a block already in a
buffer doesn't cause more I/O, it should be extremely quick.

Quote:
>>On the system side this does not prevent to have a block cache to
>>guarantee good performance.

> Sure it does.  If there is only one buffer, you have no cache.
> Unless, of course, you want to do yet another layer of buffering
> beneath the Forth buffers, and what's the point of that?  You've made
> the system more complex for no gain.

The buffer scheme is intended as a cache.  That's its primary
purpose.

Quote:
>>I have implemented such a cache on the standalone version.

> So why not have a few block buffers and be done with it?  Really, this
> is ridiculous.  Have you ever written an application on a system that
> uses the standard scheme?

Having a single buffer and a separate, invisible cache is an added
layer of complexity and inefficiency that you can avoid by
understanding the main purpose of the buffer scheme to begin
with.

Cheers,
Elizabeth

--
==================================================
Elizabeth D. Rather   (US & Canada)   800-55-FORTH
FORTH Inc.                         +1 310-491-3356
5155 W. Rosecrans Ave. #1018  Fax: +1 310-978-9454
Hawthorne, CA 90250
http://www.forth.com

"Forth-based products and Services for real-time
applications since 1973."
==================================================



Sun, 16 Oct 2005 01:51:04 GMT  
 std blocks vs blocks+cache ( was: block behavior)

Quote:


>>If one uses BLOCK to retrieve the address of the buffer, what
>>happens to changes you have made to the buffer but have not
>>necessarily decided to commit via UPDATE yet?

> You'll lose them.  Or rather, you might lose them.

Exactly.  One should use UPDATE immediately following any
change (!, C!, MOVE, etc.) that you wish to ensure is
recorded permanently.

UPDATE merely sets a flag, it isn't expensive since it
doesn't trigger a write until the buffer is needed for
a different block.

Quote:
>>Assuming no other disk I/O is done in-between, is each call to BLOCK
>>guaranteed to return the same buffer address?  

Yes.  In a single-user system, you know whether or not you're
doing other disk I/O.  In a multi-user system, other tasks may
be accessing the buffer pool.  Subsequent calls to BLOCK for
a block already in a buffer should return that address with
no I/O done.

Quote:
>>Will that buffer
>>still contain your uncommitted changes or will it always be reloaded
>>with the original disk contents?

> It depends on what other system activity is going on.

Exactly.  So long as your buffer has not been needed for another
block, everything is unchanged (including your updates).  However,
if you make changes and do _not_ mark the block as UPDATED, if
the buffer is reused changes will not be recorded and the next
time you request the block you'll get the last recorded version.

Cheers,
Elizabeth

--
==================================================
Elizabeth D. Rather   (US & Canada)   800-55-FORTH
FORTH Inc.                         +1 310-491-3356
5155 W. Rosecrans Ave. #1018  Fax: +1 310-978-9454
Hawthorne, CA 90250
http://www.forth.com

"Forth-based products and Services for real-time
applications since 1973."
==================================================



Sun, 16 Oct 2005 02:05:54 GMT  
 std blocks vs blocks+cache ( was: block behavior)



Quote:
> >>Assuming no other disk I/O is done in-between, is each call to BLOCK
> >>guaranteed to return the same buffer address?

> Yes.  In a single-user system, you know whether or not you're
> doing other disk I/O.  In a multi-user system, other tasks may
> be accessing the buffer pool.  Subsequent calls to BLOCK for
> a block already in a buffer should return that address with
> no I/O done.

> >>Will that buffer
> >>still contain your uncommitted changes or will it always be reloaded
> >>with the original disk contents?

> > It depends on what other system activity is going on.

> Exactly.  So long as your buffer has not been needed for another
> block, everything is unchanged (including your updates).  However,
> if you make changes and do _not_ mark the block as UPDATED, if
> the buffer is reused changes will not be recorded and the next
> time you request the block you'll get the last recorded version.

Think I've got it - it all seems to make sense.  I hadn't even considered
the issue of a multi-user/multi-tasking environment.  (Which is why you
won't ever see me on a standard's committee...Lord knows that it's hard
enough just doing a good job solving today's problem for today's customer -
I can't imagine trying to pull off the balancing job required to address
almost everybody's concerns across the spectrum of hardware and
environments.  More power to those of you who can contribute in that way!)

Thanks,
Bill



Sun, 16 Oct 2005 02:52:43 GMT  
 std blocks vs blocks+cache ( was: block behavior)

Quote:


> > No, it doesn't.
> > For example, you can do 26 LIST and it does BLOCK and displays
> > the buffer as text.  Say you want to do it yourself, for some
> > reason you only want to display lines 3 and 12.
> > : .LINE ( block# line# -- )
> >    64 CHARS *
> >    SWAP BLOCK + 64 TYPE ;
> > : MY-LIST ( block# -- )
> >    DUP CR 3 .LINE CR 12 .LINE ;
> > You don't save the buffer contents, you just get them fresh just
> > before you use them.  Any time you do something that might mess
> > up the buffer, grab it again.
> I don't think it's safe to TYPE out of a buffer.

You could be right.  If TYPE is a high-level word that uses EMIT and
EMIT does PAUSE then you're right.  If TYPE is a primitive that does
PAUSE when it's done, then it works.

I figured it *ought* to be safe because it gets too inconvenient if it
isn't.  I could be wrong.



Sun, 16 Oct 2005 02:54:43 GMT  
 std blocks vs blocks+cache ( was: block behavior)
...

Quote:
>>I don't think it's safe to TYPE out of a buffer.

> You could be right.  If TYPE is a high-level word that uses EMIT and
> EMIT does PAUSE then you're right.  If TYPE is a primitive that does
> PAUSE when it's done, then it works.

> I figured it *ought* to be safe because it gets too inconvenient if it
> isn't.  I could be wrong.

Among the special circumstances called out in the Standard & quoted at
the beginning of this thread is:

"displays characters on the user output device [...]"

Sounds like TYPE to me.

--
==================================================
Elizabeth D. Rather   (US & Canada)   800-55-FORTH
FORTH Inc.                         +1 310-491-3356
5155 W. Rosecrans Ave. #1018  Fax: +1 310-978-9454
Hawthorne, CA 90250
http://www.forth.com

"Forth-based products and Services for real-time
applications since 1973."
==================================================



Sun, 16 Oct 2005 03:41:59 GMT  
 std blocks vs blocks+cache ( was: block behavior)

Quote:


>> I don't think it's safe to TYPE out of a buffer.

>You could be right.  If TYPE is a high-level word that uses EMIT and
>EMIT does PAUSE then you're right.  If TYPE is a primitive that does
>PAUSE when it's done, then it works.

With a properly written scheduler, synchronous I/O like TYPE is based
on asynchronous I/O plus a call to a scheduler; i.e., TYPE (or EMIT,
if TYPE is based on that) will be something like:

: TYPE ( addr u -- )
  submit-for-output wait-for-completion ;

where wait-for-completion calls the scheduler, and only returns to the
current thread when the output is complete.

In such a setting it's not safe to TYPE something that might change in
another thread (like a buffer); unless, of course, SUBMIT-FOR-OUTPUT
copies the string to be TYPEd for exactly this reason.

- anton
--
M. Anton Ertl  http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html



Sun, 16 Oct 2005 03:38:41 GMT  
 std blocks vs blocks+cache ( was: block behavior)

Quote:


> ...
>>> I don't think it's safe to TYPE out of a buffer.
>> You could be right.  If TYPE is a high-level word that uses EMIT and
>> EMIT does PAUSE then you're right.  If TYPE is a primitive that does
>> PAUSE when it's done, then it works.
>> I figured it *ought* to be safe because it gets too inconvenient if it
>> isn't.  I could be wrong.
> Among the special circumstances called out in the Standard & quoted at
> the beginning of this thread is:
> "displays characters on the user output device [...]"
> Sounds like TYPE to me.

Yes, I wanted to hope that TYPE would do PAUSE after it gets the
string, and not before.

That would be much more convenient for me.



Sun, 16 Oct 2005 05:27:25 GMT  
 
 [ 71 post ]  Go to page: [1] [2] [3] [4] [5]

 Relevant Pages 

1. Named blocks vs unnamed blocks

2. blocking vs non blocking

3. Blocking vs Non-Blocking Assignemts

4. blocking vs non-blocking

5. blocking vs non blocking

6. question on return from blocks and ensure: blocks

7. ERROR "Corrupt Block/Unknown Block Freed"

8. Corrupt block/unknown block type freed bug

9. 0 BLOCK in Gforth (Was: Blocks Help etc.)

10. blocks and lambdas, or blocks as first-class entities

11. Associating block to block call?

12. Block passing: obj.new(){block}

 

 
Powered by phpBB® Forum Software