TechTips: How do shared files work? 
Author Message
 TechTips: How do shared files work?

One of the most common techniques used by personal-computer databases is
to store information in shared disk-files.  Two or more workstations
have the same set of files "open" for writing at the same time; the
information written to the file by any workstation can be read
immediately by any other.

Shared disk files can be implemented on any PC network, and require no
dedicated database-server.  This is exactly the sort of configuration
that is needed for many applications, even though database-servers are
becoming more commonplace these days.  If you're sharing data among less
than 30 users, all within a self-contained network not accessible to the
outside world, then this strategy works quite well.

But how does it work?

Well, like all networking, this -is- a simple "client/server"
arrangement even though this term is not used.  The computer which runs
the hard-drive upon which the shared file is stored, ultimately carries
out all of the requests ... but it does so "blindly."  It does not know
or care what a shared file actually contains.  It simply reads what it
is told to read, and writes what it is told to write.

And it "locks" what it is told to lock.  Let me explain.

If workstation 'A' wants to update a record in a file, it will do so by
(say) reading the record, changing it, and writing it back.  To be sure
that this operation takes place successfully and is not interfered with
by workstation 'B', workstation 'A' must also use locking:
        * Lock record #1
        * Read record #1
        * Write updated record #1
        * Unlock

This locking serves two purposes:  (1) it assures that, once the lock
has been granted, no other workstation will update the same region of
the file; and (2) it explicitly notes the fact that the locked region of
the file, once it has been unlocked, is presumed to have changed.  This
is important because of "cacheing."

Let's say for example that records in this file are very small:  they're
only 50 bytes long.  Well, if we're going to read 50 bytes, why don't we
make good use of the network-connection and read 500 bytes, or 5000, and
hold the other records locally (in a "cache") and dole them out to the
program when needed?  The operating system on our workstation can
certainly do that, and eliminate 10 or 100 or more separate
network-requests in doing so, but =only= if it also has a way to know
when any of the other data it read has become "out of date."

If your workstation, upon being asked to read record #1, goes ahead and
reads a block of records #1-#10, then it can dole the other 9 records
out to you as long as it is somehow able to know if anyone else changes
any of those records (in which case your workstation will have to
request the new data).  The lock/unlock messages that get sent down the
wire enable the various workstations to know that.

As you may expect, this cacheing process is both very-important to
network performance, AND very complicated.  Success is extremely timing
dependent.  It relies upon coordination between the file server and each
and every workstation.  If any of the parties "gets it wrong," however
slightly or briefly, the system can break down.

And, historically, it has.  Early versions of SMARTDRV for Windows 3.x
had problems; the VREDIR.VXD on early Windows 9x had problems; the
"opportunistic locking" scheme on early Windows NT 4.x systems had
problems.  All of those problems have been fixed ... but workstations
and servers which -don't- carry those fixes are still out there, maybe
on your network!

[Incidentally, these cacheing systems have to work a whole lot harder if
your network has faulty cabling, a marginal hub or router, a loose cable
plug, a marginal network-card, or anything else that could introduce
errors into the system.  These problems are difficult to diagnose
without proper testing equipment .. which is expensive, but which any
network-installer (company) worth his salt will own.]

This is why it's important to be sure that the -entire- hardware and
software configuration throughout your network is both up-to-date and,
especially, consistent.  "It only takes one instrument being out of tune
to spoil an entire symphony."  You should also keep a diary, in which
you record everything you do to the network and everything that anyone
observes.  Problems usually arise either (a) immediately after a change
has been made; or (b) as a result of a hardware malfunction or
developing hardware issue.

Even though our company produces a database-repair tool, one that's used
in over 20 countries around the world, the fact remains that file
sharing networks are and should be very reliable.  You should be able to
very reliably share database files across any properly-configured,
current-and-consistent-versioned, network.  Problems will occur from
time to time but they should never be chronic.  

------------------------------------------------------------------
? Sundial Services :: Scottsdale, AZ (USA) :: (480) 946-8259

Quote:
> Fast(!), automatic table-repair with two clicks of the mouse!
> ChimneySweep(R):  "Click click, it's fixed!" {tm}
> http://www.*-*-*.com/



Sun, 31 Aug 2003 01:40:44 GMT  
 TechTips: How do shared files work?
I have to say....  I enjoyed that.... I always like it when complex ideas
are leveled to a nice read!

Thank you... Hope you do something like this again.

Sincerely,
Rodney Wise


Quote:
> One of the most common techniques used by personal-computer databases is
> to store information in shared disk-files.  Two or more workstations
> have the same set of files "open" for writing at the same time; the
> information written to the file by any workstation can be read
> immediately by any other.

> Shared disk files can be implemented on any PC network, and require no
> dedicated database-server.  This is exactly the sort of configuration
> that is needed for many applications, even though database-servers are
> becoming more commonplace these days.  If you're sharing data among less
> than 30 users, all within a self-contained network not accessible to the
> outside world, then this strategy works quite well.

> But how does it work?

> Well, like all networking, this -is- a simple "client/server"
> arrangement even though this term is not used.  The computer which runs
> the hard-drive upon which the shared file is stored, ultimately carries
> out all of the requests ... but it does so "blindly."  It does not know
> or care what a shared file actually contains.  It simply reads what it
> is told to read, and writes what it is told to write.

> And it "locks" what it is told to lock.  Let me explain.

> If workstation 'A' wants to update a record in a file, it will do so by
> (say) reading the record, changing it, and writing it back.  To be sure
> that this operation takes place successfully and is not interfered with
> by workstation 'B', workstation 'A' must also use locking:
> * Lock record #1
> * Read record #1
> * Write updated record #1
> * Unlock

> This locking serves two purposes:  (1) it assures that, once the lock
> has been granted, no other workstation will update the same region of
> the file; and (2) it explicitly notes the fact that the locked region of
> the file, once it has been unlocked, is presumed to have changed.  This
> is important because of "cacheing."

> Let's say for example that records in this file are very small:  they're
> only 50 bytes long.  Well, if we're going to read 50 bytes, why don't we
> make good use of the network-connection and read 500 bytes, or 5000, and
> hold the other records locally (in a "cache") and dole them out to the
> program when needed?  The operating system on our workstation can
> certainly do that, and eliminate 10 or 100 or more separate
> network-requests in doing so, but =only= if it also has a way to know
> when any of the other data it read has become "out of date."

> If your workstation, upon being asked to read record #1, goes ahead and
> reads a block of records #1-#10, then it can dole the other 9 records
> out to you as long as it is somehow able to know if anyone else changes
> any of those records (in which case your workstation will have to
> request the new data).  The lock/unlock messages that get sent down the
> wire enable the various workstations to know that.

> As you may expect, this cacheing process is both very-important to
> network performance, AND very complicated.  Success is extremely timing
> dependent.  It relies upon coordination between the file server and each
> and every workstation.  If any of the parties "gets it wrong," however
> slightly or briefly, the system can break down.

> And, historically, it has.  Early versions of SMARTDRV for Windows 3.x
> had problems; the VREDIR.VXD on early Windows 9x had problems; the
> "opportunistic locking" scheme on early Windows NT 4.x systems had
> problems.  All of those problems have been fixed ... but workstations
> and servers which -don't- carry those fixes are still out there, maybe
> on your network!

> [Incidentally, these cacheing systems have to work a whole lot harder if
> your network has faulty cabling, a marginal hub or router, a loose cable
> plug, a marginal network-card, or anything else that could introduce
> errors into the system.  These problems are difficult to diagnose
> without proper testing equipment .. which is expensive, but which any
> network-installer (company) worth his salt will own.]

> This is why it's important to be sure that the -entire- hardware and
> software configuration throughout your network is both up-to-date and,
> especially, consistent.  "It only takes one instrument being out of tune
> to spoil an entire symphony."  You should also keep a diary, in which
> you record everything you do to the network and everything that anyone
> observes.  Problems usually arise either (a) immediately after a change
> has been made; or (b) as a result of a hardware malfunction or
> developing hardware issue.

> Even though our company produces a database-repair tool, one that's used
> in over 20 countries around the world, the fact remains that file
> sharing networks are and should be very reliable.  You should be able to
> very reliably share database files across any properly-configured,
> current-and-consistent-versioned, network.  Problems will occur from
> time to time but they should never be chronic.

> ------------------------------------------------------------------
> ? Sundial Services :: Scottsdale, AZ (USA) :: (480) 946-8259

> > Fast(!), automatic table-repair with two clicks of the mouse!
> > ChimneySweep(R):  "Click click, it's fixed!" {tm}
> > http://www.sundialservices.com/products/chimneysweep



Sun, 31 Aug 2003 06:45:52 GMT  
 TechTips: How do shared files work?


Quote:
> One of the most common techniques used by personal-computer databases is
> to store information in shared disk-files.  Two or more workstations
> have the same set of files "open" for writing at the same time; the
> information written to the file by any workstation can be read
> immediately by any other.

> Shared disk files can be implemented on any PC network, and require no
> dedicated database-server.  This is exactly the sort of configuration
> that is needed for many applications, even though database-servers are
> becoming more commonplace these days.  If you're sharing data among less
> than 30 users, all within a self-contained network not accessible to the
> outside world, then this strategy works quite well.

What you describe here seems much like the primitives of a network-enabled
local database, like Access or Paradox. I believed they were bad enough to
make you want a database server in multi-user environments.
Can't immediately see the big gain from doing this when free, light-weight
databaseservers like MySQL and Interbase are doing this both safer, easier
and with more grace...the task of creating a fail-safe, file sharing
"pseudo-database" system would not be my first wish. Additionally, as a rule
of thumb, everything with good ol' file of records should be abandoned, IMHO.
A true nightmare as development requires changes, for one.

--
Bjoerge Saether
Consultant / Developer
http://www.itte.no
Asker, Norway



Sun, 31 Aug 2003 09:42:32 GMT  
 TechTips: How do shared files work?
Rodney,

There is a series of the 'Tech Tips' here in this newsgroup. Try one
of the archives like deja.com and search for 'Tech Tips'.  Also, you
might take a look at their website, http://sundialservices.com

They have been one of the best for all of us in Delphi land :-)

Dan

On Tue, 13 Mar 2001 17:45:52 -0500, "Rodney Wise"

Quote:

>I have to say....  I enjoyed that.... I always like it when complex ideas
>are leveled to a nice read!

>Thank you... Hope you do something like this again.

>Sincerely,
>Rodney Wise

--
Dan Brennand
CMDC systems, inc.
Configuration Management and Document Control:
visit us at www.cmdcsystems.com



Sun, 31 Aug 2003 12:55:42 GMT  
 TechTips: How do shared files work?
In certain environments, Bj?rge, I heartily agree with you.  For
instance, when the customer's environment is fully supported by computer
consultants or an I.S. department that can take care of the details .. a
database server works very well.  There are quite a few customer-sites
that we maintain to this day which use these servers and for which, I
think, anything less would be inappropriate.

But, none the less, shared-file technology still occupies a sizeable
niche in the marketplace and in the installed base, because it requires
no separate server maintenance at all.  These are not the sort of
installations where 20 or 30 users are seen hammering transactions into
the system constantly throughout the day:  they're the more typical
small-business scenario where the transaction volume is much smaller,
and many operations are simple lookups.

So ... without disagreeing with you at all ... I think there is, and
that there always will be, considerable room for both.  

Quote:



> > One of the most common techniques used by personal-computer databases is
> > to store information in shared disk-files.  Two or more workstations
> > have the same set of files "open" for writing at the same time; the
> > information written to the file by any workstation can be read
> > immediately by any other.

> > Shared disk files can be implemented on any PC network, and require no
> > dedicated database-server.  This is exactly the sort of configuration
> > that is needed for many applications, even though database-servers are
> > becoming more commonplace these days.  If you're sharing data among less
> > than 30 users, all within a self-contained network not accessible to the
> > outside world, then this strategy works quite well.

> What you describe here seems much like the primitives of a network-enabled
> local database, like Access or Paradox. I believed they were bad enough to
> make you want a database server in multi-user environments.
> Can't immediately see the big gain from doing this when free, light-weight
> databaseservers like MySQL and Interbase are doing this both safer, easier
> and with more grace...the task of creating a fail-safe, file sharing
> "pseudo-database" system would not be my first wish. Additionally, as a rule
> of thumb, everything with good ol' file of records should be abandoned, IMHO.
> A true nightmare as development requires changes, for one.

------------------------------------------------------------------
Sundial Services :: Scottsdale, AZ (USA) :: (480) 946-8259

- Show quoted text -

Quote:
> Fast(!), automatic table-repair with two clicks of the mouse!
> ChimneySweep(R):  "Click click, it's fixed!" {tm}
> http://www.sundialservices.com/products/chimneysweep



Thu, 04 Sep 2003 13:16:15 GMT  
 TechTips: How do shared files work?
There are a few further things to add to this :

-  Sundial's point that having a fully fledged database server is
inappropriate in many smaller applications is absolutely correct.

-  In many cases a third party database is totally inappropriate for
an application - the more you know about your data the better you can
'tune' things.

- Allowing multiple applications to directly access raw data files is
downright dangerous

- It is quite easy to have a separate - hidden - process running on
one of the PCs on a network that communicates via TCP/IP, small files,
DCOM (Yuk), network messages -  with Apps on other PCs - only one
program ever touches the data files, all locking is logical - only one
Data Access component/App is distributed. It can run on a dedicated
machine or be kicked off on one of 'Client' PCs

- A simple easy to maintain system for smallish multi user Apps

- BTW - distributed data - ie: keeping local copies of the central
data - updated from the central server - is also pretty viable.

Sure you will not be able to run NASA on this type of system, but the
majority of systems are actually pretty small.

On Sat, 17 Mar 2001 22:16:15 -0700, Sundial Services

Quote:

>In certain environments, Bj?rge, I heartily agree with you.  For
>instance, when the customer's environment is fully supported by computer
>consultants or an I.S. department that can take care of the details .. a
>database server works very well.  There are quite a few customer-sites
>that we maintain to this day which use these servers and for which, I
>think, anything less would be inappropriate.

>But, none the less, shared-file technology still occupies a sizeable
>niche in the marketplace and in the installed base, because it requires
>no separate server maintenance at all.  These are not the sort of
>installations where 20 or 30 users are seen hammering transactions into
>the system constantly throughout the day:  they're the more typical
>small-business scenario where the transaction volume is much smaller,
>and many operations are simple lookups.

>So ... without disagreeing with you at all ... I think there is, and
>that there always will be, considerable room for both.  




>> > One of the most common techniques used by personal-computer databases is
>> > to store information in shared disk-files.  Two or more workstations
>> > have the same set of files "open" for writing at the same time; the
>> > information written to the file by any workstation can be read
>> > immediately by any other.

>> > Shared disk files can be implemented on any PC network, and require no
>> > dedicated database-server.  This is exactly the sort of configuration
>> > that is needed for many applications, even though database-servers are
>> > becoming more commonplace these days.  If you're sharing data among less
>> > than 30 users, all within a self-contained network not accessible to the
>> > outside world, then this strategy works quite well.

>> What you describe here seems much like the primitives of a network-enabled
>> local database, like Access or Paradox. I believed they were bad enough to
>> make you want a database server in multi-user environments.
>> Can't immediately see the big gain from doing this when free, light-weight
>> databaseservers like MySQL and Interbase are doing this both safer, easier
>> and with more grace...the task of creating a fail-safe, file sharing
>> "pseudo-database" system would not be my first wish. Additionally, as a rule
>> of thumb, everything with good ol' file of records should be abandoned, IMHO.
>> A true nightmare as development requires changes, for one.

>------------------------------------------------------------------
>Sundial Services :: Scottsdale, AZ (USA) :: (480) 946-8259

>> Fast(!), automatic table-repair with two clicks of the mouse!
>> ChimneySweep(R):  "Click click, it's fixed!" {tm}
>> http://www.sundialservices.com/products/chimneysweep



Fri, 05 Sep 2003 00:56:12 GMT  
 TechTips: How do shared files work?
You bring up an interesting and very valid point when you mention
"DCOM," even though you say "yuck."

It is certainly straightforward to use Delphi and DCOM technology (which
is built in to Windows now) to construct applications that open objects
on remote servers, and likewise to build the remote-server applications
that support these objects. Windows itself handles many variations of
the necessary thread-control, load balancing and so forth.  It also
handles the underlying communications between the two machines.

DCOM objects, running on the selected host machine or machines, can use
various forms of threading, queueing, and locking -- and be doing so in
the context of just one machine.  This reduces the number of points of
hardware-related failure to only one machine.

This is the essence of the so-called "thin client" model, where the
client machine does very little.  It is not (as usual...) a panacea or a
silver bullet but it =is= something that existing, off-the-shelf or
built-in WindowsNT technology makes considerably easier than you think.

:-O  Ohmygosh .. I think I just complimented Microsoft!  :-O  ;-)

Quote:

> - Allowing multiple applications to directly access raw data files is
> downright dangerous

> - It is quite easy to have a separate - hidden - process running on
> one of the PCs on a network that communicates via TCP/IP, small files,
> DCOM (Yuk), network messages -  with Apps on other PCs - only one
> program ever touches the data files, all locking is logical - only one
> Data Access component/App is distributed. It can run on a dedicated
> machine or be kicked off on one of 'Client' PCs

> - A simple easy to maintain system for smallish multi user Apps

> - BTW - distributed data - ie: keeping local copies of the central
> data - updated from the central server - is also pretty viable.

> Sure you will not be able to run NASA on this type of system, but the
> majority of systems are actually pretty small.

------------------------------------------------------------------
Sundial Services :: Scottsdale, AZ (USA) :: (480) 946-8259

- Show quoted text -

Quote:
> Fast(!), automatic table-repair with two clicks of the mouse!
> ChimneySweep(R):  "Click click, it's fixed!" {tm}
> http://www.sundialservices.com/products/chimneysweep



Fri, 05 Sep 2003 03:45:25 GMT  
 TechTips: How do shared files work?
We run Chimney Sweep nightly to clean up our 418 table system.  Our
day-to-day performance with up to 5 users is excellent.

Several time per month we note that CS has had to repair a "resync" problem.
Wonder what causes this particular problem and how to avoid it.

--
Jean Friedberg


Quote:
> One of the most common techniques used by personal-computer databases is
> to store information in shared disk-files.  Two or more workstations
> have the same set of files "open" for writing at the same time; the
> information written to the file by any workstation can be read
> immediately by any other.

> Shared disk files can be implemented on any PC network, and require no
> dedicated database-server.  This is exactly the sort of configuration
> that is needed for many applications, even though database-servers are
> becoming more commonplace these days.  If you're sharing data among less
> than 30 users, all within a self-contained network not accessible to the
> outside world, then this strategy works quite well.

> But how does it work?

> Well, like all networking, this -is- a simple "client/server"
> arrangement even though this term is not used.  The computer which runs
> the hard-drive upon which the shared file is stored, ultimately carries
> out all of the requests ... but it does so "blindly."  It does not know
> or care what a shared file actually contains.  It simply reads what it
> is told to read, and writes what it is told to write.

> And it "locks" what it is told to lock.  Let me explain.

> If workstation 'A' wants to update a record in a file, it will do so by
> (say) reading the record, changing it, and writing it back.  To be sure
> that this operation takes place successfully and is not interfered with
> by workstation 'B', workstation 'A' must also use locking:
> * Lock record #1
> * Read record #1
> * Write updated record #1
> * Unlock

> This locking serves two purposes:  (1) it assures that, once the lock
> has been granted, no other workstation will update the same region of
> the file; and (2) it explicitly notes the fact that the locked region of
> the file, once it has been unlocked, is presumed to have changed.  This
> is important because of "cacheing."

> Let's say for example that records in this file are very small:  they're
> only 50 bytes long.  Well, if we're going to read 50 bytes, why don't we
> make good use of the network-connection and read 500 bytes, or 5000, and
> hold the other records locally (in a "cache") and dole them out to the
> program when needed?  The operating system on our workstation can
> certainly do that, and eliminate 10 or 100 or more separate
> network-requests in doing so, but =only= if it also has a way to know
> when any of the other data it read has become "out of date."

> If your workstation, upon being asked to read record #1, goes ahead and
> reads a block of records #1-#10, then it can dole the other 9 records
> out to you as long as it is somehow able to know if anyone else changes
> any of those records (in which case your workstation will have to
> request the new data).  The lock/unlock messages that get sent down the
> wire enable the various workstations to know that.

> As you may expect, this cacheing process is both very-important to
> network performance, AND very complicated.  Success is extremely timing
> dependent.  It relies upon coordination between the file server and each
> and every workstation.  If any of the parties "gets it wrong," however
> slightly or briefly, the system can break down.

> And, historically, it has.  Early versions of SMARTDRV for Windows 3.x
> had problems; the VREDIR.VXD on early Windows 9x had problems; the
> "opportunistic locking" scheme on early Windows NT 4.x systems had
> problems.  All of those problems have been fixed ... but workstations
> and servers which -don't- carry those fixes are still out there, maybe
> on your network!

> [Incidentally, these cacheing systems have to work a whole lot harder if
> your network has faulty cabling, a marginal hub or router, a loose cable
> plug, a marginal network-card, or anything else that could introduce
> errors into the system.  These problems are difficult to diagnose
> without proper testing equipment .. which is expensive, but which any
> network-installer (company) worth his salt will own.]

> This is why it's important to be sure that the -entire- hardware and
> software configuration throughout your network is both up-to-date and,
> especially, consistent.  "It only takes one instrument being out of tune
> to spoil an entire symphony."  You should also keep a diary, in which
> you record everything you do to the network and everything that anyone
> observes.  Problems usually arise either (a) immediately after a change
> has been made; or (b) as a result of a hardware malfunction or
> developing hardware issue.

> Even though our company produces a database-repair tool, one that's used
> in over 20 countries around the world, the fact remains that file
> sharing networks are and should be very reliable.  You should be able to
> very reliably share database files across any properly-configured,
> current-and-consistent-versioned, network.  Problems will occur from
> time to time but they should never be chronic.

> ------------------------------------------------------------------
> ? Sundial Services :: Scottsdale, AZ (USA) :: (480) 946-8259

> > Fast(!), automatic table-repair with two clicks of the mouse!
> > ChimneySweep(R):  "Click click, it's fixed!" {tm}
> > http://www.sundialservices.com/products/chimneysweep



Sun, 14 Sep 2003 22:02:52 GMT  
 TechTips: How do shared files work?
Each of the files in a Paradox table has a "change counter" in the
header which should match all other occurrences of the same counter, and
do so exactly.  When it does not, the index or file is "out of date."  

We discovered that in a very significant number of cases, the counter is
the only thing that is actually wrong; the data itself is (provably)
good.  When this occurs, ChimneySweep resolves the problem by
resynchronizing the counters.

When the counters are wrong and the index data is also truly wrong,
ChimneySweep can temporarily resync the counters, to allow the table to
be opened for indexing purposes, then re-index the table.

Both of these processes are considerably faster than TUtility's pedantic
data copying, and usually, just as effective.

I'd suggest that you observe the problem to determine if only particular
tables are showing the resync issue, or if it appears to be more
random.  Perhaps there is an undisclosed program-flaw in your
application, or something that users are resetting their computers over,
or something along those lines.  An observable, repeatable pattern (if
there is one) is the first strong clue.

Quote:

> We run Chimney Sweep nightly to clean up our 418 table system.  Our
> day-to-day performance with up to 5 users is excellent.

> Several time per month we note that CS has had to repair a "resync" problem.
> Wonder what causes this particular problem and how to avoid it.

> --
> Jean Friedberg


------------------------------------------------------------------
Sundial Services :: Scottsdale, AZ (USA) :: (480) 946-8259

Quote:
> Fast(!), automatic table-repair with two clicks of the mouse!
> ChimneySweep(R):  "Click click, it's fixed!" {tm}
> http://www.sundialservices.com/products/chimneysweep



Mon, 15 Sep 2003 11:32:11 GMT  
 
 [ 10 post ] 

 Relevant Pages 

1. TechTips: Dealing with corruption in shared databases

2. TechTips: Common Paradox-table sharing mistakes

3. TechTips: Remember to SHARE! (index out of date)

4. TechTips: Table Repair .. how it works

5. TechTips: How do queries work?

6. Copying files - Traping file sharing error

7. Copying files - Traping file sharing error

8. FollowUp: On Shared Database Files, ScanDisk, and FILEnnnn.CHK files

9. TechTips: Why does the file keep on growing?

10. Delphi *.hlp files still not done

11. Q: to SHARE or not to SHARE ?

12. Problem Sharing PARADOX tables via shared directory on server

 

 
Powered by phpBB® Forum Software