Massively parallel programming for the Internet 
Author Message
 Massively parallel programming for the Internet

Are there any projects attempting to build a massively parallel
programming environment using spare cycles on the Internet?
Something where you put a really simple interpreter on the
end machines, mid-level cluster managers spaced periodically,
and end-user control stations?

This was inspired by the project to crack large prime numbers by
mailing large lists of numbers around to volunteers running the
sieve program.  Such a thing could be entirely automated.
David Gelernter's "Mirror Worlds" is clear articulation of this vision.

It's apparent to me that the Internet is going to have to dramatically
increase the sophistication of its internal "metabolic" processes;
parallel computations which monitor traffic and alter routing tables
will have to be done at some point.  An Internet Nervous System if you will.

Social consequences: well, at some point, you may be required to
run the interpreter as part of your Internet Tax.

Back to tech talk: is anybody using multisets as a basis for
poarallel programming?  Multisets looked like a possibly pleasant
paradigm for programming the distributed interpretive system.

Newsgroups: comp.parallel,comp.lang.misc,comp.protocols.tcp-ip
Summary: Massively parallel programming for the Internet
Followup-To:
Distribution: world
Organization: International Foundation for Internal Freedom
Keywords:

Are there any projects attempting to build a massively parallel
programming environment using spare cycles on the Internet?
Something where you put a really simple interpreter on the
end machines, mid-level cluster managers spaced periodically,
and end-user control stations?

This was inspired by the project to crack large prime numbers by
mailing large lists of numbers around to volunteers running the
sieve program.  Such a thing could be entirely automated.
David Gelernter's "Mirror Worlds" is clear articulation of this vision.

It's apparent to me that the Internet is going to have to dramatically
increase the sophistication of its internal "metabolic" processes;
parallel computations which monitor traffic and alter routing tables
will have to be done at some point.  An Internet Nervous System if you will.

Social consequences: well, at some point, you may be required to
run the interpreter as part of your Internet Tax.

Back to tech talk: is anybody using multisets as a basis for
parallel programming?  Multisets looked like a possibly pleasant
paradigm for programming the distributed interpretive system.

--

Lance Norskog

Artisputtingtogether. Art  s th ow n  aw y.



Mon, 02 Jun 1997 06:45:05 GMT  
 Massively parallel programming for the Internet
Are there any projects attempting to build a massively parallel
programming environment using spare cycles on the Internet?
Something where you put a really simple interpreter on the
end machines, mid-level cluster managers spaced periodically,
and end-user control stations?

This was inspired by the project to crack large prime numbers by
mailing large lists of numbers around to volunteers running the
sieve program.  Such a thing could be entirely automated.
David Gelernter's "Mirror Worlds" is clear articulation of this vision.

It's apparent to me that the Internet is going to have to dramatically
increase the sophistication of its internal "metabolic" processes;
parallel computations which monitor traffic and alter routing tables
will have to be done at some point.  An Internet Nervous System if you will.

Social consequences: well, at some point, you may be required to
run the interpreter as part of your Internet Tax.

Back to tech talk: is anybody using multisets as a basis for
parallel programming?  Multisets looked like a possibly pleasant
paradigm for programming the distributed interpretive system.

--

Lance Norskog

Artisputtingtogether. Art  s th ow n  aw y.



Tue, 03 Jun 1997 01:49:36 GMT  
 Massively parallel programming for the Internet
|> Are there any projects attempting to build a massively parallel
|> programming environment using spare cycles on the Internet?
|> Something where you put a really simple interpreter on the
|> end machines, mid-level cluster managers spaced periodically,
|> and end-user control stations?

One project along these lines is the Legion project, being headed by some
folks at the U. of {*filter*}ia.  They have a Mosaic WWW server that can
provide some additional details:

        http://www.*-*-*.com/ .{*filter*}ia.edu/~mentat/legion/legion.html

Patrick

---------------------------------------------------------------------------
|     Patrick T. Homer     | Dept. of Computer Science |    cat lover     |
| Post-Doc. Research Assoc.| The University of Arizona | backpacker/hiker |

|      (602) 621-4252      | (602) 621-4246 (FAX)      |  sf enthusiast   |
---------------------------------------------------------------------------



Sat, 07 Jun 1997 05:00:18 GMT  
 Massively parallel programming for the Internet

    Lance> Are there any projects attempting to build a massively
    Lance> parallel programming environment using spare cycles on the
    Lance> Internet?  Something where you put a really simple
    Lance> interpreter on the end machines, mid-level cluster managers
    Lance> spaced periodically, and end-user control stations?

    Lance> This was inspired by the project to crack large prime
    Lance> numbers by mailing large lists of numbers around to
    Lance> volunteers running the sieve program.  Such a thing could
    Lance> be entirely automated.  David Gelernter's "Mirror Worlds"
    Lance> is clear articulation of this vision.

Gelernter is working on this kind of thing at Yale. His system (or one
of his systems) is called Pirhana, and it grabs spare cycles off of
the Yale network for computation.

Pirhana programs are written in Linda. A description of parallel
programming in Linda can be found in Gelernter and Carriero's
_How_to_Write_Parallel_Programs_.

Jake



Sat, 07 Jun 1997 05:10:49 GMT  
 Massively parallel programming for the Internet
: Are there any projects attempting to build a massively parallel
: programming environment using spare cycles on the Internet?
: Something where you put a really simple interpreter on the
: end machines, mid-level cluster managers spaced periodically,
: and end-user control stations?

I have seen sevaral systems which may qualify at some point, but
I think the technology is still in it's infancy, and will take
some time to become widely available or used, if it ever is.

The contenders that I know of:
scheme-48:  There is work on turning this implementation of scheme
            into an internet scripting language.  A sample WWW server
            has been set up, see the scheme-48 home page for the
            exact location.
        http://www-swiss.ai.mit.edu/~jar/s48.html

safe-tcl:  I'm not sure of the details of this one.  It appears that
           the idea is to extend tcl in such a way that it can
           be used as an internet scripting language.

GUILE:  The new GNU extention language.  There was talk of making a
        safe GUILE, similar to safe-tcl.  So far, this is still
        vaporware, but I'm sure that at some point, someone will
        create a GUILE-capable WWW server.

Also, on Jeff Dalton's home page, I found a link to a project called
O-Plan, but I'm not sure if it's actually doing any distributed
computation besides what the server does.

     http://www.aiai.ed.ac.uk/~oplan/
     http://www.aiai.ed.ac.uk/~jeff/

: This was inspired by the project to crack large prime numbers by
: mailing large lists of numbers around to volunteers running the
: sieve program.  Such a thing could be entirely automated.
: David Gelernter's "Mirror Worlds" is clear articulation of this vision.

I'd be supprised if anything as 'organized' as Gelernter's vision
ever springs to life from the internet.  Here's a link I found to
Yale's Linda home page:
      http://www.cs.yale.edu/HTML/YALE/CS/Linda/linda.html
There is also a trellis page somewhere at the same site.

   mike
http://www.xmission.com/~callahan



Sat, 07 Jun 1997 05:11:56 GMT  
 Massively parallel programming for the Internet

Quote:

>Are there any projects attempting to build a massively parallel
>programming environment using spare cycles on the Internet?
>Something where you put a really simple interpreter on the
>end machines, mid-level cluster managers spaced periodically,
>and end-user control stations?

amoeba (by Andrew Taanenbaum ?) i think is an implementation of
such a system


Sat, 07 Jun 1997 10:48:36 GMT  
 Massively parallel programming for the Internet

Quote:

>Are there any projects attempting to build a massively parallel
>programming environment using spare cycles on the Internet?
>Something where you put a really simple interpreter on the
>end machines, mid-level cluster managers spaced periodically,
>and end-user control stations?

>This was inspired by the project to crack large prime numbers by
>mailing large lists of numbers around to volunteers running the
>sieve program.  Such a thing could be entirely automated.
>David Gelernter's "Mirror Worlds" is clear articulation of this vision.

>It's apparent to me that the Internet is going to have to dramatically
>increase the sophistication of its internal "metabolic" processes;
>parallel computations which monitor traffic and alter routing tables
>will have to be done at some point.  An Internet Nervous System if you will.

>Social consequences: well, at some point, you may be required to
>run the interpreter as part of your Internet Tax.

>Back to tech talk: is anybody using multisets as a basis for
>parallel programming?  Multisets looked like a possibly pleasant
>paradigm for programming the distributed interpretive system.

>--

>Lance Norskog

>Artisputtingtogether. Art  s th ow n  aw y.

Look at legion at http://cs.{*filter*}ia.edu

--

Computational Electromagnetics Laboratory             office: 708 491-8887
Northwestern University                               fax:    708 467-3217



Tue, 10 Jun 1997 04:26:06 GMT  
 Massively parallel programming for the Internet

Quote:

>Are there any projects attempting to build a massively parallel
>programming environment using spare cycles on the Internet?

...

Quote:
>David Gelernter's "Mirror Worlds" is clear articulation of this vision.

...

Quote:
>Back to tech talk: is anybody using multisets as a basis for
>parallel programming?  Multisets looked like a possibly pleasant
>paradigm for programming the distributed interpretive system.

This is an intersting juxtaposition, considering that Linda, Gelernter's
creation, in fact implements distributed multisets.  I haven't seen
"Mirror Worlds" - is this why you asked? In any case, the answer is
"Linda".

--Edward Segall



Wed, 11 Jun 1997 04:35:31 GMT  
 Massively parallel programming for the Internet

 [ Edward & I stuff deleted ]

Quote:
>This is an intersting juxtaposition, considering that Linda, Gelernter's
>creation, in fact implements distributed multisets.  I haven't seen
>"Mirror Worlds" - is this why you asked? In any case, the answer is
>"Linda".

Hmmm... I'll have to look at Linda again.

I had crossed it off because it doesn't let go of the classic
computer science obsession with computational efficiency.  All
of the projects I've scanned make the same mistake.  We're
talking about a system with 10,000 computers randomly executing
parts of one program while very slowly communicating with one
another.  The CPU time is essentially free, it's the messages
that are expensive.  

Given this, building special compilers that carefully optimize
your compiled C or fortran (which Gelernter did) is a complete
waste of time, because it ignores the relative costs mentioned
above.  When you factor in the  cost of delivering N different
binaries to your 10,000 computers, C-Linda is (massively)
silly.  Not to mention the security problems of allowing some
{*filter*}er on the Internet run a binary on your precious
workstation.  You need to put a very simple verifiable
interpreter on those machines which implements a nice dense
program representation.  If I'm going to run some Internet
daemon on my workstation, it's going to be something where
I can examine the code and be damn sure it can't wipe out
any of my data.

A second problem is that these projects fixate on designing one
language for the problem.  This is hubris.  You're not going to
do a good language/control system the first time, so just build
the infrastructure and let anyone submit jobs in the low-level
interpreter.  Then you and they can independently do research in
languages and interactive de{*filter*}s.

Another problem these projects (legion, for example) make is
starting small.  Nothing scales up nicely, so you should just
start big and thrash wildly until something seems to work.

The Internet is a software ecology, which slowly creates habitat
niche and demand for new software technologies.  At some point the
niche and demand for a "Netputer" will appear, and I think the
existence of the prime-smashers-by-email project indicates
that it already has.  But there are politics involved; there
must be grass-roots demand (see Mosaic).  Computer users and
site managers must want the Netputer service before they will
install the interpreters en masse.  I don't know how
to create this.  The Internet Nervous System idea is the
only one I've come up with.  INS is predicated on the
theory that the first generation of interpreters will not
be allowed any local storage to make them more palatable,
thus the nodes can only reprocess data and not store anything.

Perhaps this is wrong, what if they each had some disk space?
They could be allowed to search newsgroups via NNTP and
thus you could write a distributed newsgroup searcher.
They could be allowed to be HTTP clients and then they
could implement topologically smart caching of popular web
sites, allowing the netputer to handle overflow situations
like JPL had with the Schumaker-Levy web site right after
the impact.  (Netnews or its successor could certainly use
this service.)  At some point they could take over DNS
service, public key storage, anonymous remailing, etc.

Someday you may be required to tithe computer time and disk
space to the Internet.  If such is the price of freedom...

--

Lance Norskog

Artisputtingtogether. Art  s th ow n  aw y.



Sat, 21 Jun 1997 03:03:24 GMT  
 Massively parallel programming for the Internet
: Hmmm... I'll have to look at Linda again.

I just posted a request on comp.parallel for information on Linda and
a possible freeware Linda. ( was I just dreaming? )

This is a most interesting thread and I will take some time to digest
it.

Thanks in advance for pointers.




Sat, 21 Jun 1997 14:35:24 GMT  
 Massively parallel programming for the Internet

    thinman> Hmmm... I'll have to look at Linda again.

    thinman> I had crossed it off because it doesn't let go of the
    thinman> classic computer science obsession with computational
    thinman> efficiency.  All of the projects I've scanned make the
    thinman> same mistake.  We're talking about a system with 10,000
    thinman> computers randomly executing parts of one program while
    thinman> very slowly communicating with one another.  The CPU time
    thinman> is essentially free, it's the messages that are
    thinman> expensive.

    thinman> Given this, building special compilers that carefully
    thinman> optimize your compiled C or Fortran (which Gelernter did)
    thinman> is a complete waste of time, because it ignores the
    thinman> relative costs mentioned above.  When you factor in the
    thinman> cost of delivering N different binaries to your 10,000
    thinman> computers, C-Linda is (massively) silly.  Not to mention

Except that the whole point of the C-Linda compiler is to optimize
communication. It divides the tuple-space operations in your program
into groups based on the number of fields in the tuple and their
types, and then subdivides each group with a hash function. Each
group/hash bin is assigned to some node on the network. So most of the
work of locating a tuple is done at compile time, and a running
program doesn't have to query lots of nodes (i.e. incur lots of
communication cost) to find it.

The balance between computation and coordination can be altered by
changing the granularity of your parallelization; it is possible to
write "tunable" programs that have a knob for granularity and can be
optimized for (on one end of the scale) a shared-memory multiprocessor
machine or (on the other end) the Internet. Let me recommend (again)
Carriero and Gelernter's _How_to_Write_Parallel_Programs_.

The real problem with C-Linda is that it assumes a static program and
execution environment. An all-Internet programming environment would
be subject to all manner of machine and network failures, and Linda is
not designed to deal with them. Westbrook and Zuck (professors at Yale
who work with Gelernter) are working on a fault-tolerant
generalization of Linda called PASO. Moreover, if you allow the
program to be updated as the computation runs, all of C-Linda's
global, compile-time optimization goes out the window. PASO attempts
to address this issue as well, providing for an adaptive organization
of tuples; i.e. groups and the nodes which serve them are determined
as the program runs and data is stored in tuple space.

    thinman> the security problems of allowing some {*filter*}er on the
    thinman> Internet run a binary on your precious workstation.  You
    thinman> need to put a very simple verifiable interpreter on those
    thinman> machines which implements a nice dense program
    thinman> representation.  If I'm going to run some Internet daemon
    thinman> on my workstation, it's going to be something where I can
    thinman> examine the code and be damn sure it can't wipe out any
    thinman> of my data.

There would be plenty of room for participants in the Net computer who
don't actually run the computation: coordination on that scale
requires a large amount of support. Servers would be needed to store
pieces of the distributed memory (assuming that your model includes
one), and some system would be needed just to keep track of who is
participating in the computation. These pieces won't be running random
code.

    thinman> A second problem is that these projects fixate on
    thinman> designing one language for the problem.  This is hubris.
    thinman> You're not going to do a good language/control system the
    thinman> first time, so just build the infrastructure and let
    thinman> anyone submit jobs in the low-level interpreter.  Then
    thinman> you and they can independently do research in languages
    thinman> and interactive de{*filter*}s.

The Linda tuple-space model doesn't require that the program be
written in a uniform language--the tuple space acts as an abstraction
barrier, and it is possible for two languages which embed the
tuple-space operations to intercommunicate. I wrote an implementation
of the tuple-space operations embedded in Scheme which interfaced to
the PASO implementation, and I wrote systems composed of cooperating C
and Scheme programs which communicated through tuple space.


    thinman> Artisputtingtogether. Art s th ow n aw y.

This is a really interesting discussion!

Jake



Tue, 24 Jun 1997 11:20:23 GMT  
 Massively parallel programming for the Internet

    thinman> Hmmm... I'll have to look at Linda again.

    thinman> I had crossed it off because it doesn't let go of the
    thinman> classic computer science obsession with computational
    thinman> efficiency.  All of the projects I've scanned make the
    thinman> same mistake.  We're talking about a system with 10,000
    thinman> computers randomly executing parts of one program while
    thinman> very slowly communicating with one another.  The CPU time
    thinman> is essentially free, it's the messages that are
    thinman> expensive.

    thinman> Given this, building special compilers that carefully
    thinman> optimize your compiled C or Fortran (which Gelernter did)
    thinman> is a complete waste of time, because it ignores the
    thinman> relative costs mentioned above.  When you factor in the
    thinman> cost of delivering N different binaries to your 10,000
    thinman> computers, C-Linda is (massively) silly.  Not to mention

Except that the whole point of the C-Linda compiler is to optimize
communication. It divides the tuple-space operations in your program
into groups based on the number of fields in the tuple and their
types, and then subdivides each group with a hash function. Each
group/hash bin is assigned to some node on the network. So most of the
work of locating a tuple is done at compile time, and a running
program doesn't have to query lots of nodes (i.e. incur lots of
communication cost) to find it.

The balance between computation and coordination can be altered by
changing the granularity of your parallelization; it is possible to
write "tunable" programs that have a knob for granularity and can be
optimized for (on one end of the scale) a shared-memory multiprocessor
machine or (on the other end) the Internet. Let me recommend (again)
Carriero and Gelernter's _How_to_Write_Parallel_Programs_.

The real problem with C-Linda is that it assumes a static program and
execution environment. An all-Internet programming environment would
be subject to all manner of machine and network failures, and Linda is
not designed to deal with them. Westbrook and Zuck (professors at Yale
who work with Gelernter) are working on a fault-tolerant
generalization of Linda called PASO. Moreover, if you allow the
program to be updated as the computation runs, all of C-Linda's
global, compile-time optimization goes out the window. PASO attempts
to address this issue as well, providing for an adaptive organization
of tuples; i.e. groups and the nodes which serve them are determined
as the program runs and data is stored in tuple space.

    thinman> the security problems of allowing some {*filter*}er on the
    thinman> Internet run a binary on your precious workstation.  You
    thinman> need to put a very simple verifiable interpreter on those
    thinman> machines which implements a nice dense program
    thinman> representation.  If I'm going to run some Internet daemon
    thinman> on my workstation, it's going to be something where I can
    thinman> examine the code and be damn sure it can't wipe out any
    thinman> of my data.

There would be plenty of room for participants in the Net computer who
don't actually run the computation: coordination on that scale
requires a large amount of support. Servers would be needed to store
pieces of the distributed memory (assuming that your model includes
one), and some system would be needed just to keep track of who is
participating in the computation. These pieces won't be running random
code.

    thinman> A second problem is that these projects fixate on
    thinman> designing one language for the problem.  This is hubris.
    thinman> You're not going to do a good language/control system the
    thinman> first time, so just build the infrastructure and let
    thinman> anyone submit jobs in the low-level interpreter.  Then
    thinman> you and they can independently do research in languages
    thinman> and interactive de{*filter*}s.

The Linda tuple-space model doesn't require that the program be
written in a uniform language--the tuple space acts as an abstraction
barrier, and it is possible for two languages which embed the
tuple-space operations to intercommunicate. I wrote an implementation
of the tuple-space operations embedded in Scheme which interfaced to
the PASO implementation, and I wrote systems composed of cooperating C
and Scheme programs which communicated through tuple space.

Jake



Tue, 24 Jun 1997 23:03:47 GMT  
 Massively parallel programming for the Internet

Quote:

> Lance Norskog

> (Technically Sweet) writes:

> >The Internet is a software ecology, which slowly creates habitat
> >niche and demand for new software technologies.
> >The Internet Nervous System idea is the
> >only one I've come up with.

> Of course you have found Bernard Hubermann (Xerox PARC)'s
> The Ecology of Computation? (S-V)


>   Resident Cynic, Rock of Ages Home for Retired Hackers

Although I've always thought of it as a Hot house or green house with
many growing things, including weeds... perhaps James Burke described
the problem best and the solution when he said that there are no geniuses
with light bulbs over their heads pulling things from nowhere, instead
they are people who bits and pieces of various unrelated thing together
to make something... and see the below .sig...

-- Peter

*******************************************************************************

The problem with modern technology is not only that it is changing so fast but
that because of specialized fields and their associated 'specialty languages',
we don't understand, decisions are 'made for us'. To maintain our freedom
a way must be found to inform and educate everyone about the rapid changes,
the multi-disciplinary decisions, and their effect.
               -- Excerpted from James Burke & his Connections series

*The Answer is*: 1968: E-mail, computer conferencing, multiple windows, mice.
                    "Augmentation of Man's Intellect!!". -- Doug Englebart    
                    "Increasing the capability of man to approach a complex
                     problem or situation, gain comprehension to suit his
                     particular needs and derive solutions to problems."

              Whose implementation is *** INTERNET ****




Sat, 28 Jun 1997 22:49:46 GMT  
 
 [ 13 post ] 

 Relevant Pages 

1. Final CFP: Massively Parallel Programming Models (MPPM'99)

2. Massively Parallel Programming Languages

3. Final CFP: Massively Parallel Programming Models (MPPM'99)

4. Advance Program: Massively Parallel Computation, Frontiers '95

5. xilinx massively parallel question

6. Ada on massively parallel supercomputer

7. Ada Compilers for Massively Parallel Processors

8. Massively Parallel Prolog?

9. Advance Program: ILPS'91 Workshop on Parallel Execution of Logic Programs

10. Internet Model Optimizer (Parallel Graphics)

11. CFP: ACPC'96 - parallel DBs and parallel I/O

12. CFP: 3rd Int'l ACPC Conf: parallel DBS and parallel I/O

 

 
Powered by phpBB® Forum Software