Instruction Speed and instruction availability 
Author Message
 Instruction Speed and instruction availability


Quote:

> | Of course, you can get zapped if you had an instruction, but it disappears
> | in favor of emulation ... at which point the compilers start avoiding it :-)
> | Potential examples: VAX 11/780 -> MicroVAX II (where decimal ops disappeared).
> |  68030 -> 68040 (where sin, cos, et al got moved)
>   I'm told this is even worse, because the compiler had just be "made
> smart enough to use" some of the instructions which slowed down
> dractically.
>   I believe that we may see a day when compilers take as input both the
> program source and a profile of the execution environment which will
> result in generation of more nearly optimal code sequences.

Even this is not enough.  The compiler needs to take into account even
alternate algorithms, which "non-language" instructions are available,
considerations about frequency, variation in running time (for vector
and parallel machines),  The programmer, especially for library routines
and algorithm development, must know the alternatives.  

It is even questionable whether the compiler can do this without interaction
with the programmer/algorithm designer.  The number of alternatives may become
too large for a brute strength attack, and the variations in the problems
preclude much more than that.

Hardware designers need to get in on this as well, as minor changes in
hardware can drastically affect relative efficiencies.
--
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
Phone: (317)494-6054



Sat, 23 Apr 1994 20:36:33 GMT  
 Instruction Speed and instruction availability

Quote:

>alternate algorithms, which "non-language" instructions are available,
>considerations about frequency, variation in running time (for vector
>and parallel machines),  The programmer, especially for library routines
>and algorithm development, must know the alternatives.  

>It is even questionable whether the compiler can do this without interaction
>with the programmer/algorithm designer.  The number of alternatives may become
>too large for a brute strength attack, and the variations in the problems
>preclude much more than that.

This does not have to be done by hardware designers, nor does the programmer
necessarily need to "know the alternatives".

They key to solving this sort of problem is to adopt languages which are
semantically advanced, rather than bit-twiddling primitive languages.
For example, an engineer (or by heaven, even a statistician!) might want
to perform set membership. In a primitive language, your comment
certainly holds true, where variations in array size make different
algiorithms superior(linear search, hash, binary search...). When the
assumptions have to embedded in the user's code, two {*filter*} things occur:
   a. The clarity of the algorithm is obscured.
   b. Portability is lost: A linear search will beat binary search on
      a parallel machine.

If, on the other hand, you use a high-level language such as APL or J,
you can express membership trivially. In J, I'd write:
     X e. Y      
where e. denotes membership. The compiler or interpreter can then
make appropriate compile time or runtime decisions on the best
algorithm to use.

When I implemented fast search and membership algorithms in SHARP APL
back in 1971, I used several algorithms, and chose the best one based on:
   available storage    
   argumen types
   size of arguments
   comparison tolerance

This approach became common in APL, and our matrix product code
(APL and J have general matrix products, where you specify the
functions to be used. Normal matrix product in APL is specified as
     X   +.*   Y
where + is the reduction to be used, and * (No times sign on this
terminal, but you get the idea) is the multiply used to combine
left and right elements))
used a number of algorithms, which depended on the same parameters as
those above, as well as the two functions.
This made dramatic improvements in performance, with normal matrix
product beating IBM fortran by a factor of 2.5 to 3 on a 308x IBM S/370.  

Other matrix products, such as " or . and", on Booleans (partial
transitive closure computation) ran about 1000 times as fast as a normal
C programmer's code, because of attention to parallelism and cache
management which we could exploit within the interpreter.

The point I am making is that the implementor, NOT the application
programmer, should have to be aware of the peculiarities of the
underlying machine, in order to get good performance. If you can't
get good performance with interpreted or compiled languages, then
beat on the  compiler writers -- they're not doing their job.

Bob


Snake Island Research Inc  (416) 368-6944   FAX: (416) 360-4694
18 Fifth Street, Ward's Island
Toronto, Ontario M5J 2B9
Canada



Mon, 25 Apr 1994 01:24:42 GMT  
 Instruction Speed and instruction availability

Quote:

> >alternate algorithms, which "non-language" instructions are available,
> >considerations about frequency, variation in running time (for vector
> >and parallel machines),  The programmer, especially for library routines
> >and algorithm development, must know the alternatives.  

> >It is even questionable whether the compiler can do this without interaction
> >with the programmer/algorithm designer.  The number of alternatives may become
> >too large for a brute strength attack, and the variations in the problems
> >preclude much more than that.

> This does not have to be done by hardware designers, nor does the programmer
> necessarily need to "know the alternatives".

Which programmer?  Also, the choice of algorithms is extremely great in many
situations.  The library routines need to be written, and often they have not
been produced in a reasonable manner.  An adequate language needs to be
available for the writer of library routines, and if the applications
programmer is educated in the problem, rather than merely being a trained
seal when it comes to programming, that programmer will often see how to
use the arcane instructions.

Quote:
> They key to solving this sort of problem is to adopt languages which are
> semantically advanced, rather than bit-twiddling primitive languages.
> For example, an engineer (or by heaven, even a statistician!) might want
> to perform set membership. In a primitive language, your comment
> certainly holds true, where variations in array size make different
> algiorithms superior(linear search, hash, binary search...). When the
> assumptions have to embedded in the user's code, two {*filter*} things occur:
>    a. The clarity of the algorithm is obscured.
>    b. Portability is lost: A linear search will beat binary search on
>       a parallel machine.

So how does the compiler decide which search algorithm to use?  What if
a threshhold search is needed rather than an equality search?  It is not
clear that a linear search will beat binary search on a parallel machine;
binary searches can also go on in parallel.  And would not a hardware
search instruction run rings around any software implementation?

What if you have a computationally efficient bit-twiddling algorithm?  An
algorithm which produces exponential random variables by examinging and
average of a half-dozen bits per random variable and then procedes by
a fixed-point masking and combining the integer with the resulting fraction
(I have such routines) certainly requires bit-twiddling.  Will such a
routine be competitive with others which are obviously computationlly
more involved?  Not if it is difficult to examine one bit.  There are
other like problems.  

Quote:
> If, on the other hand, you use a high-level language such as APL or J,
> you can express membership trivially. In J, I'd write:
>      X e. Y      
> where e. denotes membership. The compiler or interpreter can then
> make appropriate compile time or runtime decisions on the best
> algorithm to use.

But the compiler must know all of the alternatives.  These alternatives
are highly dependent on the architecture, and they have to be written.
The writing of them will have to be done by knowledgeable programmers,
who frequently must know enough about the subject matter to know what
the algorithms are doing.  The problems I have pointed out have only
been addressed by coming up with slow, clumsy methods.  They have not
been constructed for the purpose of finding what the languages and
hardware do badly, although they have been selected as examples.

Quote:
> When I implemented fast search and membership algorithms in SHARP APL
> back in 1971, I used several algorithms, and chose the best one based on:
>    available storage    
>    argumen types
>    size of arguments
>    comparison tolerance
> This approach became common in APL, and our matrix product code
> (APL and J have general matrix products, where you specify the
> functions to be used. Normal matrix product in APL is specified as
>      X   +.*   Y
> where + is the reduction to be used, and * (No times sign on this
> terminal, but you get the idea) is the multiply used to combine
> left and right elements))
> used a number of algorithms, which depended on the same parameters as
> those above, as well as the two functions.
> This made dramatic improvements in performance, with normal matrix
> product beating IBM Fortran by a factor of 2.5 to 3 on a 308x IBM S/370.  

Normal matrix product in a reasonable HLL of the type you mentioned would
just use the ordinary overloaded multiplication symbol.  The other products
available in APL are the unusual ones,  A language cannot have an adequate
number of types; the structs and classes in C and C++ should be called types,
which they really are, and other types should be added at will.

Quote:
> Other matrix products, such as " or . and", on Booleans (partial
> transitive closure computation) ran about 1000 times as fast as a normal
> C programmer's code, because of attention to parallelism and cache
> management which we could exploit within the interpreter.
> The point I am making is that the implementor, NOT the application
> programmer, should have to be aware of the peculiarities of the
> underlying machine, in order to get good performance. If you can't
> get good performance with interpreted or compiled languages, then
> beat on the  compiler writers -- they're not doing their job.

The current implementors do not know anywhere near enough to do the job.
Nor do I think they ever will know enough.  Unless the HLLs allow the
algorithm producer to indicate the alternatives, such libraries as IMSL
must necessarily be slow and clumsy.  Portable software need not use the
identical algorithm, and the present applications programmer may need to
indicate the choices.
--
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
Phone: (317)494-6054

{purdue,pur-ee}!pop.stat!hrubin(UUCP)


Mon, 25 Apr 1994 22:35:32 GMT  
 Instruction Speed and instruction availability

Quote:


>>They key to solving this sort of problem is to adopt languages which are
>>semantically advanced ...

This is correct, although I believe that the example languages (APL and J)
are still too low-level.  They (and F9x) are in the right direction for
those who will work on still higher levels, and the wrong direction for
those who will work on lower levels.


Quote:
(Herman Rubin) writes:
>The current implementors do not know anywhere near enough to do the job.

This is probably true.  (The presence of fuzzy qualifiers like `anywhere
near' make it hard to assess the truth value of a statement.  In this case,
dropping them produces a true statement.)

Quote:
>Nor do I think they ever will know enough.

This is also probably true, because as we learn more, `the job' changes.

Quote:
>Unless the HLLs allow the algorithm producer to indicate the alternatives,
>such libraries as IMSL must necessarily be slow and clumsy.

We are not talking about IMSL.

Dr. Rubin is constantly griping about languages that do not let him
use an efficient syntax to express the series of bit operations for
producing particular random numbers.  What he seems completely unable
to grasp is that, while the problems being solved involve producing
random numbers, `producing random numbers' is not the end goal.  The
end goal is something like `determine the proper structure for this
object in order to get optimal heat flow'.

The correct language used to solve that problem appears above.  It is
known as `natural language'.  Anything else is too low level, because
it involves making some particular decision as to how to approach the
solution.

Herman Rubin will of course object that someone (or more correctly,
many `someone's, both separately and jointly) will have to make such
decisions and implement them.  This is true.  There is a word for
such people; the word is `implementors'.  I can now restate the issue
in Herman's own words:

Quote:
>The current implementors do not know anywhere near enough to do the job.

`The job' is to produce languages on such a level that one can write
something like this:

        Try Robinson's model of energy flow, with the same input data
        as before.

and have the computers run a suitably efficient algorithm.  We do not
know how to do this, and by the time we know how to do this, this will
not be what we need to do anymore.  In the meantime, we have implementors
do it for us.
--
In-Real-Life: Chris Torek, Lawrence Berkeley Lab CSE/EE (+1 510 486 5427)



Fri, 29 Apr 1994 04:56:54 GMT  
 Instruction Speed and instruction availability

Quote:

>So how does the compiler decide which search algorithm to use?  What if
>clear that a linear search will beat binary search on a parallel machine;
>binary searches can also go on in parallel.  And would not a hardware
>search instruction run rings around any software implementation?

Sigh. Let me try once more, and maybe put this thread to a long-overdue
rest.

a. Yes, indeedy. A linear search may or may NOT beat binary search
   on a parallel machine, and yes indeedy, binary searches can also
   go in parallel. BUT, what runs fast on machine X may not run fast
   on machine Y. You either suffer from non-portability, or use
   high-level semantics and trust (or recommend improvements to) your
   compiler and interpreter writers.

   I, as an implementor, have the job of spending a large portion of my
   time ENSURING that primitives execute faster than the code produced
   by 99% of the application programmers on the planet. If I can do
   that, I consider I have achieved success, because I have given them
   tools to use which they could not have built themselves, and which
   make their job (solving the application problem) faster, simpler,
   cheaper, and portably.

b. Hardware search instructions do NOT always run rings around
   any software implementation ( I think this was the beginning of
   this tattered thread). A few examples from the harsh world of
   Reality:
      a. IBM S/360 Translate and Test instruction.
         This basically does table lookup on bytes until you hit
         a non-zero.

         If you are looking at enough bytes, it will beat a loop,
         (Usually. ON SOME S/360s.) For short stuff, a loop is
         faster, and doesn't gunk up R1 and R2, which in turn
         wreaks havoc with register allocation, because R1 and R2
         are BOLTED into the definition.    

         For long stuff, long means less than 257 bytes, thanks to
         SS format instruction limits.

         If you are looking, for example, for the first non-zero bit
         (or the first zero bit) in a Boolean list, TRT is MUCH
         slower than a loop which ganders at a word at a time.

         Bottom Line: TRT is a fun instruction, which sometimes happens
         to pay off a bit. But it doesn't pay off very big, compared to
         loops.

      b. IBM S/370 Compare Characters Long instruction.
         This bit of Edsel design, and its friend Move Characters Long,
         offers less flexibility than a loop (MVC loop can perform
         data smearing of any sort; MVCL cannot. Data smearing is
         a swell way to initialize large arrays).
         However, in spite of eating 4 registers, and STILL needing
         a loop around it in general (address space size is 2gig,
         but CLCL, MVCL can only deal with 16meg objects), the
         damn CLCL runs about one THIRD the speed of a CLC loop on
         a 3084.

So, your desired Swiss army knife approach to computer design is no
better than the Swiss army knife approach to aircraft maintenance tools.
Give me a proper torque wrench any day.

Quote:

>The current implementors do not know anywhere near enough to do the job.
>Nor do I think they ever will know enough.  Unless the HLLs allow the

Thanks for the vote of confidence. May I recommend that you cease
letting your lack of experience with well-designed, well-engineered
products color your perceptions.


Snake Island Research Inc  (416) 368-6944   FAX: (416) 360-4694
18 Fifth Street, Ward's Island
Toronto, Ontario M5J 2B9
Canada



Fri, 29 Apr 1994 00:42:00 GMT  
 Instruction Speed and instruction availability

Quote:


> (Herman Rubin) writes:
> >The current implementors do not know anywhere near enough to do the job.
> This is probably true.  (The presence of fuzzy qualifiers like `anywhere
> near' make it hard to assess the truth value of a statement.  In this case,
> dropping them produces a true statement.)
> >Nor do I think they ever will know enough.
> This is also probably true, because as we learn more, `the job' changes.
> >Unless the HLLs allow the algorithm producer to indicate the alternatives,
> >such libraries as IMSL must necessarily be slow and clumsy.
> We are not talking about IMSL.

Those users who do not know how to produce their basic algorithms must
rely on such things as IMSL, which is certainly one of the most
available collections of programs with the necessary flexibility.

Now if there were a decent language in which the programs could be
written which would take into account what can be done, it is quite
possible that the organization putting it out might use it.  But if
they are constrained to writing in the existing HLLs, they cannot do
a good job.

Quote:
> Dr. Rubin is constantly griping about languages that do not let him
> use an efficient syntax to express the series of bit operations for
> producing particular random numbers.  What he seems completely unable
> to grasp is that, while the problems being solved involve producing
> random numbers, `producing random numbers' is not the end goal.  The
> end goal is something like `determine the proper structure for this
> object in order to get optimal heat flow'.

This is but one example.  The number theorists gripe about not being
able to do efficient multiprecision integer arithmetic.  Those doing
research in numerical methods need to do multiprecision real arithmetic,
and they probably curse whoever insisted on normalized floating point,
and not having at least simple multiprecision floating point available.
The present double is really what single should be, and this is so
considered on the Cray (48 bit mantissa) and the {*filter*} 205 (47 bits).
Both of these make some provision for less accurate numbers.  But the
Cray makes it an adventure to go beyond 48, while the {*filter*} does not
make it an adventure to go beyond 47.  

There are other types of problems.  There are many microbranching
algorithms.  On most of the modern machines, these have real drawbacks.
In many cases, hardware could do these well.  This is especially
important for multiprocessing, really slowing down SIMD, for example.

Quote:
> The correct language used to solve that problem appears above.  It is
> known as `natural language'.  Anything else is too low level, because
> it involves making some particular decision as to how to approach the
> solution.

What is natural language?  The computer does not and will not in the
foreseeable future understand natural language.  Until computers can
do such things as clearly understand what is being said in spoken
language by a multitude of speakers speaking different languages,
they cannot understand anything beyond simple instructions.  We tell
the users (or should) that a computer will do whatever it is told, no
matter how stupid.  But it must be told in language it understands.

Quote:
> Herman Rubin will of course object that someone (or more correctly,
> many `someone's, both separately and jointly) will have to make such
> decisions and implement them.  This is true.  There is a word for
> such people; the word is `implementors'.  I can now restate the issue
> in Herman's own words:
> >The current implementors do not know anywhere near enough to do the job.
> `The job' is to produce languages on such a level that one can write
> something like this:
>    Try Robinson's model of energy flow, with the same input data
>    as before.
> and have the computers run a suitably efficient algorithm.  We do not
> know how to do this, and by the time we know how to do this, this will
> not be what we need to do anymore.  In the meantime, we have implementors
> do it for us.

I am quite familiar with what the standard statistical packages can do
and also cannot do.  I do not have enough data to assess the harm they do.
One cannot but wonder at what the results mean when totally inappropriate
procedures are used because there are computer packages to do them.

The implementors must include people who understand both the numerical
or symbolic problems involved and the computer capabilities.  Since the
language designers and hardware designers know little of the mathematical
problems, there hardware and software designers are going to have to allow
them to do a post facto algorithm construction.

As I tell my students, one must do analysis before doing numerical analysis.
This last semester I assigned a problem which, done crudely, could take
2^1000 steps.  Naturally, there is an easier way.
--
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
Phone: (317)494-6054

{purdue,pur-ee}!pop.stat!hrubin(UUCP)



Fri, 29 Apr 1994 23:57:18 GMT  
 Instruction Speed and instruction availability

says, I thought that I should point out a warning:

      b. IBM S/370 Compare Characters Long instruction.
         This bit of Edsel design, and its friend Move Characters Long,
         offers less flexibility than a loop (MVC loop can perform
         data smearing of any sort; MVCL cannot. Data smearing is
         a swell way to initialize large arrays).

Data smearing is a dangerous thing (potentially slow) to do on
advanced memory systems.
--


Intel Corp., M/S JF1-19, 5200 NE Elam Young Pkwy,
Hillsboro, Oregon 97124-6497

This is a private posting; it does not indicate opinions or positions
of Intel Corp.

Intel Inside (tm)



Sat, 30 Apr 1994 05:09:41 GMT  
 Instruction Speed and instruction availability

Quote:


>says, I thought that I should point out a warning:

>Data smearing is a dangerous thing to do on advanced memory systems.

While Andy may be right, I have no idea WHY he says this. (This is one
of the things in the net that drives me batty. People make bald
statements like "Eschew Brevity", without any support for the statement).

My experience on RISC machines is limited, so I am not sure if ytoou are
referring to compiler limitations, or storage subsystem problems.
On older machines, in particular the S/370 series from IBM, I noted that
the performance of smeared MVC on IBM-designed machines was VERY sensitive
to data alignment of source and sink. This is a pity, because  the machines
were microcoded, and a bit of smarts in the Ucode to handle smears would
have provided superior performance, rather than lousy performance.

The Hitachi machines, on the other hand, had Excellent movers, which would
provide maximum performance regardless of data alignment. They obviously
put more effort into their Ucode for MVC.

Back to your comment on "dangerous", Andy:
   Is smearing slow?
   Or potentially slow?
   Or are the semantics of the machine architecture such that
the result of smearing is undefined? (MVC was specified to operate
properly on smearing, regardless of specific underlying  machine).

Bob

PS: Why would I want to smear? I wanted the speediest possible implementation
of the APL "reshape" primitive. This uses the elements of the right
argument (over and over if needed) to make an array of the shape specifed
by the left  argument. For example:
      3 4 reshape 'abcde'
abcd
eabc
deab

So, you move the  right argument (or piece thereof, if it's bigger than
the result) to the result, and if you aren't done yet, smear the
beginning of the result to the end of where you moved stuff into the  
result.

For extra credit, figure out how to do use MVC to do this with Booleans!

Snake Island Research Inc  (416) 368-6944   FAX: (416) 360-4694
18 Fifth Street, Ward's Island
Toronto, Ontario M5J 2B9
Canada



Mon, 02 May 1994 01:16:26 GMT  
 Instruction Speed and instruction availability

Quote:
Herman Rubin writes:
> The number theorists gripe about not being
> able to do efficient multiprecision integer arithmetic.  Those doing
> research in numerical methods need to do multiprecision real arithmetic,
> and they probably curse whoever insisted on normalized floating point,
> and not having at least simple multiprecision floating point available.

From the announcement:

  GNU MP is a portable library for arbitrary precision arithmetic,
  operating on signed integers and rational numbers.  It has a rich set of
  functions, and the functions have a regular interface.

  The speed of GNU MP is about 5 to 100 times that of Berkeley MP for small
  operands.  The speed-up increases with the operand sizes for certain
  operations, where GNU MP has asymptotically faster algorithms.

Think of that, Herman.  While you were cursing the darkness, someone was
lighting a single candle.  You could do this too, but then there is the
danger you might have to open your eyes first.

-- Jon


matter is all illusion, I have to be able to tell you what it would be like
if matter were not all illusion" -- Jon McKenney



Tue, 03 May 1994 13:01:50 GMT  
 Instruction Speed and instruction availability

Quote:

> Herman Rubin writes:
> > The number theorists gripe about not being
> > able to do efficient multiprecision integer arithmetic.  Those doing
> > research in numerical methods need to do multiprecision real arithmetic,
> > and they probably curse whoever insisted on normalized floating point,
> > and not having at least simple multiprecision floating point available.
> From the announcement:
>   GNU MP is a portable library for arbitrary precision arithmetic,

                ^^^^^^^^

Quote:
>   operating on signed integers and rational numbers.  It has a rich set of
>   functions, and the functions have a regular interface.
>   The speed of GNU MP is about 5 to 100 times that of Berkeley MP for small
>   operands.  The speed-up increases with the operand sizes for certain
>   operations, where GNU MP has asymptotically faster algorithms.
> Think of that, Herman.  While you were cursing the darkness, someone was
> lighting a single candle.  You could do this too, but then there is the
> danger you might have to open your eyes first.

I suspect that there are at least thousands of mathematicians who could have
done this, given competent programmers to do the grunt work.  This has nothing
on the problems with the Berkeley frexp, for which anyone who understood
machine language on any machine could have quickly turned out an improvement
by a factor comparable with 100.

But the problem is not answered.  The improvement stated is only for part of
the problem, and the requirement of portability greatly constrains the
solution.  Give me a few programmers, support for a weak of my time and
the necessary amount of their time, and I will do at least a comparable
job.  Now I might be able to spare my time, but I have only occasional
access to even one programmer.  Also, is this any better than the packages
produced by Silverman or by Bach and I do not know how many others?

Also, exactly what operations are provided by hardware makes a big
difference, even for integer MP.  I have not seen the GNU package,
but exactly which operations are available in integer and in
floating arithmetic make big differences in how things should be done.
There are even big differences in how MP numbers should be arranged in
units, which are heavily dependent on the architecture.  It is not clear
that the integer arithmetic should be done in the integer unit rather
than the floating unit.

The real question raised is HARDWARE support.  The GNU MP package, and the
others I mentioned, are for integer and rational arithmetic.  Fixed-point
and floating-point real arithmetic are not covered.  They provide some
different problems.  Even simple modifications in hardware, as for example
providing for non-normalized arithmetic and easy communication between
integer and floating units, will produce speedups on the order of 5-10
by iminating the need for checking normalization, making the use of
different widths for different purposes easy, eliminating messy alignment,
etc.  No amount of software development can overcome these problems.  To
say that these are not needed is no more sensible than to say that floating
arithmetic is not needed, as it can always be simulated.
--
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
Phone: (317)494-6054

{purdue,pur-ee}!pop.stat!hrubin(UUCP)



Wed, 04 May 1994 22:58:24 GMT  
 Instruction Speed and instruction availability

Quote:

>(Jon Krueger) writes:
>>Think of that, Herman.  While you were cursing the darkness, someone was
>>lighting a single candle.  You could do this too, but then there is the
>>danger you might have to open your eyes first.


Quote:
(Herman Rubin) writes:
>But the problem is not answered.

What problem?

Quote:
>Also, exactly what operations are provided by hardware makes a big
>difference, even for integer MP.

So what?

Quote:
>I have not seen the GNU package ...

but you are quite ready to spend time declaiming its faults (no doubt
it has plenty of those, but so what?).  Yet you seem unwilling to spend
time producing what *you* want, or even to provide constructive
comments (things like `procedure P could make use of invariant I to
eliminate code C from inner loop L' or `I think it would be better to
provide procedures L1, L2, and M rather than a single combined
prodedure').  This is why people find you so intensely irritating.

Quote:
>The real question raised is HARDWARE support.

No.  Hardware support is only indirectly related to interface
specifications.  The GNU MP library is an implementation of an
interface specification.  As such, it provides two separate things:

  - An interface specification.  Programmers can write higher level
    code---and please note that `high level code' has little to do with
    `high level languages'---using this interface, without knowing
    anything as to how the internals work.

  - Actual (portable) code to accomplish the interface.  This can be
    used to `get off the ground'.  When programs using the interface
    prove too slow for some particular task, they can use the same
    interface, if it is well-designed, for access to a more efficient
    machine-specific interface.  This is where the hardware support
    comes in, if it comes in at all.

This does not, of course, provide the absolute fastest method, which
is to use specially-tailored algorithms for specially-tailored hardware.
What it does provide is something that is often *good enough*.
--
In-Real-Life: Chris Torek, Lawrence Berkeley Lab CSE/EE (+1 510 486 5427)



Fri, 06 May 1994 19:43:44 GMT  
 
 [ 11 post ] 

 Relevant Pages 

1. Availability of "N3" instructions

2. --- Instruction speed ---

3. --- Instruction speed ---

4. Instruction speeds

5. Instruction Speed

6. pentium instruction clock speeds

7. P4/Athlon instruction speed

8. instruction -> speed

9. Speed of WAM instructions

10. Reference: HW Instructions to Speed Up Lisp?

11. Priviledged instruction

12. MVCLE and CLCLE Instructions.

 

 
Powered by phpBB® Forum Software