Call by reference vs. call by value 
Author Message
 Call by reference vs. call by value

I came across a problem with a procedure that resembled the following:

type Simple_Vector is array (1 .. 2) of Float;

procedure Vector_Manipulation (X : in Simple_Vector;
                               Y : out Simple_Vector) is
begin

    Y (1) := X (1) + X (2);
    Y (2) := X (1) * X (2);

end Vector_Manipulation;

I was told that calling this procedure with the same in and out
parameters as follows would give unexpected results:

    Vector_Manipulation (X => Test_Vector, Y => Text_Vector);

The reason is that both parameters may be passed by reference, hence
modifying Y in the first line, modifies X as well, and the value of X
used in the second line is not what was desired.  I was surprised by
this, it sounds like a typical problem in C or C++, not something likely
to happen in Ada.  So I did some research and concluded from section 6.2
of the Ada 95 RM that it is entirely up to the compiler to determine how
arrays of simple elementary values are called.  I recall also reading
that ACT got itself in a bit of trouble by changing the behavior of gnat
to use call by reference for all records.

My reasons for posting this are twofold.  The first reason is to verify
that this is correct, and see if anyone else has been bitten by this
particular undefined behavior.  The second reason is to express some
disappointment.  I think in this area, C++ has a big advantage in that
it forces the programmer to be aware of whether a function uses
call-by-refernce of call-by-value semantics.  This is somewhat of a pain
at times, but at least it is obvious and clearly defined.  I had no idea
that Ada's behavior was undefined and wrongly assumed that in parameters
were always passed by value.

I realize that the solution to this problem in this case is to simply
make one assignment using an array aggregate, or change the procedure to
a function.  I have recommended this course of action.  I still think
that this type of error should not occur.  In many situations, the
solution may not so simple.  Ada's behavior should not be so
"implementation dependent".

- Chris



Wed, 06 Jan 1999 03:00:00 GMT  
 Call by reference vs. call by value

: I came across a problem with a procedure that resembled the following:

: type Simple_Vector is array (1 .. 2) of Float;

: procedure Vector_Manipulation (X : in Simple_Vector;
:                                Y : out Simple_Vector) is
: begin

:     Y (1) := X (1) + X (2);
:     Y (2) := X (1) * X (2);

: end Vector_Manipulation;

: I was told that calling this procedure with the same in and out
: parameters as follows would give unexpected results:

:     Vector_Manipulation (X => Test_Vector, Y => Text_Vector);
                                                    ^
                                                  Test_Vector

I think you have a typo and meant that both actual parameters were the
same variable.  I just recently answered this exact question for a
colleague.  Here is a copy of my reply...

I don't know if this is it, but check out LRM 6.2...

paragraph (1) - "...The value of a variable is said to be updated when an
assignment is performed to the variable, and also (indirectly) when the
variable is used as actual parameter of a subprogram call...that updates
its value"

paragraph (7) - "...an implementation may achieve these effects by
reference, that is, by arranging that every use of a formal parameter (to
read or to update its value) be treated as a use of the associated actual
parameter, throughout the execution of the subprogram call."

Since the out parameter is the same variable as the in parameter, if the
compiler writers chose to use "reference" for both the in and out parameter
then the value of the in parameter would change everytime the out parameter
changed.  Sounds hokie but they could probably get away with it with the
language referenced above.

--
Not necessarily the opinion of the company...
--
---------------------------------------------------------------------------
         James A. Krzyzanowski - Senior Software Engineer - AFATDS
Magnavox Electronic Systems Company * Fort Wayne, IN 46808 * (219) 429-6446

     "I'd rather be right than politically correct !!!" - Rush is Right
---------------------------------------------------------------------------



Wed, 06 Jan 1999 03:00:00 GMT  
 Call by reference vs. call by value

James said

"Since the out parameter is the same variable as the in parameter, if the
compiler writers chose to use "reference" for both the in and out parameter
then the value of the in parameter would change everytime the out parameter
changed.  Sounds hokie but they could probably get away with it with the
language referenced above."

Well passing arrays by reference may seem hokie (isn't that word spelled
hoky -- don't really know, it is not in my dictionary :-), but it is
quite usual, and is certainly what you want for really large arrays.
Most Ada programs will pass all arrays by reference, as will most fortran
programs. To me, a compiler that passed all arrays by copy would be
*really* broken -- at least we didnt' go that far in GNAT!



Wed, 06 Jan 1999 03:00:00 GMT  
 Call by reference vs. call by value

Chris said

   This is somewhat of a pain
   at times, but at least it is obvious and clearly defined.  I had no idea
   that Ada's behavior was undefined and wrongly assumed that in parameters
   were always passed by value.

Well you can always get bitten by not knowing a language rule that is
clear, and you can always be surprised if you wrongly assume something!
Ada's behavior is not undefined, it is implementation defined (there is
an important difference!)

It is quite odd to assume that all IN parameters are passed by value, do
you really expect million element arrays to be copied when they are passed
to procedures, I think not.

   I realize that the solution to this problem in this case is to simply
   make one assignment using an array aggregate, or change the procedure to
   a function.  I have recommended this course of action.  I still think
   that this type of error should not occur.  In many situations, the
   solution may not so simple.  Ada's behavior should not be so
   "implementation dependent".

Here is a case where the language semantics is non-deterministic for
efficiency reasons. To force call by value in all cases would be clearly
unaccepable for large arrays. On the other hand, to force call by
reference is also inefficient on some machines. Yes, the semantics would
be cleaner if there were no implementation dependence here, but efficiency
issues certainly cannot be ignored.

The question is, does someone who knows Ada get into trouble with this rule?
Clearly if you don't know a rule in the language you can always get into
trouble, you can't expect Ada to automatically correct your misconceptions
about the language. This is certainly a lesson that you need to know a
language well to avoid trouble, but it is not convincing that there is a
problem here.

Actually, I think most people who make a mistake here make a mistake the
other way round, they assume that arrays (and even records) have to be
passed by reference. The same thing happens in Fortran (which shares Ada's
approach of leaving it up to the implementation to decide whether to pass
by copy or reference) -- people often assume Fortran requires by-reference.

You actually remember the GNAT situation wrong, and if you remembered it
right, it would have reminded you of the issue. We changed record passing
from being by reference for large records to being always by value, as you
would like to see. This turned out to be completely unacceptable in some
situations because of severe degradation of performance -- so this is a
nice object lesson that your recommendation is infeasible!

P.S. we also got quite a few complaints of regressions that turned out to  be
people assuming that records were passed by reference, and their programs
depended on reference semantics!

P.P.S. In C++ you can't pass arrays anyway so the issue does not arise, so it
is wrong to say that the mechanism for this is defined in C++! If you cold
pass arrays in C or C++, then the same problem would arise.



Wed, 06 Jan 1999 03:00:00 GMT  
 Call by reference vs. call by value



Quote:
>My reasons for posting this are twofold.  The first reason is to verify
>that this is correct, ...

It is.

Quote:
>... and see if anyone else has been bitten by this
>particular undefined behavior.

Yes, they have.

Quote:
>...  The second reason is to express some
>disappointment.

I agree.  Making this implementation dependent is bad language design,
IMHO.

However, your suggested solution -- to always pass 'in' parameters by
reference -- is unacceptable from an efficiency point of view.  It means
you would have to copy huge amounts of data around when passing arrays
from one procedure to another to another to another.

It seems to me that a language can be designed so as to avoid the
problem you are complaining about, while still retaining the desirable
efficiency.  But it wouldn't be Ada.

- Bob



Thu, 07 Jan 1999 03:00:00 GMT  
 Call by reference vs. call by value

Quote:

>P.S. we also got quite a few complaints of regressions that turned out to  be
>people assuming that records were passed by reference, and their programs
>depended on reference semantics!

But this is proof that the implementation-definedness is indeed a
bug-causing problem.  I'm convinced there must be a better solution that
does not introduce huge amounts of copying.

- Bob

P.S. In thinking about this issue, one point is that for remote
procedure calls, you pretty much *have* to pass by copy, even for very
large things.  So a rule saying "all arrays are passed by reference" or
"all arrays that are not statically known to be small are passed by
reference (for some definition of small)" would break the Distributed
Systems Annex.



Thu, 07 Jan 1999 03:00:00 GMT  
 Call by reference vs. call by value

Robert Duff said

"However, your suggested solution -- to always pass 'in' parameters by
reference -- is unacceptable from an efficiency point of view.  It means
you would have to copy huge amounts of data around when passing arrays
from one procedure to another to another to another."

OOPS, typo! he meant "allways pass 'in' parameters by copy"

incidentally, Algol-68 requires that the equivalent of in parameters
always be passed by copy (it really cannot be otherwise in the Algol-68
semantic framework). However the only really successful implementation
of Algol-68 (Ian Currie's Algol-68R), completely ignored this and passed
all arrays by reference anyway -- he never got ONE complaint from a user
(although see the 1968 Munich proceedings to see him being denounced by
the Algol-68 cognoscenti -- Barry Mailloux accused him of heresy, and
reminded him that there was only one source of truth -- the Algol 68
report :-)



Thu, 07 Jan 1999 03:00:00 GMT  
 Call by reference vs. call by value

Bob Duff said

"P.S. In thinking about this issue, one point is that for remote
procedure calls, you pretty much *have* to pass by copy, even for very
large things.  So a rule saying "all arrays are passed by reference" or
"all arrays that are not statically known to be small are passed by
reference (for some definition of small)" would break the Distributed
Systems Annex."

If you want to think more about this issue, worry about packed arrays
too. Requiring call by reference would mean that ALL packed arrays have
to be passed using general bit pointers, which would be unacceptably
inefficient in the normal case where slices are not passed. Same
thing for records, all records would have to be passed by bit address,
just in case the record you are passing is a field in a rcord with a
record rep clause.

This is not an easy problem to solve -- if there were a simple solution,
I think it might have been found by now, but there doesn't seem to be one.



Thu, 07 Jan 1999 03:00:00 GMT  
 Call by reference vs. call by value

Quote:

> Chris said

>    This is somewhat of a pain
>    at times, but at least it is obvious and clearly defined.  I had no idea
>    that Ada's behavior was undefined and wrongly assumed that in parameters
>    were always passed by value.

> Well you can always get bitten by not knowing a language rule that is
> clear, and you can always be surprised if you wrongly assume something!
> Ada's behavior is not undefined, it is implementation defined (there is
> an important difference!)

> It is quite odd to assume that all IN parameters are passed by value, do
> you really expect million element arrays to be copied when they are passed
> to procedures, I think not.

Of course I have to admit ignorance here.  I never really thought
about it in Ada.  For some reason I assumed that the compiler would in
some way ensure that this type of error would not occur, either by
passing the array by value, or reordering the assignments so the
problem would not occur.  I realize I expected too much.

Quote:
> Here is a case where the language semantics is non-deterministic for
> efficiency reasons. To force call by value in all cases would be clearly
> unaccepable for large arrays. On the other hand, to force call by
> reference is also inefficient on some machines. Yes, the semantics would
> be cleaner if there were no implementation dependence here, but efficiency
> issues certainly cannot be ignored.

Well, I guess I never conceived a machine might exist where call-by-
value for anything bigger than a word is more efficient than call by
reference, so my first instinct would be to dictate that
call-by-reference be used in all cases.  I believed the opposite was
true, however, because I wrongly thought that the designers of Ada
valued safety over performance, which is not really true.

Quote:
> The question is, does someone who knows Ada get into trouble with
> this rule?  Clearly if you don't know a rule in the language you can
> always get into trouble, you can't expect Ada to automatically
> correct your misconceptions about the language. This is certainly a
> lesson that you need to know a language well to avoid trouble, but
> it is not convincing that there is a problem here.

Of course one should know a language well, but in this case the
language "wasn't very friendly".  My point about C/C++ is that the
programmer is forced to think about this issue immediately.  All
arrays are passed by reference, and all structs are passed by value -
no ambiguity about it.  You have to be aware of this to accomplish
anything in the language.  I learned of this problem within a month of
programming in C, but I've been using Ada for a year, and this is the
first time I've had this issue pop up.  Maybe this reflects poorly on
me, but what can I say!

Quote:
> Actually, I think most people who make a mistake here make a mistake the
> other way round, they assume that arrays (and even records) have to be
> passed by reference.

Sure, don't doubt it.  It makes perfect sense.

Quote:
> You actually remember the GNAT situation wrong, and if you remembered it
> right, it would have reminded you of the issue. We changed record passing
> from being by reference for large records to being always by value, as you
> would like to see. This turned out to be completely unacceptable in some
> situations because of severe degradation of performance -- so this is a
> nice object lesson that your recommendation is infeasible!

For the record, my recommendation is simply to clarify the issue.
It would be nice if there were no implementation defined behavior, in
a perfect computer world, but I'm not that naive.

One possibility that hasn't been explored is this: define the rule as
call-by-value *by default*, and give me a pragma or something to get
the more efficient call-by-reference behavior.  Then I would be forced
to be aware of the pitfalls.  It's not the greatest solution, but it
least it gives the programmer more control.

Quote:
> P.P.S. In C++ you can't pass arrays anyway so the issue does not arise, so it
> is wrong to say that the mechanism for this is defined in C++! If you cold
> pass arrays in C or C++, then the same problem would arise.

That's exactly my point, it is defined by not being possible!

--
-------------------------------------------------------------------------------
Chris Felaco                               Phone: x4631 (Raynet 444, Local 842)

-------------------------------------------------------------------------------



Fri, 08 Jan 1999 03:00:00 GMT  
 Call by reference vs. call by value

Quote:

> P.P.S. In C++ you can't pass arrays anyway so the issue does not arise, so it
> is wrong to say that the mechanism for this is defined in C++! If you cold
> pass arrays in C or C++, then the same problem would arise.

This is really a C or C++ issue, but I still am not sure I understand.  I seem
to remember passing arrays as function parameters in both C and C++.  Since an
array reference is always implemented with a pointer in C and C++, all such
passing is pass by reference.  Why then do you say you can't pass arrays anyway?


Fri, 08 Jan 1999 03:00:00 GMT  
 Call by reference vs. call by value

Chris said

"Well, I guess I never conceived a machine might exist where call-by-
value for anything bigger than a word is more efficient than call by
reference, so my first instinct would be to dictate that
call-by-reference be used in all cases.  I believed the opposite was
true, however, because I wrongly thought that the designers of Ada
valued safety over performance, which is not really true."

Well as I hope is clear from my earlier post, your conception was not
broad enough :-)

Cases where values larger than one word are more efficiently passed by
copy:

  - when bit alignment is involved (packed case)
  - several words can be passed in registers (according to many ABI's)
  - caller and callee are in different address spaces

and here are undoubtedly more.

Note that the idea of allowing implementation freedom here is not new.
Fortran-66 is quite explicit in leaving it up to the implementation
whether parameters are passed by value or reference.



Fri, 08 Jan 1999 03:00:00 GMT  
 Call by reference vs. call by value

Karl asks

"This is really a C or C++ issue, but I still am not sure I understand.  I seem
to remember passing arrays as function parameters in both C and C++.  Since an
array reference is always implemented with a pointer in C and C++, all such
passing is pass by reference.  Why then do you say you can't pass arrays anyway?"

You can pass a pointer *by value* in either C or Ada or C++, but this is
NOT a call by reference. You can pass a *pointer* to an array, but not
an array itself in C, but in Ada you can pass either a pointer to an
array (which obviously has reference semantics at the array abstraction
level, even though it is a call by value), OR you can pass an array.
It is when an array is passed, something you cannot do in C, that the
problem arises.

Why not just go to the C model of passing pointers around? you might ask!
The trouble with this approach is it requires far more aliasing than is
comfortable in a language of this class.
\



Fri, 08 Jan 1999 03:00:00 GMT  
 Call by reference vs. call by value

Quote:

>If you want to think more about this issue, worry about packed arrays
>too. Requiring call by reference would mean that ALL packed arrays have
>to be passed using general bit pointers, which would be unacceptably
>inefficient in the normal case where slices are not passed. Same
>thing for records, all records would have to be passed by bit address,
>just in case the record you are passing is a field in a rcord with a
>record rep clause.

Pascal solved this problem in a simple (but rather ugly) way: It's
illegal to pass a component of a packed array or record as a parameter.
The *programmer* must declare a temp variable, and make a copy.  And
Pascal doesn't have slices.

I suspect part of the reason for Ada's rules (which of course predate
the DS annex by a decade) is to avoid the ugliness of the Pascal rule.
Unfortunately, the Ada rule introduces an implementation dependence that
I find uncomfortable.

- Bob



Fri, 08 Jan 1999 03:00:00 GMT  
 Call by reference vs. call by value


Quote:

>Of course I have to admit ignorance here.  I never really thought
>about it in Ada.  For some reason I assumed that the compiler would in
>some way ensure that this type of error would not occur, either by
>passing the array by value, or reordering the assignments so the
>problem would not occur.  I realize I expected too much.

You're not alone -- I've heard many professional Ada programmers who are
surprised at the rule, or are surprised that there's a difference
between by-copy and by-reference.  That makes me nervous about this
particular implementation dependence.

Quote:
>Well, I guess I never conceived a machine might exist where call-by-
>value for anything bigger than a word is more efficient than call by
>reference,

Well, a word, or maybe 2 or 3 words -- it depends, for example, on how
many machine registers are lying around.  But that's basically right,
call-by-value is more efficient for small-ish things.  But machines have
different word sizes.  And it's not clear what fits in a word, even when
you know the word size.  For example, suppose I have a record containing
4 Character components (not packed).  Some compilers will allocate 4
words, some will allocate 4 bytes -- it depends on the machine, and to
some extent on the whim of the compiler writer.

Quote:
>... so my first instinct would be to dictate that
>call-by-reference be used in all cases.

All cases?  Surely you don't mean *all*.  Surely you want scalars passed
by copy (in a register, normally).  Maybe you meant "all array/record
cases"?  Well, that solves the implementation dependence problem, but
it's less efficient.  A packed array of 32 booleans fits in a word, and
I would like it to be passed by copy, in a register.  Also, Robert Dewar
pointed out elsewhere the problem with passing components of packed
objects (record-within-record, for example) by reference, when they're
not byte-aligned.  And slices.

Quote:
>...  I believed the opposite was
>true, however, because I wrongly thought that the designers of Ada
>valued safety over performance, which is not really true.

Right, the designers made trade-offs, and this is one that went in favor
of efficiency over safety.  (As I said elsewhere, I suspect it's
possible to get both, but not simple.)

Quote:
>> The question is, does someone who knows Ada get into trouble with
>> this rule?  Clearly if you don't know a rule in the language you can
>> always get into trouble, you can't expect Ada to automatically
>> correct your misconceptions about the language. This is certainly a
>> lesson that you need to know a language well to avoid trouble, but
>> it is not convincing that there is a problem here.

This is the standard argument that gets trotted out in favor of all
manner of C pitfalls -- "the programmer should know the language".
True, but unfortunately, many programmers do not know their language
that deeply, and even ones who do make mistakes.  I don't buy it in the
C case, and I don't buy it in this particular Ada case.

C has many things that are officially undefined, but programmers don't
know that, or don't notice it in particular cases, and write
non-portable and/or buggy code.  Ada has just a few undefined things, so
the problem is less severe, but it's still a problem.

Quote:
>Of course one should know a language well, but in this case the
>language "wasn't very friendly".

I agree.  My answer is, "Yes it's unfriendly/unsafe, but it's an
efficiency trade-off, and efficiency is important, too."

Quote:
>For the record, my recommendation is simply to clarify the issue.

Well, the RM is pretty clear on this point, I think.

Quote:
>It would be nice if there were no implementation defined behavior, in
>a perfect computer world, but I'm not that naive.

I agree.  It should be a language design goal to eliminate
implementation dependence.  However, it's not always feasible to do so.
For example, how could you eliminate the implementation dependence
inherent in delay statements?

Quote:
>One possibility that hasn't been explored is this: define the rule as
>call-by-value *by default*, and give me a pragma or something to get
>the more efficient call-by-reference behavior.  Then I would be forced
>to be aware of the pitfalls.  It's not the greatest solution, but it
>least it gives the programmer more control.

Yes, that's one reasonable solution.  It shouldn't be a pragma, but
should use some sensible syntax.  Some Pascal compilers do something
like this -- there are by-value parameters, and CONST parameters (which
are by-reference, but still in-mode), and VAR parameters (which are
by-ref/in-out).  Maybe CONST parameters are part of standard Pascal now;
I don't know.  Anyway, this solution has some problems (passing of
components of packed things, forcing the programmer to specify something
that *usually* doesn't matter, ..)

The SPARK subset of Ada solves the problem by making sure you can't
write code that can tell the difference between by-copy and by-ref.  In
particular, exception handling is not allowed, and aliasing of
parameters is not allowed.  This leads to a pretty restricted version of
the language, though.

- Bob



Fri, 08 Jan 1999 03:00:00 GMT  
 Call by reference vs. call by value

This is another Ada feature well covered by the SPARK subset.  The rules
of SPARK (which are checked by the SPARK Examiner) prohibit all cases
of aliasing where program meaning might be affected by the parameter
passing mechanism used. A SPARK program has copy-in, copy-out semantics
regardless of the compiler used to compile it.

Peter



Sat, 09 Jan 1999 03:00:00 GMT  
 
 [ 27 post ]  Go to page: [1] [2]

 Relevant Pages 

1. Call by reference vs. call by value

2. Call-by-value vs. call-by-need/name

3. is javascript call-by-reference or value?

4. call by value/reference conundrum (C<->Fortran)

5. Call by Reference / Value

6. IV Called by reference doesn't open when called

7. Cross reference/Call structure tools for C function calls

8. call-with-values & call/cc

9. multiple values vs proc calls

10. call sync_memory vs call sync_memory()

11. Call by Reference/Content vs. Using by Reference/Value

12. Scheme values and call-with-values

 

 
Powered by phpBB® Forum Software