Big number 
Author Message
 Big number



Quote:
><!doctype html public "-//w3c//dtd html 4.0 transitional//en">
><html>
>I have a source file like this:
><p>123456789876543212345
><br>23456789876543245667
><br>...
><p>The rows is about 70000.&nbsp; I have to sum up these numbers.&nbsp;
>However, i note that only 6 significant digits r shown in the output.&nbsp;
>Any way to minimize rounding error in my gawk script?

Please don't post using HTML.

 To answer your question, break the number up into separate
 numbers, representing ones, thousands, millions, billions, etc.

 The add the ones, thousands, etc getting a sum_of_the_ones,
 sum_of_the_thousands, etc, then combine these to get a
 printed output.

 Chuck Demas
 Needham, Mass.

--
  Eat Healthy    |   _ _   | Nothing would be done at all,

  Die Anyway     |    v    | That no one could find fault with it.



Tue, 06 Nov 2001 03:00:00 GMT  
 Big number


Quote:
> <unhtmled>
> I have a source file like this:
> 123456789876543212345
>  23456789876543245667
> ...
> The rows is about 70000. I have to sum up these numbers. However,
> i note that only 6 significant digits r shown in the output. Any
> way to minimize rounding error in my gawk script?

...

The first number above has 21 decimal digits. That's already beyond the
precision of IEEE double precision floating point (the typical awk
numeric representation). If you need an exact result, check if your
system has an arbitrary precision calculator like bc on some unix
systems. If you do, only use an awk script to convert your file into a
valid (though lengthy) series of expressions for the arbitrary
precision calculator. In other words, use an awk script to transform
your source data into a calculator script.

For bc,

awk 'NR==1{print "a=" $0;next}
{print "a=a+" $0}
END{print "a; quit"}' source > bcscript

then

bc bcscript

--== Sent via Deja.com http://www.deja.com/ ==--
---Share what you know. Learn what you don't.---



Tue, 06 Nov 2001 03:00:00 GMT  
 Big number


% 123456789876543212345
% 23456789876543245667
% ...
%
% The rows is about 70000.  I have to sum up these numbers.
% However, i note that only 6 significant digits r shown in the output.
% Any way to minimize rounding error in my gawk script?

awk is no longer your friend when you start using numbers this big.
You could use rexx to do this (Regina rexx can be retrieved through the
web page at http://www.lightlink.com/~hessling). The script would be

 numeric digits 40
 file = 'thefilename'
 sum = 0
 do until lines(file) = 0
   sum = sum + linein(file)
   end
 say sum

alternatively, you could use awk to preprocess input for dc or bc. On a
traditional unix system, bc is a pre-processor for dc, so dc would be
faster:

 awk '{ print | "dc" }
      NR > 1 { print "+" | "dc" }
      END { print "p" | "dc"
            close("dc")
      }' thefilename

but bc is more comprehensible, and on systems with Gnu bc, it isn't a
pre-processor to dc (which is not to say that dc isn't still faster,
I don't know about that, but bc could be as fast).

 awk '{ print "a+=" $0 | "bc" }
      END { print "a" | "bc"
            close("bc")
      }' thefilename

--

Patrick TJ McPhee
East York  Canada



Wed, 07 Nov 2001 03:00:00 GMT  
 Big number

[I had written]

%  awk '{ print | "dc" }
%       NR > 1 { print "+" | "dc" }
%       END { print "p" | "dc"
%             close("dc")
%       }' thefilename

But now I've sobered up and decided that this is simpler and probably
measurably faster if you have enough data:

  awk '{ print $0 "+" | "dc" }
       BEGIN { print "0" | "dc" }
       END { print "p" | "dc"
             close("dc")
       }' thefilename

What this does, for anyone not familiar with dc, is feed dc a script
like this:
 0
 123456789876543212345+
 23456789876543245667+
 p

dc is a stack-based calculator. This input pushes 0 on the stack, then the
first number, then it adds them together and puts the result on the
stack, and so on unti the end, then it prints the number on the stack.

My first version does one fewer addition, at the cost of repeatedly
testing whether we're on the first line.

I was just curious about performance, so I ran some tests on all four
solutions I've put forward (adding up 70,002 numbers ranging from
6968382597342 to 768602705255855100000).  These are the averages of
real, user, and system time reported by time for three runs of each
solution:
 rexx (regina):  6.50s  6.34s 0.07s
 nawk/bc:       17.73s 16.56s 0.95s
 mawk/bc:       12.62s 12.28s 0.13s
 nawk/dc(1):    42.30s 40.35s 1.41s
 mawk/dc(1):    35.90s 35.50s 0.11s
 nawk/dc(2):    41.81s 40.01s 1.36s
 mawk/dc(2):    35.60s 35.34s 0.08s

I have GNU bc here because it allows variable names > 1 character, but
since the system bc is a preprocessor to dc, I have to assume it's a
lot faster than the system bc would be, too.
--

Patrick TJ McPhee
East York  Canada



Wed, 07 Nov 2001 03:00:00 GMT  
 
 [ 4 post ] 

 Relevant Pages 

1. J with BIG numbers

2. REALLY big numbers

3. Big Numbers

4. UK Developers - BT Big Number source code available

5. Big number package in Eiffel...???

6. big numbers

7. big numbers

8. Big numbers

9. Big number conversion question.

10. SCM Scheme and VERY big numbers

11. Big numbers

12. dealing with big numbers

 

 
Powered by phpBB® Forum Software