awk array elements 
Author Message
 awk array elements


Quote:




>> > Is there an easy way to print the total number of elements in an awk
>> > array?

>> Other than counting the entries,  in awk/nawk/gawk there is no other
>> way of sizing the array.  Are you having some problem counting?

>Well, gawk's asort function returns the number of elements, so it
>is possible the OP might be able to get a count for free as a side
>effect of something else he is doing.

>One question that probably belongs in comp.lang.awk is why
>does the awk interpreter not keep a count of the number of
>elements in each (associative) array that could be made available
>via a function call?

Like TAWK does?  Is that what you are getting at?


Tue, 26 Apr 2005 08:01:34 GMT  
 awk array elements

Quote:


...
>>One question that probably belongs in comp.lang.awk is why
>>does the awk interpreter not keep a count of the number of
>>elements in each (associative) array that could be made available
>>via a function call?

>Like TAWK does?  Is that what you are getting at?

It's purely for convenience/laziness. There are two ways to assign entries
in an array: using the split function and assigning specific entries. The
former returns the number of entries created. The latter doesn't, but script
code could be added to count the array entries after each assignment, e.g.,
using the index "" for the array count,

a[""] += !(x in a)
a[x] = y



Tue, 26 Apr 2005 08:27:15 GMT  
 awk array elements

Quote:



>...
>>>One question that probably belongs in comp.lang.awk is why
>>>does the awk interpreter not keep a count of the number of
>>>elements in each (associative) array that could be made available
>>>via a function call?

>>Like TAWK does?  Is that what you are getting at?

>It's purely for convenience/laziness. There are two ways to assign entries
>in an array: using the split function and assigning specific entries. The
>former returns the number of entries created. The latter doesn't, but script
>code could be added to count the array entries after each assignment, e.g.,
>using the index "" for the array count,

>a[""] += !(x in a)
>a[x] = y

Isn't everything in AWK "purely for convenience/laziness"?

I.e., it is all just a shorthand for the equivalent C program, which is, in
turn just shorthand for the equivalent machine code?

And, so your point is?



Tue, 26 Apr 2005 09:12:46 GMT  
 awk array elements
Which, in turn, is just a convenience over programming a Turing
Machine.
Quote:

>Isn't everything in AWK "purely for convenience/laziness"?

>I.e., it is all just a shorthand for the equivalent C program, which is, in
>turn just shorthand for the equivalent machine code?

>And, so your point is?



Tue, 26 Apr 2005 09:30:36 GMT  
 awk array elements

Quote:


> ...
> >>One question that probably belongs in comp.lang.awk is why
> >>does the awk interpreter not keep a count of the number of
> >>elements in each (associative) array that could be made available
> >>via a function call?

> >Like TAWK does?  Is that what you are getting at?

> It's purely for convenience/laziness. There are two ways to assign entries
> in an array: using the split function and assigning specific entries. The
> former returns the number of entries created. The latter doesn't, but
script
> code could be added to count the array entries after each assignment,
e.g.,
> using the index "" for the array count,

> a[""] += !(x in a)
> a[x] = y

I was thinking, "Hmm, AWK really could do that much more efficiently
internally, it's only an INC in some sort of traversal."

But then I thought, "Why would you want to know how many elements there are
in an array, anyway?"

If you want to traverse elements based on count, then that count must
correspond to something.  For instance, consider

    delete a; a[1] = x; a[3] = y
    for (i = 1; i <= acount(a); i++) doSomething(i)

You can't use i to reference into a, so what good is it?  This only works if
a is uniformly, incrementally, filled.  split() and asort() do this.  And in
user space, like Harlan's example, something like this would typically be
used:

    count=0; delete a
    :
    a[++count] = data

(or 'a[++a[0]] = data' if you want to package the count in with a.)

But then, you already have 'count'.

Suppose, though, you have incremental data coming in:

    1 e; 2 t; 3 a; 4 o; 5 n; 6 i; 7 s; 8 r; 9 h; 10 l

    BEGIN {RS=";"} {a[$1]=$2}
    END {for (i=1; i<= acount(a); i++) print a[i] }

This could be of use...but why not just use:

    END {for (i=1; i<= $1; i++) print a[i] }

There is the case of wanting to pre-count data for random output (or at
least, output in indeterminate order).  Here I want a page header showing
total number of pages.

    END {
        pages = int(acount(a) / 50) + (acount(a) % 50 > 0)
        c = 0; p = 0
        for (i in a) {
            if (!(c++ % 50)) print "Page " ++p " of " pages
            print i
        }
    }

But I think this is rare enough that I wouldn't mess with the language.
(But even in this case, I think I would be counting things as I put them in
the array, like Harlan suggests, so I could use the count for other
worthwhile things during processing.)

Did I miss any other reasons one would need a count, but not yet have it
already at hand?

    - Dan



Tue, 26 Apr 2005 12:23:59 GMT  
 awk array elements
...

Quote:
>And, so your point is?

Anything easily done in the awk script itself with awk code (such as keeping
track of the number of entries in an array) doesn't need to be built into the
language. If you want a language with overstuffed syntax, use Perl. At least
it's available on any platform anyone would want to run it on, unlike TAWK.


Tue, 26 Apr 2005 15:00:56 GMT  
 awk array elements

Quote:


>...
>>And, so your point is?

>Anything easily done in the awk script itself with awk code (such as keeping
>track of the number of entries in an array) doesn't need to be built into the
>language.

You obviously missed my point.  If what you said was true, then nothing
should be built-in (since anything could be done in user-space, if needed).

Quote:
>If you want a language with overstuffed syntax, use Perl. At least it's
>available on any platform anyone would want to run it on, unlike TAWK.

Your envy is showing.


Tue, 26 Apr 2005 19:43:59 GMT  
 awk array elements

...

Quote:
>I was thinking, "Hmm, AWK really could do that much more efficiently
>internally, it's only an INC in some sort of traversal."

A basic principle of real software is that it is better for the
implementors to implement commonly needed features than for users to do so.
You can google for the reasons (look for "Kenny McCormack" in
"comp.lang.awk") if it isn't obvious to you why this is so.

Or, to put it another way, I actually do understand why the GAWK
implementor(s) want to keep the feature set small, but can't understand why
us users should be expected to feel the same way.  Obviously, a good
example of "group-think".

Quote:
>But then I thought, "Why would you want to know how many elements there are
>in an array, anyway?"

Why would I ever need to know the arc tangent of y/x?  In all my years of
writing programs in dozens of languages, I've never needed this, but yet
that function is there in most of those languages.

A common use of the "length(A)" construct in TAWK is like this:

/somecondition/ { A[$1] = $0 }
END     {
        if (length(A)) # Did I get anything at all?
        ...
        }

Obviously, syntactic sugar, but, as I've said repeatedly, all of AWK is
syntactic sugar - since real men use Turing machines (to quote another
poster).

Another version of the above is:

END     {
        print "There are",length(A),"element" (length(A)==1 ? "" : "s"),
                "in the array"
        }

Quote:
>If you want to traverse elements based on count, then that count must
>correspond to something.  For instance, consider

>    delete a; a[1] = x; a[3] = y
>    for (i = 1; i <= acount(a); i++) doSomething(i)

>You can't use i to reference into a, so what good is it?  This only works if
>a is uniformly, incrementally, filled.  split() and asort() do this.

Nobody is doubting that you can get around the limitations of a primitive
implementation.  It is just more fun and satisfying to work in a
full-featured one.

Quote:
>Did I miss any other reasons one would need a count, but not yet have it
>already at hand?

See above.  By the way, have you ever used atan2()?


Tue, 26 Apr 2005 19:58:36 GMT  
 awk array elements

Quote:



> ...
>>I was thinking, "Hmm, AWK really could do that much more efficiently
>>internally, it's only an INC in some sort of traversal."

> A basic principle of real software is that it is better for the
> implementors to implement commonly needed features than for users to do so.
> You can google for the reasons (look for "Kenny McCormack" in
> "comp.lang.awk") if it isn't obvious to you why this is so.

> Or, to put it another way, I actually do understand why the GAWK
> implementor(s) want to keep the feature set small, but can't understand why
> us users should be expected to feel the same way.  Obviously, a good
> example of "group-think".

>>But then I thought, "Why would you want to know how many elements there are
>>in an array, anyway?"

> Why would I ever need to know the arc tangent of y/x?  In all my years of
> writing programs in dozens of languages, I've never needed this, but yet
> that function is there in most of those languages.

> A common use of the "length(A)" construct in TAWK is like this:

/somecondition/ { A[$1] = $0 }

Is there some reason you can't

   /somecondition/ { A[$1] = $0;ct++ }
   END   {
         if (ct) # Did I get anything at all?
         ...
         }

Or you could:

   END { for (any in A) { any=1;break }
         if (any) ...

You still haven't made a case for the feature.

--
Dan Mercer

If responding by email, include the phrase 'from usenet'
in the subject line to avoid spam filtering.

- Show quoted text -

Quote:
> END        {
>    if (length(A)) # Did I get anything at all?
>    ...
>    }

> Obviously, syntactic sugar, but, as I've said repeatedly, all of AWK is
> syntactic sugar - since real men use Turing machines (to quote another
> poster).

> Another version of the above is:

> END        {
>    print "There are",length(A),"element" (length(A)==1 ? "" : "s"),
>            "in the array"
>    }

>>If you want to traverse elements based on count, then that count must
>>correspond to something.  For instance, consider

>>    delete a; a[1] = x; a[3] = y
>>    for (i = 1; i <= acount(a); i++) doSomething(i)

>>You can't use i to reference into a, so what good is it?  This only works if
>>a is uniformly, incrementally, filled.  split() and asort() do this.

> Nobody is doubting that you can get around the limitations of a primitive
> implementation.  It is just more fun and satisfying to work in a
> full-featured one.

>>Did I miss any other reasons one would need a count, but not yet have it
>>already at hand?

> See above.  By the way, have you ever used atan2()?

Opinions expressed herein are my own and may not represent those of my employer.


Tue, 26 Apr 2005 21:57:19 GMT  
 awk array elements

Quote:





>> ...
>>>I was thinking, "Hmm, AWK really could do that much more efficiently
>>>internally, it's only an INC in some sort of traversal."

>> A basic principle of real software is that it is better for the
>> implementors to implement commonly needed features than for users to do so.
>> You can google for the reasons (look for "Kenny McCormack" in
>> "comp.lang.awk") if it isn't obvious to you why this is so.

>> Or, to put it another way, I actually do understand why the GAWK
>> implementor(s) want to keep the feature set small, but can't understand why
>> us users should be expected to feel the same way.  Obviously, a good
>> example of "group-think".

>>>But then I thought, "Why would you want to know how many elements there are
>>>in an array, anyway?"

>> Why would I ever need to know the arc tangent of y/x?  In all my years of
>> writing programs in dozens of languages, I've never needed this, but yet
>> that function is there in most of those languages.

>> A common use of the "length(A)" construct in TAWK is like this:

>/somecondition/ { A[$1] = $0 }

>Is there some reason you can't

>   /somecondition/ { A[$1] = $0;ct++ }

This is ugly, but the most common way to do it.

Quote:
>   END   {
>         if (ct) # Did I get anything at all?
>         ...
>         }

>Or you could:

>   END { for (any in A) { any=1;break }

This is putrid.

Quote:
>         if (any) ...

>You still haven't made a case for the feature.

Are you as dense as you seem?

Of course there are workarounds.  My point is that it is better if it is a
builtin.  You don't have to agree with that, but it would be nice if you
could understand it.



Tue, 26 Apr 2005 22:41:54 GMT  
 awk array elements

Quote:






>>> ...
>>>>I was thinking, "Hmm, AWK really could do that much more efficiently
>>>>internally, it's only an INC in some sort of traversal."

>>> A basic principle of real software is that it is better for the
>>> implementors to implement commonly needed features than for users to do so.
>>> You can google for the reasons (look for "Kenny McCormack" in
>>> "comp.lang.awk") if it isn't obvious to you why this is so.

>>> Or, to put it another way, I actually do understand why the GAWK
>>> implementor(s) want to keep the feature set small, but can't understand why
>>> us users should be expected to feel the same way.  Obviously, a good
>>> example of "group-think".

>>>>But then I thought, "Why would you want to know how many elements there are
>>>>in an array, anyway?"

>>> Why would I ever need to know the arc tangent of y/x?  In all my years of
>>> writing programs in dozens of languages, I've never needed this, but yet
>>> that function is there in most of those languages.

>>> A common use of the "length(A)" construct in TAWK is like this:

>>/somecondition/ { A[$1] = $0 }

>>Is there some reason you can't

>>   /somecondition/ { A[$1] = $0;ct++ }

> This is ugly, but the most common way to do it.

>>   END   {
>>         if (ct) # Did I get anything at all?
>>         ...
>>         }

>>Or you could:

>>   END { for (any in A) { any=1;break }

> This is putrid.

One would have to ask why,  since it is the least expensive method of
testing whether an array holds any data.

Quote:

>>         if (any) ...

>>You still haven't made a case for the feature.

> Are you as dense as you seem?

> Of course there are workarounds.  My point is that it is better if it is a
> builtin.  You don't have to agree with that, but it would be nice if you
> could understand it.

Well,  maybe if you would stop thinking of them as arrays and thinking
of them as hashes you would understand why a count hasn't been implemented.

Even when you act as though they are arrays,  i.e.

   X[1] = y

You are really hashing two strings - "1" and the string value of whatever
is in y.  And you have to be careful about accidentally creating
entries:

   $ awk 'BEGIN {
   > if (X[22]) print 1
   > else print 2
   > for (i in X) print i
   > exit
   > }'
   2
   22

In short,  if you think about awk arrays as hashes you will probably
be safe.  If you think too much of them as arrays you will probably
come to grief.

Seriously,  if this bothers you then you probably shouldn't program
in awk - pick a more procedural language like perl.  I've often
thought AWK programming required a twisted mind to really do right.

--
Dan Mercer

If responding by email, include the phrase 'from usenet'
in the subject line to avoid spam filtering.

Opinions expressed herein are my own and may not represent those of my employer.



Tue, 26 Apr 2005 23:09:06 GMT  
 awk array elements
...

Quote:
>Well,  maybe if you would stop thinking of them as arrays and thinking
>of them as hashes you would understand why a count hasn't been implemented.

It has been implemented.  I use it all the time.

I really don't see why you keep pretending it doesn't exist.



Tue, 26 Apr 2005 23:34:04 GMT  
 awk array elements

Quote:


> ...
> >I was thinking, "Hmm, AWK really could do that much more efficiently
> >internally, it's only an INC in some sort of traversal."

> A basic principle of real software is that it is better for the
> implementors to implement commonly needed features than for users to do
so.
> You can google for the reasons (look for "Kenny McCormack" in
> "comp.lang.awk") if it isn't obvious to you why this is so.

> Or, to put it another way, I actually do understand why the GAWK
> implementor(s) want to keep the feature set small, but can't understand
why
> us users should be expected to feel the same way.  Obviously, a good
> example of "group-think".

> >But then I thought, "Why would you want to know how many elements there
are
> >in an array, anyway?"

> Why would I ever need to know the arc tangent of y/x?  In all my years of
> writing programs in dozens of languages, I've never needed this, but yet
> that function is there in most of those languages.

> A common use of the "length(A)" construct in TAWK is like this:

> /somecondition/ { A[$1] = $0 }
> END {
> if (length(A)) # Did I get anything at all?
> ...
> }

> Obviously, syntactic sugar, but, as I've said repeatedly, all of AWK is
> syntactic sugar - since real men use Turing machines (to quote another
> poster).

> Another version of the above is:

> END {
> print "There are",length(A),"element" (length(A)==1 ? "" : "s"),
> "in the array"
> }

> >If you want to traverse elements based on count, then that count must
> >correspond to something.  For instance, consider

> >    delete a; a[1] = x; a[3] = y
> >    for (i = 1; i <= acount(a); i++) doSomething(i)

> >You can't use i to reference into a, so what good is it?  This only works
if
> >a is uniformly, incrementally, filled.  split() and asort() do this.

> Nobody is doubting that you can get around the limitations of a primitive
> implementation.  It is just more fun and satisfying to work in a
> full-featured one.

> >Did I miss any other reasons one would need a count, but not yet have it
> >already at hand?

> See above.  By the way, have you ever used atan2()?

Hi Kenny-

Quote:
> A basic principle of real software is that it is better for the
> implementors to implement commonly needed features than
> for users to do so.

"Commonly" is the operative word.  I would maintain that this is not a
common need.

Quote:
> Obviously, syntactic sugar...

No, it wouldn't be.  "Syntactic sugar," as I understand the term, is
modification to the grammer, not the built-in function set, of a language.
For instance, the printf() function is only that.  But modifying the grammer
of the language to allow a printf keyword that works similarly to the print
statement is syntactic sugar.  Syntactic sugar is required to go places
where the ordinary language definition can't...like the variable number of
parameters to Pascal's write() and writeln().  Implementing length() or
count() or acount() hardly requires modification of the grammer of the
language.

Quote:
> By the way, have you ever used atan2()?

Actually, yes.  But that is beside the point you are trying to make.  Many
languages don't implement atan2(); they leave it in user space.  This
function, and those other rare ones, typically come to languages in
libraries; since AWK is (was) interpretted, and real math is (was) slow in
user-space, I'm guessing that, since they knew many people would be using,
say, sin() and cos(), that they just brought along the whole library,
somewhat mechanically.

Quote:
> >Did I miss any other reasons one would need a count, but not yet have it
> >already at hand?
> if (length(A)) # Did I get anything at all?

This is a good example of something I missed.  Strictly speaking, though,
you aren't interested in the number of elements at all...you just want to
know if there are any elements in the array.  So what you really want is the
isempty() function, not length().  But--wait--oh no--fearturitis is creeping
in--and the language is now two functions bigger to learn!

    - Dan



Wed, 27 Apr 2005 00:29:06 GMT  
 awk array elements

Quote:


> ...
> >Well,  maybe if you would stop thinking of them as arrays and thinking
> >of them as hashes you would understand why a count hasn't been
implemented.

> It has been implemented.  I use it all the time.

> I really don't see why you keep pretending it doesn't exist.

Kenny -
What do you mean?  You have, indeed, pointed out that your vendor has,
indeed, implemented it in their AWK-like language, but do gawk, awk, mawk,
POSIX-compliant awk, OTU specs, whatever, have this feature already?

Quote:
> I really don't see why you keep pretending it doesn't exist.

For those of us tht didn't pay extra for our AWK, I don't think it does.  (I
guess that's what paying extra gets you.)

It sounds like the only reason to use the intrinsic length is to test
whether or not an array is empty.  Do you use it for other uses, as well?

    - Dan



Wed, 27 Apr 2005 00:37:49 GMT  
 awk array elements

Quote:

<snip>

> > >Did I miss any other reasons one would need a count, but not yet have it
> > >already at hand?

> > if (length(A)) # Did I get anything at all?

> This is a good example of something I missed.  Strictly speaking, though,
> you aren't interested in the number of elements at all...you just want to
> know if there are any elements in the array.  So what you really want is the
> isempty() function, not length().  But--wait--oh no--fearturitis is creeping
> in--and the language is now two functions bigger to learn!

>     - Dan

function isempty(a,    i)       {
        for (i in a)    return 0;
        return 1;

Quote:
}

Works in gawk, even if the array was never defined. YMMV on other awks

(OK, the semicolons are syntactic sugar, but they make C programmers
like me more comfortable)

--
Jim Mellander
Incident Response Manager
Computer Protection Program
Lawrence Berkeley National Laboratory
(510) 486-7204

Your fortune for today is:

Questions are never indiscreet, answers sometimes are.
                -- Oscar Wilde



Wed, 27 Apr 2005 02:41:34 GMT  
 
 [ 36 post ]  Go to page: [1] [2] [3]

 Relevant Pages 

1. Shifting array element & regex on array element

2. array elements in (n)awk

3. Arrays: Build array in multiple for loops or replace array elements

4. Adding an element in an array of cluster of 2 elements

5. Access Array Elements by Arrays Reference

6. for every element of array find bounds in another array

7. Array and creation of the elements of the array

8. Adjustable array dimensions specified via array element?

9. creation of array elements and write traces on an array

10. selecting array elements with a logical array

11. (typep (make-array 10 :element-type 'bit) '(array bit (10)))

12. Do array element traces affect the whole array?

 

 
Powered by phpBB® Forum Software