awk questions !!! 
Author Message
 awk questions !!!

I have a test file separate into several column fields and
want to get those lines whose 4th field is equal to a
specified argument, look liks as follows:

prompt> cat test.csh
awk '$4 >= $2' src_file > dest_file

Question:
1) The $2 is intended to be the argument of the "test.csh",      i.e.
test.csh {argument} which is
passed to the "awk" utility.

     Then how to avoid that "awk" interpret $2 as its own argument
in '$4 >= $2' ????

I have tried the followings told by someone in the newsgroups but still
failed.
*******************************
 awk '$4 >= '"$2" src_file > dest_file
 or
 awk '$4 >= '"$2"'{print}' src_file >
dest_file
 or
 awk '$4 >= num {print} ' -v num="$2"
src_file > dest_file
 or
awk '$4 >= num ' -v num="$2" src_file >
dest_file
*******************************

THANKS !!!



Wed, 10 Sep 2003 12:03:28 GMT  
 awk questions !!!

Quote:
> I have a test file separate into several column fields and
> want to get those lines whose 4th field is equal to a
> specified argument, look liks as follows:

> prompt> cat test.csh
> awk '$4 >= $2' src_file > dest_file

> Question:
> 1) The $2 is intended to be the argument of the "test.csh",      i.e.
> test.csh {argument} which is
> passed to the "awk" utility.

>      Then how to avoid that "awk" interpret $2 as its own argument
> in '$4 >= $2' ????

> I have tried the followings told by someone in the newsgroups but still
> failed.
> *******************************
>  awk '$4 >= '"$2" src_file > dest_file
>  or
>  awk '$4 >= '"$2"'{print}' src_file >
> dest_file
>  or
>  awk '$4 >= num {print} ' -v num="$2"
> src_file > dest_file
>  or
> awk '$4 >= num ' -v num="$2" src_file >
> dest_file
> *******************************

> THANKS !!!

By using the '-v' option, you can pass a named variable, and accompanying
value, from the command-line to 'awk'. So, you could try:

     awk -v sParm=$2 '{ if ($4 >= sParm) print $0; }' src_file > dest_file

You probably also noticed that the 'awk' source code needs to be enclosed
within '{' and '}' characters, did you not ?

I hope this helps.



Wed, 10 Sep 2003 14:23:41 GMT  
 awk questions !!!

% I have tried the followings told by someone in the newsgroups but still
% failed.
% *******************************
%  awk '$4 >= '"$2" src_file > dest_file
%  or
%  awk '$4 >= '"$2"'{print}' src_file >
% dest_file

It seems like these should have worked. What happened? I could
see there being a problem if $2 has leading spaces, but otherwise
it should be fine provided it's set to a number..

%  or
%  awk '$4 >= num {print} ' -v num="$2"
% src_file > dest_file

-v has to come before the script (ie, this should be

  awk -v num="$2" '$4 > num' src_file > dest_file

). This will work with awk on most machines, but not on systems
which ship `old' (pre-1989) awk as well as `new' awk, which is the
dialect normally discussed in this newsgroup. A specific example
of such a system is Solaris, where you must either use `nawk', or
put /usr/xpg4/bin in your path ahead of /usr/bin, after which you
can use `awk'.

Apart from that, it's difficult to comment on the problem without
knowing what happened with each of the things you tried.
--

Patrick TJ McPhee
East York  Canada



Thu, 11 Sep 2003 03:50:57 GMT  
 awk questions !!!
AWKSTR="awk '\$4 >= $2' src_file > dest_file"
eval AWKSTR

or

awk -v VAL=$2 '$4 >= VAL' src_file > dest_file


I have a test file separate into several column fields and
want to get those lines whose 4th field is equal to a
specified argument, look liks as follows:
prompt> cat test.csh
awk '$4 >= $2' src_file > dest_file
Question:
1) The $2 is intended to be the argument of the "test.csh",      i.e.
test.csh {argument} which is
passed to the "awk" utility.
     Then how to avoid that "awk" interpret $2 as its own argument      in
'$4 >= $2' ????
I have tried the followings told by someone in the newsgroups but still
failed.
*******************************
 awk '$4 >= '"$2" src_file > dest_file
 or
 awk '$4 >= '"$2"'{print}' src_file >
dest_file
 or
 awk '$4 >= num {print} ' -v num="$2"
src_file > dest_file
 or
awk '$4 >= num ' -v num="$2" src_file >
dest_file
*******************************
THANKS !!!



Fri, 12 Sep 2003 05:52:42 GMT  
 awk questions !!!
The -v is needed for BEGIN blocks where the file list is not yet known,
therefore you only need to do this:
    awk '{ if ($4 >= sParm) print $0; }' sParm=$2 src_file > dest_file

regards,
Ben

Quote:

> By using the '-v' option, you can pass a named variable, and accompanying
> value, from the command-line to 'awk'. So, you could try:

>      awk -v sParm=$2 '{ if ($4 >= sParm) print $0; }' src_file > dest_file

> You probably also noticed that the 'awk' source code needs to be enclosed
> within '{' and '}' characters, did you not ?

> I hope this helps.



Mon, 15 Sep 2003 01:29:21 GMT  
 awk questions !!!

Quote:

>     awk -v sParm=$2 '{ if ($4 >= sParm) print $0; }' src_file > dest_file

>You probably also noticed that the 'awk' source code needs to be enclosed
>within '{' and '}' characters, did you not ?

Anthony, your last statement is not correct. The awk code relating to
the BEGIN {...} and END {...} blocks certainly needs to be within {...},
but the main code does *not* need to be enclosed within {...}.

There is no harm in doing so, EXCEPT that using {...} precludes use of the
abbreviated forms of IF statements within the program, e.g., in your awk
program above, your use of {...} has forced you into writing the IF test
explicitly as "if ($4 >= sParm) ..." whereas were you to omit the braces
that piece of code could then be written without the IF ( ) characters:

awk -v sParm=$2 '$4 >= sParm {print $0}' src_file > dest_file
and in this example further simplified to:
awk -v sParm=$2 '$4 >= sParm' src_file > dest_file

I have been using mawk for over a decade, and have always been peeved
that it wouldn't accept the abbreviated forms of statements such as I see
in awk manuals and in this group. Only recently did I discover that this
was my fault, and entirely due to my insistence on enclosing the main code
within braces. Braces around the main code are superfluous, best omit them.

Quote:
>I hope this helps.

Ditto.
--
John Savage            (for email, replace "ks" with "k" and delete "n")


Tue, 16 Sep 2003 14:32:51 GMT  
 awk questions !!!

Quote:

> The awk code relating to
> the BEGIN {...} and END {...} blocks certainly needs to be within {...},
> but the main code does *not* need to be enclosed within {...}.

That is somewhat misleading. For example,

awk 'print $1'

does not work without braces.

Quote:
> There is no harm in doing so, EXCEPT that using {...} precludes use of the
> abbreviated forms of IF statements within the program

I guess they can be called "abbreviated forms of IF" but I find it
even more misleading.

If you think of them that way, how would you explain that, e.g.,

awk '$1 > 1'
awk '$1 > 1 { print }'
awk '{ if ($1 > 1) print }'
awk '{ if ($1 > 1) { print } }'

work but

awk '{ $1 > 1 }'
awk '$1 > 1 print'
awk 'if ($1 > 1) print'
awk 'if ($1 > 1)'
awk 'if ($1 > 1) { print }'
awk '{ if ($1 > 1) }'

do not?

You can explain those away as idiosyncrasies but if you look at
it the right way (tm), they appear as logical consequences
of simple rules.

Quote:
> Braces around the main code are superfluous, best omit them.

It is indeed not necessary to enclose the entire awk program
in braces - indeed, it is incorrect except in rare (?) special cases.
Function definitions apart, awk programs consist of
patterns (conditions) which *must not* be enclosed in braces
and statement blocks which *must* be enclosed in braces:

pattern { statements }
pattern { statements }
pattern { statements }
...

There "pattern" can be any conditional expression or regular
expression, or two such separated by comma, or "BEGIN" or "END".

It can also be omitted, which is interpreted as "always true".
By enclosing the entire program in braces you are in effect
doing just that, you have only one pattern-statement pair
with empty pattern, and you are losing quite a lot of
expressive power of awk.

The '{ statements }' part can also be omitted, in which
case it defaults to '{ print }'. (In this case newline
after the pattern is required, otherwise it can be omitted.)

In general you cannot replace such patterns with IF-statements in any
obvious way, nor vice versa. In simple cases yes, but not always.

Besides BEGIN and END, range patterns cannot be converted into
if-statements directly. On the other hand, if-statements that occur
anywhere but the main level (performed for every line) can't be
converted into patterns.

For example,

awk '/begin geek code/,/^$/'

would be a major hassle to do with if-statements:

awk '{ if ($0 ~ /begin geek code/) flag=1
       if ($0 ~ /^$/) flag=0
       if (flag) print
     }'

In modern awk you can omit the "$0 ~" but it's still messy.

--
Tapani Tarvainen



Tue, 16 Sep 2003 17:56:48 GMT  
 awk questions !!!

Quote:

>> The awk code relating to
>> the BEGIN {...} and END {...} blocks certainly needs to be within {...},
>> but the main code does *not* need to be enclosed within {...}.

>>That is somewhat misleading. For example,

Your claim is itself misleading.

The OP indicated that he was starting out with a program template like:

BEGIN {    }
{ main code here }
END {      }

So I pointed out that he should use a program template of the form:

BEGIN {    }
main code here
END {      }

Quote:
>awk 'print $1'

>does not work without braces.

Of course it won't work, because 'print $1' does not constitute a valid
line of awk code.
--
John Savage            (for email, replace "ks" with "k" and delete "n")


Sat, 20 Sep 2003 17:24:58 GMT  
 awk questions !!!


% BEGIN {    }
% { main code here }
% END {      }

Which is reasonable for some purposes.

% So I pointed out that he should use a program template of the form:
%
% BEGIN {    }
% main code here
% END {      }

Which will almost never work.

% >awk 'print $1'
% >
% >does not work without braces.
%
% Of course it won't work, because 'print $1' does not constitute a valid
% line of awk code.

Of course it is, unless you're using some peculiar definition of `valid',
or an obscure definition of a `line of awk code'.

An awk program is made up of a sequence of pattern/action pairs, which
have one of these forms:

 pattern
 { action }
 pattern { action }

where pattern is one of BEGIN, END, an expression, or two expressions
separated by a comma, and action is consists of zero or more `valid
lines of awk code'. Ignoring the case where the action is completely
omitted, there always have to be braces around the `main code', meaning
the code that makes up an action.

In general, though, if this works:

 { code }

then this will not work

  code

and it's not terribly helpful to say that it will, is it?
--

Patrick TJ McPhee
East York  Canada



Sat, 20 Sep 2003 10:30:36 GMT  
 awk questions !!!

Quote:
Patrick TJ McPhee writes:

>> BEGIN {    }
>> { main code here }
>> END {      }

>Which is reasonable for some purposes.

As I indicated, I have often done so.

Quote:
>> So I pointed out that he should use a program template of the form:

>> BEGIN {    }
>> main code here
>> END {      }

>Which will almost never work.

Actually, it will always work.

% >awk 'print $1'
% >
% >does not work without braces.
%
% Of course it won't work, because 'print $1' does not constitute a valid
% line of awk code.

Quote:
>Of course it is, unless you're using some peculiar definition of `valid',

The interpreter defines what it sees as valid or not, nothing to do with me.
If it were a valid line of code, then that one line program would work. But
it doesn't, because 'print $1' (w/out the quotes) is not a valid one liner.

Take the original line of code from the command line, viz., awk 'print $1'.
As presented, 'print $1' (w/out the quotes) is not a valid line of awk code
because to the interpreter it constitutes neither a pattern nor an action.
The absence of braces means it attempts interpretation as a pattern, but it
fails, though this can be turned into a pattern by enclosing it within
slashes, as /print $1/. Alternatively, it can be caused to be interpreted
as an action by enclosing it within braces, as {print $1}. But, as written,
that line of code is accepted as providing neither a pattern nor an action.

Quote:
>Ignoring the case where the action is completely
>omitted, there always have to be braces around the `main code', meaning

Completely wrong.

Quote:
>the code that makes up an action.

Huh? Why would it make up an action? On the template example provided, it
clearly shows main code as that part of the program which lies outside the
the BEGIN and END actions. In general, the main program will not be one
single action, but rather will be multiple action-pattern pairs.

Quote:
>In general, though, if this works:

> { code }

>then this will not work

>  code

>and it's not terribly helpful to say that it will, is it?

I'm not aware anyone claimed otherwise. Certainly I didn't.

In general, though, if this works:

  code

then this will not work

  { code }

and it's not terribly helpful for you to say that it will, is it?

In an awk program, the main code, in general, comprises many pattern-action
pairs, and arbitrarily enclosing it all with a further pair of braces will
cause the interpreter to choke.

If, for some reason, you elect to code the main body of the program as one
giant action, it is entirely feasible to do so, but it is not necessary
that it be. Indeed, in general, it is better that it not be coded as a
single action but rather as multiple pattern-action pairs.
--
John Savage            (for email, replace "ks" with "k" and delete "n")



Thu, 25 Sep 2003 18:58:52 GMT  
 awk questions !!!

Quote:

> Patrick TJ McPhee writes:
> >In general, though, if this works:

> > { code }

> >then this will not work

> >  code

> >and it's not terribly helpful to say that it will, is it?

> I'm not aware anyone claimed otherwise. Certainly I didn't.

Then we misunderstood you.
Which was the whole point, actually:
How to explain when braces are needed and when not
without being misunderstood.

Quote:
> >> BEGIN {    }
> >> { main code here }
> >> END {      }

That is no good as "awk template", just as you say, but this

Quote:
> >> BEGIN {    }
> >> main code here
> >> END {      }

is not much good either:
If you don't already understand what awk means by "patterns"
and "actions", that will only confuse things further.
If you do understand them, you won't need such a template.

Quote:
> % >awk 'print $1'
> % >
> % >does not work without braces.
> %
> % Of course it won't work, because 'print $1' does not constitute a valid
> % line of awk code.

No? Then why

awk '
{
print $1

Quote:
}'

works?

The point being, "line" is not a good term here.
Newlines are significant in awk but not the way your sentence
seems to imply. Of course you didn't actually mean that,
as your explanation makes clear:

Quote:
> 'print $1' (w/out the quotes) is not a valid one liner.

but again the whole point is what kind of terminology is clear
and what is confusing.

Quote:
> In general, though, if this works:

>   code

> then this will not work

>   { code }

That depends on what you mean by "code". Quite a few people
will interpret it as synonym for "action", or "statements
inside action block", in which case it *will* work.

No, I'm not saying that's a good interpretation either.

Quote:
> In an awk program, the main code, in general, comprises many pattern-action
> pairs, and arbitrarily enclosing it all with a further pair of braces will
> cause the interpreter to choke.

> If, for some reason, you elect to code the main body of the program as one
> giant action, it is entirely feasible to do so, but it is not necessary
> that it be. Indeed, in general, it is better that it not be coded as a
> single action but rather as multiple pattern-action pairs.

With that I agree completely.

I don't, however, see much point in making a big distinction between
"main code" and BEGIN- and END-blocks, and the template you
suggested above I find misleading. I prefer the simple template

pattern { action }
pattern { action }
...

although I wouldn't object much to this either:

BEGIN { action }
pattern { action }
pattern { action }
...
END { action }

even though it seems to imply BEGIN- and END-blocks are mandatory,
which has never been true,
and that there must be just one of each and they must be
in the beginning and end, which is not true in modern awk's
(anything since nawk, or anything POSIX.2 compliant).

But substituting just "code" for the "pattern { action }..." part
will only confuse people, IMHO.

--
Tapani Tarvainen



Thu, 25 Sep 2003 12:13:13 GMT  
 awk questions !!!


% % Of course it won't work, because 'print $1' does not constitute a valid
% % line of awk code.
%
% >Of course it is, unless you're using some peculiar definition of `valid',
%
% The interpreter defines what it sees as valid or not, nothing to do with me.
% If it were a valid line of code, then that one line program would work. But
% it doesn't, because 'print $1' (w/out the quotes) is not a valid one liner.

I think I went on to suggest you might be using a circular definition of
`line of code'. If your definition of `line of code' is `anything that
can be given to awk without braces around it, and awk accepts it', then
of course you don't have to put braces around it, but it's not really very
helpful.

[...]

% >Ignoring the case where the action is completely
% >omitted, there always have to be braces around the `main code', meaning
%
% Completely wrong.
%
% >the code that makes up an action.
%
% Huh? Why would it make up an action?

This is a definition. Sometimes when people are having a discussion and
it's not clear that they're talking about the same thing, one of the people
will give a definition, so the others will know what he means. What you mean
by `main code' is `anything which isn't part of a BEGIN or END action'.
I don't know why you stop there. This is not what I was talking about.

% >In general, though, if this works:
% >
% > { code }
% >
% >then this will not work
% >
% >  code
% >
% >and it's not terribly helpful to say that it will, is it?

% I'm not aware anyone claimed otherwise. Certainly I didn't.

You suggested replacing this

Quote:
>> BEGIN {    }
>> { main code here }
>> END {      }

with this

Quote:
>> BEGIN {    }
>> main code here
>> END {      }

You might feel that this is not a suggestion to simply remove the
braces, but in fact is an admonishment to resist the temptation to put
entire programs in a single action, and re-write that action to use
pattern/action pairs, but you would be wrong if you felt that way. What
you suggested is that people should take exactly the same code
(represented by `main code here'), and remove the braces. Perhaps
you didn't mean to suggest this, but that doesn't mean it shouldn't
be corrected.

--

Patrick TJ McPhee
East York  Canada



Fri, 26 Sep 2003 02:31:26 GMT  
 awk questions !!!

Quote:
>Patrick TJ McPhee writes:

...
>>> So I pointed out that he should use a program template of the form:

>>> BEGIN {    }
>>> main code here
>>> END {      }

>>Which will almost never work.

>Actually, it will always work.

Depends on what you mean by 'main code here'.

Quote:
>%>awk 'print $1'
>%>
>%>does not work without braces.
>%
>%Of course it won't work, because 'print $1' does not constitute a valid
>%line of awk code.

>>Of course it is, unless you're using some peculiar definition of `valid',

>The interpreter defines what it sees as valid or not, nothing to do with

me.

Yes, the interpretter decides, but you're misconstruing what's going on. By
'lines of code' do you mean 'statement'? In awk and most other languages in
the same family as C, some statements are also expressions but others aren't
expressions. Statements do things, expressions return values.

'print $1' is most definitely a valid statement, but it's not an expression.
Therefore, it can't be used as a _pattern_ (code appearing outside braces)
in awk because patterns _must_ be expressions.

...

Quote:
>Take the original line of code from the command line, viz., awk 'print $1'.
>As presented, 'print $1' (w/out the quotes) is not a valid line of awk code
>because to the interpreter it constitutes neither a pattern nor an action.

Not quite. It's a valid statement but not an expression. It's used in a
context that requires an expression. That's why it doesn't work. It _is_,
however, a valid 'line' of awk code because it would be perfectly valid on a
text line all by itself _within_ an action.

Quote:
>The absence of braces means it attempts interpretation as a pattern, but it
>fails, though this can be turned into a pattern by enclosing it within
>slashes, as /print $1/.

Again, not quite. It fails interpretation as an _expression_. As for /print
$1/, non sequitur - you're changing a valid statement into a regular
expression. Different operation than the valid statement 'print $1'.

...

Quote:
>>Ignoring the case where the action is completely
>>omitted, there always have to be braces around the `main code', meaning

>Completely wrong.

Details, details. If the only operation desired is printing $0, then no
braces are required. If any other operation is desired (meaning that
statements must be used), then braces around those statements are most
certainly required.

...

Quote:
>In general, though, if this works:

>  code

>then this will not work

>  { code }

>and it's not terribly helpful for you to say that it will, is it?

Getting confused again. If

code

works without any braces, it _must_ be a valid expression, and the implicit
action would be '{print}' if code evaluates to anything other than 0 or "".
In that case,

{ code }

would be valid. It _would_ still work, but more likely than not it wouldn't
do anything.



Sat, 27 Sep 2003 07:21:03 GMT  
 
 [ 13 post ] 

 Relevant Pages 

1. Simple awk question

2. A Newbie AWK question

3. awk question

4. awk question about matching a pattern

5. AWK question

6. newbie awk question

7. Simple awk question

8. Easy Awk Question

9. Awk question.

10. awk Question

11. awk question

12. awk question

 

 
Powered by phpBB® Forum Software