Determining max field widths 
Author Message
 Determining max field widths

Any way when I examine a CSV file using gawk, that I can determine the
max field width of each field, use that max value to write out the
output to a fixed field file. For example:-

Input:-
001,02020,393003,38933
002,212222,22398233,3237283
003,33,2333,3323

Example output:-
001  02020   393003   38933
002 212222 22398233 3237283
003     33     2333    3323

Thanks on advance.

Mika



Tue, 02 Dec 2003 19:50:45 GMT  
 Determining max field widths

Quote:

>Any way when I examine a CSV file using gawk, that I can determine the
>max field width of each field, use that max value to write out the
>output to a fixed field file. For example:-

>Input:-
>001,02020,393003,38933
>002,212222,22398233,3237283
>003,33,2333,3323

>Example output:-
>001  02020   393003   38933
>002 212222 22398233 3237283
>003     33     2333    3323

Sure, look at the length of each field in a for loop

{for (i=1;i<=NF;i++){
      data[NF,i]=$i
      l=length($i)
      if ( l > a[i] ) { a[i] = l }
          }
      }

then you can use the values in the array a to set the fieldwidths
when printing the data array in the END section.

I left some work for you to do.  :-)

Chuck Demas

--
  Eat Healthy    |   _ _   | Nothing would be done at all,

  Die Anyway     |    v    | That no one could find fault with it.



Wed, 03 Dec 2003 00:49:54 GMT  
 Determining max field widths

Quote:



>>Any way when I examine a CSV file using gawk, that I can determine the
>>max field width of each field, use that max value to write out the
>>output to a fixed field file. For example:-

>>Input:-
>>001,02020,393003,38933
>>002,212222,22398233,3237283
>>003,33,2333,3323

>>Example output:-
>>001  02020   393003   38933
>>002 212222 22398233 3237283
>>003     33     2333    3323

>Sure, look at the length of each field in a for loop

>{for (i=1;i<=NF;i++){
>      data[NF,i]=$i

-->Should that not be data[NR,i]=$i

Quote:
>      l=length($i)
>      if ( l > a[i] ) { a[i] = l }
>          }
>      }

>then you can use the values in the array a to set the fieldwidths
>when printing the data array in the END section.

This all assumes the files is small and/or it will fit
within the limits imposed on the size of associative arrays
If it wont, this will require a fairly trivial double pass of the data
Rgds

Mark
---
Mark Katz
Mark-IT, London. Delivering MR-IT/Internet solutions
Tel: (44) 20-8731 7516, Fax: (44) 20-8458 9554; http://www.mark-it.co.uk



Wed, 03 Dec 2003 01:37:14 GMT  
 Determining max field widths

Quote:

> Any way when I examine a CSV file using gawk, that I can determine the
> max field width of each field, use that max value to write out the
> output to a fixed field file. For example:-

> Input:-
> 001,02020,393003,38933
> 002,212222,22398233,3237283
> 003,33,2333,3323

> Example output:-
> 001  02020   393003   38933
> 002 212222 22398233 3237283
> 003     33     2333    3323

> Thanks on advance.

> Mika

Hello,

try:
'cat temp.awk'
NR==FNR{for(i=1;i<=NF;i++){l=length($i);if(l > max[i]){max[i]=l}};next}
{for(i=1;i<=NF;i++){printf "%+*s%s",max[i],$i,(i<NF)?" ":""};print ""}

'gawk -F, -f temp.awk infile infile'
001  02020   393003   38933
002 212222 22398233 3237283
003     33     2333    3323  

Inspired by a solution for a problem like this, no one else then Chuck
Demas
himself, posted once to this group...:-)

Good luck

Michael Heiming



Wed, 03 Dec 2003 16:12:49 GMT  
 Determining max field widths

Quote:
>Any way when I examine a CSV file using gawk, that I can determine the
>max field width of each field, use that max value to write out the
>output to a fixed field file. For example:-

>Input:-
>001,02020,393003,38933
>002,212222,22398233,3237283
>003,33,2333,3323

>Example output:-
>001  02020   393003   38933
>002 212222 22398233 3237283
>003     33     2333    3323

You'll either need to read in (and have your script memorize) all data in
the file, or read through the file twice -- the first time only recording
column widths, the second time reading and printing.

Which of these two to implement depends on your needs; memorizing all the
data won't work, if you have huge amounts of data -- but reading through
twice will prevent your script from being used within a pipeline (you
can't rewind a pipeline to read through it again).

Also, be careful, or you may lose some leading zeroes in the data.
--
Wolf  a.k.a.  Juha Laiho     Espoo, Finland

         !M V PS(+) PE Y+ PGP(+) t- 5 !X R !tv b+ !DI D G e+ h--- r+++ y+++
"...cancel my subscription to the resurrection!" (Jim Morrison)



Wed, 03 Dec 2003 13:25:01 GMT  
 
 [ 5 post ] 

 Relevant Pages 

1. ?list{proplist:width+proplist:group,SomeNr} does change field width, not group width

2. Determining if a field is a substring of another field

3. Determining Width and Height of Image

4. Determining text width

5. Help: How to determine display's width and height

6. Determine MAX LONG and INT

7. Find the value of a field width file properties

8. PowerBrowse in line edit field width

9. field always a certain width

10. Too small field width

11. The field width is too small for the number of fractional digits

12. width of an entry field in a canvas

 

 
Powered by phpBB® Forum Software