2-file awk search-and-replace 
Author Message
 2-file awk search-and-replace

Hi!
For my data-mining class I'm trying to learn how to use awk. I was able to
extract some data from a file and put it into a file with the following
layout:

5 Attcap-shape:s
6 Attcap-shape:x
7 Attcap-surface:f
8 Attcap-surface:g
9 Attcap-surface:s
10 Attcap-surface:y
11 Attcap-color:b
12 Attcap-color:c
13 Attcap-color:e
14 Attcap-color:g

the integer is a property-id and the text is it's description.

Now what I want to do is in another file, replace the different numbers by
their corresponding property-description. My second file is formatted like
this:
36 86    83    1.000    0.667
36 89    83    1.000    0.613
51    83 86    1.000    0.567
51 86    83    1.000    0.567
51 83    86    1.000    0.567
51    83 89 34   1.000    0.567
51 89 12   83 23   1.000    0.567

leftSideOfRule  rightSideOfRule         Accuracy        Frequency

Concretely, what I want to do is, on each line, replace the (NF-2) fields
with their corresponding descriptions. I've been on this for the past 2
hours, and I can't seem to get it right. Should I maybe use sed or some
other util for this type of multi-file handling?

Thank you very much in advance,

Yannick

/// Yannick Wurm
/// Bioinformatique et Modelisation
/// Insa de Lyon



Sat, 09 Apr 2005 00:43:51 GMT  
 2-file awk search-and-replace


Quote:
>Hi!
>For my data-mining class I'm trying to learn how to use awk. I was able to
>extract some data from a file and put it into a file with the following
>layout:

>5 Attcap-shape:s
>6 Attcap-shape:x
>7 Attcap-surface:f
>8 Attcap-surface:g
>9 Attcap-surface:s
>10 Attcap-surface:y
>11 Attcap-color:b
>12 Attcap-color:c
>13 Attcap-color:e
>14 Attcap-color:g

>the integer is a property-id and the text is it's description.

>Now what I want to do is in another file, replace the different numbers by
>their corresponding property-description. My second file is formatted like
>this:
>36 86    83    1.000    0.667
>36 89    83    1.000    0.613
>51    83 86    1.000    0.567
>51 86    83    1.000    0.567
>51 83    86    1.000    0.567
>51    83 89 34   1.000    0.567
>51 89 12   83 23   1.000    0.567

>leftSideOfRule  rightSideOfRule         Accuracy        Frequency

>Concretely, what I want to do is, on each line, replace the (NF-2) fields
>with their corresponding descriptions. I've been on this for the past 2
>hours, and I can't seem to get it right. Should I maybe use sed or some
>other util for this type of multi-file handling?

>Thank you very much in advance,

>Yannick

>/// Yannick Wurm
>/// Bioinformatique et Modelisation
>/// Insa de Lyon

I'm not sure what you're doing, but reaading one file into arrays
should simplify things.

Try posting what you tried with better data as to what should go in
and what should come out.  

Chuck Demas

--
  Eat Healthy    |   _ _   | Nothing would be done at all,

  Die Anyway     |    v    | That no one could find fault with it.



Sat, 09 Apr 2005 03:06:30 GMT  
 2-file awk search-and-replace


% For my data-mining class I'm trying to learn how to use awk. I was able to

I'll give you some hints. You can read a file using getline:

 while ((getline < "file1") > 0)
   process_data($1, $2)

you can do that in a BEGIN block to, say, load the contents of a reference
file into an array.

Arrays in awk can be indexed on either numeric or non-numeric values. In
this case, you could have array element 5 refer to Attcap-shape:s with

 propid[5] = "Attcap-shape:s"

If $1 == 5 and $2 == "Attcap-shape:s", this is equivalent to

 propid[$1] = $2

Finally, you can look up the array value associated with, say, the NF-2nd
field using

 propid[$(NF-2)]

I hope that's helpful.
--

Patrick TJ McPhee
East York  Canada



Sat, 09 Apr 2005 03:09:54 GMT  
 2-file awk search-and-replace
Thank you very much, Patrick and Charles!

I almost have it working now. Here is my code:
BEGIN {
  while ((getline < "champcorrespondances.txt") > 0) {
    description[$1] = $2;
  }

Quote:
}

{
  print "ceci: "
  for (i=0;i<(NF-2);i++) {
    print description[$i];
  }
  print"   accuracy: ";
  print $(NF-1);
  print "   frequency: ";
  print $NF;

Quote:
}

I will have to use split() to separate my input tables (there are two types
separators: tabs and spaces), but first I'd like my output to be a little
prettier. Awk always spits out everything onto separate lines. Is there any
way to append a string to another string and simply do one print at the end
of the whole process?

Thanks again,

Yannick.

ps: this is what my awk prints out:

ceci:

Attgill-attachment:f
Attstalk-surface-above-ring:s
Attveil-type:p
Attring-number:o
   accuracy:
1.000
   frequency:
0.559
ceci:

Attstalk-surface-below-ring:s
Attveil-color:w
Attring-number:o
Attgill-attachment:f
   accuracy:
1.000
   frequency:
0.530



Sat, 09 Apr 2005 19:05:15 GMT  
 2-file awk search-and-replace


[...]

% I will have to use split() to separate my input tables (there are two types
% separators: tabs and spaces), but first I'd like my output to be a little

The default field separator splits on any number of tabs and spaces, so
you should be OK.

% prettier. Awk always spits out everything onto separate lines. Is there any

You have two choices. You can build up a string:

    desc = description[$1]
%   for (i=2;i<(NF-2);i++) {
      desc = desc " " description[$i];
%   }
    print desc

or you can use printf:

    printf "%s", description[$i]
    printf "accuracy: %s  frequence: %s\n", $(NF-1), $NF

note that printf won't insert spaces between items, so you'll have to deal
with that yourself.

--

Patrick TJ McPhee
East York  Canada



Sun, 10 Apr 2005 00:12:59 GMT  
 
 [ 5 post ] 

 Relevant Pages 

1. awk "search and replace"

2. Replacing a string from an input file within awk script

3. in-file search/replace question without using mv or cp

4. Search and replace text in a file based on a specific line

5. Search and replace on a large text file

6. searching and replacing a string within a file with rexx

7. search/replace in long files

8. AWK-search the web quickly with Search Spaniel

9. How to replace or create a file using the open/create/replace.vi

10. How to replace one or two words with one word with one line of awk code

11. replace variable with awk

12. Window's 2000 Global Find and Replace Solution using awk

 

 
Powered by phpBB® Forum Software