merging two files 
Author Message
 merging two files

Hi - I have two files, each with 2 columns, and would like to do the
following:

- for every line read from file1, put $1 and $2 into a temp variable,

- then compare $2 of every line in file2 with $2 from file1,
if $2 of both files are equal, replace $2 in file2 with $1 of file1,
otherwise, don't change anything in file2,

- go back and read the next line from file1 and repeat the process

I was able to strip out all of this info from my data, but I'm having a
hard time with the above problem. Thanks for any help!

Sent via Deja.com http://www.*-*-*.com/
Before you buy.



Sat, 12 Apr 2003 10:05:43 GMT  
 merging two files
ever have that feeling of deja vu?
maybe it's just a glitch ;)

In a hurry, I'd smash off something like:

gawk 'ARGIND==1{storage[$2]=$1;next};$2 in storage{$2=storage[$2]};{print}' file1 file2

this assumes there is a SINGLE entry in file1 for a given $2

if you're not using gawk, then identify the first file with FILENAME or
by the number of arguments, or with a dummy "trigger" file with a single
marker row.

it also assumes there is a reasonable number of rows in file1.
if you have a large number of rows (in relation to computing resources),
then I'd go for a disk based sort, sorting the two files together with
the entry from file1 first, using this as a flag to say to modify the next
entry(ies) which match the value.

jen

humm I'm sure I've seen that cat before :)
--

Quote:

> Hi - I have two files, each with 2 columns, and would like to do the
> following:

> - for every line read from file1, put $1 and $2 into a temp variable,

> - then compare $2 of every line in file2 with $2 from file1,
> if $2 of both files are equal, replace $2 in file2 with $1 of file1,
> otherwise, don't change anything in file2,

> - go back and read the next line from file1 and repeat the process

> I was able to strip out all of this info from my data, but I'm having a
> hard time with the above problem. Thanks for any help!

> Sent via Deja.com http://www.deja.com/
> Before you buy.



Sat, 12 Apr 2003 12:37:44 GMT  
 merging two files

Quote:

>Hi - I have two files, each with 2 columns, and would like to do the
>following:

>- for every line read from file1, put $1 and $2 into a temp variable,

>- then compare $2 of every line in file2 with $2 from file1,
>if $2 of both files are equal, replace $2 in file2 with $1 of file1,
>otherwise, don't change anything in file2,

>- go back and read the next line from file1 and repeat the process

If you're on a unix or unix-like system, the join command [man join(1)]
would be a quicker solution than writing an awk script to duplicate its
functionality.

Sent via Deja.com http://www.deja.com/
Before you buy.



Sat, 12 Apr 2003 12:59:53 GMT  
 merging two files

Quote:

> [snip]

> gawk 'ARGIND==1 ... [snip]

> this assumes there is a SINGLE entry in file1 for a given $2

> if you're not using gawk, then identify the first file with FILENAME or
> by the number of arguments, or with a dummy "trigger" file with a single
> marker row.

Or use the canonical idiom FNR == NR, which works in all versions
of awk and is functionally equivalent to using ARGIND == 1 in gawk.

--
Jim Monty

Tempe, Arizona USA



Sat, 12 Apr 2003 03:00:00 GMT  
 merging two files
Thanks! Works with Jim's suggestion FNR==NR.

Regards,
Rob

Sent via Deja.com http://www.deja.com/
Before you buy.



Sat, 12 Apr 2003 03:00:00 GMT  
 merging two files
Ha! fine for the trival case, but not so good when you are
reading several files, and in some cases the same file
a couple of times (I do a LOT of splicing files together :)
For a giggle I've included one of the typical (little)
scripts I do all the time (at the bottom of the message so
everyone doesn't have to suffer my code :)
m{*filter*}of the story - use gawk, you won't regret it!
Jennifer
Quote:


> > [snip]

> > gawk 'ARGIND==1 ... [snip]

> > this assumes there is a SINGLE entry in file1 for a given $2

> > if you're not using gawk, then identify the first file with FILENAME or
> > by the number of arguments, or with a dummy "trigger" file with a single
> > marker row.

> Or use the canonical idiom FNR == NR, which works in all versions
> of awk and is functionally equivalent to using ARGIND == 1 in gawk.

> --
> Jim Monty

> Tempe, Arizona USA

--
here's me doing the data warehouse thing without the warehouse :)
sample junk merge command (done on the command line on a IRIX server)
for reference, the .expand file is about 5mb and the midnightmerge is about 480mb
(oh, the \ continuations are added just so the mailer doesn't mess it too bad,
I just bash it out on a single command line - good for the geek cred if anyone is watching :)

gawk -F, 'ARGIND==1 {lookfor[$2 "," $5]="-1,-1,-1,-1,-1,-1" ;next}; \
ARGIND==2{aoai[$2]=$3;next}; \
ARGIND==4{print $0 "," lookfor[$2 "," $5] "," aoai[$5];next}; \
length($9)==3{key=$5 "," $7;if(key in lookfor){ \
lookfor[key]=$1 "," $2 "," $3 "," $4 "," $12 "," $15}}'  ss.priv-new.expand \
/data1/CRD/update2000/indexB/_XAO_A prices.midnightmerge ss.priv-new.expand > ss.priv-new.out



Sun, 13 Apr 2003 11:25:10 GMT  
 
 [ 6 post ] 

 Relevant Pages 

1. Merging two files

2. merging two files

3. Merge two files and create third file from them...

4. Merge selected columns from two different files into another file

5. merging 2 columns from two files in one file

6. merging two large files

7. Merge of two files

8. Merging two Tps files

9. Merging two Identical file.

10. Merging two VCD format dump files ??

11. Merging contents of two files

12. read two sets of data from two spread sheet files to two arrays problem

 

 
Powered by phpBB® Forum Software