deleting records in a file using AWK, SED, or kshell 
Author Message
 deleting records in a file using AWK, SED, or kshell

OOPS

I read it wrong , I thought you wanted to remove all entry on the given
date.

I will rewrite the script to grep out and store the dates before the
specified date.

David Waffen



Mon, 22 Jan 2001 03:00:00 GMT  
 deleting records in a file using AWK, SED, or kshell

Quote:

>I am looking for an efficient way to delete entries
>in a text file that fall within a given time period.

>My text file looks like:
>=================
>Jun 05 1998 12:09:45
>John Wayne
>Actor
>USA
>=================
>Jul 08 1998 04:45:56
>Madonna
>Singer
>France
>=================
>....

>Now the user could provide a date: MM/DD/YY HH:MM:SS
>and the script would find all entries before that date and remove
>them from the file. An entry is delimited by the "============" marker.

>Is there anything in AWK, NAWK, SED, that makes this
>easy? I'm not at all familiar with PERL and would rather refrain
>from using it. Is this a time consuming process, esp. if the
>script would make a copy of the file, write all matching entries
>to it and then remove the original file??

OK, the first thing to notice is that you just want to know
if it is an earlier date and time, not a specific time earlier.

That meams that you can map the date and time-of-day into another
function that is monotonic, but not necessarily continuous.

That means: create a number that relates to the date and a number
that relates to the time, but these numbers don't have to _be_ the
date and time.

Call them DDD (Ding-Dong-Date), and DDT (Ding-Dong-Time).

DDD=(year_number)*10000 +(month_number)*100 +(day_number)

DDT=(Hour_number)*10000 + (Minute_number)*100 + (Second_number)

If you convert the record info to DDD and DDT, and the input
info to DDD and DDT, it is easy to compare the two.

FWIW, 12:35:45 converts to a DDT=123545 so just removing the colons
will do the trick.

It is similarly easy to convert 01/17/1998 to a DDD=19980117

This new frame of reference will make date and time comparisons easy,
especially if one combines them as one floating point number, DDD.DDT

To make things easy to visualize, you can consider making an
intermediate file in this format:

=================
19980605.120945
Jun 05 1998 12:09:45
John Wayne
Actor
USA
=================
19980608.044556
Jul 08 1998 04:45:56
Madonna
Singer
France
=================

This is a relatively simple task in awk or gawk.  Something like:

gawk '
BEGIN{ mo[Jan]=0100; mo[Feb]=0200; mo[Mar]=0300; mo[Apr]=0400;
       mo[May]=0500; mo[Jun]=0600; mo[Jul]=0700; mo[Aug]=0800;
       mo[Sep]=0900; mo[Oct]=0000; mo[Nov]=0100; mo[Dec]=1200;}
NF=4 {DDD=($3*10000) + mo[$1] + $2;
      DDT=$4;
      gsub(/:/,"",DDT);
      my_time= DDD "." DDT;
      print my_time}
{print}' infile

Then one can take this, and instead of printing my_time, one can
compare my_time to  a number created from user input supplied in
DDD and DDT format, and set a flag which controls printing of lines.

gawk '
BEGIN{ mo[Jan]=0100; mo[Feb]=0200; mo[Mar]=0300; mo[Apr]=0400;
       mo[May]=0500; mo[Jun]=0600; mo[Jul]=0700; mo[Aug]=0800;
       mo[Sep]=0900; mo[Oct]=0000; mo[Nov]=0100; mo[Dec]=1200;}
NF=4 {DDD=($3*10000) + mo[$1] + $2;
      DDT=$4;
      gsub(/:/,"",DDT);
      flag=0;
      my_time= DDD "." DDT;
      if (my_time >= user_time) {flag=1} }
NR==1 || flag==1 {print}' user_time="$ddd.$ddt" infile

Which requires that $ddd and $ddt be created before the above code.

I thought I'd leave something for you and others to do, or modify. :-)
I also didn't try this or attempt to clean it up much.

Note how the flag is reset on the date line.

I hope this is helpful.  Certainly the date mapping approach
makes things much easier.  Think of it as a Stardate, from Star Trek.
I liked that show, and it's followers/sequels/spinoffs.

Chuck Demas
Needham, Mass.

--
  Eat Healthy    |   _ _   | Nothing would be done at all,

  Die Anyway     |    v    | That no one could find fault with it.



Thu, 25 Jan 2001 03:00:00 GMT  
 
 [ 3 post ] 

 Relevant Pages 

1. deleting records using AWK, SED, etc.

2. Using awk And/Or sed To Edit Fourth Field Of File

3. Extracting hyphenated words using sed/awk

4. how to insert a comma using either sed or awk

5. join a range of lines using awk/sed

6. extracting top 100 data using awk or sed/perl

7. Q: Directory highlighting using sed/awk?

8. Question: How to remove END OF LINE using AWK or SED

9. Using awk & sed from tcl/tk

10. log file processing - awk or sed

11. How to get info from a file with sed/awk/perl

12. Getting awk (or sed or anything else) to put single quotes into file

 

 
Powered by phpBB® Forum Software