reading flat-file db and replacing a word 
Author Message
 reading flat-file db and replacing a word

how do i make my program read a flat-file db, replace a certain word and
then save it as another file?


Tue, 20 Apr 2004 00:46:21 GMT  
 reading flat-file db and replacing a word

Quote:

>how do i make my program read a flat-file db, replace a certain word and
>then save it as another file?

Perl FAQ, part 5:

   "How do I change one line in a file/
    delete a line in a file/
    insert a line in the middle of a file/
    append to the beginning of a file?"

--
    Tad McClellan                          SGML consulting

    Fort Worth, Texas



Tue, 20 Apr 2004 01:06:46 GMT  
 reading flat-file db and replacing a word


Quote:
> how do i make my program read a flat-file db, replace a certain word and
> then save it as another file?

You do it with perl.


Tue, 20 Apr 2004 04:10:43 GMT  
 reading flat-file db and replacing a word
This should work:

open(FILE1, "filename1");
$file1 = <FILE1>;
close (FILE1);

$file1 =~ s/$word1/$word2/gi;

open(FILE2, ">filename2");
print FILE2 $file1;
close(FILE2);

You can mess around with the search and replace to make it do exactly
what you want and it might help to use arrays and a foreach loop
depending on what kind of search and replace you're doing.

- s. (sameer iyengar)
(http://www.yenga.com)

Quote:

> how do i make my program read a flat-file db, replace a certain word and
> then save it as another file?



Tue, 20 Apr 2004 06:02:24 GMT  
 reading flat-file db and replacing a word

  s> This should work:
  s> open(FILE1, "filename1");

you don't check the return of open.

  s> $file1 = <FILE1>;

and what about the rest of the lines in the file?

  s> $file1 =~ s/$word1/$word2/gi;

  s> open(FILE2, ">filename2");

again no check on open.

  s> print FILE2 $file1;
  s> close(FILE2);

  s> You can mess around with the search and replace to make it do exactly
  s> what you want and it might help to use arrays and a foreach loop
  s> depending on what kind of search and replace you're doing.

there is no such operator as search and replace in perl. it is
substitute. it does not do a search, but a regular expression match.

uri

--

SYStems ARCHitecture and Stem Development ------ http://www.stemsystems.com
Search or Offer Perl Jobs  --------------------------  http://jobs.perl.org



Tue, 20 Apr 2004 06:11:13 GMT  
 reading flat-file db and replacing a word

Quote:


>>how do i make my program read a flat-file db, replace a certain word and
>>then save it as another file?

> Perl FAQ, part 5:

>    "How do I change one line in a file/
>     delete a line in a file/
>     insert a line in the middle of a file/ append to the beginning of a
>     file?"

This is of interest for me, too. But the FAQ doesn't really help. As I
don't want to read the entire file in an array or variable. And I can't
find a tutorial which describes how to solve that in another way. Isn't
there a mehtod using seek, tell to replace or delete an entry in a file?

Would be glad to have a site on the net which describes that in detail.

Carsten

P.S. The problem is not that I have to replace and delete entries at the
same time, it's only replace or delete not both.



Tue, 20 Apr 2004 15:25:35 GMT  
 reading flat-file db and replacing a word


Quote:
> how do i make my program read a flat-file db, replace a certain word and
> then save it as another file?

$oldword = 'abc';
$newword = 'def';

open (IFIL, 'infile.txt');
open (OFIL, '>outfile.txt');

while ($lin = <IFIL>)
   {
   $lin =~ s/$oldword/$newword/g;
   print OFIL, $lin;
   }

close (OFIL);
close (IFIL);

K Brownhill



Tue, 20 Apr 2004 15:41:54 GMT  
 reading flat-file db and replacing a word

Quote:



>>>how do i make my program read a flat-file db, replace a certain word and
>>>then save it as another file?

>> Perl FAQ, part 5:

>>    "How do I change one line in a file/
>>     delete a line in a file/
>>     insert a line in the middle of a file/ append to the beginning of a
>>     file?"

>This is of interest for me, too. But the FAQ doesn't really help.

I think it does help you, you just didn't recognize it  :-)

If there is something in the FAQ answer that you don't understand,
you can ask about it here and we'll try and help clarify.

Quote:
>As I
>don't want to read the entire file in an array or variable.

The FAQ answer gives code that does not read the entire file into
an array (more than one example even). It shows how to do it with
only a single line in memory at any time.

Quote:
>And I can't
>find a tutorial which describes how to solve that in another way.

The FAQ answer does show "other ways" so I don't know what you
mean there...

Quote:
>Isn't
>there a mehtod using seek, tell to replace or delete an entry in a file?

Reread the part that starts with:
   "(There are exceptions in special circumstances".

If your circumstances are not those special circumstances, then
seek() isn't going to do it for you.

Quote:
>Would be glad to have a site on the net which describes that in detail.

I fail to see the shortcomings in the FAQ answer that you allude to.

What is wrong with the answer given in the FAQ?

--
    Tad McClellan                          SGML consulting

    Fort Worth, Texas



Tue, 20 Apr 2004 16:12:40 GMT  
 reading flat-file db and replacing a word


|


| > how do i make my program read a flat-file db, replace a certain word and
| > then save it as another file?
| >
| >
|

use strict;
use warnings;

| $oldword = 'abc';
| $newword = 'def';
|
| open (IFIL, 'infile.txt');
| open (OFIL, '>outfile.txt');

Always check the return value of open as Uri suggested in his answer to
another post in this thread.

| while ($lin = <IFIL>)
|    {
|    $lin =~ s/$oldword/$newword/g;

What if the string to be replaced contains newlines?

|    print OFIL, $lin;
|    }
|
| close (OFIL);
| close (IFIL);

Steffen
--
$_=q;0cb212c210b0bb010c0113bb0c410c0b516c0bb3d212c2b0b0b016b6cb2b2c21010c0
b41110b3bba0e0c0d2c4b2b6bc013d2c0d0b01012b0b0;;s/\n//g;s/(\d)/$1<2?$1:'0'x




Tue, 20 Apr 2004 16:50:51 GMT  
 reading flat-file db and replacing a word

Quote:

> I think it does help you, you just didn't recognize it  :-)

Hmmh, maybe I haven't clearly told what I want to do.
OK, Though I will read the FAQ again, I'll tell you in the meanwhile what
I want to do:

I have a CSV File which looks like this

ad001;5;1011002;User X Y,

where the first field contains alphanumeric characters, the second
numeric only, the third numeric with the length of 7, and the 4th cou be
nearly anything.

So I want to change or delete a whole line, but what I actually don't
want to do is, to make temporary copies of the file itself or read the
whole content into an array. Currently I read all into an array,
but this should be changed as
the filesize increases. Currently the file is only 63K But I think it's
not a good idea of having 500K - 1MB stored into an array on a shared
Host.

So I wonder if there is a way of replacing or deleting a whole line in the text file
without making temporay copies and reading in the whole content into an
array. But as I said, it's too late for today, and I will reread the FAQ.

Quote:
> If there is something in the FAQ answer that you don't understand, you
> can ask about it here and we'll try and help clarify.

Thanx for the offer :-)

Quote:

> The FAQ answer gives code that does not read the entire file into an
> array (more than one example even). It shows how to do it with only a
> single line in memory at any time.

> Reread the part that starts with:
>    "(There are exceptions in special circumstances".

Will do it, but tomorrow, today my eyes are flickering because I sad to
much in front of my computer, would be nice if you are there tomorrow,
too.
Quote:

> I fail to see the shortcomings in the FAQ answer that you allude to.

> What is wrong with the answer given in the FAQ?



Wed, 21 Apr 2004 03:39:23 GMT  
 reading flat-file db and replacing a word

Quote:
> So I wonder if there is a way of replacing or deleting a whole line in the
text file
> without making temporary copies and reading in the whole content into an
> array. But as I said, it's too late for today, and I will reread the FAQ.

This is just physics, if the length of the file doesn't change you can seek
and write over the top of the bytes. If the length of the file changes then
you must make a gap or close a gap.
That means moving all the data after this point.

You can close a gap by moving the data after back in blocks then pad out the
file.
T open a gap you need to read the bit you are going to clobber each time
before you clobber it.

You can avoid this by using fixed length fields like most databases do.

Or you can design your data structure with a delete flag so that to make a
line bigger you overwrite the start with XXXXX or whatever then stick the
new longer data at the end. Then do a garbage collection at the weekend.

You can make a copy in memory or in another file.

None of this is difficult to do it is just tedious. There are no magical
solutions the data must be moved one way or another. Any high(er) level
language which gives you an INSERT INTO or what ever still has to adopt one
of the strategies, to do it. it is just abstracted from the programmer.

Well there is one magical solution which I use you put each record in a
different file with the index field as the file name. That way the OS does
all the work for you. I admit this is not always a good solution but
sometimes it is.

--

Stuart Gall
------------------------------------------------
This message is not provable.



Wed, 21 Apr 2004 11:16:42 GMT  
 reading flat-file db and replacing a word
Quote:

> Or you can design your data structure with a delete flag so that to make
> a line bigger you overwrite the start with XXXXX or whatever then stick
> the new longer data at the end. Then do a garbage collection at the
> weekend.

> You can make a copy in memory or in another file.

I have made some thoughts, and, well I can hear all people say use a DB
then, but this is only what I thought of:

I have my file, for every line in the file I use a fixed line length

like:

ab1000;5;1011002;some comments, here#########################
#############################################################
dd500;6;1011115;some comments, here##########################

then for input, the # character is forbidden, because it is used as a
seperator. When replace an entry I do a:

open (FILE, "+< file.dat")
or die "Could not open File $!\n";
while(<FILE>)
        {
        if ($_=~m/^($expression)#+$/)
                {
                $pos = tell(FILE);
                $size = length $1;
                }
        }
$pos -= $size;
seek(FILE, $pos, 0);
$fill = $fixed_size - $size;
$fill = '#'x$fill;
$replace .= $fill;
print FILE $replace
close(FILE);

If I want to delete a line I just simply fill it with # until it reached
the fixed line length. Every delete is counted and the result is stored in
separate file. If I want to add a line, I search for a line beginning
with # and fill the line up with the string to add.

Every day or weekend I look up the separate file where the delete and add
operations are counted, and if the amount of delete's reaches more than
20% of the add's for example a cron job is rewriting the file without the
empty ##### lines.

What do you think of that? Will this be possible solution or did I
overlook something important?

Carsten



Wed, 21 Apr 2004 22:02:47 GMT  
 reading flat-file db and replacing a word

Quote:
> What do you think of that? Will this be possible solution or did I
> overlook something important?

It is one of the possible solutions.
Your code has a bug I think
Quote:
>$size = length $1;
>}
>}
>$pos -= $size;

size is the length of the data not the record.

I would suggest using syswrite and not print, then there is no danger of a
bug in the code overwriting other records.
Also I dont like the '#' delimiter why dont you terminate the string with a
semicolon?
That way you can just split and discard the last term when you read.
Also when you write you just add ";######################" (lots of #'s) on
the end and use write to truncate the string.
You could use # as a rouge value in the first field to indicate the record
is deleted.

Perhaps it is better to forget <FILE> method all together if you are going
to use fixed length fields you could just use syswrite and sysread

Something like (Untested)

$LEN=length of record
open (FILE, "+< file.dat")
or die "Could not open File $!\n";
while(sysread FILE,$Rec,$LEN)
{

    if (test the fields for a match)
{
seek(FILE, -$LEN, 1);
$Fields[5]='#'xLEN;

syswrite FILE,$replace,$LEN;
last;

Quote:
}}

close(FILE);

--

Stuart Gall
------------------------------------------------
This message is not provable.



Thu, 22 Apr 2004 01:19:35 GMT  
 reading flat-file db and replacing a word
Quote:


>> Or you can design your data structure with a delete flag so that to make
>> a line bigger you overwrite the start with XXXXX or whatever then stick
>> the new longer data at the end. Then do a garbage collection at the
>> weekend.

>> You can make a copy in memory or in another file.

>I have made some thoughts, and, well I can hear all people say use a DB
>then, but this is only what I thought of:

>I have my file, for every line in the file I use a fixed line length

                                                    ^^^^^^^^^^^^^^^^^

That gives you the keys to the kingdom. If you are willing to
live with fixed length records, then you meet the "special
circumstances":

   "Another is replacing a sequence of bytes with
    another sequence of the same length."

But that doesn't help with deleting.

Quote:
>ab1000;5;1011002;some comments, here#########################
>then for input, the # character is forbidden, because it is used as a
>seperator.

Eh? That looks to me like semi-colon is being used as a separator,
and # is being used as a padding character.

Why not use fixed length fields within your fixed length records?

Then there would be no need for spending space on separators at all.

[ snip a bunch o' code ]

Quote:
>If I want to delete a line I just simply fill it with # until it reached
>the fixed line length. Every delete is counted and the result is stored in
>separate file. If I want to add a line, I search for a line beginning
>with # and fill the line up with the string to add.

That's a lot of housekeeping for very little gain (reduced disk space).

Quote:
>Every day or weekend I look up the separate file where the delete and add
>operations are counted, and if the amount of delete's reaches more than
>20% of the add's for example a cron job is rewriting the file without the
>empty ##### lines.

Yet more housekeeping.

Quote:
>What do you think of that?

I think it is awful (no offense).

It is high maintenance. It has high development costs. It will
have high maintenance/debugging costs.

Quote:
>Will this be possible solution or did I
>overlook something important?

I think you may have overlooked money. Lots of people count
that in the "important" category  :-)

I still have not seen a cost-benefit analysis justifying such
an expenditure of effort. Let's back up a bit. You said:

CM> what I actually don't want to do is, to make temporary
CM> copies of the file itself or read the whole content into an array
           ^^^^^^^^^^^                                  ^^^^^^^^^^^^^

The FAQ shows you how to avoid your second objection, but you've
never said why you do not want to make a copy of the file itself.

   Why you do not want to make a copy of the file itself?

I don't see anything in your description that would preclude
just using the -i command line switch and ripping through
your data with a while <> loop.

Development time would be, I dunno, 10 or 20 times more if
you must not use a temporary file. Disk space is cheap.
Programmer time is expensive.

I have not seen anything that is forcing you to go with the
(greatly) more expensive route.

Why do you feel compelled to spend so much time in order to
save disk space that will be used for only a few seconds?

Perhaps you have a good reason and just haven't shared it
with us?

--
    Tad McClellan                          SGML consulting

    Fort Worth, Texas



Thu, 22 Apr 2004 02:19:44 GMT  
 reading flat-file db and replacing a word

Quote:

> size is the length of the data not the record.

> I would suggest using syswrite and not print, then there is no danger of
> a bug in the code overwriting other records. Also I dont like the '#'
> delimiter why dont you terminate the string with a semicolon?

As I said just a thought. :-)

Quote:

> Something like (Untested)

I can test :-)

Quote:
> $LEN=length of record
> open (FILE, "+< file.dat")
> or die "Could not open File $!\n";
> while(sysread FILE,$Rec,$LEN)
> {

>     if (test the fields for a match)
> {
> seek(FILE, -$LEN, 1);
> $Fields[5]='#'xLEN;

> syswrite FILE,$replace,$LEN;
> last;
> }}
> close(FILE);

Thanx for you input, I was afraid when posting my code snippet of getting
flamed, glad that this had not taken place.

As I mentioned above, It was just a thought, the current situation is
that the records are not of a fixed length. To get this I have to port
all files, the current situation is that I have fields like this,

aaa100;5;1011002;Comments, Notations,

The Problem is that only the 3rd Field has a fixed length all other
fields could vary. For instance I tried to find a solution, of how to
delete a whole line, but how to do that one? I mean delete a whole line
within a file, where always  the last character is \n ? Because when
an entry should be changed, the new entry could be shorter than the old
one, so I could not simply overwrite the old one.

Sorry if I'm asking here for somewhat, which may be very simple.
I also think the longterm is to port all files to a fixed record length,
or use MySQl, but you know if you boss is wanting everything at best
yesterday.... there is no room for finding perfect solutions, that's why
I'm currently load all into an array and make my modifications, this is
still possible as the files have sizes around 63K but what about the
future???

Again thankx so far for your input, you have yet helped me a lot

Carsten



Thu, 22 Apr 2004 02:22:05 GMT  
 
 [ 22 post ]  Go to page: [1] [2]

 Relevant Pages 

1. replacing a word in a flat file

2. flat file db Vs Aliases.db by Perl.

3. Can Perl read DB (.db) format files?

4. Anyone using SoftList? (perl shell / flat file db)

5. How to display all records in flat-file db using HTML

6. Need a perl script that searches a flat file db

7. Flat-File db recommendation

8. Flat-file DB Perl Module?

9. replace a word in file...

10. Replacing a word in a file

11. replace some word in text file

12. Problem opening a flat-file for read

 

 
Powered by phpBB® Forum Software