Substituting :: for __ in every htm file, and rename htlml file 
Author Message
 Substituting :: for __ in every htm file, and rename htlml file

Am using vhdldoc, which creates HTML files.
File names of form: XXX::YY.html
Within HTML files: Links to other files of same form (with the ::)
This works great in Linux/unix, but Windows file system does not allow
files with then :: in the name.
QUESTION: How do I write an awk program that
1. traverses every .html file
2. rename the file with same name, except "::" is substituted with "__"
3. Change every instance of "::" to "__" within the file.

I used awk many many years ago, and totally forgot it!
Thanks,
---------------------------------------------------------------------------
Ben Cohen     Publisher, Trainer, Consultant    (310) 721-4830  

Author of following textbooks:
* Real Chip Design and Verification Using Verilog and VHDL, 2002 isbn 0-9705394-2-8
* Component Design by Example ",  2001 isbn  0-9705394-0-1
* VHDL Coding Styles and Methodologies, 2nd Edition, 1999 isbn 0-7923-8474-1
* VHDL Answers to Frequently Asked Questions, 2nd Edition, isbn 0-7923-8115
------------------------------------------------------------------------------



Tue, 28 Dec 2004 12:21:32 GMT  
 Substituting :: for __ in every htm file, and rename htlml file

Quote:

> Am using vhdldoc, which creates HTML files.
> File names of form: XXX::YY.html
> Within HTML files: Links to other files of same form (with the ::)
> This works great in Linux/unix, but Windows file system does not allow
> files with then :: in the name.
> QUESTION: How do I write an awk program that
> 1. traverses every .html file
> 2. rename the file with same name, except "::" is substituted with "__"
> 3. Change every instance of "::" to "__" within the file.
> ------------------------------------------------------------------------------

try

        FNR == 1 {
                file = FILENAME ; gsub( /::/, "__", file)
        }

        {
                gsub( /::/, "__" )
                print > file
        }

Jurgen



Tue, 28 Dec 2004 18:44:57 GMT  
 Substituting :: for __ in every htm file, and rename htlml file


Quote:
> > Am using vhdldoc, which creates HTML files.
> > File names of form: XXX::YY.html
> > Within HTML files: Links to other files of same form (with the ::)
> > This works great in Linux/unix, but Windows file system does not allow
> > files with then :: in the name.
> > QUESTION: How do I write an awk program that
> > 1. traverses every .html file
> > 2. rename the file with same name, except "::" is substituted with "__"
> > 3. Change every instance of "::" to "__" within the file.

> --------------------------------------------------------------------------
----

> try

> FNR == 1 {
> file = FILENAME ; gsub( /::/, "__", file)
> }

> {
> gsub( /::/, "__" )
> print > file
> }

> Jurgen

You may want to add a close() in there if you have a lot of files.
Quote:
> FNR == 1 {

      if (file != "") close(file)

- Show quoted text -

Quote:
>     file = FILENAME ; gsub( /::/, "__", file)
> }



Tue, 28 Dec 2004 21:03:34 GMT  
 Substituting :: for __ in every htm file, and rename htlml file
Thanks for the reply.  The following worked, but I still have one simple question.
FNR == 1 {
file = FILENAME ; gsub( /::/, "__", file)

Quote:
}

{
gsub( /::/, "__" )
gsub("%3A%3A", "__")
print > file

Quote:
}

to activate:
awk -f awkcmd.awk *

do I need the close(file)?
Worked without it.
The statement:

Quote:
> FNR == 1 {
>    if (file != "") close(file)
> >     file = FILENAME ; gsub( /::/, "__", file)
Bombed!
> You may want to add a close() in there if you have a lot of files.
> > FNR == 1 {
>    if (file != "") close(file)
> >     file = FILENAME ; gsub( /::/, "__", file)
> > }

Thanks,
Ben


Wed, 29 Dec 2004 05:49:35 GMT  
 Substituting :: for __ in every htm file, and rename htlml file

Quote:

> Am using vhdldoc, which creates HTML files.
> File names of form: XXX::YY.html
> Within HTML files: Links to other files of same form (with the ::)
> This works great in Linux/unix, but Windows file system does not allow
> files with then :: in the name.
> QUESTION: How do I write an awk program that
> 1. traverses every .html file
> 2. rename the file with same name, except "::" is substituted with "__"
> 3. Change every instance of "::" to "__" within the file.

> I used awk many many years ago, and totally forgot it!
> Thanks,

If .html files are under single directory,
    for h in *.html; do
        sed 's/::/__/g' $h > $h.tmp && mv $h.tmp $h
    done

If they are under directory tree,
    find . -name '*.html' | while read h; do
        sed 's/::/__/g' $h > $h.tmp && mv $h.tmp $h
    done

--

8-CPU Cluster, Hosting, NAS, Linux, LaTeX, python, vim, mutt, tin



Wed, 29 Dec 2004 07:01:54 GMT  
 Substituting :: for __ in every htm file, and rename htlml file

Quote:
> Thanks for the reply.  The following worked, but I still have one simple
question.
> FNR == 1 {
> file = FILENAME ; gsub( /::/, "__", file)
> }

> {
> gsub( /::/, "__" )
> gsub("%3A%3A", "__")
> print > file

> }

> to activate:
> awk -f awkcmd.awk *

> do I need the close(file)?
> Worked without it.
> The statement:
> > FNR == 1 {
> >    if (file != "") close(file)
> > >     file = FILENAME ; gsub( /::/, "__", file)
> Bombed!
> > You may want to add a close() in there if you have a lot of files.
> > > FNR == 1 {
> >    if (file != "") close(file)
> > >     file = FILENAME ; gsub( /::/, "__", file)
> > > }

> Thanks,
> Ben

I'm not sure why it bomb with that line.  Perhaps a more detailed
description would help.  I'm using gawk 3.0.4/Windows 98, and this program,
that has a similar structure, works for me:
    FNR == 1 {
        if (file != "") close(file)
        file = FILENAME ; file = "_" file
    }
    {
        $0 = "_" $0
        print > file
    }

Anyway, as for why you would want to close files in the first place,
whenever you write something to a new file (>), or existing file (>>), or
read from a file with getline (<), awk open the file, and keeps it open, in
case you want to, say, write another line, or read another line.  When awk
is done, all of the files opened by awk are closed.  But some (most?)
platforms have a maximum number of files that can be open at a time, and
even without that restriction, keeping too many files open is a resource
drain.

So, if you were "fixing" 1000 vhdl files in one pass (awk -f fixer.awk
*.html), every time you printed to a file the first time, the file would
stay open until the end of the run.  while working on the first file, one
output file would be open; while working on the last file, 1000 would be
open.  awk closes the input files it's working with as it goes along.

The gawk manual has this to say:
"If you use more files than the system allows you to have open, gawk
attempts to multiplex the available open files among your data files. gawk's
ability to do this depends upon the facilities of your operating system, so
it may not always work. It is therefore both good practice and good
portability advice to always use close on your files when you are done with
them."

Try it without the "if()".  All this does is keeps awk from initially
closing the file "".  Usually close("") does nothing, but I'm alwys a little
paranoid that "" might be interpretted as stdout or stdin on some platforms.

    - Dan



Wed, 29 Dec 2004 10:32:12 GMT  
 Substituting :: for __ in every htm file, and rename htlml file

Quote:

> > Am using vhdldoc, which creates HTML files.
> > File names of form: XXX::YY.html
> > Within HTML files: Links to other files of same form (with the ::)
> > This works great in Linux/unix, but Windows file system does not allow
> > files with then :: in the name.
> > QUESTION: How do I write an awk program that
> > 1. traverses every .html file
> > 2. rename the file with same name, except "::" is substituted with "__"
> > 3. Change every instance of "::" to "__" within the file.

> > I used awk many many years ago, and totally forgot it!
> > Thanks,

> If .html files are under single directory,
>     for h in *.html; do
> sed 's/::/__/g' $h > $h.tmp && mv $h.tmp $h
>     done

> If they are under directory tree,
>     find . -name '*.html' | while read h; do
> sed 's/::/__/g' $h > $h.tmp && mv $h.tmp $h
>     done

> --

> 8-CPU Cluster, Hosting, NAS, Linux, LaTeX, python, vim, mutt, tin

I don't follow sed as I should, since awk and sed are in the same O'Reilly
book [See  eveyone, I'm keeping on-topic!].  I think I see how the contents
of the file who's name is stored in the variable 'h' are changed and stuck
in a temp file--but how is the filename itself fixed, per requirement (2),
"rename the file..." ?

    - Dan



Wed, 29 Dec 2004 10:39:01 GMT  
 Substituting :: for __ in every htm file, and rename htlml file

Quote:

>> > 1. traverses every .html file
>> > 2. rename the file with same name, except "::" is substituted with "__"
>> > 3. Change every instance of "::" to "__" within the file.

>> If .html files are under single directory,
>>     for h in *.html; do
>> sed 's/::/__/g' $h > $h.tmp && mv $h.tmp $h
>>     done

>> If they are under directory tree,
>>     find . -name '*.html' | while read h; do
>> sed 's/::/__/g' $h > $h.tmp && mv $h.tmp $h
>>     done
> I don't follow sed as I should, since awk and sed are in the same O'Reilly
> book [See  eveyone, I'm keeping on-topic!].  I think I see how the contents
> of the file who's name is stored in the variable 'h' are changed and stuck
> in a temp file--but how is the filename itself fixed, per requirement (2),
> "rename the file..." ?

Sharp eye!  I miss that.  To change the filename as well,
    for h in *::*.html; do
        to=${h/::/__}
        sed 's/::/__/g' $h > $to && mv $to $h
    done
or
    find . -name '*::*.html' | ...

--

8-CPU Cluster, Hosting, NAS, Linux, LaTeX, python, vim, mutt, tin



Wed, 29 Dec 2004 12:35:09 GMT  
 Substituting :: for __ in every htm file, and rename htlml file
Hello,


Quote:
> do I need the close(file)?
> Worked without it.

whenever you use

        print > file

with a new value of variable file, new file is opened.
The previous one is not closed, it remains open.

So it depends on your OS, how many files can you open simultaneously;
it may be 16, 256, 65536 ...

Generally, it's advisable to close() the files in situations likes this:
you write to the file for a while, then move to another one and the
previous one won't be used any more.

More detailed description is available here:
http://www.gnu.org/manual/gawk/html_node/Close-Files-And-Pipes.html

Quote:
> The statement:
> > FNR == 1 {
> >    if (file != "") close(file)
> > >     file = FILENAME ; gsub( /::/, "__", file)
> Bombed!

Wow!  Can we have more details?  Was it just monitor which bombed or
was there also a thin blue strip of strange smelling smoke going out of
your keyboard?

Cheers,
        Stepan



Fri, 31 Dec 2004 14:52:54 GMT  
 Substituting :: for __ in every htm file, and rename htlml file

Quote:

> You may want to add a close() in there if you have a lot of files.

Thank you, Dan!

Of course, you are right; I forgot the "close" call.

As you and Stepan already said: Thumbrule: Whenever files are accessed
explicitely by name (or pipe) within an awk script, a "close" call
should be added.
If this happens in a loop or with an unknown number of files, "close"
calls are a must.

Sorry, Jurgen



Sat, 01 Jan 2005 14:12:06 GMT  
 Substituting :: for __ in every htm file, and rename htlml file


Quote:

> > You may want to add a close() in there if you have a lot of files.

> Thank you, Dan!

> Of course, you are right; I forgot the "close" call.

> As you and Stepan already said: Thumbrule: Whenever files are accessed
> explicitely by name (or pipe) within an awk script, a "close" call
> should be added.
> If this happens in a loop or with an unknown number of files, "close"
> calls are a must.

> Sorry, Jurgen

To be 100% percent safe you need to close _any_ command or file involved
in a pipe or redirection otherwise there is a risk of the system running
out of memory, swap space, etc..  This advice is particularly true where
using gawk (or other versions of awk) on Win32 systems.

HTH
--
Peter S Tillier
"Who needs perl when you can write dc and sokoban in sed?"
peter{dot}tillier<at>btinternet[dot]com
To reply direct to me please use the above address
not the "Reply To" which activates a spam trap.



Sun, 02 Jan 2005 08:22:28 GMT  
 Substituting :: for __ in every htm file, and rename htlml file


Quote:




> > > You may want to add a close() in there if you have a lot of files.

> > Thank you, Dan!

> > Of course, you are right; I forgot the "close" call.

> > As you and Stepan already said: Thumbrule: Whenever files are accessed
> > explicitely by name (or pipe) within an awk script, a "close" call
> > should be added.
> > If this happens in a loop or with an unknown number of files, "close"
> > calls are a must.

> > Sorry, Jurgen

> To be 100% percent safe you need to close _any_ command or file involved
> in a pipe or redirection otherwise there is a risk of the system running
> out of memory, swap space, etc..  This advice is particularly true where
> using gawk (or other versions of awk) on Win32 systems.

> HTH
> --
> Peter S Tillier
> "Who needs perl when you can write dc and sokoban in sed?"
> peter{dot}tillier<at>btinternet[dot]com
> To reply direct to me please use the above address
> not the "Reply To" which activates a spam trap.

In my experience, even calling system() too many times (at least in my old
3.0.4 gawk), on 9x or NT, can lead to bad things.
    - Dan


Sun, 02 Jan 2005 09:27:08 GMT  
 Substituting :: for __ in every htm file, and rename htlml file

Quote:






> > > > You may want to add a close() in there if you have a lot of
files.

> > > Thank you, Dan!

> > > Of course, you are right; I forgot the "close" call.

> > > As you and Stepan already said: Thumbrule: Whenever files are
accessed
> > > explicitely by name (or pipe) within an awk script, a "close" call
> > > should be added.
> > > If this happens in a loop or with an unknown number of files,
"close"
> > > calls are a must.

> > > Sorry, Jurgen

> > To be 100% percent safe you need to close _any_ command or file
involved
> > in a pipe or redirection otherwise there is a risk of the system
running
> > out of memory, swap space, etc..  This advice is particularly true
where
> > using gawk (or other versions of awk) on Win32 systems.

> > HTH
> > --
> > Peter S Tillier
> > "Who needs perl when you can write dc and sokoban in sed?"
> > peter{dot}tillier<at>btinternet[dot]com
> > To reply direct to me please use the above address
> > not the "Reply To" which activates a spam trap.

> In my experience, even calling system() too many times (at least in my
old
> 3.0.4 gawk), on 9x or NT, can lead to bad things.
>     - Dan

Yep, that can do it too.  I think that it's to do with the way that M$
Win32 manages its memory.  I get lots of problems (and very slooow
response) when using Cygwin on Win32 platforms, but at least it works.

Regards
Peter
--
Peter S Tillier  peter{dot}tillier<at>btinternet[dot]com
To email me direct please use the above address
This post represents the views of the author and does not necessarily
accurately represent the views of BT



Sun, 02 Jan 2005 15:15:12 GMT  
 Substituting :: for __ in every htm file, and rename htlml file
On 13 Jul 2002 04:35:09 GMT, William Park

Quote:


>>> > 1. traverses every .html file
>>> > 2. rename the file with same name, except "::" is substituted with "__"
>>> > 3. Change every instance of "::" to "__" within the file.

>>> If .html files are under single directory,
>>>     for h in *.html; do
>>> sed 's/::/__/g' $h > $h.tmp && mv $h.tmp $h
>>>     done

>>> If they are under directory tree,
>>>     find . -name '*.html' | while read h; do
>>> sed 's/::/__/g' $h > $h.tmp && mv $h.tmp $h
>>>     done

>> I don't follow sed as I should, since awk and sed are in the same O'Reilly
>> book [See  eveyone, I'm keeping on-topic!].  I think I see how the contents
>> of the file who's name is stored in the variable 'h' are changed and stuck
>> in a temp file--but how is the filename itself fixed, per requirement (2),
>> "rename the file..." ?

>Sharp eye!  I miss that.  To change the filename as well,
>    for h in *::*.html; do
>    to=${h/::/__}
>    sed 's/::/__/g' $h > $to && mv $to $h
>    done
>or
>    find . -name '*::*.html' | ...

I see that this script only filters files that contain the substring
'::' whereas it was specified by the OP :
1. traverses every .html file

What about filenames without :: but the contents of which has
occurences of :: ?

Terry



Tue, 11 Jan 2005 06:51:39 GMT  
 Substituting :: for __ in every htm file, and rename htlml file

Quote:

>>Sharp eye!  I miss that.  To change the filename as well,
>>    for h in *::*.html; do
>>       to=${h/::/__}
>>       sed 's/::/__/g' $h > $to && mv $to $h
>>    done
>>or
>>    find . -name '*::*.html' | ...

> I see that this script only filters files that contain the substring
> '::' whereas it was specified by the OP :
> 1. traverses every .html file

> What about filenames without :: but the contents of which has
> occurences of :: ?

> Terry

Well, you know what to do in that case.  Hint: '*.html'

--

8-CPU Cluster, Hosting, NAS, Linux, LaTeX, python, vim, mutt, tin



Tue, 11 Jan 2005 09:08:07 GMT  
 
 [ 17 post ]  Go to page: [1] [2]

 Relevant Pages 

1. FILE-STATUS and RENAME-FILE in Pygmy Forth

2. need help with rename-file/delete-file

3. Using HTM Help file

4. CWIC Default Font on HTM files??????

5. opening up html / htm files in Cosmo Worlds

6. new htm file in new window on the fly

7. exec htm and txt files on XP

8. drop every newline in file and then ...

9. Clarion Rewrites CLW file Every time I load it

10. Writing binary data to a file without carriage returns every 512 bytes

11. Short summary of every C file in the Tk Distribution

12. Renaming a file

 

 
Powered by phpBB® Forum Software