remove rtf encoding from txt files 
Author Message
 remove rtf encoding from txt files

Hi all,

Is there some elegant VBA way to remove rtf encoding from txt files? I
have huge (60MB) txt files (exports from a database) containing
'{*filter*}' Eastern European and Baltic characters and also rtf encoding
that should be partially removed, i.e. all codes except the font codes
and the text itself of course. Opening in Word is not an option,
because the files are too large.
Being a real amateur at VBA I couldn't come up with anything better
than this:
With smaller test files I have tried cheating on Word97 by reading the
files line by line (all text segments are on one line), writing them
one by one to a dummy file that contains the exact rtf preamble of the
source file, putting {\rtf before and } after, rename the file with
.rtf extension and then open this file in W97 so this one segment is
displayed exactly as formatted. If the input string does not contain
any font codes ({\fxx) I select it, assign the selection (which only
contains unformatted text) to a string and write this unformatted
string to the target file; if it does contain a font code I do not
select it, but write it 'as is' (with rtf encoding) to the target file
with a marker, so I can remove the superfluous formatting after
reimporting the lot into the database. Unfortunately, this database
does not permit to search on formatting characteristics, as you may
have guessed.
This really is an awkward procedure, extremely slow with all this file
opening, closing and renaming, and after a while W97 gets tired and
protests with an 'out of virtual memory' message. By the way, is there
a VBA command to free up memory?
Sorry to be so long winded, but I hope I've clearly stated my problem.
Surely someone will tell me that my approach so far is pathetic and
redirect my efforts?

Many thanks,
Mark



Sat, 20 Nov 2004 05:43:53 GMT  
 remove rtf encoding from txt files

SaveAs Text


like this:

Quote:
>Hi all,

>Is there some elegant VBA way to remove rtf encoding from txt files? I
>have huge (60MB) txt files (exports from a database) containing
>'{*filter*}' Eastern European and Baltic characters and also rtf encoding
>that should be partially removed, i.e. all codes except the font codes
>and the text itself of course. Opening in Word is not an option,
>because the files are too large.
>Being a real amateur at VBA I couldn't come up with anything better
>than this:
>With smaller test files I have tried cheating on Word97 by reading the
>files line by line (all text segments are on one line), writing them
>one by one to a dummy file that contains the exact rtf preamble of the
>source file, putting {\rtf before and } after, rename the file with
>.rtf extension and then open this file in W97 so this one segment is
>displayed exactly as formatted. If the input string does not contain
>any font codes ({\fxx) I select it, assign the selection (which only
>contains unformatted text) to a string and write this unformatted
>string to the target file; if it does contain a font code I do not
>select it, but write it 'as is' (with rtf encoding) to the target file
>with a marker, so I can remove the superfluous formatting after
>reimporting the lot into the database. Unfortunately, this database
>does not permit to search on formatting characteristics, as you may
>have guessed.
>This really is an awkward procedure, extremely slow with all this file
>opening, closing and renaming, and after a while W97 gets tired and
>protests with an 'out of virtual memory' message. By the way, is there
>a VBA command to free up memory?
>Sorry to be so long winded, but I hope I've clearly stated my problem.
>Surely someone will tell me that my approach so far is pathetic and
>redirect my efforts?

>Many thanks,
>Mark

Steve Hudson - Word Heretic, Sydney, Australia

Live Advice: http://www.*-*-*.com/ +Word+Heretic
You agree by writing to me personally that any material can be reused publicly unless you explicitly disclaim it.


Fri, 26 Nov 2004 12:38:23 GMT  
 
 [ 2 post ] 

 Relevant Pages 

1. How to save RTF file as an encoded text with Hebrew(Windows) encoding

2. Converting a byte array of an RTF encoded string into true RTF

3. Detecting the Encoding of a .txt file without openning it

4. Showing contents of instruction file (.txt/.rtf) on a form

5. help with .txt file in a RTF

6. save file as .txt or as .rtf in rich text box

7. save file as .txt or as .rtf in rich text box

8. Help Removing Lines From TXT Files.

9. Repost: Removing EOF char from txt file

10. Removing EOF char from txt file

11. Help: How to remove duplicate value llines from txt file

12. Adding .TXT files to a VB program without the .TXT file

 

 
Powered by phpBB® Forum Software