How to Read/Write RTF and Word Files? 
Author Message
 How to Read/Write RTF and Word Files?

How to Read/Write RTF and Word Files?

Any suggestions on modules or code which
will allow reading and writing of the above?

Thanks



Thu, 06 Nov 2003 02:23:12 GMT  
 How to Read/Write RTF and Word Files?
[This followup was posted to comp.lang.python and a copy was sent to the
cited author.]


Quote:
> How to Read/Write RTF and Word Files?

> Any suggestions on modules or code which
> will allow reading and writing of the above?

> Thanks

http://www.wvware.com/


Fri, 07 Nov 2003 06:23:24 GMT  
 How to Read/Write RTF and Word Files?

Quote:

>How to Read/Write RTF and Word Files?

>Any suggestions on modules or code which
>will allow reading and writing of the above?

>Thanks

If my app was definitely going to be running on Windows with MS Word installed,
I would use COM.

---
Alan.



Fri, 07 Nov 2003 06:48:37 GMT  
 How to Read/Write RTF and Word Files?

Quote:

> How to Read/Write RTF and Word Files?

> Any suggestions on modules or code which
> will allow reading and writing of the above?

RTF is a published format. The standard isn't as rigourously followed as
one would like, but it is more consistently implemented than HTML.
Microsoft defined RTF and you can find the documentation on their site.
Start at the knowledge base article:

http://support.microsoft.com/support/kb/articles/Q86/9/99.ASP

Microsoft Word uses a proprietary and undocumented format for .doc files.
The Word file format has changed significantly across versions of Word.
Whether by reverse-engineering or licensing the spec from Microsoft, quite
a few companies have implemented at least some Word import/export
capabilities.

Microsoft also gives away Word document reader. I don't believe they have
a Linux version, though. But you don't need Word to view Word documents if
you have Windows or Mac OS running. The free Word reader may work under
Wine; it certainly works under VMWare.

Greg Jorgensen
PDXperts LLC
Portland, Oregon, USA



Fri, 07 Nov 2003 10:37:22 GMT  
 How to Read/Write RTF and Word Files?

.

Quote:

> > How to Read/Write RTF and Word Files?
    ...
> Microsoft Word uses a proprietary and undocumented format for .doc files.
> The Word file format has changed significantly across versions of Word.
> Whether by reverse-engineering or licensing the spec from Microsoft, quite
> a few companies have implemented at least some Word import/export
> capabilities.

Yep.  And there are opensource programs that attempt the same
feat (presumably after reverse-engineering, in this case).

http://www.fe.msk.ru/~vitus/catdoc/ offers such tools that manage
to extract some text from MS Word (and Excel) files most of the
time, and also points to other such tools.  These tools aren't
in Python, but you might use them from Python, or maybe recode in
python the algorithms & heuristics they embody.

Alex



Fri, 07 Nov 2003 16:28:23 GMT  
 How to Read/Write RTF and Word Files?

Quote:


> .

> > > How to Read/Write RTF and Word Files?
>     ...
> > Microsoft Word uses a proprietary and undocumented format for .doc files.
> > The Word file format has changed significantly across versions of Word.
> > Whether by reverse-engineering or licensing the spec from Microsoft, quite
> > a few companies have implemented at least some Word import/export
> > capabilities.

> Yep.  And there are opensource programs that attempt the same
> feat (presumably after reverse-engineering, in this case).

> http://www.fe.msk.ru/~vitus/catdoc/ offers such tools that manage
> to extract some text from MS Word (and Excel) files most of the
> time, and also points to other such tools.  These tools aren't
> in Python, but you might use them from Python, or maybe recode in
> Python the algorithms & heuristics they embody.

> Alex

WxWare which I tersely mentioned previously in this thread claims to do a
pretty good job of round tripping various Word versions up through Word
2000. It does both RTF and doc formats.

Dave LeBlanc



Sat, 08 Nov 2003 00:22:06 GMT  
 How to Read/Write RTF and Word Files?
Hi,

I have read some discussions on this mailing list about DIP (dependency
inversion principle) and theoretically everything is clear to me.

So if we have two classes A and B such that A->B (a depends on B)
and B->A then we have cyclic dependency.
One way to break this cycle is to use DIP.
According to DIP we define an abstract interface C such that:
B uses C
A implements C

and we finish with acyclic dependencies: A->B, B->C, A->C.

Now, I want to ask you about implementation.

1. Would you really declare class ("interface") C in python ?
2. Because python has only implicit interfaces is not is enough just to
   implement C interface in A ?
3. Finally if I decide not to really make C class but only
   to implement this interface in A I will finish with almost the same
   situation as at the beggining: two classes A and B where A->B and B->(C
part of A)
   I am a little confused at this point :-)  ... it seems like I did really
nothing.

Your thoughts ?

-- Sasa Zivkov



Sat, 08 Nov 2003 01:15:03 GMT  
 How to Read/Write RTF and Word Files?
Quote:
> WxWare which I tersely mentioned previously in this thread claims to do a
> pretty good job of round tripping various Word versions up through Word
> 2000. It does both RTF and doc formats.

> Dave LeBlanc

Sorry, my mistake: it's WvWare, not WxWare

Dave LeBlanc



Sat, 08 Nov 2003 02:17:53 GMT  
 How to Read/Write RTF and Word Files?

Quote:

> Hi,

> I have read some discussions on this mailing list about DIP (dependency
> inversion principle) and theoretically everything is clear to me.

> So if we have two classes A and B such that A->B (a depends on B)
> and B->A then we have cyclic dependency.
> One way to break this cycle is to use DIP.
> According to DIP we define an abstract interface C such that:
> B uses C
> A implements C

> and we finish with acyclic dependencies: A->B, B->C, A->C.

> Now, I want to ask you about implementation.

> 1. Would you really declare class ("interface") C in python ?

No.

Quote:
> 2. Because python has only implicit interfaces is not is enough just to
>    implement C interface in A ?

Yes.

Quote:
> 3. Finally if I decide not to really make C class but only
>    to implement this interface in A I will finish with almost the same
>    situation as at the beggining: two classes A and B where A->B and B->(C
> part of A)
>    I am a little confused at this point :-)  ... it seems like I did really
> nothing.

Precisely.

Quote:

> Your thoughts ?

In a strongly typed language you have this problem and you'll find the
A->B, B->C, A->C dependencies back in the import/include/use structure
needed to get your modules compiled.

In Python the equivalent checks are done at runtime, so there is no
need to predeclare C.
Still, I prefer to document the C interface in somewhere my python code
so it can be easily reused.

Note that you can even delay the implementation of interface C in class A
until you actually need it. Eg.

class B:
    def __init__(self, glasses):
        self.glasses = glasses

    def lookatThing(self, thing):
        thing.inspect(self.glasses) # thing conforms to C, it should have inspect().

class A:
    def __init__(self):
        self.theB = B('goggles')

    def receiveVisitor(self, whichAudit):
        if whichAudit == 1:
            self.inspect = self.audit1 # dynamically adapt self to have inspect() method.
        else:
            self.inspect = self.audit2
        self.theB.lookatThing(self)
        del self.inspect

    def audit1(self):
        print 'audited1'

    def audit2(self):
        print 'audited2'

Fortunately you only have to try imagine how you would do this in a typed language.

Have fun,
Ype
--
email at xs4all.nl



Sat, 08 Nov 2003 04:10:46 GMT  
 How to Read/Write RTF and Word Files?

Quote:
> Hi,

> I have read some discussions on this mailing list about DIP (dependency
> inversion principle) and theoretically everything is clear to me.

> So if we have two classes A and B such that A->B (a depends on B)
> and B->A then we have cyclic dependency.

Yes.  But take care that what defines "a dependency" is not
language-independent.

In Python (and any other signature-based-polymorphism setup,
such as C++ templates!), "uses" does not necessarily create a
dependency.  If all the "using" code asks of the "used" one is
that the latter provide methods x, y, z, ..., with certain sigs,
then the dependency of the "using" code is on the method names
and signatures, _intrinsically_.

Quote:
> One way to break this cycle is to use DIP.

In pure sig-based pm, as I hope I just showed, the "inversion of
dependency" turns out to be *intrinsic*.  Which is part of the
power of signature-based polymorphism, whether it be in the
friendly guise of Python or the less-friendly one of C++ templates.

Quote:
> According to DIP we define an abstract interface C such that:
> B uses C
> A implements C

> and we finish with acyclic dependencies: A->B, B->C, A->C.

> Now, I want to ask you about implementation.

> 1. Would you really declare class ("interface") C in python ?

Not in Python as it stands today.

Quote:
> 2. Because python has only implicit interfaces is not is enough just to
>    implement C interface in A ?

There is no "C interface" as a unity, or entity.  Just sets of
methods with signatures.

Which isn't to say it could be even better to HAVE a way to
express, objectify, 'C'.  Though I would not narrowly call it
an 'interface' but a 'protocol', and a more general concept than
'implements' is 'is-adaptable-to'.  See PEP 246, at URL
http://python.sourceforge.net/peps/pep-0246.html, for a
dream-way it could be handled...!-)

Quote:
> 3. Finally if I decide not to really make C class but only
>    to implement this interface in A I will finish with almost the same
>    situation as at the beggining: two classes A and B where A->B and B->(C
> part of A)
>    I am a little confused at this point :-)  ... it seems like I did
really
> nothing.

In a way, because, in sig-based polymorphism, there WAS no need
to do anything special:-).

But, if you defined a protocol, even if the language itself does not let
you express this crucial design-idea, you would in fact have made a
step forward.  Instead of designing in terms of concrete classes, i.e.,
implementation, you'd have designed in term of the protocol, its
user, and its supplier.  It WOULD be even better if this key design
idea could be expressed directly and explicitly in the language...

Alex



Sat, 08 Nov 2003 05:11:21 GMT  
 How to Read/Write RTF and Word Files?

Quote:

> Microsoft defined RTF

I thought IBM defined RTF.  Microsoft define a non-proprietary
standard?  Doesn't seem to fit the MS way of doing things.

-SteveN-



Sat, 08 Nov 2003 08:13:57 GMT  
 
 [ 36 post ]  Go to page: [1] [2] [3]

 Relevant Pages 

1. Write RTF File From CFD APP

2. Writing RTF files.

3. Reading/Writing from/to Excel/Word using DDE (PLEASE HELP)

4. RTF File Writer ( Was : RTF )

5. Clarification: read/write slow, and TCPSocket and sys{read,write}

6. read/write slow, and TCPSocket and sys{read,write}

7. Reading from input file writing to output file

8. Q: simple example of read file->convert char(s)->write file needed

9. Reading a binary file / writing binary data to a file

10. VW: Reading text from Word 97 files

11. read tables from word file

12. Reading word files in Fortran 77?

 

 
Powered by phpBB® Forum Software