making python man pages looks hard 
Author Message
 making python man pages looks hard

I'm having fun watching tchrist learn python. I learned perl from Tom
back in '90 or so. Anyway... I said my piece on perl v.s python back in '96,
and I don't have much to add since then:

Subject: Re: Python, Tcl and Perl, oh my! (was Re: tcl vs. perl)
    Date: 1996/06/26
    Newsgroups: comp.lang.perl.misc, comp.lang.tcl, comp.lang.perl.tk, comp.lang.python
http://www.*-*-*.com/

But, having spent most of my career translating technical
documentation from one format to another, the gripe about a
lack of man pages for python got me thinking.
I happen to be a *big fan* of the python documentation[2] as is,
but I don't see any reason why it shouldn't be available as
man pages too. Or at least... I didn't.

[2] http://www.*-*-*.com/

Then I looked at the source[3]

[3] http://www.*-*-*.com/

The source format is LaTeX. Converting from{*filter*}is hard/messy:

==============
For example, I conjecture that it is impossible to write a program that
will extract the third word from a TeX document. It would be an easy
task for 80% of the TeX documents out there -- just skip over some
formatting stuff and grab the third bunch of characters surrounded by
whitespace. But that "formatting stuff" might be a program that
generates 100 words from the hypenation dictionary. So the simple
lexical scan of the TeX source would find a word that is not third
word of the document when printed.

This may seem like an obscure and unimportant problem, but I
assure you that the problem of converting TeX tables to FrameMaker
MIF is just as unsolvable.

So while "programmable" document formats have the advantage that
features can be added on a per-document basis, they suffer the
disadvantage that these features cannot be recovered by the
machine and translated in an automated fashion.

excerpted from Toward Closure on HTML
1994/04/07
http://www.*-*-*.com/
==============

The author of one of the python doc tools seems to agree:

===============
# Why not start from{*filter*}rather than HTML?
# I could hack latex2html itself to produce Texinfo instead, or fix up
# partparse.py (which already translates{*filter*}to Teinfo).
#  Pros:
#   * has high-level information such as index entries, original formatting
#  Cons:
#   * those programs are complicated to read and understand
#   * those programs try to handle arbitrary{*filter*}input, track catcodes,
#     and more:  I don't want to go to that effort.  HTML isn't as powerful
#     as LaTeX, so there are fewer subtleties.
#   * the result wouldn't work for arbitrary HTML documents; it would be
#     nice to eventually extend this program to HTML produced from Docbook,
#     Frame, and more.

excerpt from
# html2texi.pl -- Convert HTML documentation to Texinfo format

# Time-stamp: <1999-01-12 21:34:27 mernst>
part of [3]
===============

The bewildering array of scripts, tools, and hacks used to
generate the HTML version of the python docs is frightening!
It suggests to me that *very few people* maintain the
python docs. That's good for consistency, but it's sort
of a cathedral[3] approach: there's a sharp line between
the "blessed" modules and Everything Else.

[3] http://www.*-*-*.com/ ~esr/writings/cathedral-bazaar/

It would be fairly easy to convert the HTML to nroff... I think there
are tools that do that... rosettaman or something? Yes... it
seems to have an option to convert back to roff format.
http://www.*-*-*.com/
ftp://ftp.cs.berkeley.edu:/ucb/people/phelps/tcltk/rman.tar.Z

The trick would be dividing up the sections. The python HTML docs
aren't self-contained like the perl man pages.

I took a quick look at the python doc-sig, but I didn't find much
relevant info... they seem to be focussed on a javadoc
work-alike... hmm... maybe that is relevant; is the python
library reference source expected to move into docstrings?
That would be cool.

Anyway... I had hoped to contribute more, but this looks harder
than I expected, and I'm done for the day.

parting shot: CPAN is cool, but I find it frustrating that I can't
read the documentation for a module without downloading
and upacking the module. for example, I can browse
the list of modules,

http://www.*-*-*.com/

but say I find one I'm interested in:

Mac::Comm::
::OT_PPP        RdpO Control Open Transport PPP / Remote Access   CNANDOR

the only link goes to the Author contact info. Gee thanks.

"Uniform Resource Identifiers (URIs, aka URLs) are short strings that identify
resources in the web: documents, images, downloadable files, services,
electronic mailboxes, and other resources. They make resources available under
a variety of naming schemes and access methods such as HTTP, FTP, and
Internet mail addressable in the same simple way. They reduce the tedium of "log
in to this server, then issue this magic command ..." down to a single click."
 -- http://www.*-*-*.com/

--
Dan Connolly
http://www.*-*-*.com/



Tue, 05 Feb 2002 03:00:00 GMT  
 making python man pages looks hard


[snip]
! parting shot: CPAN is cool, but I find it frustrating that I can't
! read the documentation for a module without downloading
! and upacking the module. for example, I can browse
! the list of modules,
!
! http://www.perl.com/CPAN/modules/00modlist.long.html
!
! but say I find one I'm interested in:
!
! Mac::Comm::
! ::OT_PPP        RdpO Control Open Transport PPP / Remote Access   CNANDOR
!
! the only link goes to the Author contact info. Gee thanks.

that is usually no longer a valid complaint --- try the CPAN
front end at 'theory.uwinnipeg.ca', in particular, the search
engine:

http://theory.uwinnipeg.ca/search/cpan-search.html

I entered Mac::Com in the box and got back:

Mac-Comm-OT_PPP-1.20.tar.gz [readme / module list] 3.6 1998-01-06 CNANDOR
Mac::Comm::OT_PPP [ documentation]: Interface to Open Transport PPP

(and yes, those are links to the .tgz itself, the readme, a list of
modules in the package; and, surprise, "[documentation]" links to the
html'ified manpage).

regards
andrew

--
      They're not soaking, they're rusting!
          -- my wife on my dishwashing habits



Tue, 05 Feb 2002 03:00:00 GMT  
 making python man pages looks hard
     [courtesy cc of this posting mailed to cited author]

In comp.lang.python,

:parting shot: CPAN is cool, but I find it frustrating that I can't
:read the documentation for a module without downloading
:and upacking the module. for example, I can browse
:the list of modules,
:
:http://www.perl.com/CPAN/modules/00modlist.long.html

Try

    http://search.cpan.org/

--tom
--
You have to admit that it's difficult to misplace the Perl sources.  :-)



Tue, 05 Feb 2002 03:00:00 GMT  
 making python man pages looks hard
(posted and mailed)

Quote:

> I happen to be a *big fan* of the python documentation[2] as is
> but I don't see any reason why it shouldn't be available as
> man pages too. Or at least... I didn't.
> The source format is LaTeX. Converting from{*filter*}is hard/messy:

> ...
> For example, I conjecture that it is impossible to write a program that
> will extract the third word from a TeX document. It would be an easy
> task for 80% of the TeX documents out there -- just skip over some
> formatting stuff and grab the third bunch of characters surrounded by
> whitespace.
> ...

luckily, Fred Drake has already made most of the hard work
here -- check the Doc/tools/sgmlconv directory:

...

$ more Doc/tools/sgmlconv/README
These scripts and Makefile fragment are used to convert the Python
documentation in{*filter*}format to SGML.  XML is also supported as a
target, but is unlikely to be used.

This material is preliminary and incomplete.

...

not sure how incomplete, though.  I don't have things setup
so I can try the current release of this (from the CVS archive),
but maybe Fred can give us a status update?

(maybe he has, I cannot say I've bothered to read *all* the
messages on the group today...  sigh...)

anyway, if this stuff works, writing an esis2man converter
cannot be *that* hard.

</F>



Tue, 05 Feb 2002 03:00:00 GMT  
 making python man pages looks hard

    DC> The bewildering array of scripts, tools, and hacks used to
    DC> generate the HTML version of the python docs is frightening!

I think that's the Quote of the Day.  Wouldn't you agree Fred? :)

    DC> It suggests to me that *very few people* maintain the
    DC> python docs.

few == one.  Fred Drake, a.k.a. Dr. Doc.

speaking-as-a-beta-tester-of-fred's-html-documentation-ly y'rs,
-Barry



Tue, 05 Feb 2002 03:00:00 GMT  
 making python man pages looks hard

 > The source format is LaTeX. Converting from{*filter*}is hard/messy:

Dan,
  It sure is!  You'll get no argument from me on that one.

 > For example, I conjecture that it is impossible to write a program that
 > will extract the third word from a TeX document. It would be an easy
 > task for 80% of the TeX documents out there -- just skip over some
 > formatting stuff and grab the third bunch of characters surrounded by

  Not impossible, but painful enough in the general case that I have
no plans to write the code to do it (in any language!).

Quote:
Fredrik Lundh writes:

 > luckily, Fred Drake has already made most of the hard work
 > here -- check the Doc/tools/sgmlconv directory:

  Dang, my secret has been found!  ;-)
  Yes, there's a good bit of potentially interesting material there.

 > not sure how incomplete, though.  I don't have things setup
 > so I can try the current release of this (from the CVS archive),
 > but maybe Fred can give us a status update?

  I don't have time today, but I'll try to send a status report to the
Python doc-sig next week.  For anyone interested but not subscribed to
the doc-sig mailing list, see http://www.*-*-*.com/
information.

 > anyway, if this stuff works, writing an esis2man converter
 > cannot be *that* hard.

  Parsing esis isn't hard (it's in the XML package!), and generating
man pages isn't hard ("print" is a language statement in Python,
right?).  The difficulty is the semantic mapping; what do you want on
you manpages, how do you want them organized, etc.  The existing
structure may not be trivial to transform into manpages, regardless of
syntax.  Having a DOM instance for the library reference doesn't make
manpages a one-liner.  ;-)

  -Fred

--

Corporation for National Research Initiatives



Tue, 05 Feb 2002 03:00:00 GMT  
 making python man pages looks hard

    Tom> Try

    Tom>     http://search.cpan.org/

Tom,

Just out of curiosity, how is CPAN populated and kept in sync?  I went the
the Winnipeg site at

    http://theory.uwinnipeg.ca/search/cpan-search.html

and searched for Frontier (Ken MacLeod's Perl XML-RPC package) and found a
couple links to a readme and a tgz.  Clickly the readme link failed the
first time, but succeeded the second (guess it was choosing different
mirrors and got a hoser the first time).

I then tried the same search at search.cpan.org and got no relevant hits.

Skip Montanaro  | http://www.mojam.com/

847-971-7098



Tue, 05 Feb 2002 03:00:00 GMT  
 making python man pages looks hard

Quote:

> The source format is LaTeX. Converting from{*filter*}is hard/messy:

Well - the docs aren't in standard LaTeX, are they? According to
www.python.org/doc/doc:

  "With almost no basic TeX or{*filter*}markup in use, [...] the markup
  syntax is about the only evidence of{*filter*}in the actual document
  sources."

Since the format seems to be clearly and simply defined, it shouldn't
be *that* difficult... Anyway - a transition to XML seems to be on its
way, which will make all this trivial. (Actually, it wouldn't be that
hard to go from the HTML versions either, I guess, with a little
imagination. My point is - there is no reason to be spooked by the
complexity of{*filter*}here...)

--

  Magnus              Making no sound / Yet smouldering with passion
  Lie          The firefly is still sadder / Than the moaning insect
  Hetland                                       : Minamoto Shigeyuki



Tue, 05 Feb 2002 03:00:00 GMT  
 making python man pages looks hard
I've contacted the maintainer who will get back to you on it.

--tom



Tue, 05 Feb 2002 03:00:00 GMT  
 making python man pages looks hard

Quote:


> > The source format is LaTeX. Converting from{*filter*}is hard/messy:

> Well - the docs aren't in standard LaTeX, are they? According to
> www.python.org/doc/doc:

>   "With almost no basic TeX or{*filter*}markup in use, [...] the markup
>   syntax is about the only evidence of{*filter*}in the actual document
>   sources."

> Since the format seems to be clearly and simply defined, it shouldn't
> be *that* difficult... Anyway - a transition to XML seems to be on its
> way, which will make all this trivial. (Actually, it wouldn't be that
> hard to go from the HTML versions either, I guess, with a little
> imagination. My point is - there is no reason to be spooked by the
> complexity of{*filter*}here...)

In I:\cvsroot\python\dist\src\Doc\tools\sgmlconv
I find some README and a couple of scripts.
latex2esis.py
esis2sgml.py

and it looks to be able to produce XML. ?

Then I'd take an XML parser and try to generate man pages :-)
I have just no idea of *that* format.

ciao - chris     (starting a "who writes the shortest man.py")

--

Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://www.*-*-*.com/
10553 Berlin                 :     PGP key -> http://www.*-*-*.com/
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     we're tired of banana software - shipped green, ripens at home



Tue, 05 Feb 2002 03:00:00 GMT  
 making python man pages looks hard
Please contact me if you need help building the man pages.

ergowolf

Quote:

> I've contacted the maintainer who will get back to you on it.

> --tom



Wed, 06 Feb 2002 03:00:00 GMT  
 
 [ 11 post ] 

 Relevant Pages 

1. TclX.n man pages (and pages and pages....)

2. iwidgets 3.0.0 man pages require man.macros

3. ANNOUNCE: tcltk-man-html9, 9th edition of hypertext Tcl/Tk man pages

4. ANNOUNCE: tcltk-man-html6, 6th edition of hypertext Tcl/Tk man pages

5. tcl/tk man pages use .HS, not .TH for title, so man -k does not work

6. A suggestion for making ruby more realtime and parallel (and it may not be too hard)

7. OS made by a single man, possible?

8. man pages searched...

9. Man pages for MIT scheme.

10. Starting irb according to the man page

11. Man Pages For GNAT?

12. blas1 man page

 

 
Powered by phpBB® Forum Software