Pdx Design History |
Table of Contents
1999-08-14Experimented with pod and with sdf. I like the syntax of pod except for the over...back mechanism. I like the functionality of sdf but I think it got out of control. In particular, I want to allow user-definition of name-value pairs. That generalizes my earlier efforts at a configuration section.
1999-08-15Began project. A lot of time spent learning modules (a la h2xs), references, and objects. Gradually reworked the architecture to make easy addition of functionality. Added:
1999-08-16Bootstrapping the design history.
Nested ListsAdded nested lists, with various styles
=cutAdded treatment of =cut. Eventually, would like to do full-blown literate programming.
Cleaner parsingFor a while I had trouble with detecting lines begining in spaces. I tried to order the regex checks in process_file by frequency of occurrence. Obviously, this means that non-special (i.e., non "=") lines should go high in the list. I tried:
elsif ($line =~ /^\s*[^ =]/) {$self->out($line);}However, this matched on indented "=" lines, resulting in failed parses. Finally found that I had some tabs and actually needed elsif ($line =~ /^\s*[^ \t=]/) {$self->out($line);}
IncludeNext, we do include_verbatim:dummy.pdx line 1: This is I<included> from the dummy.pdx file. line 2: An embedded include, to another dir: =include ../doc/dummy2.pdx line 4 line 5 line 6: final line Next we do a pdx include, which may be nested. So far, we don't check for circular includes. We want to include on the first pass and just use the results on the second pass. We'll use a splice command, using ndx as the offset. We mark before and after with "|". dummy.pdx| line 1: This is included from the dummy.pdx file. line 2: An embedded include, to another dir: line 1: This is included from the dummy.pdx file. line 2: An embedded include, to another dir: line 1: This is I<included> from the dummy.pdx file. line 2: An embedded include, to another dir: =include_verbatim dummy2.pdx line 4 line 5 line 6: final lineline 4 line 5 line 6: final line line 4 line 5 line 6: final line |
1999-08-17Changed the list marker for descriptions from "d" to "[]", comparable to LaTeX. This allows us to later support setting starting values for the enumeration markers -- not that I ever expect to do that. Changed the input args from a string to a list, thus allowing binaries like pdx2html: #!/usr/bin/perl use Pdx; my $pdx = new Pdx; $pdx->parse('--2html',@ARGV)); Time to redo the architecture. I'd wanted to:
But that means the various output drivers are competing for namespace. I'll try going with 1 output at a time, and use "require" to get the driver.
1999-08-18Moved the format-specific code to separate files. it took quite a while to find the proper combination of idioms.
Still not right for references back from Html.pl to Pdx.pm.
1999-08-20RestructureCompletely redid the architecture (again). Now structured as:Pdx/ AUTHORS Base/ Base.pm Changes MANIFEST Makefile.PL RCS/ COPYING Changes Drivers/ Changes Html.pm MANIFEST Makefile.PL RCS/ MANIFEST Makefile.PL RCS/ README bin/ pdx2html doc/ deshist.pdx go rebuild design history (deshist.pdx) test.pl
The inheritance works like this: Pdx/Base/Base.pm package Pdx::Base::Base; ... @ISA = qw(Exporter AutoLoader); Pdx/Drivers/Html/pm package Pdx::Drivers::Html; ... @ISA = qw(Pdx::Base::Base Exporter AutoLoader); ,,, sub new { my $classname = shift @_; my $self = $classname->SUPER::new(@_); return $self; }
More fontsAdded underlined text.
More includesFixed =include_verbatim.Fixed =include, which inserts directly into the input stream: line 1: This is included from the dummy.pdx file. line 2: An embedded include, to another dir: line 1: This is included from the dummy.pdx file. line 2: An embedded include, to another dir: line 1: This is I<included> from the dummy.pdx file. line 2: An embedded include, to another dir: =include_verbatim dummy2.pdx line 4 line 5 line 6: final lineline 4 line 5 line 6: final line line 4 line 5 line 6: final line
Added legacy pod over... back:
Tables
LinksExternal links, e.g., README.Internal links, e.g., Table of Contents
Driver-specific codeThis is a single line of raw HTML. This is a block of raw HTML, withIt's time for the LaTeX module. After coding up Latex.pm, I found I had to rework some of Html.pm. Fortunately, it all got simpler and smaller as I went.
FiltersTo allow pdx to be used as a filter, it needs to be able to read from STDIN and write to STDOUT. That is done via: pdx2html --stdin --stdoutOf course, these options can be used independently as well.
GroffTook a shot at it. The groff and tbl documentation were inadequate for my needs. For example, which input option is used to shift output from ps to dvi? How are table rows formatted? It seemed to generate a valid ps file, but the formatting was pitiful.
1999-08-21Ok, it's clean enough to use. I'll start building tests scripts instead of working from this design history itself. Things to consider:
More IncludesHere is include_for html: This is a block of pdx to be included only in html files. Thus it could be used for autogenerated chunks of html, generated elsewhere and placed here in a begin block. E.g.,:
Here is include_for latex:
TablesFaced with Latex table layout, I changed the =table command, to be:=table [widths]; grid; caption where: widths is ',' delimited list given in percentages, e.g., [25,75] grid is 'grid' or 'nogrid' (actually, grid or anything else) caption is the caption of the table This was easy to insert into the Html.pm, but I stumbled over the Latex.pm. Need to use textwidth, with the colwidth percentages, to get values for the tabular column format string, e.g.: {|p{1.25in}|p{3.8in}|} After an hour of looking around in tex and latex docs, I couldn't find the tex commands to compute directly in tex, so I just stuck in a default in Latex.pm's new method. Tweaked this to get the table to line up approximately. I tried tabular* for a while, but it's textwidth option was fighting with the p values, so I went back to staight tabular. Anyway, it looks respectable now, and the long line wraps nicely. Need a fancier table? Build it in a driver-specific begin block. ...Found the magic incantation: p{0.25\textwidth}So I removed the hack approximation of textwidth. Then worked on providing breaks in a table cell. First, we'll use an HTML-style break tag, and convert that to linebreaks in various drivers: <BR> --> \\We then set each cell inside a minipage so breaks will take effect. I've found that a simple break is adequate to get crude formatting inside a table.
=cfgThis is a convenient way to set a lot of =define statements. Actually, I'm putting it in to be compatible with my current verison of pdx2html. The idea is a collection of name=value pairs: =cfg creator_name = Harry George =end cfg
Preparing the distributionEven though I started these modules with h2xs, cleanupwas a struggle. Finally found the magic: perl -MExtUtils::MakeMaker -e "WriteMakefile()" That allowed several mods under the Pdx dir, without actually having a Pdx module.
1999-08-22Next I found that the "Pdx::Html" notation works only for a fully installed module, or for the make-induced /blib/lib/Pdx/Html.pm, which mimics the final format. Otherwise, I was stumbling around with stuff like "Pdx::Html::Html". Finally put symbolic links into Pdx dir to the .pm's. Then I added the parent of the Pdx dir to my PERL5LIB env var. Finally my scripts would run under development conditions.I'll have to take out the symbolic links when it comes time to do make dist. Next I tidied up the Html/test.pl so it was fairly driver-independent. Then actually began the tests. Html went fine. After copying to Latex, the Latex tests went fine until xrefs. Working on that now. Seems to be use of underscore in a ref. ...Yep, that was it. Underscores (even escaped with backslash) are not allowed. Also, misspelled "makeindex" in go2, so my index wasn't showing. All is well now. Tried running deshist with and without table of contents. Decided to stay with table of contents. Ahhh. The sun is out, so no more computing for a while.
1999-08-23Users asked for font control (esp. color), numbered sections in HTML, and lists inside table cells. Color: Use, e.g.: =for html <FONT COLOR = "RED"> some red text = =for html <FONT COLOR = "BLACK"> some black textsome red text some black text But FONT is deprecated, so one should really use style sheets. That means providing user-defined preambles. And, since the default preamble may be ok, we provide preamble_append. These are driver unique, so we get:
=preamble_for html (block of preamble, totally replacing defaults) =preamble_append_for html (add to default after defaults but prior to </HEAD>) =postamble_for html (no known need yet, but I put it in) Thus a user can hard code a style sheet into the document, or can =include another file (e.g., a site standard) which has a preamble_for statement. Tested the hard-coded approach in test08. Numbered sections: This is primarily an HTML or text driver problem, since Latex already does numbering. I looked through the specifications for HTML and CSS, and didn't see a treatment of numbered sections (headings). So I'll do one myself. ...Just completed it. Provided an option for --numbered_heads, which can be set from the commandline or form the cfg section (or from a define of course). In Base.pm, that splits to the driver_head and driver_head_numbered methods. In Latex, it is just a matter of a "*" (as in section vs section*). In HTML, I coded up a stack for the form:
level:num.num.num... e.g. 1. 1.1 1.2 2. 2.1. 2.1.1. 3. becomes: 1:1. 2:1.1. 2:1.2. 1:2. 2:2.1. 3:2.1.1. 1:3. I pop the stack until at the right level, then make a new "1", or increment the existing last num as appropriate. Then join the resulting number to the given name, and save that for the toc.
1999-08-25Reread material on pod, sdf, sgmltools, and docbook. Next driver is probably docbook -- that gives easy translation to many other formats. Even then, the pdx syntax is terser than the others, giving faster development productivity. Speaking of which, I changed the syntax of =define xyz 123 to =def xyz=123 (with optional spaces around the equal sign). Now, what tagging for value substitution? Std templates use "@", so we'll try: =def xyz = 123 ... There were @@xyz@@ ways to solve the problem.Allow only [a-z0-9_] in the define name, to cut the chance of accidental matches. Also, turn expansion on and off via expand_p. Default is off. Put it in nextline, so it is always picked up. We'll have to turn it off and on surrounding verbatims. But we'll leave it active (at user's discretion) for includes, since they may be used to load templates.
1999-08-27After several hours of debugging yesterday and today, I found an extraneous backtick (`) in the Base.pm. Only in perl would a program try to compile and run with random chars. So I installed python and took a look. I like the modula-3 aspects. I don't like the enforced indents -- even though I'd probably do them anyway. Also downloaded python-based Zope. Maybe I'll convert Pdx to python. Next I found and resolved a normal bug: The first line of a pdx-style list item was being printed without a space or newline. As a result it was effectively concatenated with the next line. Put in a newline and all is well. Ok, back to the users guide....Added a "_force_p" to force update even if the timestamp is up-to-date. Tweaked Html.driver_table_row to make cells top aligned. ...Ok, ok, I need to do escape chars. I'd been holding out for ... being an emphasize mode, but I'm getting killed by trying to describe markups for Pdx itself. I'll escape lt, gt, amp, quot, and the pound sign. ...Did the escapes. ...When I got to graphics, realized I needed a "center" environment, so added it. Also noted the break function using <BR > ...Completed the user's guide.
1999-08-28Built the release process. Had to add a mkdist script, MANIFEST.SKIP, and put the pdx2html and pdx2latex scripts into their respective EXE_FILES. The mkdist script is: rm *~ rm testdata mkdir testdata cp ../testdata/*.pdx testdata cp ../testdata/oracle??.html testdata perl Makefile.PL verbose make test make distcheck make dist mv *.tar.gz .. rm -rf testdata ln -s ../testdata testdata The idea is to use the linked testdata dir for normal development, and then hardcode it locally during the distribution process. I do several tests, then move the .gz file to the Pdx level. All the dirs have basically the same mkdist (with adjustments for tests). Had to restructure the dirs:
Pdx/ Base/ AUTHORS COPYING MANIFEST MANIFEST.SKIP README Base.pm Changes Makefile.PL doc/ deshist.pdx devguide.pdx userguide.pdx (other files) go script to build html's for docs mkdist RCS/ HTML (same structure for all drivers) AUTHORS COPYING MANIFEST MANIFEST.SKIP README Html.pm Makefile.PL pdx2html test.pl testdata/ usually a link to ../testdata but rebuilt during mkdist go "make test" killtest remove an oracle mkdist RCS
For a while I was copying to my working proj/perl dir and installing there via PREFIX. Once that was working, I did the real thing -- installing as root into the official perl dirs. Then tweaked doc's go script to use the official pdx2html and pdx2latex. We are now bootstrapped to the installed Pdx series. Of course, the test scripts still point to the local Pdx sources. ` Hmmm, maybe I should generate the html and pdf files prior to installation, in case folks can't build them. Makes for a big tar file for Base, but let's try it. ...Did it. It comes to 250K. ...Made a mkdriver script to clone Html for new drivers. Made a mkperlproj to help build mkdriver. Ready to get going with Docbook.
1999-09-02Started Docbook but had to detour to do homework. Need a way to do TeX math, so let's put in a =texmath...=end marker. Non-TeX drivers can generate a graphics file and include that. See test10. ...Adding to Latex was of course trivial. For Html, I used latex2html's pstogif and made a script "math2gif", based on my earlier "mathtoepsi". Look around for some ps2png, but have't found anything so far. Found that once I officially installed Pdx, I had trouble finding the in-development version. I have the PERLLIB env var set ok, but it was finding the official one first. So I just moved the official Pdx to Pdx.save for the duration.
1999-09-14Have been using the package for a while. Decided to drop the hardcoded pre/post ambles in favor of stylesheets. Cleaner code and more powerful.
2000-01-12Completely rebuilt it in python. Generally easier to read and maintain. I took out the pod-style verbatim, which was an odd duck for the parser anyway. The new dirtree is also simpler:
RCS AUTHORS COPYING INSTALL MANIFEST Pdx/ RCS Base.py Docbook.py Html.py Latex.py __init__.py go1 run Base tests go2 run Html tests go3 run Latex tests go4 run Docbook tests testdata/ (testxx.pdx and oraclexx.html/tex/sgml) README TODO VERSION bin doc go run mkdist go1 run setup.py build go2 run setup.py test go3 run setup.py install mkdist.py build the distriution and tar/gzip it mkdocs.py build the docs, using the installed system pdx2docbook.py script for docbook output pdx2html.py script for html output pdx2latex.py script for latex output setup.py similar in intent to distutils, but not yet using it styles/ (stylesheets) New functionality for the =cfg and =def: =table [10,40];grid;. =row B<Code> & B<Example> & B<Description> =row == & xyx == foo & use foo as-is =row +== & xyz +== foo & append foo to existing xyz as-is =row = & xyx = foo & use foo after processing in-line markups =row += & xyz += foo & append foo to existing xyz after processing in-line markups I also took out the --force mechanism. It was failing to handle changes to "included" files. If you want to avoid reprocessing a file, do up a script of your own.
2000-02-26Again rebuilt the dirtree. This time to support the mypythonproj and setup.py approach.
---boilerplate--- AUTHORS COPYING INSTALL MANIFEST README TODO VERSION setup.py ---the actual code--- RCS/ __init__.py Base.py Docbook.py Html.py Latex.py ---scripts for the bin dir--- pdx2docbook.py pdx2html.py pdx2latex.py ---documentation--- doc/ go pdx.gif for the web page banner default_cfg.pdx settings for the documents article_style.pdx tuned for this project deshist.pdx design history (includes the perl Pdx notes) devguide.pdx developer's guide manual.pdx user's manual user_install.pdx included in manual.pdx user_tutorial.pdx included in manual.pdx hello.pdx example document mktexdocs generate documents in TeX pdxpix.fig,eps,jpg example picture userguideR23.gif example math userguideR24.gif example math ---testing--- ---stylesheets--- Pdx/styles/ default_cfg_.pdx ` generic defaults file article_style.pdx generic stylesheet
2000-02-27Lots of work on the documentaiotn. In the process I discovered that the self.data (the dictionary for macros and def's) expects string values, and I need to do "str(value)" to assure they are prepared. Fixing that took care of some troubles with turining commandline flags on and off.
2000-04-17Converted to distutils "setup.py" last week, and published to web. Per a comment by F. Lundh in c.l.p, used "readlines(BUFFERSIZE)", with BUFFERSIZE set to 100000. The process_file collection of regular expressions is a candidate for a better algorithm. Took a look at something like:
tag_pat=re.compile(r'^\s*=([a-z_]+)') head_pat=re.compile(r'[1-5]\s+(.+)') ... tag_map={'head':(Base.do_head,head_pat),....} ... m=tag_pat.search(line) if m: tag=m.group(1) if tag_map.has_key(tag): (funct,tag_re)=tag_map[tag] self.funct(line) Didn't get it completely working, so stayed with current brute force approach. Look at it later.
2000-05-20Worked on styles. Have Latex seminar.sty and foils.sty. Have a cleaned up HTML article, with sidebars et al. Then tidied up the test process --- for html anyway; need to do it for the others. I looked at Guido's regression test mechanism. Either I didn't understand it or it isn't doing what I need, so I;ll stay with my own for a while longer. DocBook XML DTD v 4.0 came out. I downloaded it and will work on generating it properly for the docbook driver. Next worked on redirection of includes. The problem is this: Assume file1.pdx is a chunk of documentation which includes file1a from another dir. Next, file2 (in yet another dir) wants to include file1. How does pdx find the right file1a? I run into this when file1 is a viewfoil presentation with includes of its own, but I want to also include file1 in class notes. We need a way to redirect the basedir of an include. This should operate recursively. However, I've done a quick hack:
=def savedir=@includedir@ =def includedir=xyz =include file1a =def includedir @savedir@ The include function looks at the home dir of the original pdx file (i.e. file2), and then at xyz trying to find file1. That is, the main file's path is the default source for includes, with 1 alternative path. Rethinking, I really should be looking at the algorithm for cpp's #include. But I'll try this: As each include is started, push the abspath onto a stack of paths, and pop off the stack at the end. For each new include, compute the normed path of the concatenation of all the paths on the stack. |
|
Creator: Harry George Updated/Created: 2001-09-03 |