Second sprint week of docbook2odf development

So, after a week of working on it, I think docbook2odf is pretty much at a stable point to use for a while. I'm not quite calling it v2.0.0, but it is almost there. Maybe use "beta" for now, but I would like to have it a bit more functional before calling it a version.

Reasons

The main reason for doing this was three-fold:

  • I wanted to take my Markdown/DocBook 5 system and generate things cleanly for the writing group. I mostly had this, but there were a few things that were broken in that system (lists), so getting that working would make my life easier.
  • I needed to simplify generation of Smashwords documents. Though I'm not fond of it, Smashwords uses a Word document to do all their work. Getting something that simplifies the process would take out the effort to make those books. Having it part of my normal generation of EPUB, MOBI, and PDFs just makes it that much easier.
  • I wanted to have something that generated Standard Manuscript Format (SMF) for submissions. This actually does it, and I made a dedicated style for generating SMF including putting in the address on the first lines, estimated word counts (a placeholder for now), and even the newlines. Overall, it does a fairly good imitation of SMF.
    • I defaulted to "Courier New" because I like the font, but it is easy to switch to "Times New Roman"
    • Scene breaks are easily changed because I couldn't find consensus on how to do it. I stuck with "#" centered, but you could easily change that to "*" or "# # #" or something else.
    • I also made it easier to add/remove "END" at the end of the piece.

At the moment, the current HEAD does all this for every element I have in a book or story in my Git repository. Now, I don't use anything more than chapters and scene breaks (), but I did put in something for epigraphs and attributions. It even embeds graphical covers into the ODF document.

Status

At the moment, the new branch (currently HEAD on master) is a complete rewrite using namespace-aware stylesheets. I think the problem another developer had was using non-namespaced documents, so I wrote some Perl scripts to create both versions from the same source. I also added a lot of examples and documentation to show various features. And also to make sure it works.

The items that aren't done, and what I considered needed for v2 are:

  • verbatim.xsl
    • screen, programlisting, synopsis
  • synop.xsl (programming inlines)
    • varname, filename, constant, guilabel, guibutton, guimenu
    • accel
  • section.xsl
    • above/d:subtitle
  • notes.xsl (slides)
  • info.xsl
    • Formatting of authors, names, copyrights
  • block.xsl
    • d:blockquote
  • tables.xsl
    • handling tables
  • slides.xsl
    • handling slides
  • paragraph.xsl
    • d:formalpara (has d:title)
  • inline.xsl
    • d:email
    • d:uri
    • d:credit
  • bibliography.xsl
    • Lots of stuff

Changes

  • Reimplemented most of the stylesheets in namespace-aware
    • Moved old version into 'old-xsl/' for research purposes
    • About 80% of the old code is implemented in new version
    • Switched to a slightly different method for doing styles and overrides
  • Changed 'docbook2odf' to allow for single-file selections
    • The old 'opf.xsl' is embedded into the executable
  • Started to provide some standard formats, such as Standard Manuscript Format.
  • Modified image processing to fit more with the content and viewport formulas.
    • Allow scaling of images with only one dimension specified.
    • Allow for images to be anchored to a character or page.
    • Now only picks one supported image and allows for text fallback.
  • Switched to use common functionality of the other docbook-xsl-ns namespaces.
    • Requires these stylesheets to be called in a relative directory.
    • TOC generation is controlled via the $generate.toc params.
    • Added custom rule for TOC generation at the end (to support ebook creation).
  • Added headers and footers into the stylesheets.
  • Added processing to allow embedding of cover images.
  • Added processing (via Makefile) to generate non-namespaced versions.
  • Added examples (both namespace and non-namespace) of various features.
  • Added processing to create legal pages for ebooks.
  • Nested quotes work properly.
  • Examples use 'jing' to verify they match DocBook 5 RNG schema.
  • Normalized on "d:" for DocBook namespace, uses exclude-prefix to match docbook-xsl-ns.

Future Plans

I want to fix the features that existed in the old version. That part is one of my priorities, but it will take a little bit since I actually need to write a bit more for the rest of December. Plus, I have lots of other projects, but I'll get them slowly in there.

The original writers of this seem to have abandoned it. I'd like to have a different license (MIT) instead of the original GPL-2 but I don't know if I can do that. I sent an email, we'll see if they respond.

Eventually, I think I got the basics that I could eventually write a "direct to docx" version. That will be MIT licensed and I'll make it so I can roll it into Author Intrusion.

Metadata

Categories:

Tags: