Author Intrusion: Persistence plans

It's coming up on the end of the week and I've been steadily moving forward with Author Intrusion. Most of the features are theoretically available, according to the unit tests, but there is always a lot of little things that that seem to surprise me as I add new features. This system can quickly get out of hand if I'm not careful.

Of the last two major features (basically broad strokes of development), the next one is an interesting one: persistence. This includes saving and loading files, but I want this to be a flexible system which means it is a bit more than just writing out a file.

Source Control Management (SCM)

I use source control rather heavily. The number of times I've accidentally blown away a fresh chapter or lost a hard drive has given me an appreciation for systems like Git or Subversion (though I pretty much only use Git now).

One thing no SCM handles well is binary files. This includes Microsoft Word's DOC and DOCX formats along with LibreOffice's ODT. When you change one of these files, a SCM can only tell you a binary changed and uploads the entire file again. If you save frequently, you'll quickly get 113-263 versions of the files, all of them are just a mess of hex bytes.

With text files, the SCM identifies the lines that changes and uploads a delta, the minimum change needed to update the file. It becomes easier to see that you changed a specific paragraph or removed a section. It also takes up less space and it is easier to figure out what you were doing at the time.

Manipulating styles

I'm always changing something when it comes to displaying text. This is one reason I like to keep them separate because I'll blow a day trying to get the lines underneath a header to look "just right" across 127 short stories.

With ODT files, it was really easy to do this. Since ODT (and DOCX) are just ZIP files, I could decompress the file and play with the contents. For ODT, the style formatting is in a separate file from the content. Since I don't rename stylesheets, it meant that I could take one document, format it, and then write a quick shell script to update all the others.

In effect, the each major component of the ODT and DOCX file is a separate file inside the ZIP archive. I want to do the same thing for Author Intrusion.

This would work well if I want to have a shared components of the project. For example, if 53 of my short stories all have the same formatting, I'd love to have the ability to share that file once instead of writing some script to enforce the style.

It will also benefit the source control side since I will be identify what type of changes were done on a commit. If only the content file changes, it tells me one thing. If the formatting file changes, another.

The drawback of multiple files is that it is harder to manage the file. Instead of copying a file, you'll be copying a directory. I have an idea of how to handle that, but the default is to take over an entire directory for a project.

Reduce clutter

I like clean folders. Not a ton of different files scattered about. It makes it harder to read and find what I'm looking for. And, since I'm writing this, I want the default save layout to have the same philosophy.

Since I'm breaking up the project into different files, I'm also going to put most of them into separate folders underneath a directory.

  • Project.aiproj
  • Settings
    • Project.xml
    • Fantasy Dictionary.xml
    • Fedran.xml
  • Chapters
    • Chapter 1.xml
    • Chapter 2.xml

This will keep the top-level folder fairly clean. I'm keeping ".aiproj" for the main file because I need to be able to let the user double-click and have it open the right program. For the others, I'd rather be obvious of their file format.

Single file option

One of the things I'm trying to keep is working the way people work, not only how I work. So, a later version is going to basically let you take the entire massive directory structure and put it into a single ZIP archive automatically. This is basically what Microsoft Word and LibreOffice does. In that case, there will be a single file:

  • Sand and Blood.aiprojz

If you rename it to "Sand and" and open it, you'll have the big structure inside. That way, I get my ease of source control and specificity while allowing single files for those who prefer it.

Layout options

Now, one thing I've learned is that no one will ever agree on how to arrange a project. I don't even agree with myself most of the time. So, I'm going to build in some flexibility on how the project is saved.

The two ways I use writing projects. I have my novels, which look like this:

  • Novel Directory/
    • book.xml (DocBook 5)
    • chapters/
      • chapter-00.txt (Markdown)
      • chapter-01.txt

I also have my short stories layout:

  • World Directory/
    • short-story.txt (Markdown)
    • short-story-series/
      • short-story-02.txt
      • short-story-03.txt
    • short-story-04.txt

No doubt, there are a million different layouts for the authors. Since I can't figure out all of them, I'm going to steal an idea from Visual Studio and provide macros to figure out the layout. So, when a user defines their preferred layout, they'll just populate fields like this:

  • ProjectFilename: {ProjectDir}/Project.aiproj
  • ProjectSettingsFilename: {ProjectDir}/Settings.xml
  • ChaptersDir: {ProjectDir}/Chapter
  • ChaptersPath: {ChaptersDir}/Chapter {ChapterNumber}.xml

(I was going to use $(ProjectDir) to match Visual Studio exactly, but it was either use single character delimiters or write my own. I went with not writing my own; I'm writing enough from scratch on this.)

The standard directory layout (the one with all the files) will just be the default for new projects. I'm also planning (in a later release) to let the writer pick another project and say "set it up like this". Or provide additional settings as the application evolves.

A second reason for the flexible layout is to allow libraries (another thing from programming). For example, all of my short stories in a given setting typically have the same style guide, dictionaries, and organization. I'm not that original. In these cases, I might have this:

  • ProjectFilename: {ProjectDir}/{ProjectName}.aiproj
  • ProjectSettingsFilename: {ProjectDir}/Settings/{ProjectName}.xml
  • ExternalSettingsDir (e.g., dictionaries): {ProjectDir}/Settings/

In this case, I would put the structure and common settings in a single place and then have a bunch of project files on the top-level. When I add a word to the dictionary for one, it will become available in all of them.


As you can probably guess, this is not going to be a simple application. There are already simple text editors out there and they do their job beautifully. I'm trying to write something very specific to writing but still acknowledges that highly creative people (e.g., authors) have their own needs.

One thing I don't like are programs that tell me how to work. I know there is a general trend for simplistic applications with few options. It makes life easier for the developer (less options to duplicate to find a bug), makes it easier to comprehend for users (less features to grok), and generally reduces the difficulty to maintain the application. The problem is that I don't always work that way. So, if I want an application that works the way I want to, I have three choices:

  • Keep looking for one that works for me
  • Write one that works exactly the way I want
  • Write one that is flexible enough to work the way I want

Obviously, I don't think the first one is an option because I wouldn't be doing this. The second is actually a hard one, because I continue to evolve. I change things like how I like stuff arrange or organization. I need to acknowledge that as part of "exactly the way I want" so I'm trying to write the third version (make it flexible).