Source Control

If you want a dry explanation, I can use Wikipedia to describe source control (also known as version control):

Version control (also known as revision control, source control, and source code management) is the software engineering practice of controlling, organizing, and tracking different versions in history of computer files; primarily source code text files, but generally any type of file.

However, I can speed run through it by describing how I got into embracing source control.

White Papers

I started programming in 1981. Before then, it was copying save games to a different directory when the game didn't provide slots. (Interestingly, I rarely bother with more than one slot these days.) At the time, when I was coding or writing for school, I would just make copies as I went. Mostly this was to compensate for making mistakes or programs crashing.

As I got worked on bigger projects, crashes became more frequent. I would lose hours of work after inserting a couple hundred graphs into a white paper or the entire system would corrupt as I struggled to create some grid of equipment lists for whatever table top role-playing game I was creating. That led to a tic that remains to me even today, which is when I pause, I have a tendency to tap Control-S as I'm thinking (or when I was using Emacs, C-x C-s which is great except C-x is “cut” in most programs).

I would also use the “Save As…” functionality a lot, affixing a number at the end. On really big projects, I would hit the 100+ copies which lead to another pattern of mine, I almost always use 001 for my first number so they always sort alphabetically and I want to give myself room. You'll probably see that three-digit number later in this plot.

The Blue Line

The “making copies” thing worked pretty well for about twenty years. Then I got a job in Chicago. At the time, I would get on at the National Street station in Elgin and take the two hour ride to downtown, work for my usual nine to eleven hours, and then another two hours back. Along the way, I got tired of reading books and decided to write my first novel.

I propped up my laptop and wrote. Then I would copy my novel to my work machine so I could write during my lunch hour. Then copy it back to the laptop for the train ride home. Once I got home, I would copy the file to my home machine and then everything worked.

Every once in a while, though, I would get the steps wrong and accidentally overwrite a two hour train ride's worth of writing with an old copy. And when I say “once in a while,” I mean about once every few weeks. I was not very detailed oriented at the time.

That got frustrating, so I started looking at options. My dad used a program called RCS for his work at the laboratory, so I started using that. It was a command-line program that allowed me to check files in and show changes from the previous version. Which made a big difference to me. However, RCS had a lot of limitations and I still accidentally overwrote my files, though a lot less frequently.

Since I'm always looking for improvements and trying to find some way of coordinating changes across multiple servers, I ended up learning about CVS but eventually settled on Subversion. Both CVS and Subversion had something that RCS didn't, they worked with servers. They were designed to work with multiple people in the same project, possibly making changes to the same file. While I wasn't writing with everyone, I did have three machines which basically acted as three people working on the same novel. When I made a change to a file, the server would tell me that someone (me in another location) had already made changes to that file and I needed to merge or handle that conflict. Or that I had “checked out” the file at home, so I couldn't edit it on my laptop until I checked in the files at home first.

That was a major turning point for writing for me. I stopped copying files from one machine to another and just treated each location as a different user. I didn't have to worry about overwriting a chapter with an old version because each computer already knew which order or what version I was on.

Git

In 2005, Git was created. I was already heavily into the Linux scene at the time (falling in love with in 1995) so I was aware of Linus Torvalds and his work there. And Git was just an attempt to solve a specific problem (which is out of scope for this plot). I didn't didn't really get into Git until 2009 because it introduced two things new: disconnected source control and first-class branches.

The biggest problem with Subversion was it needed a server to coordinate the work. Which meant if I was going to write during my lunch hour, I had either connect to my personal server to write or write on my laptop in the lunch room which had no Internet connectivity. With no Internet, I was basically uploading a large change when I got home. I wasn't overwriting files, but I would occasionally make a large sweeping refactor that would conflict with work I had done on a different computer. The bigger the changes, the more time I spent merging the results together to make sure everything was good.

With Git, I could “commit” a change (think of it as like “saving” but on a directory scale) without Internet access, label it with what I was doing, and keep going. Then later, Git would merge these commits together. Since Git was designed from the ground up to coordinate commits, it would automatically merge the results and only stop to ask for help when two changes were made to the same place. Which meant, as long as I avoided editing the same paragraph on two separate computers, I could freely write, and rarely worry about having a conflict with editing the same paragraph.

Branches

The other thing Git was first-class branch support. Subversion had branches, which was effectively copying the entire directory to another location, making changes, and then going through the painful process to merging the two branches together. An entire list of “best practices” sprung up because it was such an extension process.

And Git did away with that. Branches were effectively just different chains of commits, each one built on top of each other. The mechanism that let Git merge two computer's work together also worked for branches. It would coordinate the changes and only ask when there was something it couldn't merge. And over the years, it's gotten a lot better at merging.