I still have a few SCCS directories hanging around, back from the 90s-ish I think. RCS and CVS haven't left any trace in my system (they were work only, as was SourceSafe). And my live projects are SVN and git...
Yep, here's the first delta in one of the SCCS dirs...
^As 00031/00000/00000
^Ad D 1.1 98/04/27 21:02:41 dhd 1 0
^Ac date and time created 98/04/27 21:02:41 by dhd
I am super curious about your experience with SCCS! I have never really used it for personal projects, so the blog post is purely based on research and a bit of trying. Really curious what oyu loved and what annoyed you about SCCS!
We used SCCS on a satellite constellation demonstration system. Two developers shared archives via an NFS mounted shared file system. SCCS was a “check out” model, in that if you wanted to make a change, you first acquired a lock, made your edits, then checked it in releasing the lock.
Collections of revisions were self managed as you say.
RCS came along and I think that’s where the $Id$ construct came in. This would expand to the version number of that particular file - if you const char * const idstr = “$Id$” then you could “strings” your binary and see what components were there. Of course, fileguard #ifdef’s would affect this.
Finally, CVS (maybe) had the idea of tagging, which would associate a single label across a current snapshot of different files and their versions.
In 1993-94, ClearCase came on the scene and tried monetizing this concept. A site license cost $30k in 1994 dollars and required a kernel patch for it to work. Try explaining that to your DEC, HP or Sun representative. The software implemented its own filesystem - you would set some kind of version label on a directory and then magically all files and sub folders would appear as of that date. This included libraries, binaries, data files, even crash dumps.
This is all from memory, someone can fact check and correct the finer details.
ClearCase was basically a spinoff of Apollo's DSEE (same engineers), which didn't need kernel patches, because Domain/OS actually had an object filesystem and so the reference-to-repo was just a filesystem object. (As with ClearCase, it wasn't so much the software cost as the 1-3 FTEs you needed to be sacrificial experts to tune and manage it.) (insert "mimic a fraction of our power" meme here, though that applies to a lot of Apollo stuff :-)
RCS had tags but they were single file tags; the two big things CVS gave you were "this collection of RCS files means something as a whole" (which made tags and branches meaningful) and "you can sort-of-efficiently update from a repo on the other side of the country". (The former was the entire point: CVS was written so Solbourne could maintain their multi-cpu-SunOS fork and still pull changes from upstream; the latter was from Cygnus having a similar upstream+local problem and developers distributed across the planet.)
Clearcase. Ugh. Only had to use it on one job in the 2000’s, man was it terrible to use. Very slow, required weird file system drivers and access. Horrific GUI. Semantic model was simply broken.
Systems like Clearcase is where “release manager” type roles were born. One poor guy would become the expert trying to bang a release out of it over weeks or months.
As others have mentioned, using SCCS and RCS leaves you constantly having to ask others to unlock certain files so you can have a turn editing them.
Getting groups to move to CVS was hard. Devs were terrified of conflict resolution if they edited the same file concurrently as another dev. Took patient explaining that it would only be an issue if both devs modified the same lines, and even then it would guide you through the experience.
But they did let me, a young engineer, convert them from SCCS to CVS, so yay!
I used sccs on Sun workstations back in the 1990s. It wasn’t terrible but if you had more than a few team members then chasing people to check their code in for a build was painful and sometimes you’d hit a point where the project wouldn’t compile because of incompatible changes.
We switched to PVCS which made it easier to have isolated work spaces, plus a bunch of scripts to build those so developers could effectively emulate feature branches. Merging was a bit painful but at least you could work without holding other people up.
I’ve used other tools since but git is damn good compared to those older tools. I do miss being able to embed version strings but in reality tagging is superior if you have disciplined release control.
I used SCCS for a short period before finding RCS. IIRC the commands were a bit different and the files SCCS managed were larger. I also think SCCS had something similar to $Id$, but forgot what it was. Of course I could be miss-remembering.
I also ran across SCCS for Linux somewhere, I forgot where I saw it. Maybe you can look for it and play with it.
> arguable it took the rise of Git and Github for them to be used nearly everywhere.
This is fairly accurate according to my experience. It seemed absolutely crazy to me when I started computer science at uni in the early 2000s that nobody was using version control. It wasn't even mentioned as a thing we might want to look into. I had to go and look for something and found SVN. I ran my own server and thought it would be perfect for when we had a group piece of work, but I was unable to convince the rest of the group to install the weird software and cooperate. By '09 I'd discovered git and it was a breath of fresh air.
> SCCS stored each version’s delta along with metadata ... In principle, it is how most version control systems work today.
I think this is one of the most misleading and confusing things people seem to get taught about version control systems. Whether they store deltas or not underneath is actually an irrelevant implementation detail. But surely it's easier to think of version control systems as storing version? It's right there in the name! It's not called a delta control system.
> I think this is one of the most misleading and confusing things people seem to get taught about version control systems. Whether they store deltas or not underneath is actually an irrelevant implementation detail. But surely it's easier to think of version control systems as storing version? It's right there in the name! It's not called a delta control system.
Post author here!
I agree! One of the fascinating bits of SCCS is that it actually exposed the idea of deltas right away to the user. Of course in the 70s, the amount of users was limited, pretty much exclusively programmers familiar with the idea, and probably all people who knew each other :).
Nowdays a VCS should just care about snapshotting your work. In my experience, only Git is so confusing that people need to understand the underpinnings of the system (whats a GC, whats a tree, etc). Systems like SVN, Mercurial & Co all seem to mostly do the right thing for the user with minimal knowledge on the user side.
Fun side note: I had a heated argument with Olivia Mackall, the original author of Mercurial, when I wrote Mercurial bookmarks. I initially called it refs, since they were models after git refs. Olivia insisted to call it bookmarks, since it's the more intuitive english name for what it's doing (bookmarking a part if a book of 'history'). Nowdays I much prefer the name bookmarks, since it's an analogy that makes it easy to reason about what it's doing, rather than a technical term that needs to be learned.
> Systems like SVN, Mercurial & Co all seem to mostly do the right thing for the user with minimal knowledge on the user side.
I have an another blog idea for you: what are these "right things"? I have used Mercurial a bit when I was new to programming, and back then Git and Mercurial seemed more or less the same with different command names. Today, I almost exclusively use (and hate) Git, but I find it hard to see what alternatives there would be in DVCS space, other than something lie Darcs/Pijul (perhaps my imagination has been stunted by too much exposure to Git). It will be great to have someone with the knowledge lay it out comprehensively and explicitly, so that the next generation of VCS developers will be able to build upon it.
I do have thoughts about waht the next generation of source control systems should look like! It hopefully will be the last post in the full series. So estimated arrival time for that post, by interpolation of how long it took me to write this post, is roughly 1-2 years :P
IMHO there are different ways to design a version control system:
1. The SCSS/Git way, aka the hacker way: look at what we can do with existing stuff, and use that to build something that can do the job. For example if you're one of the world's expert on filesystem implementations, you can actually produce a fantastic tool like Git.
2. The mathematician's way: start by a model of what collaboration is, and expand from there. If your model is simple enough, you may have to use complex algorithms, but there is a hope that the UI will match the initial intuition even for non-technical users. Darcs did this, using the model that work produces diffs, and conflicts are when diffs can't be reordered. Unfortunately this is slow and not too scalable. Pijul does almost the same, but doesn't restrict itself to just functions, using also points in the computation, which makes it much faster (but way harder to implement and a bit less flexible, no free lunch).
3. The Hooli way: take an arbitrary existing VCS, say Git. Get one of your company's user interviewer, and try to please interviewees by tweaking the command names and arguments.
The tradeoff between 1 and 2 is that 1 is much more likely to produce a new usable and scalable system fast, but may result in leaky abstractions, bad merges and hacks everywhere, while 2 may have robust abstractions if the project goes to completion, but that may take years. OTOH, method 3 is the fastest and safest method, but may not produce anything new.
So, I am the main author of Pijul, and I also don't quite see how to do much better (I'm definitely working on improvements, but not technically radical). But the causal relationship isn't the one you may be thinking: it is because I thought this was the ultimate thing we could have that I started the project, not the other way around.
By the time it was open sourced, the DVCS wars were already at an end. Git (due to Github) was used everywhere. Bazaar saw it's last release in the same year as BK open sourced (2016) and Mercurial usage dropped slowly.
I should have added to my original comment that I wasn't criticising the article, which I really enjoyed. Glad it was taken that way!
It's really interesting that those early systems really did deal with deltas. Unbelievable that ideas from systems that have been and gone before people's lifetimes still prevail today. When was the last time users needed to know about deltas? Possibly skipping ahead impatiently here...
Arguably git still does this: quite a lot of the implementation is exposed to the user. But, as someone who learned RCS: that paper was exceptionally clear and the whole thing was extremely comprehensible, right down to the ,v format being human readable.
In retrospect VCS had a long way to go, but going from nothing to RCS was huge.
That is weird. I started in 2003 and we absolutely had SVN provided by the department... at latest by 2005ish, maybe even 2003, or it was CVS. I can only say this for sure because of one course where we actively used it.
I have not used sccs nor rcs for version control as such, but I have used them a little.
The BSD archives published by Kirk McKusick include the source repository in sccs, so I have used sccs a little to drill down into nuggets of history there.
More weirdly we had an ancient and janky terminal-based user interface to allow colleagues to edit configuration files in a controlled manner. It avoided concurrency problems by using rcs under the cover to implement locking.
In 1983, when I started using Unix, I went through the manuals for Unix System 7 trying to learn about every program that came with the system. So I tried SCCS, but at that time I wasn't developing software as part of a larger team so I decided it wasn't useful to me.
Many years later I bought the book "Applying RCS and SCCS: From Source Control to Project Control", half-read it and decided I still didn't feel it was something useful to me. It wasn't until 1998 or 1999 that I started using CVS, then a few years later switched to SVN, an finally to Git (between SVN and Git I spent a few years using Unity's Asset Server).
This got me thinking about what the Linux kernel was using before BitKeeper (of course before Git). Did it not have version control? Was it just Torvald's tree, maintained by applying patches from the list, and distributed via archives or rsync? (Oh, and if anyone knows how to get a copy of the BK or pre-BK sources, let me know!)
A strange case I've run across from the SCCS and RCS era is Plan 9. The history of the Plan 9 kernel is stored as ed scripts[0], which produce revisions per file, essentially like ad hoc SCCS deltas. I'm not sure if it was assembled as such after the fact or recorded like that all along. That method seems to have only been used for the kernel and the rest (such as the libraries) was snapshotted daily on a file server starting in 2002.
Did Linux have version control? That would depend on your definition of version control. It did not have a canonical version control system, but it did have the data structure for it in the form of the linux-kernel mailing list. From that a series of patches could be extracted that mirrors development quite accurately.
All subsystem maintainers had their own way of working with this. Some used svn, some used bespoke scripts.
The main reason for creating bk and git was that none of the existing version control systems matched the workflow of sending patches via a mailing list. This is quite clearly reflected in the design of git, a tool for quickly wiping and recreating a whole project directory.
Linux development history was imported to git, and you can follow it back in time since long before the introduction of git, so in that sense some form of version control can be said to have existed.
The quilt tool by Andrew Morton is one of these tools used for maintainance (https://man7.org/linux/man-pages/man1/quilt.1.html) of the kernel. It's actually quite nice to use. Early versions of git and mercurial had tooling around similar workflows on top of git/hg repositoires (guilt, being one of these tools).
My understanding is that even during the BK years, some maintainers refused to use it and would continue to use tarballs + patches.
Torvalds' git repo only goes back to 2005; is there another one with more history? Because I wouldn't describe 2005 as "long before the introduction of git".
Yup, Linux was just patches and a sequence of tarballs.
Another weird one is a distributed version control system over UUCP [1]. Although not distributed in the modern sense
[1] O'Donovan, Brian, and Jane B. Grimson. "A distributed version control system for wide area networks." Software Engineering Journal 5.5 (1990): 255-262.
I'd worked in Sun Microsystems last 3 years before acquisition by Oracle.
We've used version control system based on SCCS. Like CVS is based on RCS format, this was based on SCCS.
But it was distributed! Only way to clone repo or push or pill was direct file system access (so, NFS was used, of course), without any protocol over HTTP, SSH or TCP.
It was very cumbersome and difficult to understand & master. I could say, it was more complicated than git, because it felt alien. I don't know why and cannot pinpoint reasons for it.
It had generator for web reviews (like modern pull reviews), but again, generator produced static HTMLs with different diffs, and you need some place to host them and they were non-interactive, nothing like in-line comments and such.
At the very end, right before acquisition, we converted repos to Mercurial, but tools for web reviews was ported to hg with exactly same output. And then I left Sun, errr, now Oracle.
This reminds me, I eventually learned that there were two styles of cooperating with others in a lock-based system, each with different advantages and disadvantages:
(1) Lock the file(s) as soon as you start coding. Commit when finished coding.
(2) Do all your coding against unlocked files. When finished coding, then lock the files, and copy over your edits to them. Then commit.
Style 1 is fine if there is low contention for the files, like a team with few members or a part of the codebase that is rarely changed. You never have to manually copy your edits to an updated file. If everyone uses this style, there's no such thing as a conflict. It's not even possible.
You also don't necessarily have to coordinate offline with anyone to prevent situations where people unknowingly proceeding in incompatible directions with the same file. Coordination is still better, but locks naturally prevent this to some extent.
You do need to be responsive to emails to avoid blocking others. And it's possible to forget you have something locked when you don't even need it to be, like if you were just starting to code and paused or abandoned that work.
You may have disagreements about whose work is more important and whether you should release a lock or make the other person wait.
Style 2 is helpful if there is high contention. You need to plan and coordinate more since otherwise it's possible two people could make incompatible changes to the same file without knowing it until the last moment.
If the base version of a file changed (since you started coding against it), you have to manually copy over your changes. This definitely involves tedium and may require resolving conflicts.
This style is a lot like what we're accustomed to today with Git (and Subversion and CVS before that). But updates and merging are totally manual instead of mostly automated. Back in ancient times, codebases were smaller, so although that was definitely a burden, it was doable.
Joke's on y'all, our "release lock" in first semester introduction to programming for team assignments was the name of our ZIP file and whoever cc-blasted every member of the team with "team07_assignment03_v07.zip" called dibs on it :)
It goes without saying, that "it works on my machine was out mantra". Those were the days!
Back in '94 I was supporting a development team and one of my users came to me "I typed ci instead of vi. It asked me a question and I said yes and now my file is gone." I told her "Don't panic, I know exactly what happened and how to fix it."
The cited RCS paper is actually somewhat contentious among Unix greybeards. They claim the paper compared RCS against an obsolete version of SCCS. Supposedly a contemporary SCCS would have easily outperformed RCS.
Of course, RCS was freely available, while SCCS needed to be licensed from Bell Labs, so RCS was much more widely used.
Do you know if that was just a matter of implementation or a matter of data structure usage? In my mind, the RCS approach makes more sense for common operations such as checking out the most recent version, particularly if history is deep. (Writing would still be same, given the most recent version is in top in RCS (and not just append only).
Would love to get more flavor on the his, and happy to amend the post with more information!
I’ve been waiting to see ‘luckydude weigh in, but here he is by proxy. When some time ago he and I talked about this (here, or on #Tcl), the thing I recall most from our conversation is “Tichy pulled the wool over our eyes.”
I still use RCS for my personal projects. I had moved to github, but M/S ended that experiment with their changes, so back to RCS and anon ftp on sdf. One thing I really like about RCS is it will update tags in source during checkin, easy to find the version of a binary. I would have stuck with git if git did that.
Anyway, I was working at a fortune 500 company (which I just left), the group I was in had no SC until I was hired. I forced them to use RCS, and once we got a server I had them move to CVS.
Then a couple of years ago, we moved to git after corporate got serious about SC (Source Code Control). They contracted with github soon after the Microsoft purchase, so I moved our CVS to github. FWIW, when I left, many people were still strongly resisting SC.
I know some source was lost, and I wonder if the attitudes have finally changed.
I know that, but the main thing I do not like about git is the source is not updated with version tags like in RCS. So when I dumped github I decided to move back to RCS because of $Id$
If git had something like that I would have stuck with it.
* Tags in the binary which can be listed using ident(1)
* If I have a object checked out, no one can check in their changes without talking to me. Most other SC deals with this automatically, but I am old school :)
* works nice with emacs, but on OpenBSD the emacs interface is broken, you need to install GNU RCS for OpenBSD :(
One thing I really like about git is gpg signing on commit, but on the BSDs pinentry* does not prompt when used from inside emacs, so signing fails when using gpg
Others have pointed out that you can retrofit a similar or even more powerful mechanism into git.
Alternatively, CVS and SVN have RCS-style keyword expansion built in. Either of these is an obvious improvement over RCS, so I don't think there are any good reasons not to upgrade.
When looking at a Linux kernel build problem last week I noticed how gmake still tries to handle SCCS by default. So if another level of obfuscation is needed for the next supply chain attack it is ready to be used.
> Note that this posts focuses on source control systems, meaning systems meant for storing versions of source code. Other version control systems that focus primarily binary data will not be covered.
I can not overstate what a mistake this is. The fact that Git is effectively incapable of storing binary data is an utterly terrible thing. Our systems, tools, and pipelines are so much more fragile and complicated than they need to be because of this limitation.
Interesting post. I have to say I agree with the main thrusts of the article, although there are some differences in my opinions I won't go into.
I work with a couple source available pieces of software that have their dependencies all packaged in the repo and it's a joy to work with compared to trying to compile a typical open source package.
I’d be curious to hear your differences in opinion! It’s always nice to hear different perspectives. There are sooooo many different workflows there’s always something new to learn.
> it's a joy to work with compared to trying to compile a typical open source package.
I know right. Once you experience it and realize how delightful building a project can be it’s hard to go back. I swear people don’t realize that things don’t have to be as painful as they are!
Nice post. I actually still use RCS for version controlling basic dotfiles, the kind that still just plonk a single text file in your home directory.
It works really well for those. None of this extra overhead of sticking your entire $HOME in a git directory, or weird symlinking to some monorepo or whatever. Nice, intuitive, easy to manage, easy to back up and sync, perfect for local condig files you're not looking to turn into your next big project, just a file you want to potentially roll back from time to time.
I should do that. I used to use RCS for everything, and my recollection is that after I read the man page once or twice, I knew almost everything I needed to know. SCCS had a bit more of a learning curve. As for git, I almost always have to read not just the man page, but actually look on-line for clarification about almost anything. I think of git as the C++ of version control systems. (Not sure how far that analogy goes--I don't know what programming language I would compare RCS to, Pascal?)
Early on, companies would tend to take version control like SCCS, and layer their own SCM systems atop it, to address some of the most obvious team project pain points.
The bespoke layers I saw were usually kludgey, but sometimes they worked well enough, in a "worse is better" kind of way. Other times, people were greatly relieved to eventually get a better off-the-shelf SCM system.
This building atop SCCS is an example of starting with the minimum and adding what you need. It has different pros&cons, compared to a more typical contemporary approach of starting with the maximum off-the-shelf framework/platform/service, and figuring out how to get it to do the simple thing you actually need.
There's a sweet spot in between the two, of sometimes using some big/medium-complexity off-the-shelf thing, and other times keeping it simpler, while making better decisions than flipping a coin. Though that tends to get lost in the reality of resume-driven-development, when the incentive of a decision-influencer to add a keyword to resume isn't aligned with the goals of the current project or current company. (But companies only have themselves to blame for this, when they hire by the keyword. Of course, selecting for and rewarding pursuit of keywords, that's what they'll get. Similar with hiring for Leetcode.)
I've been purchasing older O'Reilly books off of Ebay for the past couple of years. My most recent purcahse was Applying RCS and SCCS by Don Bolinger, which is split between RCS, SCCS, and describing software configuration management in general.
The SCCS interleave format lived in BitKeeper and PVCS/Dimensions.
While researching which SCS to use at Honeywell in early 2000’s, I read the RCS thesis and it was short concise and a fantastic read. I recommend any SCS N00b’s to read it first.
Thankfully at the time, I chose SVN over CVS and although SVN was only at v0.8 my small team never lost any data. I never regretted my decision.
For those following along, the blog post was updated to reflect some of the comments. Most importantly it includes an email that Marc Rochkind, the author of SCCS wrote me, with some anecdotes from the time.
As an unrelated side note: Walter Tichy just retired at KIT a few years back but continued teaching until recently. Because he simply cannot stop he is now teaching in Kutaisi, Georgia.
He was by far the best known professor in computer science, much loved by his students: https://www.reddit.com/r/KaIT/search/?q=Tichy . Guess the memes are only half as funny if you don't understand German.
Yep, here's the first delta in one of the SCCS dirs...
That's from the file served then and now as:http://www.exnet.com/About_Us.html
And I have found some there from 1995, which is about when we will have taken the HTTP server live...