Screens, Research and Hypertext

Powered by 🌱Roam Garden

Version Control and Transclusion

Git, but for content.

It's tempting to think of transclusion as a fancy word for embedding. After all, they do both involve inserting a live piece of content from one website inside another website. But there is one really important difference between embedding as it has been implemented on the web and transclusion as envisioned by hypertext theorists.

When you embed a piece of content, you pull through whatever content exists at that URL. When you are fully in control of the content at both the source URL and the destination of the embed, that's not a problem. But when you embed a URL whose content you do not control, then things can get complicated.

If someone updates the content at a URL that you have embedded, then that new content will show up on your site, too.

In a lot of contexts, that sort of instant updating is a feature, not a bug. As John Schwartz points out, the Nuffield Trust, for example, treats each of its charts as a unique piece of content, often embedding the same chart into multiple outputs.

The International Budget Partnership takes this approach one step further, using Drupal's tokening system to dynamically assemble prose passages country reports based on a spreadsheet full of data values.

In both those cases, live updates are exactly what you want. If you find a typo in a chart or change the data threshold for a particular passage of text, then it should update your content everywhere it's embedded. That's the entire dream of create once, publish everywhere.

But that's not always the behavior you want—particularly when you are embedding someone else's content. If your think tank has conducted an analysis based on a Spring 2021 government data set, then you don't want your charts to automatically update when the Winter 2021 data set is released. Your analysis was based on that specific set of data. It may or may not hold with the arrival of updated figures.

That's what makes transclusion different from embedding. Proper transclusions require some sort of version control. Maggie Appleton envisions something like Git, but for content:

Transclusion is more like a saved version of the original – a cached snapshot of it at a moment in time. You can update that snapshot to a newer version if you want to, but it's not automatic. Even if the original author deletes it, your transcluded version stays visible.

It works the same way Git does—it's a version control system that stores the state of a file at a particular point in time.

As with code in Git, the organization that writes the content—call them the Centre for Ideas—controls the main branch. When your think tank transcludes a passage from Centre for Ideas, it creates a fork of the content at that exact date and time.

If Centre for Ideas later updates that content, they push their change to the main branch. You then get a notification. Now you can look at the changes in the main branch and then choose either to pull those changes to your copy or to keep your own forked version in place.

That way you can update when the changes don't affect your content (like when the Centre for Ideas realizes the proper way to spell "center") but reject the changes you don't want to reflect (like, say, the Centre changes its methodology and thus the new data set isn't directly comparable with the old one).

For more context

What is transclusion, anyway?

What to read next

Transclusion is just one part of building a better web.

Other items of interest

Single source publishing is the holy grail for online publishers.

Version control is one ingredient for transclusion. Better URLs is the second.

Version control and better URLs are great. But you need better writing, too.