MP

TIL: use svn to download subdirectory of GitHub repo

So sometimes you just want to download a subdirectory of a GitHub repo. My use case today was wanting to vendor the apache-avro Rust crate. The crate can be found here. It's hosted on the avro repo here, which also contains the avro implementations for other supported languages.

Vendoring it proved to be complicated, because the Rust package is nestled inside the repo at the path lang/rust/avro/ and because I needed to use the unpublished version in git, since the currently published version does not support schema references. Normally for vendoring I use the excellent cargo clone, but that only works for published crates.

Pulling down the entire avro repository would have been excessive. It's quite large, and I only needed the Rust bits, so ideally I could just download the lang/rust path and be done with it. This turns out to be more complicated than you'd expect! I found this SO post, where a lot of the answers were more complicated than I wanted. I didn't want to track the repo state locally. If I'm going to update, I just want to blow it away and re-download.

This looked like a pretty solid option:

git archive --format tar --remote ssh://server.org/path/to/git HEAD docs/usage > /tmp/usage_docs.tar

But GitHub doesn't support pulling down an archive via the https scheme, and when building into a script to be shared with others, you would first need to ensure that everyone who's going to use it has SSH set up properly, which I couldn't guarantee.

I thought about writing a script to download the repo to a temporary directory, mv out the bits I care about, and then delete the rest, but then I saw what seemed like an absolutely bonkers suggestion to use... svn, of all things.

Turns out that GitHub supports a number of svn use-cases, presumably to help people migrate, so in order to download a subdirectory of the main branch of a repo, you can just do like so:

svn export --force https://github.com/apache/avro.git/trunk/lang/rust .vendor/apache-avro

You take the tree/master bit from the regular git link and replace it with trunk, and then you're off to the races. The --force argument will replace the target directory if it already exists.

If you're running a nix-mediated development environment, you can just throw subversionClient into your dependencies and be done with it.

Created: 2023-07-25

Tags: github, svn, til