www.byjp.me/content/posts/git-download-subpath/index.md
JP Hastings-Spital b8cc9ee66d Remove leading newlines
This standardises all my markdown files to have no leading newline after the `---` that ends the frontmatter.

This makes display on Gemini (in which newlines matter) much more consistent!

I can't spot any errors with the regex I used here, but there may be one or two articles that change shape — hopefully I can spot them and fix them!
2024-06-20 13:40:54 +01:00

2.9 KiB

title emoji date summary tags topics syndications
git download-subpath 🤓 2024-05-14T13:08:10+01:00 I made a git helper tool that lets you retrieve files from a subpath of git without downloading the whole of the (possibly huge) repo.
git
computers
tools
Technology
https://gist.github.com/jphastings/6560bb173399fc3a155913b33e5f0f0c

My blog has ~250MB of photos in it (as I archived my Instagram herea decade of photos) which means that every time I want some files from my blog's repo I've needed to pull down all that data to get at the few files I want.

This isn't a problem on my laptop where I work on my blog (as it's already cloned), but I also keep my IndieKit config in there, which I need to copy to the server it runs from. I don't want to have to download hundreds of MB (and growing!) of photo data every time I want to update it.

So I built git-download-subpath, which is a bash script around git's partial clone functionality.

$ git download-subpath
# usage: git-download-subpath <repo> <subpath> [destination]

$ git download-subpath https://github.com/by-jp/www.byjp.me indiekit
# Successfully downloaded to ./indiekit

This script (which you can find & download below) completes these steps:

  1. Clones a "no tree" copy of the repo to a temporary directory (just references to commits & the latest files, not the data itself)
  2. Completes a "no cone" sparse checkout of the subpath desired
  3. Moves that desired subpath over the working directory (or the destination, if specified)

I'm no bash expert, so there may be subtle bugs here; let me know if you spot them!

Copy the following bash to git-download-subpath somewhere in your $PATH (I keep it in /usr/local/bin), and mark as executable with chmod +x git-download-subpath:

#!/bin/bash
set -e

repo="$1"
subpath="$2"
if [[ -z "${repo}" || -z "${subpath}" ]]; then
  echo "usage: git-download-subpath <repo> <subpath> [destination]"
  exit 1
fi

if [[ -z "$3" ]]; then
  dest=$(pwd)
else
  dest=$(realpath "$3")
fi
if [[ ! -d "${dest}" ]]; then
  >&2 echo "${dest} is not a directory that exists"
  exit 3
fi
if [[ -d "${dest}/${subpath}" ]]; then
  >&2 echo "${dest}/${subpath} already exists, exiting to avoid replacing"
  exit 3
fi

# Cross-platform create temporary dir: https://unix.stackexchange.com/a/84980/
tmpdir=$(mktemp -d 2>/dev/null || mktemp -d -t 'git-download-subpath')
cd "${tmpdir}"

git clone -n --depth=1 --filter=tree:0 "${repo}" repo > /dev/null 2>&1
cd repo
git sparse-checkout set --no-cone "${subpath}" > /dev/null 2>&1
git checkout > /dev/null 2>&1
if [[ ! -d "${subpath}" ]]; then
  >&2 echo "The directory '${subpath}' doesn't exist in this repo."
  exit 5
fi
cp -r "${subpath}" "${dest}/"

echo "Successfully downloaded to ${dest}/${subpath}"