Fixing git lfs for plr required rewriting git history

background

git lfs adds the ability to track large files in git without having the .git history size explode, by saving references to files instead of the contents of files. PyHamilton used LFS to track venus executables and files, which it requires.

As many of you know, PLR was originally a fork of PyHamilton. At no point was code actually shared between PyHamilton and what would become PLR, and so I decided to make PLR a standalone repo on GitHub (I think only the README and setup.py files have lived on.) However, in the process of separating the repos the files referenced by LFS got lost. This means that using git lfs with plr gives an error.

what i have done

To fix the LFS issue, I have taken the commit where I started PLR on the original main branch as a starting point, and applied all subsequent commits onto that. The unrelated PyHamilton commits before that, including the problematic ones referencing LFS pointers, have been removed. Obviously, I preserved all the authorship information, timing, branching, etc. for all PLR commits in the history. On GitHub, all the code, issues, historical PRs, and discussions still exist.

For posterity, I have kept a history of the original main branch on archive-main. This branch will not be used or maintained, and simply serves as a reference.

Some additional benefits:

  • Reduced repo size from 110.61 to 46.75 mb (21953 to 10881 objects), cloning should be 2x as fast
  • commit history will more accurately reflect people’s contributions to PLR

GitHub automatically closed all open PRs as a result of this change. See instructions on how to re-open below.

TODO for you: how to clone

Since the git history of the repo you have cloned locally and the git history of the online repository now no longer share a single origin, you have to freshly clone the repo to continue pulling and contributing in the future. A simple git pull is not enough.

:rotating_light: IN EITHER OPTION, I RECOMMEND DUPLICATING THE DIRECTORY CONTAINING YOUR PLR WORK AS A BACKUP :rotating_light:

  • Option 1

The easiest is to remove the PLR folder and clone it freshly. Be careful that you’re not accidentally deleting local changes or other files stored in this directory!

  • Option 2

Another option is to do this:

git switch main
git fetch --all
git reset --hard origin/main

With this option, your other branches will be preserved.

Be careful that this will overwrite any commits which have not been merged into PLR!

TODO for you: how to fix PRs

If you have work on branches which have not been merged into main (before the history rewrite), use the following to rebase your commits onto the new history:

git checkout <your pr branch>

# rebase your changes onto new main
git rebase main

# replace the history on github with your updated local, correct history
git push --force

then, open a new PR to PLR.


I am deeply sorry about the inconvenience caused by this. We are still very early with PLR, and it feels like the sooner I make this necessary but breaking change, the better. Doing it later will impact many more users. Now we have a clean slate, and are well set up for future success!

4 Likes

This is a good change and worth doing!

2 Likes