As Sacha Willems posted: "Shrinking a git(hub) repository isn’t just about deleting locally present files but requires cleaning up the history as files that have been removed are still present in the repository’s history and therefore still contribute to it’s size."
With the GitHub action Branch-Pruner, you can easily reduce the size of a GitHub repository by manually and/or automatically truncating the old commit history of a selected branch. This means that you can delete all commits with previous and unused file versions up to an arbitrarily selected point in your Git history without losing newer commits with newer file versions of the selected branch tree.
Normally YOU SHOULD NEVER DO THIS and there are huge drawbacks. However, in some cases it is really useful to get rid of the old stuff on a regular basis. E. g., if your repository size is growing continuously and you only ever need the latest commit history. Or when you encounter problems of a general slowness with Git commands like push
and pull
. Then it's time for the Branch-Pruner. It will speed you up again 😉.
I, Sitdisch, created the Branch-Pruner because I needed a GitHub action that would periodically auto-crop my repo size, and there was no action out there before. My solution approach is based on this blog post by Thomas Sutton and this blog post by Alin Ruscior. Thanks to both.
The Branch-Pruner rewrites the entire commit history of the branch being pruned. The new history takes the branch-tree of the selected new-first-commit
. That means all subsequent commits have the old order and be authored by the original sources.
But the Drawbacks are:
- in the
new-first-commit
, the files are marked as created - all commits have new time stamps and commit-hashes
- all commits are committed by the selected
User
(default:github-actions[bot]
) - all forks and other branches have nothing to compare with the pruned branch anymore
- cuts can't be undone.
Oh, you're still here then let's do it.
- add the
branch-pruner.yml
workflow file to a repository- the path has to be
.github/workflows/branch-pruner.yml
- it doesn't have to be the repository you want to prune; e. g., you can simply fork the
myactionway/branch-pruner-workflow
repository
- the path has to be
- create a new encrypted repository secret [procedure]
- add the secret to the same repository where you added the workflow file
- give the secret a name e. g.
BRANCH_PRUNER_TOKEN
- the value of the secret must be the value of the personal access token for the repository to be pruned
- procedure for creating a personal access token
- select only the minimum scopes and permissions required for your use case e. g. repo and workflow
- adapt your
branch-pruner.yml
file- for manual triggers
- all you have to do is enter your secret name e. g.
BRANCH_PRUNER_TOKEN
env: # Token for all triggers TOKEN: ${{ secrets.BRANCH_PRUNER_TOKEN }}
- CONSIDER: never enter the actual value of the personal access token
- procedure for manually running a workflow
- CONSIDER: currently, you can't change the token in the UI
- all you have to do is enter your secret name e. g.
- for other triggers
- adapt this section
############################################################## # DEFINE YOUR TOKEN, INPUTS AND TRIGGERS IN THE FOLLOWING ############################################################## # TOKEN and INPUTS as environmental variables env: # Token for all triggers TOKEN: # e.g. ${{ secrets.BRANCH_PRUNER_TOKEN }} # # Inputs for not manually triggered workflows NEW-FIRST-COMMIT: # e.g. commit-hash or HEAD~N etc. REPOSITORY: # target repo e.g. 'dummy/mytargetrepo' BRANCH: # branch to be pruned e.g 'master' USER-NAME: # user who should commit e.g. 'dummy' USER-EMAIL: # e.g. '[email protected]' # TRIGGERS on: # push: # schedule: # - cron: '00 23 28 * *'
- CONSIDER:
- token: never enter the actual value of the personal access token
- inputs:
- if any input is blank, one of these default values will be used instead
DEFAULT-REPOSITORY: ${{ github.repository }} # is the repo with this file DEFAULT-BRANCH: 'master' DEFAULT-USER-NAME: 'github-actions[bot]' DEFAULT-USER-EMAIL: '41898282+github-actions[bot]@users.noreply.github.com'
- choose your
new-first-commit
carefully; E. g.,HEAD~N
is really useful for autonomously truncating N old commits of a branch on a regular basis. However, know what you are doing.HEAD~N
orHEAD^N
may not be the commits you're targeting. For more information about HEAD~N and HEAD^ look e. g. here.
- if any input is blank, one of these default values will be used instead
- trigger-schedule:
- e. g.
cron: '00 23 28 * *'
executes the Branch-Pruner every 28th day of a month at 23:00 - you can check your inputs here
- e. g.
- adapt this section
- for manual triggers
- workflow trigger
schedule
doesn't fire- in my experience, a workflow file with this trigger must be placed in the default branch
- in this chat Brightran said: "... The workaround is to push something to trigger them. ..." and Hless said: "... It appears to me that it takes while before schedules actions run at all in a new repo". In my experience, they are right.
- this error
fatal: refusing to merge unrelated histories
occurs when you pull the pruned branch back to your local machine- possible solution [source]:
git fetch --all
git reset --hard origin/<PRUNED_BRANCH>
(replace<PRUNED_BRANCH>
)
- possible solution [source]: