version control - Managing large binary files with Git

Thursday, 7 March 2019

version control - Managing large binary files with Git

I am looking for opinions of how to handle large binary files on which my source code (web application) is dependent. We are currently discussing several alternatives:

Copy the binary files by hand.
- Pro: Not sure.
- Contra: I am strongly against this, as it increases the likelihood of errors when setting up a new site/migrating the old one. Builds up another hurdle to take.

Manage them all with Git.
- Pro: Removes the possibility to 'forget' to copy a important file
- Contra: Bloats the repository and decreases flexibility to manage the code-base and checkouts, clones, etc. will take quite a while.

Separate repositories.
- Pro: Checking out/cloning the source code is fast as ever, and the images are properly archived in their own repository.
- Contra: Removes the simpleness of having the one and only Git repository on the project. It surely introduces some other things I haven't thought about.

What are your experiences/thoughts regarding this?

Also: Does anybody have experience with multiple Git repositories and managing them in one project?

The files are images for a program which generates PDFs with those files in it. The files will not change very often (as in years), but they are very relevant to a program. The program will not work without the files.

Answer

If the program won't work without the files it seems like splitting them into a separate repo is a bad idea. We have large test suites that we break into a separate repo but those are truly "auxiliary" files.

However, you may be able to manage the files in a separate repo and then use git-submodule to pull them into your project in a sane way. So, you'd still have the full history of all your source but, as I understand it, you'd only have the one relevant revision of your images submodule. The git-submodule facility should help you keep the correct version of the code in line with the correct version of the images.

Here's a good introduction to submodules from Git Book.

Blog

Thursday, 7 March 2019

version control - Managing large binary files with Git

No comments:

Post a Comment

php - file_get_contents shows unexpected output while reading a file