Tuesday, January 17, 2017

Git Large File System

One of my projects needs some pretty big datasets that github can't handle by itself.  Its error message helpfully suggested I install Git Large File System (LFS).

So, I downloaded and installed it.  Then you need to initialize it.  I tried it first in the git shell from RStudio but that didn't work (said it was not a command), so I tried opening a regular command prompt.  There "git lfs install" worked (it said "Git LFS initialized").

Next, I needed to track some giant .csv files, but I didn't want to track all .csv files.  The main instructions for using LFS call for tracking given file extensions.

This part was me noodling around and it didn't work, but has other ideas that might work for other situations.  I found a modification for a single file tracking though, so I can track only some files by putting them in a folder (all the big ones).  I tried following instructions for uploading a file here, but nothing happened when I tried the first file.  I realized it might be too small, so I put a bigger file in (>100 MB) and it still didn't work.  I tried just doing a single file and that didn't do it.  ls-files and lfs status still doesn't show it.  Modifying just to do all .csv files seems to modify them, which is not what I wanted.  So I went back to the last good version (using git reset --hard SHAnumber).  Unfortunately, somehow reset kept locking up RStudio.  I finally got it to open when I paused my computer's Dropbox syncing (I was on a very slow connection) and also opened the project not from the recent projects menu.  Not sure if one or both of those helped.  Then I got the .csv files that had been modified to go away when I deleted the .gitattributes (not my custom.gitattributes file) that had LFS settings in it (apparently track/untrack just writes to .gitattributes).

Once I was reset back to where I was before I started messing with everything, I tried again with tracking the folder AND ITS CONTENTS.  This does it!

git lfs track "myfolder/**"

However, it won't show anything staged in regular git status or git lfs ls-files until commited.  you have to do git lfs status.

No comments:

Post a Comment

Comments and suggestions welcome.