A few months ago, I wrote about Microsoft’s move to git and the scaling problems they were having to overcome. Now Brian Harry has posted an update on their work. They have what they claim is the largest git repository on the planet. At 3.5 million files and 300 GB, there’s no reason to doubt that description.
When Harry first wrote about the move back in February, they had a working system but hadn’t yet moved an appreciable number of engineers over to it. As a result, they didn’t know how well it would scale when thousands of Windows engineers started using it. They now have about 3,500 of the roughly 4,000 Windows engineers switched over and have collected some useful performance statistics. They also discovered that some of their assumptions about how to speed things up were off a bit and have since fixed them.
I’ll let you read Harry’s post (or this Ars Technica article) to see where they stand now. The TL;DR is that the speeds are similar to their old system but they are still working to improve them. One huge improvement is, as you’d expect from git, in branching. That went from an ordeal almost too painful to endure to a simple and easy process. The other news is that Microsoft is making their work available to the public and is moving to an open development model where users can contribute if they wish.
Give Harry’s new post a read. You almost certainly don’t have the problems with your repository that Microsoft does with theirs but it’s interesting none-the-less and shows what can be done to use git with large code bases.