Learn about your project from git history

marcin piczkowski
Dec 4, 2018 · 3 min read
Image for post
Image for post

When I enter a new project, apart from learning how to build and run it, I also like to check what to pay attention to.
Things like:

You can find some of this information from static code analysis tools (e.g. for Java: FindBugs, PMD, CheckStyle, or IntelliJ Idea plugin called SonarLint.

These all analyse current state of the source code, but it appears like you can get a lot of valuable information from the history of the project.

You could get an information like:

There are already some tools which can do it (check references section below) but it is relatively simple to do such analysis on your own.

Here is a simple application I written in Java which, given a git repository path, prints a list of the top 10 most frequently committed files together with the number of commits.

I used JGit library, which is an implementation of Git in Java.

The way it works is that I get a list of commits for a repository.

Stream<RevCommit> getCommits(Git git) throws GitAPIException{
return StreamSupport.stream(git.log().call().spliterator(), false);
}

Then for each commit I reference the revision tree which holds information about all the files in this revision.

Stream<ObjectId> getRevTrees(Stream<RevCommit> commitsStream){
return commitsStream.map(rev -> rev.getTree().getId());
}

For each such tree I compare it with a tree from previous commit and get a diff.

List<DiffEntry> diff(Git git, ObjectId newCommit, ObjectId oldCommit) throws IOException {
DiffFormatter df = new DiffFormatter(new ByteArrayOutputStream());
df.setRepository(git.getRepository());
return df.scan(newCommit, oldCommit);
}

I collect the list of all changed files in all the commits and at the end grouping them by file path and count the occurrences.
Finally, I’m sorting them by the most frequently occurring paths.

E.g. when you run the application on popular spring-framework master branch, you’ll get results as below:

build.gradle — 1374
src/asciidoc/index.adoc — 444
spring-core/src/main/java/org/springframework/core/annotation/AnnotationUtils.java — 375
spring-beans/src/main/java/org/springframework/beans/factory/support/DefaultListableBeanFactory.java — 374
spring-context/src/main/java/org/springframework/context/annotation/ConfigurationClassParser.java — 362
spring-webmvc/src/main/java/org/springframework/web/servlet/config/annotation/WebMvcConfigurationSupport.java — 332
spring-webmvc/src/main/java/org/springframework/web/servlet/mvc/method/annotation/RequestMappingHandlerAdapter.java — 323
spring-web/src/main/java/org/springframework/http/HttpHeaders.java — 323
spring-web/src/main/java/org/springframework/web/client/RestTemplate.java — 320
spring-beans/src/main/java/org/springframework/beans/factory/support/AbstractAutowireCapableBeanFactory.java — 320
...

Based on that you can imagine which files are the ones to take a look at the first place. Then you would probably like to check where they are used to drill down dipper in the project.

If this post was interesting, you may also like the following references:

Marcin Piczkowski

About software engineering, programming, cloud technologies

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch

Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore

Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store