What if merge tools were APIs?

Imagine you could reduce up to 30% of your manual merges just by making a single REST API call. Wouldn’t it be awesome?

You could invoke this API as a step of your Continuous Integration plan to maximize the branches or pull requests that would be merged without human intervention. It would save precious time in the form of reduced context switches.

You could also use it to develop new merge front-ends in tools like Visual Studio Code, without having to deal with coding an entire merge engine.

Of course, the merge engine wouldn’t be just like a wrapper on top of the regular merge code, nor around tools like Kdiff3, WinMerge or Meld. No. This merge engine would be much cleverer. It would actually understand the programming language, so when you move a method or function or class around, it would know what to do.

Well, that’s what the upcoming merg.io is about. A REST API around our “semantic merge” engine. It is basically turning the most advanced merge tech into an accessible API so programmers can build great solutions around it.

Do you think it makes sense? We are collecting your support and interest to allocate the resources required to develop it. Please subscribe to the waiting list of merg.io. We will keep you posted with news of the product development, features and more.

What is the semantic merge technology?

Well, think for a second how merges should be. Do you think they should still work on plain text or should they recognize the programming language?

Consider the following merge for a second:

You simply want to merge branch bug3001 back to master. Ok, you run the merge and then one file is in conflict as follows:

Look carefully:

· Originally (base) the file had 3 methods: Sum, Multiply and Sqr in this order.

· One developer (in branch bug3001) simply modified the Sum method. I’m not showing the actual modification to keep things simpler, but you get the point.

· Meanwhile, changes happened on master: the Sum method was moved down AND changed.

How is this merge is supposed to be solved?

What a regular 3-way merge tool would do is something as follows:

Since it works on a line by line basis, it could find a conflict comparing Sum with Multiply, which doesn’t obviously make any sense.

Now, let’s see what a semantic merge can do:

As you see semantic merge detects that the method was moved and changed by one contributor, while modified by the other.

It knows it has to 3-way-merge the contents of the Sum() method and place the merged result in the last position of the file because it was moved.

It is pretty obvious when you think about it. It just means parsing the code first, then figuring out what happened to it, then calculating the merge based on that extra info.

Why merg.io reduces manual intervention

We ran what we call a “merge replay” on +10k GitHub repositories. A replay basically consists on cloning a repo, finding all the merges and repeating them.

Look at the following diagram:

If we want to repeat the merge between A and B to create C, all we do is create a branch on A and try to merge again from B. It would create something as follows:

This new C’ (which we don’t need to actually commit, just create it to calculate the merge) is where we compare the result of semantic merge with a traditional text-based merge. If we find the text merge tool finds manual conflicts while everything is automatic for semantic, then we know we can reduce a manual intervention, saving precious time.

Well, what we found out is that:

· We can reduce about 20–30% of manual intervention merging full branches (or pull requests). It means that if a branch has 3 files to merge, and one can’t be fully automated, the branch doesn’t count, even if the other 2 files were automatically merged by semantic.

· More than 40% of manual conflicts can be reduced on individual files. Here, if 2 out of 3 files can be automated, they count on the final results, so the number is better.

Now, what I mean is that it is 20–30% (or 40%) better than a standard 3-way text-based merge tool, or better than Git’s built-in 3-way merge resolution strategy.

We built a tool to do the replays, and here you can find the results of a very well-known repo: https://gmaster.io/mergedroid/analyze/report/gmasterscm/gitextensions

Or you can even check the Git repo itself: https://gmaster.io/mergedroid/analyze/report/gmasterscm/git

Why an API?

Well, we are tool developers. We develop our own full version control stack: www.plasticscm.com, and also a standalone 3-way merge tool you can buy and use today with Git: www.semanticmerge.com, and a full Git GUI for Windows capable of doing cross-file semantic merges and diffs: www.gmaster.io.

But we think an API could be extremely useful for developers because it is all about building on top. While SemanticMerge is a GUI for Windows (plans for Mac, but just Windows so far) and you can definitely use it to automate some merges on the servers-side, building on top is never as simple as doing a REST API call. First you have to install some software, which makes things more complicated on Cloud environments than just using an API.

Our intention is to unleash the power of the semantic merge core so that teams can implement cleverer pull request integration heuristics, doing dry-runs to find conflicts earlier, reducing context switches when possible and so on.

It won’t be hard to customize a Jenkins build step (or Bamboo, TeamCity) and merge the branches using merg.io instead of the default algorithm.

And, it would be awesome to see diff plugins on top of Visual Studio Code (not hard just reusing Monaco) or IntelliJ.

Sign up!

We published a teaser page http://merg.io to explain what the technology is about and figure out if you are interested on it. If you are, please enter your email there so we can get in touch when we have more news. Whether we go forward and build it or simply realize it doesn’t make sense will on depend on the number of people who leaves their emails.