Swish Labs
Published in

Swish Labs

Machine Learning for Contracts Analysis — Put Your Human Mind Where It Really Matters.

Reviewing documents has historically been a manual undertaking. It usually involves tediously reading possibly large amounts of text only to look for specific information.

In many cases, obtaining summaries of all this text that extract only the information we need would save tremendous time.

Lawyers, real estate brokers, HR specialists for employment, hospital staff, financial brokers, etc. all share a common goal: to gain visibility into their documents.

In this post, we focus on how AI can help businesses gain contract visibility.

Humans and contracts.

Contracts are a powerful way of creating trust between two or more parties. They ensure clauses are respected, payments are delivered, work is done, and so on.

If having a contract is helpful (and, very often, necessary), it seems that the interaction we, humans, have with the multi-paged document is not an easy one.

From scanning through a contract to check for abusive clauses, to searching for specific information like a price, a date, or amending information in a set of contracts, these are highly time-consuming tasks. And time is money.

Words, words, words, how many can a human brain digest, and how fast?

Law firms, government agencies, HR Managers, Real Estate brokers (and more) are faced with the challenge of having to monitor a huge number of contracts. However, the latter often lack consistency and are difficult to manage as most organizations don’t store the information in their contracts in a database — let alone have an efficient way to extract said information.

For example, when the expiry date of a contract is approaching or when legislation amendments affect a contract, law firms must inform their clients. Contracts involving specific parties and large payments may need particular focus from law enforcement agencies. Agreed payments and deliverables need to be kept track of by contractors.

Many of these tasks can be automated by extracting specific contract elements (e.g. termination dates, agreed on payments, legislation references, contracting parties). However, extracting elements from contracts is currently mostly a manual process, which is tedious and costly.

Let the robot help.

In recent years, the use of machine learning (ML) for natural language processing (NLP) has seen great success in performing tasks previously only done by humans.

Most notably, the use of deep learning has surged in the field, allowing models to learn from large amounts of data.

As a result, ML extracting software can learn from businesses’ data to quickly uncover valuable insights. Being able to get a quick picture of a contract can increase productivity and efficiency in the contracting of many businesses.

For example, these businesses will be able to extract contract data much quicker than would otherwise be possible with a team of lawyers, thereby allowing them to review contracts more rapidly.

Also, companies will be able to find large amounts of contract data with greater ease, which allows them to increase the number of contracts they are able to negotiate and execute.

Now the question arises: How do we build ML models for contract element extraction?

How does it work?


The first step is data.

The way machine learning works is through feeding a model (an algorithm) with enough data to train it on the task we are building them to perform. So, if we want a model to extract information from a contract, we have got to train them on large sets of data — contracts.

Learn more about ML model training here.

We need to provide the model with examples of contracts that are annotated with contract elements of interest for training.

Figure 1: Typical structure of a contract, with possible contract elements of interest, highlighted

The core dataset we need must contain contracts annotated with clause headings (Fig. 1, points 4) such that our model can learn to identify them. Similarly, we require annotations of contract elements that can include for example:

  • Contract Title (Fig. 1, point 1)
  • Contracting Parties (Fig. 1, points 3)
  • Start (Fig. 1, point 2), Effective (point 5), Termination (point 8) Dates, Contract Period (point 7), Value (point 9)
  • Governing Law (Fig. 1, point 10), Jurisdiction (point 6), Legislation Refs (point 6)

Algorithm (Warning Technical Disclaimer)

After we have gathered and processed our data, we can proceed to develop the model.

Several algorithms have been compared by the authors of this paper, with the conclusion that the best overall method is using deep learning.

One approach to build the machine learning model is to make use of a bidirectional LSTM (BILSTM) operating on words, part-of-speech (POS) tag, and token-shape embeddings.

In common terms, word embeddings are sequences of numbers representing particular words, such that words with similar context are spatially close.

To further improve results, we can stack an additional LSTM on top of the BILSTM (BILSTM-LSTM).

We build a separate extractor for each contract element type (e.g. contracting parties).

Figure 3: BILSTM-(LSTM)-LR extractor for a particular contract element type

In a deployed system, we would apply the extractor for each corresponding contract element type separately.

As such, the extractors can focus on identifying a single element which would improve the system’s accuracy.

Business Tool of the Future, Today.

Machine learning contracting software has the ability to completely revolutionize the way businesses interact with their contracts.

Take for example the case of JP Morgan with their AI program COIN (Contract Intelligence) which completed 360,000 hours of their lawyers’ work in just seconds.

By better managing the bottleneck that is contract management, JP Morgan greatly increased their contracting efficiency and productivity.

Better contract management increases efficiency, productivity and saves costs — an estimate shared by The Harvard Business Review showed that “inefficient contracting causes firms to lose between 5% to 40% of value on a given deal.”

This idea has sparked the interest of machine learning researchers and startups around the world. Organizations see more and more value in investing both time and capital in this domain.

In fact, the startup Kira Systems raised a $50 million Series A for their AI-powered automated contract review software.

Figure 4: Kira automated contract extraction and analysis tool

As awareness regarding the benefits of AI for contracts grows, so does the need for businesses to take advantage of this technology to improve their workflow.

The use of ML and AI in the area of contract management definitely holds the potential to transform great pans of industries such as Insurance, Finance, Investment, Real Estate, Corporate. Really, the technology can be applied to any business and organization that use contracts; and it can be applied without further ado — the state of the technology today is advanced enough to enable companies and professionals to harness the benefits of such a resource-freeing tool.

Swish can create such a tool and support you in implementing contract analyzers to your business. Let’s chat.

This story was brought to you by Noel Vouitsis from the Machine Learning team at Swish.

References Papers:



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store