Legal NLP Releases new Multichoice and Question-Answering on Multistate Professional Responsibility Examination Questions and more

David Cecchini
John Snow Labs
Published in
7 min readSep 10, 2023

Version 1.18.0 expands the capabilities of the library with two new models, demo apps and bug fixes

Flan T5 finetuned for Multichoice on MPRE

The new model is a fine-tuned version of Flan T5, extending its capabilities to reason on the Legal domain. Passing the Multistate Professional Responsibility Examination is required by most US states as a prerequisite or co-requisite to enter a bar as an attorney at law.

To create the pipeline, simply use the TextGenerator from Legal NLP:

from johnsnowlabs import nlp, legal


spark = nlp.start()

document_assembler = (
nlp.DocumentAssembler()
.setInputCol("question")
.setOutputCols("document_question")
)

leg_gen = (
legal.TextGenerator.pretrained(
"leggen_flant5_mpre", "en", "legal/models"
)
.setInputCols(["question"])
.setOutputCol("generated_text")
.setMaxNewTokens(150)
.setStopAtEos(True)
)

pipeline = nlp.Pipeline(stages=[document_assembler, leg_gen])

Then, to use the model, pass a question in the format of MPRE, meaning a short context followed by choices for the model to select from. For example,

Conglomerate Corporation owns a little more than half the stock of Giant Company. Conglomerate’s stock, in turn, is public, available on the public stock exchange, as is the remainder of the stock in Giant Company. The president of Conglomerate Corporation has asked Attorney Stevenson to represent Giant Company in a deal by which Giant would make a proposed transfer of certain real property to Conglomerate Corporation. The property in question is unusual because it contains an underground particle collider used for scientific research, but also valuable farmland on the surface, as well as some valuable mineral rights in another part of the parcel. These factors make the property value difficult to assess by reference to the general real-estate market, which means it is difficult for anyone to determine the fairness of the transfer price in the proposed deal. Would it be proper for Attorney Stevenson to facilitate this property transfer at the behest of the president of Conglomerate, if Attorney Stevenson would be representing Giant as the client in this specific matter?

Yes, because Conglomerate Corporation owns more than half of Giant Company, so the two corporate entities are one client for purposes of the rules regarding conflicts of interest.

Yes, because the virtual impossibility of obtaining an appraisal of the fair market value of the property means that the lawyer does not have actual knowledge that the deal is unfair to either party.

No, because the attorney would be unable to inform either client fully about whether the proposed transfer price would be in their best interest.

No, not unless the attorney first obtains effective informed consent of the management of Giant Company, as well as that of Conglomerate, because the ownership of Conglomerate and Giant is not identical, and their interests materially differ in the proposed transaction.

Send the data to a spark data frame and use the model to obtain the answer:

data = spark.createDataFrame(
[
[
"""question:
Conglomerate Corporation owns a little more than half the stock of Giant
Company. Conglomerate’s stock, in turn, is public, available on the public
stock exchange, as is the remainder of the stock in Giant Company.
The president of Conglomerate Corporation has asked Attorney Stevenson to
represent Giant Company in a deal by which Giant would make a proposed
transfer of certain real property to Conglomerate Corporation.
The property in question is unusual because it contains an underground
particle collider used for scientific research, but also valuable farmland
on the surface, as well as some valuable mineral rights in another part of
the parcel. These factors make the property value difficult to assess by
reference to the general real-estate market, which means it is difficult for
anyone to determine the fairness of the transfer price in the proposed deal.
Would it be proper for Attorney Stevenson to facilitate this property
transfer at the behest of the president of Conglomerate, if Attorney
Stevenson would be representing Giant as the client in this specific matter?
Yes, because Conglomerate Corporation owns more than half of Giant Company,
so the two corporate entities are one client for purposes of the rules
regarding conflicts of interest. Yes, because the virtual impossibility
of obtaining an appraisal of the fair market value of the property means
that the lawyer does not have actual knowledge that the deal is unfair to
either party. No, because the attorney would be unable to inform either
client fully about whether the proposed transfer price would be in their
best interest. No, not unless the attorney first obtains effective informed
consent of the management of Giant Company, as well as that of Conglomerate,
because the ownership of Conglomerate and Giant is not identical, and their
interests materially differ in the proposed transaction.""",
]
]
).toDF("question")

results = pipeline.fit(data).transform(data)

results.select("generated.result").show(truncate=False)

Obtaining, in this example,

Not if the attorney first obtainses efficient informed consent of the administration of Giants, as well and of Conglomerate

Question-Answering on MPRE

This new model is a modification of the MPRE dataset, adding it as a question answering task without giving the model choices to choose from.

To use the model, add the question and context as input to the annotator:

context = """
Mr. Burns, the chief executive officer of Conglomerate Corporation, now
faces criminal charges of discussing prices with the president of a competing
firm. If found guilty, both Mr. Burns and Conglomerate Corporation will be
subject to civil and criminal penalties under state and federal antitrust
laws. An attorney has been representing Conglomerate Corporation.
She has conducted a thorough investigation of the matter, and she has
personally concluded that no such pricing discussions occurred. Both
Conglomerate Corporation and Mr. Burns plan to defend on that ground.
Mr. Burns has asked the attorney to represent him, as well as Conglomerate
Corporation, in the proceedings. The legal and factual defenses of
Conglomerate Corporation and Mr. Burns seem completely consistent at
the outset of the matter. Would the attorney need to obtain informed consent
to a conflict of interest from both Mr."""

question = """
Burns and a separate corporate officer at Conglomerate Corporation before
proceeding with this dual representation?"""


spark_df = data = spark.createDataFrame([[
question, context]]).toDF("question", "context")

Then, build the pipeline using the QuestionAnswering annotator and the pretrained model legqa_flant5_mpre :

document_assembler = (
nlp.MultiDocumentAssembler()
.setInputCols("question", "context")
.setOutputCols("document_question", "document_context")
)

leg_qa = (
legal.QuestionAnswering.pretrained(
"legqa_flant5_mpre", "en", "legal/models"
)
.setInputCols(["document_question", "document_context"])
.setCustomPrompt("question: {QUESTION} context: {CONTEXT}")
.setMaxNewTokens(50)
.setOutputCol("answer")
)

pipeline = nlp.Pipeline(stages=[document_assembler, leg_qa])

Finally, run the model:

result = pipeline.fit(data).transform(data)


result.select("answer.result").show(truncate=False)

Obtaining:

Yes, because the conflicting positions in the legal and factual defenses require the attorney to obtain the informed consent of both clients before proceeding with the representation.

New demo for Law Stack Exchange Classifier

Released in the past version of Legal NLP, the Law Stack Exchange model received a new demo app, highlighting how the model can be used in practice.

Bug fixes

We fixed bugs on the pretrained deidentification pipelines of our models, which contained incompatibility problems with newer versions of Spark. Now the models are compatible with all major releases of the library.

The pipeline can be used to remove the information by masking it with entity labels, special characters, or obfuscating (changing with synthetic data). Use it with the PretrainedPipeline named legpipe_deid :

Obtaining:

Masking with entity labels:

Masking with special chars:

Masking with fixed-length chars:

Obfuscated:

Fancy trying?

We’ve got 30-days free licenses for you with technical support from our legal team of technical and SME. This trial includes complete access to more than 926 models, including Classification, NER, Relation Extraction, Similarity Search, Summarization, Sentiment Analysis, Question Answering, etc. and 120+ legal language models.

Just go to https://www.johnsnowlabs.com/install/ and follow the instructions!

Don’t forget to check our notebooks and demos.

How to run

Legal NLP is extremely easy to run on both clusters and driver-only environments using johnsnowlabs library:

! pip install johnsnowlabs
from johnsnowlabs import nlp
nlp.install(force_browser=True)
# Start Spark Session
spark = nlp.start()
# Import the Legal NLP module
from johnsnowlabs import legal

For alternative installation methods of how to install in specific environments, please check the docs.

--

--