Running spaCy as a Service on GKE

k8scale.io
k8scaleio-engineering
3 min readDec 29, 2019

This is a step by step guide of running SpaCy a Natural Langauge processing library on Google Cloud Kubernetes Engine (GKE) cluster. These steps assumes that you have followed the steps to create cluster in your account.

The api uses the examples of the from Spacy github repository

App

We are using Flask to define two apis for SpaCy

  1. To fetch the entity relation
  2. To fetch noun and verb phrase extraction from a given sentence
from flask import Flask
from flask import request, jsonify
from flask import Response
from flask import json
from gevent.pywsgi import WSGIServer
import numpy as np
import wave
import sys
import spacy
import textacy
import os.path
from spacy.matcher import PhraseMatcher
from entity_relation import extract_currency_relations
nlp = spacy.load('en_core_web_sm')
matcher = PhraseMatcher(nlp.vocab)

app = Flask(__name__)

pattern = r'<VERB>?<ADV>*<VERB>+'

def extract_noun_phrase(text):
doc = nlp(text)
noun_phrases = []
for np in doc.noun_chunks:
noun_phrases.append(np.text)
print(noun_phrases)
return noun_phrases
def extract_verb_phrase(text):
doc = nlp(text)
verb_phrases = []
verb_chunks = textacy.extract.pos_regex_matches(doc, pattern)
for vb in verb_chunks:
verb_phrases.append(vb.text)
print(verb_phrases)
return verb_phrases

@app.route('/extract-phrase', methods = ['POST'])
def extract_phrase():
if request.method == 'POST':
data = request.get_data()
dataDict = json.loads(data)
nounPhrase = extract_noun_phrase(dataDict["text"])
verbPhrase = extract_verb_phrase(dataDict["text"])
phraseDic = {
"noun": nounPhrase,
"verb": verbPhrase
}
return jsonify(phraseDic)
else:
return Response()

@app.route('/extract-relation', methods = ['POST'])
def find_relation():
if request.method == 'POST':
data = request.get_data()
dataDict = json.loads(data)
doc = nlp(dataDict["text"])
relations = extract_currency_relations(doc)
output = {}
for r1, r2 in relations:
output[r1.text] = r2.text
return jsonify(output)
else:
return "Invalid Request"

@app.route('/ping', methods = ['GET'])
def health():
return "Ok"

if __name__ == '__main__':
print("Starting the server...")
port = 8050
http_server = WSGIServer(('', port), app)
print("Server started and listing on port: ", port)
http_server.serve_forever()

Complete code is in git: https://github.com/k8scaleio/SpaCyServer

Docker

Now let’s build the docker image for the above app. We are going to use ubuntu base image for building the docker container

Dependencies are being pull from requirements.txt file

Build the docker container by cloning the repository and running the below command

docker build -t spacy-server:1.0 .

After you have build the container run it using

docker run -p 8050:8050 spacy-server:1.0

You can test your server by running a curl command

curl -d ‘{“text”:”Net income was $9.4 million compared to the prior year of $2.7 million.”}’ -H “Content-Type: application/json” -X POST http://localhost:8050/extract-relation

GKE deployment

Now that your docker container is working. Let’s try to deploy to the kubernetes cluster which we have created

Verify first that your kubectl is pointing to the correct cluster Run the below command

kubectl config current-context

If its not pointing to the correct cluster run the below command to fetch the credentials for it

gcloud container clusters get-credentials $CLUSTER_NAME

First we need to tag the container so that we can push it to GCR. $PROJECT_NAME is the google cloud project in which your cluster is running

docker tag spacy-server:1.0 gcr.io/$PROJECT_NAME/spacy-server:1.0

Now we are ready to push the container to Google container registry using the below command

docker push gcr.io/$PROJECT_NAME/spacy-server:1.0

Once you have pushed your container to GCR let’s create a deployment file

This deployment file can be found in the repository as well.

Now you can run kubernetes deployment command

kubectl apply -f $DEPLOYMENT_FILE

Now you should be able to access the service from your cluster.

We have learned to run a spaCy as a service in a kubernetes cluster.

Follow us on twitter: https://twitter.com/k8scaleio

--

--