Start a K8s SolrCloud with Pre-Installed Custom Libraries — Customizing Solr Helm Chart
Almost all real world applications on SolrCloud would require adding some custom libs, specifically the custom text analyzers, to provide good search experiences. However, Solr has not yet supported adding text analyzers libs dynamically on a live cluster, and thus we figured out a way to install custom libs automatically when deploying SolrCloud with Solr Helm chart.
Solr Limitations of Adding Custom Libraries Dynamically
In real-world use cases, many people would often need to add custom text-analyzers in their SolrCloud to provide better search results for specific languages.
Though Solr does provide a way to install custom libraries on a live SolrCloud — Solr Blob Store API, it does not work for all types of components you may need to add in your SolrCloud, as you can see from the following description in a Solr issue (https://issues.apache.org/jira/browse/SOLR-9175).
It appears that only the Config API and solrconfig.xml support loading custom classes from the Blob Store. It seems to me like any directive for a collection which references a class attribute should also support loading this class from the Blob Store via a runtimeLib="true" attribute.The obvious use case here is custom analyzers, but similarity is also a candidate.The documentation in this area (eg. "add-runtimelib") is pretty vague.
Deploying SolrCloud on K8s with Solr Helm chart is a breeze but without the support of adding custom libs dynamically, we’ll have to manually install them on each Solr instance, and then do a rolling restart — which is definitely something we want to avoid.
A Solution By Tweaking Solr Chart
Luckily, after some digging around, we found the official Solr Docker image, used by Solr Helm chart, provides an entry point for us to run a script before starting Solr. We can add a script that downloads and installs the libraries we need to this entry point, and make sure each Solr instance is loaded with our custom libraries.
It’s just that Solr Helm chart doesn’t provide a way for us to make use of the entry point at the time this article is written, so we have to do some tweaking on Solr Helm chart to make it work.
To start customizing the Solr Helm chart, you’ll have to gain some basic knowledge of its template language. A template directive is enclosed in {{
and }}
blocks.
Please download the charts repo from the GitHub, and do the following customization.
- Add a ConfigMap file
incubator/solr/templates/solr-custom-script.yaml
---apiVersion: "v1"
kind: "ConfigMap"
metadata:
name: "{{ include "solr.custom-script.configmap-name" . }}"
labels:
{{ include "solr.common.labels" . | indent 4}}
data:
solr-init.sh: |
{{ .Files.Get .Values.initScript | indent 4}}
{{ end }}
This ConfigMap has a name generated by {{ include "solr.custom-script.configmap-name" . }}
, and it loads the custom script content under the key solr-init.sh
.
- Add the following lines in
incubator/solr/templates/_helpers.tpl
{{/*
Define the name of the custom script configmap
*/}}
{{- define "solr.custom-script.configmap-name" -}}
{{- printf "%s-%s" (include "solr.fullname" .) "custom-script-config-map" | trunc 63 | trimSuffix "-" -}}
{{- end -}}
Here we define a named template — solr.custom-script.configmap-name
, which is used to generate the name for the ConfigMap.
- Add the following lines in
incubator/solr/templates/statefulset.yaml
volumes:...{{- if not ( eq .Values.initScript "" ) }}
- name: solr-init-script
configMap:
name: {{ include "solr.custom-script.configmap-name" . }}
items:
- key: solr-init.sh
path: solr-init.sh
{{- end }}
Here, we populate a volume solr-init-script
with data stored in a ConfigMap {{ include "solr.custom-script.configmap-name" . }}
.
...{{- if not ( eq .Values.initScript "" ) }}
- name: solr-init-script
mountPath: /docker-entrypoint-initdb.d
{{- end }}
Here, we mount the volume solr-init-script
to the entry point provided by the official Solr Docker image.
- Add the following lines in
incubator/solr/values.yaml
# Specify the initial script file name here to be executed before starting Solr
# The file is located relative to the Solr chart root folder
initScript: ""
Here, we expose the parameter initScript
to specify the custom script file path (relative to the Solr chart root folder — /charts/incubator/solr/
).
Start a SolrCloud with Installed Libraries
After we customize the Solr Helm chart, we can then go on to deploy the SolrCloud.
- First, we write a custom script to download the libraries we need.
You can create a shell script with any name you like, and put the file under /charts/incubator/solr/
. Here we use the name — load_lib.sh
#!/bin/bash
set -e
mkdir -p /opt/solr/server/home/lib
wget -O /opt/solr/server/home/lib/mmseg4j-solr-2.4.0.jar https://github.com/pai911/solr-deploy/raw/master/mmseg/mmseg4j-solr-2.4.0.jar
wget -O /opt/solr/server/home/lib/mmseg4j-core-1.10.0.jar https://github.com/pai911/solr-deploy/raw/master/mmseg/mmseg4j-core-1.10.0.jar
echo "this is running inside the container before Solr starts"
My script downloads mmseg text analyzers (a Chinese word segmentation library) from my own Github repo. You can modify this file to meet your own needs.
- Create a custom value file
solr-deploy-vars.yaml
You can define a set of values in this file to customize your SolrCloud. Here, we just need to add one more value initScript
to let Solr chart know the place to load the init script.
initScript: "load_lib.sh"
- Deploy the SolrCloud.
The simplest deployment command is as follows.
helm install [NAME] [CHART] -f [VALUE_FILE]
Remember, we have to deploy SolrCloud with our locally customized Solr Helm chart. That is, [CHART] should reference the local chart.
Say we want to deploy a local chart ~/charts/incubator/solr
with the name solr
using the value file solr-deploy-vars.yaml
. The full command will look like the following.
helm install solr ~/charts/incubator/solr -f solr-deploy-vars.yaml
The Complete Code
To see the complete code, you can go to this gist I created. I’ve also created a pull request to add this capability to Solr chart, and hopefully, it’ll get merged soon so no one will have to go through this trouble again.