Speed Up Python with Golang?

Sometimes yes, sometimes no.

Bruce H. Cottman, Ph.D.
May 5 · 8 min read
Source: Unsplash

I and perform benchmarking. Our journey begins by installing GoLang and setting up a GoLang Development Environment. I show the benchmarking of the GoLang kmeans implementation and the Python sklearn kmeans.

Introduction

I need a way to speed up my Python Machine Learning solutions to put them in production.

Python is too slow for production Machine Learning applications. I need to switch away from Python.

What I decided to do was: Learn GoLang.

It is almost as fast as C. It is promoted as easy to learn because it has a C-like syntax. There is language-level support for concurrency. It is statically typed but also includes garbage collection.

It is noteworthy, as it is on Windows, Mac, Linux, and other platforms. It supports the new multi-core architectures.

GoLang is open-source and has a growing contributor community.

Importantly for me is that GOLang is used in production for many years at Google. Also, I required that GOLang can call Python. A GOLang wrapper of my Python-based ML kernels is my fast runtime solution for production.

The rest of this article is my journey in using GoLang to speed up Python Machine Learning algorithms.

I create a development environment with the GoLang IDE, along with some associated tools and packages.

Next, I list 19 “free” resources and 5 “not-free” books to learn the GoLang language used in our company.

Finally, I benchmark the Python sklearn kmeans to GoLang kmeans implementation.

Ready to Start: Setting up your Go Development Environment

Here, I downloaded the binary release installer go1.14.4.darwin-amd64.pkg for MacOS 10.14. 6. There are downloads for Windows and Linux as well.

I double-clicked on go.1.14.4.darwin-amd64.pkg that resided in my Mac Downloads folder. I then followed the instructions of the pop-up installation wizard.

Note: The Go installation requires 350 MB on the selected destination disk.

Note: Depending on how you have a security setup on your computer, you may be asked for your password during the installation process.

Note: After installation, the final action is “move Installer to trash?” I clicked, Yes.

Setting up your GoLand IDE

That bias caused my search for GoCharm. It turns out that JetBrains, the company behind PyCharm, calls their GoLang IDE GoLand.

There is a convenient button for downloading the install of GoLand. From the download, I double-clicked on goland-2020.1.3dmg, which resided in my Mac Downloads folder. I then followed the instructions of the pop-up installation wizard.

The installation process put the icon GoLand in my toolbar. I double-clicked the GoLand icon.

The initial GoLand screen on launch. Animation by Rachel Cottman

Next, I clicked on New Project:

Creating First Project in GoLand. Animation by Rachel Cottman

My First Go Program

package mainimport "fmt"func main() {
fmt.Printf("hello world")
}

If you guessed Hello World, you are correct! The Hello World code was put in my first project, awesomeProject1, by clicking on the GoLand tutorial “The Hitchhicker’s Guide to GoLand.”

Note: I am using the Community version of GoLand.

Then, I ran Hello World in GoLand. It worked!

Run “Go Build learning go…”

I proceeded to go through the rest of the GoLand tutorial, “The Hitchhiker’s Guide to GoLand.” If you have been using PyCharm, the tutorial is a quick introduction to GoLand.

If you are not enamored with PyCharm, you may want to use some other IDE. May I suggest VSCode, another popular multi-language support IDE.

In the next section, I list some Learning Resources. That I used.

Learning Resources

If you want five to ten-minute chunks to learn GoLang or need a few topics right now, I recommend

On the “your-must-have” list should be these other twelve GoLang resources available for free on the Web:

  1. A Tour of Go;
  2. Learn Go: Top 30 Go Tutorials for Programmers Of All Levels;
  3. Golang tutorial series;
  4. Go Tutorials and Courses;
  5. Go by Example;
  6. Gophercises;
  7. Go exercisms;
  8. The Complete Guide to Learning Go;
  9. Essential Go is a free book about Go programming language;
  10. How to Write Go Code.
  11. Go Concurrency Patterns
  12. learn-go-with-tests.

A list of GOLang packages (imports!) can be found here:

Also, you can find GOLang packages here:

These packages are part of the Go Project but outside the main Go tree. They are developed under looser compatibility requirements than the Go core. Install them with “go get”.

And packages I find useful for Machine Learning:

  1. https://github.com/pa-m/sklearn
  2. https://github.com/pa-m/randomkit
  3. https://github.com/pa-m/optimize

Performance

  1. easy-to-learn syntax
  2. faster than Python

As I knew C, the first reason was satisfied. I found it similar in C’s basic constructs, with a little Python flavoring and new functionality in concurrency.

I don’t know the mindset of the authors of GoLang, (I will look up the Wiki version.), my bias leads me to believe that GoLang is a significant re-write of C to accommodate distributed computing.

For a reason #2, I reviewed various benchmarks that showed GoLang anywhere from 30x to 50x faster than Python.

Packages and Modules

go get “github.com/pa-m/sklearn/base”
go get “github.com/pa-m/sklearn/cluster”
go get “github.com/pa-m/sklearn/dataset”

I think that GoLang has evolved such that packages boil down to these rules:

  1. Global variables are Camel-cased. They are variables externally exposed from its package.
  2. Packages are the name at the top of a file set by package <name>. All functions and Global variables specified in that file are in package <name>.
  3. You can load that package in your GoLang environment with the command go get <URL-for-package-file> (similar to pip install <package-name>).
  4. You can access definitions in <package-name>using the statement import (<URL-for-package-file>) in your package main file.

The resulting call to a GoLang implementation of the cluster routine kmeans is:

package mainimport (
"fmt"
"github.com/pa-m/sklearn/cluster"
"github.com/pa-m/sklearn/datasets"
"time"
)
func main() {
start := time.Now()
_ = start
kmeansBlobs()
fmt.Printf("elapsed %s s\n", time.Since(start))
}
func kmeansBlobs(){
X,Y := datasets.MakeBlobs(&datasets.MakeBlobsConfig{
NSamples: 10000,
Centers: 10,
ClusterStd: 0.5})
kmeans := &cluster.KMeans{NClusters: 10}
start := time.Now()
_ = start
kmeans.Fit(X, nil)
kmeans.Predict(X, Y)
fmt.Printf("elapsed %s s\n", time.Since(start))
}

Also, you can skim through reading about GoLang packages and modules because you are hopefully using GoLand. GoLand sets your local GOROOT and GOPATH for you on run::

GoLand run of GoLang kmeans-blogs.go

Benchmarks

from sklearn.cluster import KMeans
def km():
kmeans = KMeans(n_clusters=N_CLUSTERS)
kmeans.fit(X)
y_kmeans = kmeans.predict(X)

The number of points(n_p) to the cluster is varied from 100 to 70,000for the kmeans implementations.

Table of Python-sklearn-Kmeans versus GoLang-Kmeans (Y=MS) by X=number of points.
Python-sklearn-Kmeans versus GoLang-Kmeans (Y=MS) by X=number of points.

By observation, we observe the GoLang-kmeans implementation goes up as O(n), while the Python-sklearn-kmeans implementation goes up as O(log(n)).

The Python-sklearn-kmeans implementation uses cython callouts (C language) for overall speed and unique algorithm tweaks to achieve O(log(n)).

I am surprised to see the GoLang-kmeans implementation be faster than the Python-sklearn-kmeans implementation for N < 10,000 points.

My curiosity is triggered; you or I should try:

  1. Explain what is going on with GoLang-kmeans implementation and the Python-sklearn-kmeans speed results at low N data points.
  2. How easy it to use GPUs? Does it make a difference in performance?
  3. How fast is a Python-kmeans implementation without the performance enhancements?
  4. Would GoLang-kmeans implementation speed up with concurrency enhancements? Is it 1/O(n-cores) faster?

Conclusion

I was not surprised by the speeds of the GoLang and Python kmeans implementations. The Python sklearn is highly optimized with cython and log(N) algorithms.

I expect other leading Python Machine Learning packages to benefit little or not at all from switching to GoLang. XGBoost has a Python API on top of a C++ implementation and is concurrent. Lightgbm gives similar results to XGBoost and is usually faster than XGBoost.

I hope to get speedup replacing Python extraneous pre-processing and post-processing code surrounding ML kernels with GoLang code.

But why bother speeding up Python when there are GPUs (speed up) and cloud-computing (scalability), and quantum computing (massive speedup shortly?)?

Here is my reasoning. There are existing, and future devices, such as a couple of billion intelligent phones, drones, electrical sockets (IoT), and others, that may not want to have internet access, nor have GPU(s); nor have Internet access (cloud).

Given this reasoning and 70 years of computing history, there has been a perpetual need for faster programs, and there are different ways tried.

If Python uses C for better performance, perhaps GoLang can use Python to deliver production Machine Learning.

Next Steps

There is a translator for Python 2.x to Golang. It would be so cool if I could find a translator for Python 3.x to Golang.

I am sure there is a way to call Golang from Python and Python from Golang. I also need the callout mechanisms to be secure.

Python has development tools and automated deployment with Continuous Integration/Continuous Deployment (CI/CD). What development tools does GoLang have? We already encountered GoLand, a comprehensive IDE.

All code used is found at https://github.com/bcottman/GoApplication.

I hope you find these blogs useful in your Python and GoLang coding and coding. I am enjoying writing them. I enjoyed even more thinking about the possible futures for Python and GoLang. Please let me know your thoughts on any of these matters.

CodeX

Everything connected with Tech & Code

Bruce H. Cottman, Ph.D.

Written by

Physicist, Machine Learning Scientist and constantly improving Software Engineer. I extrapolate the future from emerging technologies.

CodeX

CodeX

Everything connected with Tech & Code

Bruce H. Cottman, Ph.D.

Written by

Physicist, Machine Learning Scientist and constantly improving Software Engineer. I extrapolate the future from emerging technologies.

CodeX

CodeX

Everything connected with Tech & Code

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store