Unleashing the power of sikuliX: Automate with vision

Sarumathy P
featurepreneur
Published in
4 min readMay 28, 2023

SikuliX is a powerful tool for GUI automation, and it can be a valuable addition to your automation toolkit, especially when dealing with applications that lack API support or require visual interaction.

SikuliX is an open-source automation tool that uses visual recognition to interact with graphical user interfaces (GUIs). It allows you to automate tasks by writing scripts that simulate user interactions based on visual patterns or images on the screen.

It provides a scripting interface in Python and Java, allowing you to write automation scripts using familiar programming languages. It can locate and click buttons, type text, perform mouse actions, and more, based on the patterns or images you define.

AGENDA:

  1. SikuliX Installation
  2. Running 1st Script in Sikulix IDE
  3. Methods and Utilities in SikuliX
  4. SikuliX Installation:

Head Over to the link, https://launchpad.net/sikuli/+download

Download the jar file as per your OS. After downloading, open your terminal and cd into the path where u have downloaded your SikuliX jar file. Then Execute the command to open the SikuliX IDE.

java -jar <downloaded_sikuli_file_name>.jar 

The <file_name> should be replaced with the one you have downloaded. In my case, It is

java -jar sikulixide-2.0.5-lux.jar 

Make sure that you have Java installed in your system before you run this command. This command will open the sikuliX IDE for scripting.

If you encounter any errors like these, The IDE still opens.

But If you want to get rid of these errors, You can try doing the below instructions.

  1. Failed to load module “canberra-gtk-module”: This error usually occurs when the “libcanberra-gtk-module” package is missing or not properly installed on your system. This module is related to sound events in GTK-based applications but is not directly related to Sikuli. To resolve this error, you can try installing the “libcanberra-gtk-module” package using your system’s package manager. For example, on Ubuntu or Debian-based systems, you can run the following command:
 sudo apt-get install libcanberra-gtk-module

2. SikuliIDEI18N: no locale for en_IN: This error occurs when the “en_IN” locale is not available on your system. Sikuli attempts to set the English (India) locale by default, but if it’s not available, this error can occur. One solution is to set a different locale explicitly. To do this, you can open a terminal and run the following command:

export LC_ALL=C

This command sets the “C” locale, which is a generic and commonly supported locale.

After resolving these errors, you should be able to launch Sikuli and use it without encountering these specific issues.

This is how it looks. You can spot “Run” at the top right corner of your IDE.

2. Running 1st Script in Sikulix IDE:

click(<picture>)
doubleClick(<picture>)

You have to pass the picture you want to click or double-click as an argument. You can do this by Inserting already saved images using the “Insert Image” option in your taskbar or the easiest way is to click “Take Screenshot”. The moment you click Take Screenshot, your IDE minimizes and a copy of your screen appears and you will be able to select an area of your screen to take the screenshot.

If you click Run, the IDE minimizes and starts executing the commands one by one.

You can also use for loop (syntax same as Python) to execute tasks a number of times. For example,

for i in range(0,2):

doubleClick("1685124458192.png")

click("1685124477167.png")

Logs:

Logs are visible at the right half of the editor after executing a script.

3. Methods and Utilities in SikuliX:

  • Simulate keyboard Input:

To simulate keyboard input in Sikuli, you can use the type() function. The type() function allows you to input text or special keys as if they were typed on the keyboard.

Sikuli requires the specific key names provided by its Key class, such as Key.ENTER, Key.TAB, etc., to represent special keys.

Remember to import the Key class from the Sikuli module at the beginning of your script:

from sikuli import Key

Here are the examples:

Typing Text:

type("Hello, world!")  # Types the text "Hello, world!"

Pressing Enter:

type(Key.ENTER)  # Presses the Enter key

Pressing Tab:

type(Key.TAB)  # Presses the Tab key

Backspace:

type("Helli" + Key.BACKSPACE + "o")  # Types "Hello" by deleting the last character and replacing it with "o"

Combining Keys:

type(Key.CTRL, "a")  # Presses the Ctrl key and types "a"

Some common functions used in Sikuli:

Overall, Sikuli is particularly useful when automating tasks that involve GUI-based applications, such as testing, repetitive tasks, or workflow automation.

Hope it helps.

--

--