Bots to update Wikidata.

Himanshu Maheshwari
7 min readApr 26, 2020

--

Recently I was working for a project (for the course Social Computing) at my College(IIIT Hyderabad), where the aim was to enhance Wikidata using gamification with a special focus on Hindi and Telugu language. However all the documentation on how to automate editing Wikidata was very distributed and I spent quite some time figuring out the pieces, thus I decided to write this article. This article talks about bots in Wikidata and how to write one. It shows a simple ~10 lines script to edit statements. Also, there was no documentation on how to add a reference to property value, I also cover this topic.

This article assumes that the reader has a good idea about what Wikidata is and how to write simple scripts in python. Please refer to this link in case you don’t know what Wikidata is. Bots (also known as robots) are tools used to make edits without the necessity of human decision-making. Bots can add interwiki links, labels, descriptions, statements, sources, and can even create items, among other things. Bots can make edits very quickly and can disrupt Wikidata if they are incorrectly designed or operated. That’s why sufficient care must be taken while designing bots.

Bots Accounts

Users are required to create a separate bot account to contribute. Bot accounts are generally named after either their operator or their function, combined with the word “bot”.

Approval Process

To receive approval and a bot flag, a request must be made at Wikidata: Requests for permissions/Bot detailing what task the operator wishes for the bot to perform. The bot operator should do a test run of between 50 and 250 edits(on Test Wikidata), so that the community can observe that the bot is working correctly. As long as the bot request is not accepted, the bot can’t edit the main Wikidata. Now the official documentation does not specify where the edits must be made. So, there is a mirror of Wikidata called Test Wikidata. The Test Wikidata is entirely made for users to experiment. So it is good practice for bots to edit Test Wikidata (anyway the bot could not edit main Wikidata). Also, there are some Sandbox pages that unauthorized bots could edit (though I haven’t tried it yet). Sandbox1; Sandbox2; Sandbox3.

For further reading please refer to the link, it is pretty descriptive.

Pywikibot

Now once the account is made, let’s come to the main topic — How to write a bot that edits Wikidata? I used Pywikibot to make the bot. Pywikibot is a Python library and collection of scripts that automate work on MediaWiki sites. In this article, I discuss how to write a simple script to create a new statement in Wikidata. Once that is done I discuss how to add references or sources to property value (I could not find official documentation to do this). This much introduction is sufficient for writing more sophisticated bots. Please refer to the following link, after reading the article. I will be using Python 3 and thus readers are encouraged to use that.

How to install and configure Pywikibot?

  1. First download this zip file and extract it. You could also download this tar file and extract it.
  2. Next cd into the extracted folder.
  3. After that run:

python3 pwb.py generate_user_files

It will first ask you for the site you want to edit. Choose Wikidata. After that, it asks for the language you want to edit. Since our bot is unauthorized we would choose — test. If your bot is authorized you can choose the Wikidata option. After that, it asks for your bot’s username. Enter your bot’s username. After that is done, it asks if you want to add any other project (i.e. do you want to edit another site), for now, we will choose “no”. It will then ask for your bot’s password. You can either enter it or skip it for now. Once everything is done a user-config file is generated in the same folder.

4. After that, whenver you want to use pywikibot, run the following command:

python3 pwb.py login

First time it might ask for your password. Do remeber that you need to run this command everytime you wish to use pywikibot.

How to add a statement?

As I told before, first run the command-

python pwb.py login

Please note that the script to add the statement should be in the same directory as above..Before we start, please note that Test Wikidata is not the same as actual Wikidata. A lot of items/properties that exist in actual Wikidata may not exist in Test Wikidata or may exist with different ids. So it is important to check these things before writing the script. For example, the Sitcom Friends Wikidata page does not exist at the time of writing this article whereas it exists in actual Wikidata (Q79784). Similarily Atomic number has property id P167 in Test Wikidata whereas it has id P1086 in actual Wikidata. Similarly, entry for India in Test Wikidata had only 3 statements at the time of writing this article whereas actual Wikidata has 1341 statements.

At the time of writing this article, there was no entry for element Argon on Wikidata, so I created an entry for it. It could be found at link.

Now we will first add a statement about the atomic weight of Argon and then add a reference to the same. We will use this link as reference.

Adding Atomic Weight

Now the aim is to add a statement with the property as Atomic Weight and its corresponding value as 39.948 u. Now our very first difficulty is how to get Property Id for atomic weight. For actual Wikidata we could either search for it or use Sparql or use other APIs but for Test Wikidata there is only one way to go about it, i.e. search using the web interface.

As we can see that Property id for atomic weight is P95435. After all, this is done, we use the following script to add atomic weight to Argon.

import pywikibot
site=pywikibot.Site(“test”, “wikidata”)
repo = site.data_repository()
item = pywikibot.ItemPage(repo, “Q212173”)

As we are using Test Wikidata, the first argument in the second line is “test”. If we were using actual Wikidata we could have used “Wikidata”. The above code loads the data for item Q212173 i.e. Argon. After this, we use the following code to add Atomic Weight.

claim = pywikibot.Claim(repo, “P95435”)
claim.setTarget(“39.948 u”)
item.addClaim(claim, summary=u’Add any message’)

Voila!! This much code is sufficient to add a new statement in for an item. We get the following results:

Before
After

In case you want to add other items as values or for further sophistication, check out this link. Please note that there are various data structures for the value. Atomic weight above had data type strings. There are various other data types, please check them out here. Make sure that while committing you use the proper data structure.

Adding Reference

In Wikidata it is a good practice to cite a source or reference to any value that you add. For example, above we could use this link as a source. It is worth noting that adding a reference URL is also kind of adding a new statement. In the above example, we added a new statement in the item Argon. Adding a reference URL is like adding a new statement in the property atomic weight. Now the reference URL is like a property with id P93 in Test Wikidata and P854 in actual Wikidata. The value of this property is our reference URL. Please make sure that the URL has proper structure i.e. it starts with “https://” etc. We add the following code to the previous code to add the reference URL:

qualifier = pywikibot.Claim(repo, “P93”)
qualifier.setTarget(“https://www.britannica.com/science/argon-chemical-element”)
claim.addSource(qualifier, summary=u’Add any message’)

As we can see that we add source to claim and not item.

Thus combining above two the overall code to add atomic weight and reference for Argon to Test Wikidata is:

import pywikibot
site=pywikibot.Site(“test”, “wikidata”)
repo = site.data_repository()
item = pywikibot.ItemPage(repo, “Q212173”)

claim = pywikibot.Claim(repo, “P95435”)
claim.setTarget(“39.948 u”)
item.addClaim(claim, summary=u’Add any message’)

qualifier = pywikibot.Claim(repo, “P93”)
qualifier.setTarget(“https://www.britannica.com/science/argon-chemical-element”)
claim.addSource(qualifier, summary=u’Add any message’)

Thank you for reading the article. Please check out a simple game that I made to enrich Hindi Wikidata (can be extended to Telugu by making a few changes) at https://github.com/him-mah10/social-computing-project and leave a star :). It was made as a project for the course Social Computing at IIIT Hyderabad, take by Prof. Vasudeva Varma.

After reading the article please check the link for further sophisticated techniques. Please do not pollute Test Wikidata or Actual Wikidata. It is a very valuable commodity. Please reach out to me if there is anything wrong with the article. The article was written on 26 April 2020.

Thank-you!!!

--

--