Pywikibot Introduction

Pywikibot is a set of python functions which make it much much easier to make automated edits on mediawiki.

**Warning**: You are accountable for every edit you or your python script makes. Be careful and don't get banned!

1. Using a mediawiki site

The first thing pywikibot needs to know, is which mediawiki website to target. There are many official sites like,,,,,, etc. And each has their own versions with different languages like,, etc.

The default website seen on PAWS is the To check the website out, go on to

A mediawiki website has 2 parts which are important. The code and the family. The pywikibot API supports a LOT of official families and codes, and can also add a local instance or a personal deployment of mediawiki.

The family tells pywikibot which type of mediawiki site should be used, and it can read and write data specific to the family. Examples of family are: wikipedia, wikitionary, wikisource, etc.

The code tells pywikibot which variant of the family should be used. Common examples of codes are: en, es, ml, etc. The code depends on the family though. For example, the "commons" family has only the "commons" code.

2. Logging in

In the PAWS interface, the user is set by default to the user account that has been used to login to PAWS. But in a local script, we would need to modify the file to add the username and password. We will see this later.

We tell pywikibot to login with the login() function. Then we check which user has been used to login:

3. Reading data on Pages

To pull data from pywikibot, we use the Page class which holds information about a page from the mediawiki website.

First, we create a Page object using the name of the page. Here, we use the page "User:AbdealiJK/Pywikibot_Tutorial" as an example:

Now we use the class to fetch other information about the page. For example, to get the text of the page:

You can get a lot of other information about the page by using various helper functions provided by pywikibot:

4. Writing data to Pages

In general use the test wikipedia website for writing data, and ensure that you make changes in your User space (pages starting with User:<Your user name> as these are meant for your personal usage like testing these scripts :)

For example, let's create the object for your personal Sandbox page on test wiki:

Here, let's try writing some wiki markup to the page. For example, let's try making your profile !

Note: To get more information about the wikimarkup visit Help:Wiki markup

Let's open up the webpage and see if our changes have been added there.

Using Jupyter and IPython, we can even embed the webpage into the notebook:

5. Textlib functions

Once you can get content and save new content, there are many times you'd like to get a list of categories or templates from a mediawiki instance.

A category is a special namespace (Similar to the user space) which holds categories that are used to classify pages. For example the "Python (programming language)" page on wikipedia has the categories "Category:Class-based programming languages", "Category:Cross-platform free software", "Category:Dynamically typed programming languages" and so on.

To add a category to a page, a link to the category must be added to the medaiwiki page. Hence, something like [[Category:<name of category>]] should be added according to the wiki markup.

A template is a snippet of text which can be included into multiple other pages (Something like a #include or import). The wiki markup to add a template is {{<template name>}} and it can also take in arguments, for example {{<template name>|arg1|arg2}}.

Let's get a list of all categories added to the page:

The textlib functions help to modify the text content on the page for specific needs like adding or removing categories. Hence, it has it's parsers which read through the text and pull out all the category links it finds based on the wiki markup.

Let's try removing a category using the textlib functions:

Other useful methods

Textlib contains many other websties that make editing the tet in mediawiki pages easier. For example:

6. Page Generators

There are many instances where it is useful to create a "page generator" which helps iterate over multiple pages that share a common property. For example, consider you want to find all pages of wikimedia projects:

We use the python module pprint (pretty print) to format the output in a better way rather than dumping it as a list.

For more information on Page generators check pywikibot documentation on pagegenerators

7. Exercises

Exercise 1 - Write a script to remove trailing whitespace from a given page

In many mediawiki pages, we see that editors leave trailing whitespace at the bottom of the page. While this does not matter when the page is rendered for viewing, it adds unnecessary length to the article when downloading the text and raw wikicode.

Write a script to remove the trailing whitespace and keep only 1 newline at the end of the page. (Test this on a testwiki !)

Exercise 2 - Write a script to find the number of devices using Android Operating System

Find the number of pages that exist that are related to devices that use the Android Operating System's category.

8. Setting up pywkibot locally

PAWs provides a method to run pywikibot and related commands through Jupyter notebooks. It has already installed various requirements and so on that are needed for pywikibot scripts. Hence, it's an easy way to get users started. As it's only 1 server on the internet, if everyone began using PAWs, it gets crowded and slow. In such cases, it may be easier to run these scripts locally in your own desktop/laptop.

Installing basic requirements

First, install the basic requirements. This depends on your specific OS.

Installing pywikibot

Pywikibot is currently still a release candidate, hence rather than installing the rc5 from pip, we will get the latest source code at the master branch using git. To do this, run the following command on your terminal or command prompt:

/home/user/git_repos/$ git clone

You will find the folder pywikibot-core has been created in the current working directory. If you wish to move the folder simple move it to another directory, or use the cd command to change directory before running the above git command.

Once the git repository has been downloaded, cd into the directory and run:

/home/user/git_repos/pywikibot-core/$ pip install .

Which installs the pywikibot repository to your python installation. The . (dot) is required as it tells pip to find the python package at the current directory. Pywikibot also has a lot of optional dependencies which are used to run specific scripts and unittests. To install all of these (to avoid errors later) run:

/home/user/git_repos/pywikibot-core/$ pip install -r dev-requirements.txt -r requirements.txt

Configuring pywikibot

Once the pywikibot library has been installed, simply use the script provided in the git repo:

/home/user/git_repos/pywikibot-core/$ python login

And follow the questions to create a which holds your configuration information.


For more information about the configuration and other aspects of pywikibot, check the Pywikibot manual