Studying Wikipedia Edits by Tag

Content

Overview
Learning Outcomes
Accessing Tagged Edits Via Dumps
Accessing the Edit Tag APIs
Example Analyses of Edit Tag Data

Overview

This is an importable notebook that provides simple helpers on how to gather edit data from Wikipedia. It is designed to help someone who wants to understand how to edit on mobile device instead of a desktop computer.

Learning Outcomes

Accessing Tagged Edit Via Dumps

This is an example of how to parse through the history dumps for a wiki -- i.e. complete history of all revisions made to all pages -- and filter that for edits that match a given tag.

Now we'll loop through the the history dump and record how many mobile vs. non-mobile edits were made in each year

Accessing the Edit Tag APIs

The Revisions API can be a much simpler way to access data about edit tags for a given article if you know what articles you are interested in and are interested in relatively few articles (e.g., hundreds or low thousands).

Here we choose an article with title Nigeria, loop through the Dump and store the revision ids of revisions made to it in year 2020. We then compare the mobile and non mobile edits

Now we'll use API endpoint to get the revision ids and tags for revisions in Nigeria article in year 2020

Accessing Tagged Edits Via Dumps

Here we'll visualise with an example article(Nigeria) how the use of mobiles for editing Wikimedia articles has changed over years. We'll use the API data as it is more reliable.

The Revisions API API can be a much simpler way to access data about edit tags for a given article if you know what articles you are interested in and are interested in relatively few articles (e.g., hundreds or low thousands). NOTE: the APIs are up-to-date while the Mediawiki dumps are always at least several days behind -- i.e. for specific snapshots in time -- so the data you get from the Mediawiki dumps might be different from the APIs if edits have been made to a page in the intervening days.

Here we choose an article with title Nigeria, loop through the Dump and store the revision ids of revisions made to it in year 2020. We then compare the mobile and non mobile edits

Now we'll use API endpoint to get the revision ids and tags for revisions in Nigeria article in year 2020

Conclusion: The count of mobile vs non mobile edits fetched from API is consistent with that extracted from dumps.

Example Analyses of Edit Tag Data