Studying Wikipedia Page Protections

This notebook provides a tutorial for how to study page protections on Wikipedia either via the Mediawiki dumps or API. It has three stages:

Accessing the Page Protection Dumps

This is an example of how to parse through Mediawiki dumps and determine what sorts of edit protections are applied to a given Wikipedia article.

Accessing the Page Protection APIs

The Page Protection API can be a much simpler way to access data about page protections for a given article if you know what articles you are interested in and are interested in relatively few articles (e.g., hundreds or low thousands).

NOTE: the APIs are up-to-date while the Mediawiki dumps are always at least several days behind -- i.e. for specific snapshots in time -- so the data you get from the Mediawiki dumps might be different from the APIs if permissions have changed to a page's protections in the intervening days.

Example Analyses of Page Protection Data

Here we show some examples of things we can do with the data that we gathered about the protections for various Wikipedia articles. You'll want to come up with some questions to ask of the data as well. For this, you might need to gather additional data such as:

Descriptive statistics

TODO: give an overview of basic details about page protections and any conclusions you reach based on the analyses you do below

Predictive Model

TODO: Train and evaluate a predictive model on the data you gathered for the above descriptive statistics. Describe what you learned from the model or how it would be useful.

Future Analyses

TODO: Describe any additional analyses you can think of that would be interesting (and why) -- even if you are not sure how to do them.