Comparing top read and trending articles

See the research report

Imports and inputs

Generate the sample sets

Gather the current top 5 articles by pageviews and trending edits, and their metadata

Top read articles from yesterday

-Update 7/17/2017: filters out anything with a trendiness score of less than 1, and anything with fewer than 5 editors.

Get the top five articles in each set

Convert the dicts into a list of tuples for ranking and truncate each list at 5 items after filtering out any duplicates in the top 5

Store top 5 titles for later reference

Build the pages on testwiki

Get the latest version of each datafile

remove everything from the full datasets if it's not on the top5 list

Convert the top 5 dicts back into lists of tuples sorted by rank

Create the study pages on test.wikipedia.org

Identify any non-commons images

I will manually upload these to test.wikipedia.org