Jupyter notebooks are a coding environment for Python (and several other programming languages). Like code editors such as SublimeText (and your Terminal or Powershell), you can run code in a notebook and see the output printed below.
Jupyter notebooks have several other benefits:
You can run any Python in Jupyter notebooks that you can in your text editor or terminal.
When you press SHIFT+ENTER, the code is executed. If you include print statements, or if your code raises an error, that will be displayed below the code block.
pythonistas = ["John","Graham","Eric","Michael","Terry J.", "Terry G."]
for p in pythonistas:
print(p)
John Graham Eric Michael Terry J. Terry G.
You can insert blocks of text in your notebook, and format them using a plaintext formatting language called Markdown.
"# Heading one"
"## Heading two"
"### Heading three"
"#### Heading four"
Numbered lists:
bold text
**bold text**
italic text
*italic text*
More here: https://daringfireball.net/projects/markdown/syntax
(This part is a little bit manual)
All notebooks are technically public by default. In order to share the public (non-executable) version of any notebook on paws.wikimedia.org, you need to manually change the URL.
import requests
ENDPOINT = 'https://en.wikipedia.org/w/api.php'
parameters = { 'action' : 'query',
'prop' : 'revisions',
'titles' : 'Panama_Papers',
'format' : 'json',
'rvdir' : 'newer',
'rvlimit' : 500,
'rvstart': '2016-04-03T17:59:05Z',
'rvend' : '2016-04-04T17:59:05Z',
'continue' : '' }
num_revisions = 0
done = False
while not done:
wp_call = requests.get(ENDPOINT, params=parameters)
response = wp_call.json()
pages = response['query']['pages']
for page_id in pages:
page = pages[page_id]
revisions = page['revisions']
for revision in revisions:
num_revisions += 1
print('Done one query, num revisions is now ' + str(num_revisions))
if 'continue' in response:
parameters['continue'] = response['continue']['continue']
parameters['rvcontinue'] = response['continue']['rvcontinue']
else:
done = True
print(parameters['titles'] + ' had ' + str(num_revisions) + ' revisions in the first 24 hours')
Done one query, num revisions is now 500 Done one query, num revisions is now 607 Panama_Papers had 607 revisions in the first 24 hours
Once you've uploaded a file to your PAWS fileserver, you can import it into your Python code the usual way, since it's in the same directory.
NAMES_LIST = "yob2011_short.txt"
boys = {}
girls = {}
for line in open(NAMES_LIST, 'r').readlines():
print(line)
name, gender, count = line.strip().split(",")
count = int(count)
if gender == "F":
girls[name.lower()] = count
elif gender == "M":
boys[name.lower()] = count
Sophia,F,21799 Isabella,F,19850 Emma,F,18761 Olivia,F,17286 Ava,F,15471 Emily,F,14228 Abigail,F,13221 Madison,F,12351 Mia,F,11503 Chloe,F,10966 Elizabeth,F,10050 Ella,F,9567 Addison,F,9286 Natalie,F,8620 Lily,F,8164 Grace,F,7613 Samantha,F,7375 Avery,F,7331 Sofia,F,7314 Aubrey,F,7167 Brooklyn,F,7151 Lillian,F,6900 Victoria,F,6874 Evelyn,F,6695 Hannah,F,6547 Alexis,F,6508 Charlotte,F,6414 Zoey,F,6388 Leah,F,6372 Amelia,F,6356 Zoe,F,6287 Hailey,F,6258 Gabriella,F,6079 Layla,F,6071 Nevaeh,F,6068 Kaylee,F,6027 Alyssa,F,5996 Anna,F,5641 Sarah,F,5532 Allison,F,5447 Savannah,F,5433 Ashley,F,5392 Audrey,F,5206 Taylor,F,5184 Brianna,F,5171 Aaliyah,F,5102 Riley,F,5026 Camila,F,4965 Khloe,F,4942 Zakarri,M,5 Zakhar,M,5 Zakhari,M,5 Zakry,M,5 Zalynn,M,5 Zaman,M,5 Zamaree,M,5 Zamarius,M,5 Zamiel,M,5 Zamiere,M,5 Zandar,M,5 Zandre,M,5 Zandyn,M,5 Zanthony,M,5 Zari,M,5 Zarrion,M,5 Zaryn,M,5 Zathan,M,5 Zaviyon,M,5 Zaya,M,5 Zayen,M,5 Zayir,M,5 Zayvien,M,5 Zecheriah,M,5 Zeid,M,5 Zeik,M,5 Zell,M,5 Zeph,M,5 Zephram,M,5 Zephyrus,M,5 Zepplin,M,5 Zerik,M,5 Zeryk,M,5 Zeyd,M,5 Zeyden,M,5 Zhair,M,5 Zhi,M,5 Zidaan,M,5 Zihan,M,5 Zihao,M,5 Ziheir,M,5 Zimri,M,5 Zyerre,M,5 Zykell,M,5 Zylar,M,5 Zylas,M,5 Zyran,M,5 Zyshawn,M,5 Zytavion,M,5
print(girls['sophia'])
21799
The site that hosts these notebooks (called "WMF Labs") is run by the Wikimedia Foundation and governed by the following Terms of Use: https://wikitech.wikimedia.org/wiki/Wikitech:Labs_Terms_of_use
Of these, the most relevant to us is are the rules around the data that can be hosted on WMFLabs server. Please do NOT place any of the following types of data in your notebook or your home directory:
This means you should NOT allowed upload (e.g. from a CSV) or download (e.g. a JSON dump from an API query) the following types of data to your PAWS notebooks or home directory:
Failure to comply with these rules may lead to your data or notebooks being deleted and/or your Wikipedia account being blocked.
If you are working with proprietary and/or private data, or you simply don't want to use PAWS, you can also run Jupyter notebooks on GitHub. More information here: https://github.com/blog/1995-github-jupyter-notebooks-3