Jupyter Notebooks

Jupyter notebooks are a coding environment for Python (and several other programming languages). Like code editors such as SublimeText (and your Terminal or Powershell), you can run code in a notebook and see the output printed below.

Jupyter notebooks have several other benefits:

  1. You can run individual blocks of code one at a time.
  2. If you run code that prints any output, that output will be saved until the next time you run that block of code.
  3. You can publish your notebook publicly so that others can see your code and output in one place, accessed through a stable URL.
  4. You can easily import someone else's notebook, and customize their code.

Running code

You can run any Python in Jupyter notebooks that you can in your text editor or terminal.

When you press SHIFT+ENTER, the code is executed. If you include print statements, or if your code raises an error, that will be displayed below the code block.

Formatting text

You can insert blocks of text in your notebook, and format them using a plaintext formatting language called Markdown.

Heading one

"# Heading one"

Heading two

"## Heading two"

Heading three

"### Heading three"

Heading four

"#### Heading four"

Numbered lists:

  1. item one
  2. item two
  3. item three

bold text

**bold text**

italic text

*italic text*

More here: https://daringfireball.net/projects/markdown/syntax

Forking (copying) a Notebook

  1. get the url of another public PAWS notebook (example: https://paws.wmflabs.org/paws/user/Jtmorgan/notebooks/DS4UX%20Jupyter%20intro.ipynb)
  2. pass in a raw param to download a raw .ipynb file https://paws.wmflabs.org/paws/user/Jtmorgan/notebooks/DS4UX%20Jupyter%20intro.ipynb?format=raw
  3. log into your PAWS account and use "upload" to upload this copy into your own directory

Publishing a Notebook

(This part is a little bit manual)

All notebooks are technically public by default. In order to share the public (non-executable) version of any notebook on paws.wikimedia.org, you need to manually change the URL.

  1. Go to a Notebook (example: https://paws.wmflabs.org/paws/user/Jtmorgan/notebooks/DS4UX%20Jupyter%20intro.ipynb)
  2. Change "paws" to "paws-public" in both places where it appears in the URL ENCODED version of the URL https://paws-public.wmflabs.org/paws-public/User:Jtmorgan/DS4UX%20Jupyter%20intro.ipynb
  3. Share the new version of the URL with anyone who you want to be able to view the notebook. Every time you "save" your original notebook, the public version will reflect those changes.

Query an API

Import files

Once you've uploaded a file to your PAWS fileserver, you can import it into your Python code the usual way, since it's in the same directory.

IMPORTANT: data licensing and privacy

The site that hosts these notebooks (called "WMF Labs") is run by the Wikimedia Foundation and governed by the following Terms of Use: https://wikitech.wikimedia.org/wiki/Wikitech:Labs_Terms_of_use

Of these, the most relevant to us is are the rules around the data that can be hosted on WMFLabs server. Please do NOT place any of the following types of data in your notebook or your home directory:

This means you should NOT allowed upload (e.g. from a CSV) or download (e.g. a JSON dump from an API query) the following types of data to your PAWS notebooks or home directory:

Failure to comply with these rules may lead to your data or notebooks being deleted and/or your Wikipedia account being blocked.

Remember: everything you put into your notebook is publicly accessible!

Jupyter notebooks on GitHub

If you are working with proprietary and/or private data, or you simply don't want to use PAWS, you can also run Jupyter notebooks on GitHub. More information here: https://github.com/blog/1995-github-jupyter-notebooks-3

Some example notebooks