Getting started with PAWS¶

Table of Contents¶

  • Introduction to Jupyter and PAWS
  • Getting started with PAWS
  • Quickstart
  • Finding your way around the control panel
  • Create a new notebook
  • Title your notebook
  • Save and download your notebook
  • Share your notebook
  • Fork a notebook
  • Best practices
  • PAWS and Jupyter documentation
  • Additional Resources

Introduction to Jupyter and PAWS¶

Jupyter Notebooks¶

Jupyter Notebooks are Open Source web apps that allow you to create and share documents that contain live code equations, visualizations, and text. Jupyter notebooks are incredibly flexible and have many uses. They can function as a light weight browser based development environment that allows you to execute code and display text, equations, images and more all on the same page.

Uses include:

  • Writing and running live code
  • Creating documentation and tutorials
  • Data cleaning, transformation, and analysis
  • Write and iterate on Python code
  • Writing and running SQL queries
  • Writing and running resource light bots
  • Much more...

Little or no programming skill is required to use Jupyter notebooks. They are used by a robust community users in in technology and the sciences. There are many examples and resources for new and advanced users to draw on, making them a powerful tool for users along the technical spectrum.

PAWS (A Web Shell)¶

PAWS: A Web Shell (PAWS) is a service that hosts Jupyter notebooks for use by Wikimedia's contributors. PAWS users can launch, publish, and fork notebooks without having to install Jupyter on a local computer. Users only need a Wikimedia SUL and a internet connected web-browser to use the service.

PAWS makes it easier for volunteers along the spectrum to work in technical spaces and make contributions Wikimedia's technical projects.

Some things you can use PAWS for Wikimedia technical projects include:

  • Creating documentation and tutorials
  • Perform queries against replica databases
  • Write and run scripts using Pywikibot to help support Wikimedia projects (Note: here that for heavy duty or scheduled jobs folks should be using Toolforge.)
  • Keeping notes on your work

Getting started with PAWS¶

Quickstart¶

1) Launch PAWS in your browser.

2) Sign in with MediaWiki.

3) Allow Oauth permission.

4) Party! You're in! 😄

Finding your way around the control panel¶

You can create and navigate your notebooks and files from the PAWS control panel. Now that you are in the PAWS control panel, you may want to take some time to have fun and explore on your own.

In the upper left corner, you'll find some options for existing notebooks:

The "Files" tab provides you with several options for actions you can take with your notebooks.

The "Running" tab lets you see which terminals and notebooks you have running, what kernel they are on, and their status.

In the upper right corner, you'll find an option to upload or create a new notebook:

As you can see, you have the option to use several different programming languages: Bash, Python 3, and R. You can also create a text file or folder or run a terminal.

Create a new notebook¶

For the purpose of this tutorial, we'll create a Python 3 notebook and navigate through some of the basics of how to use it.

From your PAWS control panel create a new Python 3 notebook.

Your new notebook¶

Your brand new notebook will start with one empty Code cell by default:

Cells¶

Notebooks are organized using separate cells that can contain text or executable code. Each cell is its own unit of text or code that can be executed within the cell. You can change the position of cells within the notebook; you can merge cells; you can insert new cells anywhere you like, allowing you to intermix code and text.

You can run code alone in a cell or run all the cells in order to perform more complex operations.

Note: The order of the cells in the notebook and the order that they are run in is important. If you run all cells at once, the notebook will run them in order of top to bottom. However, because you can run cells separately, it is possible you could run them out of order, -- especially if you go back and revise a cell you already created -- and this can affect the outcome of your operation.

Cell Options¶

You can change the nature of the cell from the dropdown on the menu:

  • Code: New cells are code cells by default. You can use this cell to run code in the programming language you've chosen for your notebook.
  • Markdown: This cell can be used for Markdown and can be used for adding documentation, links, and images. Markdown cells are used for body-text, and contain markdown, as defined in GitHub-flavored markdown. HTML is also supported.
  • Raw NBConvert: You can use Raw NBConvert cells to write output directly or save code that you don’t want to run. The cells do not support formatting and will cannot be run.
  • Heading: You can use Markdown to creat headers in your notebook or use the heading cell. This provides a little more organizational structure for your notebook.

Menu options¶

From the top menu you can save, add, remove, copy, move, and run cells. There are keyboard shortcuts for each of these functions, as well.

Click the Command Panel button to find a comprehensive list of commands available to you.

Title your notebook¶

The first thing you want to do is give our notebook a descriptive title. This will help you find it easily in your control panel. Since all notebooks hosted on PAWS are publicly available, this will allow others to understand the content of your notebook quickly, too.

You can do this by clicking the "Untitled" text at the top left of your screen and entering a new title.

You can change the name of your notebook at anytime, either by clicking through the title from your notebook or from your PAWS control panel.

Save and download your notebook¶

While working in PAWS, your notebook will be saved automatically from time to time. You can view when the notebook was last checkpointed and saved at the top of your screen.

You can also save and checkpoint your notebook by clicking the File tab in the menu.

You can also download your notebook in a number of file formats. This can be helpful if you are planning on using your notebook for other purposes later.

Share your notebook¶

There are several ways to share your notebook.

1) Click the "Public Link" button

2) Alter the URL manually to get the public link to your notebook: https://public.paws.wmcloud.org/User:YOURUSERNAME/YOURNOTEBOOK.ipynb

Fork a notebook¶

If you want to build on the work of an existing public notebook, you can create a copy of it for your personal use (AKA "fork"), and upload it to your PAWS control panel.

1) Get the URL of another public PAWS notebook. Example: https://public.paws.wmcloud.org/YOURUSERNAME/YOURNOTEBOOK.ipynb

2) Add ?format=raw to the end of the URL to download a raw .ipynb file. Example: https://public.paws.wmcloud.org/YOURUSERNAME/YOURNOTEBOOK.ipynb?format=raw

3) Log into your PAWS account and use Upload button to upload this copy into your own control panel.

Best practices¶

  • PAWS is a service for the Wikimedia Technical community and uses the resources of Wikimedia Cloud Services. You should use it to host notebooks that support that community.
  • All notebooks hosted on PAWS are available to the public. Write your notebook with that in mind. Use Markdown cells between code cells to document what you are doing. This will be useful for others who want to perform similar tasks (and for you later when you return to the notebook).
  • Don't use passwords or private SSH keys with PAWS. For SSH, use phone or computer based clients instead of the PAWS terminal.
  • Follow the Python style guide for you code.
  • Avoid using PAWS to run complex operations that require a lot heavy processing. Consider using Toolforge instead.
  • Pay attention to the order in which you run your cells.
  • Save time by creating a template.ipynb that includes imports of packages, settings, and code snippets you frequently use. You can use Duplicate when you are ready to create a new notebook.
  • Include a licensing statement statement on your notebook, so others will know if or how they can reuse material you publish in them. This website can help you decide which license would work best for you.

Example of a license statement¶

Licensing

Copyright 2020

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

PAWS and Jupyter documentation¶

  • PAWS documentation on Wikitech. Learn more about why PAWS might be the right tool for you, using PAWS and Pywikibot to create scripts to help with automated tasks on wikis, report bugs and request features, and more.
  • Jupyter documentation on readthedocs. Learn more about Jupyter notebooks in general. This documentation is comprehensive and can answer many questions you might have about the capabilities of Jupyter notebooks.

Additional resources¶

  • Getting started with Jupyter notebooks
  • A list of keyboard shortcuts for Jupyter notebooks
  • Markdown cheatsheet
  • Basic Python syntax and plotting
  • Basic Python for ecology
  • Lesser known ways of using notebooks