Accessing the new replicas, changes from the previous cluster

  1. Old cluster, using environment variables
  2. New cluster, using credentials file and host names
  3. Diff between the two versions
  4. Other links and contact info

Old cluster, using environment variables

Previously, we would connect to the databases using information in environment variables, like so:

Due to issues with the proxy that MYSQL_HOST pointed to, and due to the fact that now we have to connect to the DB we want to use directly, we won't be able to use the environment variables.

New cluster, using credentials file and host names

Instead of using the environment variables, the connection will be very similar to what is done in other environments like Toolforge (*):

Here is the same example from above:

Diff between the two versions

Here is the diff of the two code snippets to help you see what changes:

- import os
  import pymysql

- host = os.environ['MYSQL_HOST']
- user = os.environ['MYSQL_USERNAME']
- password = os.environ['MYSQL_PASSWORD']
+ # Host urls are like {wiki}.{analytics,web}
+ host = ""
+ credentials = ".my.cnf"

  query = "SELECT page_title FROM page WHERE page_title LIKE %s LIMIT 5;"

  conn = pymysql.connect(
-     user=user,
-     password=password
+     read_default_file=credentials,
+     database="eswiki_p"
  with conn.cursor() as cur:
-     cur.execute("USE eswiki_p;")
      cur.execute(query, args=('%Alicante%',))
      data = cur.fetchall()
      for row in data:
          print(str(row[0], encoding='utf-8'))


If you want to see a small example connecting to different databases with the new cluster, please see the notebook Accessing Wikireplicas from PAWS

If you'd like to see more examples or have questions, reach out to #wikimedia-cloud on freenode IRC or email