Previously, we would connect to the databases using information in environment variables, like so:
import os
import pymysql
host = os.environ['MYSQL_HOST']
user = os.environ['MYSQL_USERNAME']
password = os.environ['MYSQL_PASSWORD']
query = "SELECT page_title FROM page WHERE page_title LIKE %s LIMIT 5;"
conn = pymysql.connect(
host=host,
user=user,
password=password
)
with conn.cursor() as cur:
cur.execute("USE eswiki_p;")
cur.execute(query, args=('%Alicante%',))
data = cur.fetchall()
for row in data:
print(str(row[0], encoding='utf-8'))
conn.close()
500_Kilómetros_Nocturnos_de_Alicante 500_Km_Nocturnos_de_Alicante Abubilla_(escultura_de_Alicante) Acceso_Alicante_desde_Elche Acceso_Noroeste_a_Alicante
Due to issues with the proxy that MYSQL_HOST
pointed to, and due to the fact that now we have to connect to the DB we want to use directly, we won't be able to use the environment variables.
Instead of using the environment variables, the connection will be very similar to what is done in other environments like Toolforge (*):
~/.my.cnf
Here is the same example from above:
import pymysql
# Host urls are like {wiki}.{analytics,web}.db.svc.wikimedia.cloud
host = "eswiki.analytics.db.svc.wikimedia.cloud"
credentials = ".my.cnf"
query = "SELECT page_title FROM page WHERE page_title LIKE %s LIMIT 5;"
conn = pymysql.connect(
host=host,
read_default_file=credentials,
database="eswiki_p"
)
with conn.cursor() as cur:
cur.execute(query, args=('%Alicante%',))
data = cur.fetchall()
for row in data:
print(str(row[0], encoding='utf-8'))
conn.close()
500_Kilómetros_Nocturnos_de_Alicante 500_Km_Nocturnos_de_Alicante Abubilla_(escultura_de_Alicante) Acceso_Alicante_desde_Elche Acceso_Noroeste_a_Alicante
Here is the diff of the two code snippets to help you see what changes:
- import os
import pymysql
- host = os.environ['MYSQL_HOST']
- user = os.environ['MYSQL_USERNAME']
- password = os.environ['MYSQL_PASSWORD']
+ # Host urls are like {wiki}.{analytics,web}.db.svc.wikimedia.cloud
+ host = "eswiki.analytics.db.svc.wikimedia.cloud"
+ credentials = ".my.cnf"
query = "SELECT page_title FROM page WHERE page_title LIKE %s LIMIT 5;"
conn = pymysql.connect(
host=host,
- user=user,
- password=password
+ read_default_file=credentials,
+ database="eswiki_p"
)
with conn.cursor() as cur:
- cur.execute("USE eswiki_p;")
cur.execute(query, args=('%Alicante%',))
data = cur.fetchall()
for row in data:
print(str(row[0], encoding='utf-8'))
conn.close()
If you want to see a small example connecting to different databases with the new cluster, please see the notebook Accessing Wikireplicas from PAWS
If you'd like to see more examples or have questions, reach out to #wikimedia-cloud
on freenode IRC or email cloud@lists.wikimedia.org