Monday, March 12, 2012

SQLAlchemy and Connection Pooling

Since Django opens a MySQL connection for every ORM access, we recently decided to implement connection pooling with SQLAlchemy. The basic instructions, which basically are Django version-specific, are posted here. It works fairly well though you have to put custom specific code in the Database.connect() routine since some of the parameters normally passed are not serializable, which SQLAlchemy needs to generate a unique key within its connection pool.

Another issue is the use of "CREATE DATABASE" and "DROP DATABASE" when connected to the same database. If you happen to issue these statements to which a database that is currently connected to this same DB, you may find doing subsequent SELECT statements cause a "Database not selected" error. For instance, these set of commands fail:

import sqlalchemy
engine = sqlalchemy.create_engine('mysql://username:password@localhost/testdb')
connection = engine.connect()

connection.execute("DROP DATABASE testdb")
connection.execute("CREATE DATABASE testdb")
connection.execute("SHOW TABLES")


But this command works:
import sqlalchemy
engine = sqlalchemy.create_engine('mysql://username:password@localhost/testdb')
connection = engine.connect()

connection.execute("DROP DATABASE testdb")
connection.execute("USE testdb")
connection.execute("CREATE DATABASE testdb")
connection.execute("SHOW TABLES")

The problem also happens in MySQLdb too:

import MySQLdb
conn = MySQLdb.connect (host="localhost", user="username", passwd="password", db="testdb")
cursor = conn.cursor()
cursor.execute ("DROP DATABASE testdb")
cursor.execute("CREATE DATABASE testdb")
cursor.execute("SHOW TABLES")

Django's test runner avoids this issue by requiring that the original database connection to be used and then issuing the CREATE/DROP on the test database inside the create_test_db() function. Then it closes the connection and sets the test database name, which forces a reconnect.

django/db/backends/creation.py:
def create_test_db(self, verbosity=1, autoclobber=False):
        """                                                                                                                                                                                                  
        Creates a test database, prompting the user for confirmation if the                                                                                                               
        database already exists. Returns the name of the test database created.                                                                                                              
        """
        self._create_test_db(verbosity, autoclobber)
        self.connection.close()
        self.connection.settings_dict["NAME"] = test_database_name

    def _create_test_db(self, verbosity, autoclobber):
                    cursor.execute("DROP DATABASE %s" % qn(test_database_name))
                    cursor.execute("CREATE DATABASE %s %s" % (qn(test_database_name), suffix))

It appears that the connection needs to be invalidated in SQLAlchemy using connection.invalidate() for SQLAlchemy. Otherwise, the issue can occur. Normally Django will close the connection and reinitiate but SQLAlchemy won't know to do this unless we explicitly invalidate it.

Since the connection appears to be reused, this issue appears to be caused by not issuing a "USE " between CREATE/DROP. This issue was encountered when rolling out django-nose, which was posted here.

No comments:

Post a Comment