Python Data Access patterns, part 1

Recently I’ve been reading Clifton Nock’s book Data Access Patterns: Database Interactions in Object-Oriented Applications, which as you’d expect from the title, covers a lot of patterns to do with data access. The first part of the book is dedicated to decoupling patterns, which seems to me to be pretty fundamental to good design of applications which connect with databases. If you want an application to interact with a database, then in general it’s a good idea to place the code handling data access into a separate component. The idea is to encapsulate physical database calls in logical operations, which the application can then use, as opposed to mixing them into the rest of the application code.

The benefits of keeping application code (code which needs to know about objects in the application domain) separate from data access code (code which needs to know about how to get at the database) are many. Maintenance is a lot easier – by exposing only logical operations to your application you can change anything related to the detail of how these are implemented without worrying about breaking code elsewhere in the application. The application code itself is cleaner, as it can focus on the objects without worrying about how these are stored physically. This also affords a nice division of labour, so you could have someone working on the business logic in the application who knows nothing of the database behind it.

Here’s an example of how such a data access component could be implemented in Python, where the application is interacting with a SQLite database. The DataAccessor class (apologies to Clifton Nock here, I both stole the class name and modelled the code pretty closely on his Java version) handles connecting to the database and exposes the operations the application might require: read, insert, update, delete. For the sake of brevity I’ve only included the read and insert operations here – update and delete follow a pretty similar pattern to insert.

import apsw
import os

def sql_trace(stmt, bindings):
    'Echoes all SQL executed'
    print "SQL:", stmt
    if bindings:
        print "Bindings:", bindings
    return True

class DataAccessor(object):
    '''
    Class to handle data access using apsw sqlite wrapper
    '''
    def __init__(self, dbpath, echo=False):
        try:
            if os.path.exists(dbpath):
                self.conn = apsw.Connection(dbpath)
                self.cur = self.conn.cursor()
                if echo:
                    self.cur.setexectrace(sql_trace)
            else:
                raise IOError('Database not found: ' + dbpath)
        except apsw.CantOpenError as detail:
            print "Unable to open db file: ", dbpath, detail
            raise

    def read(self, table, columns=None, where_row=None, sort_cols=None):
        '''Executes a SELECT statement against table.

        Arguments:
        table                 -- name of the table to be read
        columns (optional)    -- list of columns to be read
                              from table
        where_row (optional)  -- dict used to build WHERE
                              clause
        sort_cols (optional)  -- list of (column, order) pairs
                              used to specify order of the
                              rows returned. Needs to be of
                              the form ('<column>', 'ASC'|'DESC')

        Returns: rows returned from the SELECT statement.
        '''
        try:
            stmt = 'SELECT '
            if columns:
                stmt += ', '.join(columns)
            else:
                stmt += '*'

            # from clause
            stmt += "\nFROM " + table

            # where clause
            if where_row:
                stmt += "\nWHERE "
                stmt += "\n  AND ".join([col + "=:" + col \
                                    for col in where_row])

            # order clause
            if sort_cols:
                stmt += "\nORDER BY "
                stmt += ', '.join([col[0] + ' ' + col[1] \
                                    for col in sort_cols])

            stmt += ';'

            # submit and return results
            args = where_row and (stmt, where_row) or (stmt,)

            results = columns and [dict(zip(columns, row)) \
                for row in self.cur.execute(*args)] \
                or [row for row in self.cur.execute(*args)]

            return results

        except apsw.SQLError as sql:
            print 'Error in SQL submitted:', sql
            print 'SQL:', stmt
            if where_row:
                print 'Bindings:', where_row

        except apsw.Error as error:
            print 'APSW Error: ', error

        except Exception as error:
            print 'Error reading from database:', error

        finally:
            self.cur.close()

    def insert(self, table, values):
        '''Executes an INSERT statement against table.

        Arguments:
        table           -- name of the table to be written to
        values          -- list of rows (dicts) to be inserted

        Returns: None
        '''
        try:
            # build list of column names
            cols = values[0].keys()

            # generate insert statement
            stmt = 'INSERT INTO ' + table + ' ('
            stmt += ', '.join(cols)
            stmt += ') VALUES ('
            stmt += ', '.join([":%s" % col for col in cols])
            stmt += ')'

            # submit

            self.cur.execute('BEGIN IMMEDIATE')
            self.cur.executemany(stmt, values)
            self.cur.execute('COMMIT')

            return self.conn.changes()

        except apsw.SQLError as sql:
            print 'Error in SQL submitted:', sql
            print 'SQL:', stmt
            self.cur.execute('ROLLBACK')

        except apsw.Error as error:
            print 'APSW Error: ', error
            self.cur.execute('ROLLBACK')

        except Exception as error:
            print 'Error submitting insert:', error
            self.cur.execute('ROLLBACK')

        finally:
            self.cur.close()

I’ve used the apsw SQLite wrapper here, but if at some point I decided to switch to pysqlite, or to use a MySQL database, I could do so without greatly affecting the calling code. The potential drawback here is that what you gain is offset by the loss of control. It may well be that all your application needs to do is to read from or write to a single table at a time, but what about if it needs to execute a join? Drop or create a table or view? What if you need to do specify a more complex where condition than “column=value”?

It may well be possible to rewrite this class to expose these operations, but I could see this getting to be a lot of work. In the next part I’ll look at some ORM (Object Relational Map) solutions which get around this nicely by mapping an OO structure to a relational model.

Advertisements