----------------------------------------------------------------- PySQLite: Python DB-API 2.0 Compliant Interface Module for SQLite ----------------------------------------------------------------- This document was last updated for PySQLite version 0.4.3. =============== 0. Front Matter =============== 0.1 Copyright notice and License -------------------------------- (c) 2002 Michael Owens (c) 2002-2003 Gerhard Häring Permission to use, copy, modify, and distribute this software and its documentation for any purpose and without fee is hereby granted, provided that the above copyright notice appear in all copies and that both that copyright notice and this permission notice appear in supporting documentation. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. 0.2 Abstract ------------ SQLite is a powerful embedded relational database management system in a compact C library, developed by D. Richard Hipp. It offers support for a large subset of SQL92, multiple tables and indices, transactions, not to mention ODBC drivers. The library is self-contained and implemented in under 20,000 lines of code, which itself is uncopyrighted, and free to use for any purpose. It runs on a wide variety of platforms including Linux, FreeBSD, Windows, and has been ported to the Sharp Zaurus. The goal of this project is to develop a Python extension for SQLite that conforms to the Python Database API Specification 2.0, exposes all SQLite functionality to Python programmers and offers additional ease-of-use features. =============== 1. Introduction =============== 1.1 SQLite and Python --------------------- 1.1.1 Overview ~~~~~~~~~~~~~~ SQLite is a powerful embedded database implemented in a small, compact C library. It has been adapted to command-line utilities, specialized C programs, web servers, PDAs, and programming languages, of which Python is but one. Because of its small size, wide platform availability, liberal copyright license, and solid design, it lends itself to wide range of applications in both open source and commercial software. Programmers who have traditionally used simple BTree databases such as gdbm can now take advantage of using a relational database without adding substantial overhead to their applications. Users who often wrote simple reports in Microsoft Visual Basic and Access can now easily use Python and SQLite for free, and the resulting databases and programs are no longer limited to Windows-only platforms. The very same code will also run just as easily on an Apple Macintosh or Sun UltraSPARC. Python is the best scripting language in the world (I say without the least bias). It is clean, simple, elegant, and has an extensive library and user base. Like SQLite, it is available on everything from PDAs to mainframes. It is used in scientific computing, system administration, graphical user applications, search engines, web servers, movie studios, and the list goes on. It is not only an excellent language for professional programmers, but for beginners as well. It has been ported to a wide array of operating systems and architectures and is free on every one of them. 1.1.2 Applications ~~~~~~~~~~~~~~~~~~ The combination of Python and SQLite makes programming with a relational database both open and available, spanning operating systems and architectures, and all at a surprisingly low cost [1]_. PySQLite's application ranges from both sides of the continuum: Python can be a handy administration language with which to make operating on SQLite databases easier, and likewise SQLite can serve as a convenient persistent storage mechanism for use in existing Python applications. 1.1.3 Ease of Use ~~~~~~~~~~~~~~~~~ The ease of use of both SQLite and Python cannot be understated. By itself, the SQLite C library requires only three functions to execute an SQL query and retrieve its results. Python makes it even easier by virtue of its being a scripting language. A complete program to sort a taxonimic database of plants by species and genus name could be done with the following program:: import sqlite cx = sqlite.connect("db") cu = conn.cursor() cu.execute(""" select genus, species, family, category from calflora order by species, genus """) for row in cu.fetchall(): print "%14s, %15s, %19s, %8s, %25s, %i" % (row.genus, row.species, row.family, row.category) cx.close() 1.2 Required Software --------------------- - PySQLite requires Python 2.1 or later. - PySQLite requires SQLite 2.5.6 or later. Using the most current one is recommended in order to enjoy the full feature set. 1.3 Thread Safety ----------------- SQLite does not directly support threads, but at the same time it does not impede their use either. Detailed information regarding the use of threads is available on the website. In essence, you can use threads safely as long as each thread maintins its own connection and associated cursors. Multiple threads accessing the same connection or cursor is not safe, unless you take precautions using the threading module. Thus, PySQLite is thread safe at level 1, as defined in the DB-API Specification. 1.4 Installation ---------------- The source distribution can be built on both Windows and UNIX. However, building on Windows can be somewhat more difficult due to differences in build tools. All platforms require that both Python and SQLite libraries/headers are already present on the system. 1.4.1 UN*X Installation ~~~~~~~~~~~~~~~~~~~~~~~ The source tarball is currently the only method of installation on UNIX. Fortunately, it is easy and painless. After you untar the source, cd into the pysqlite directory and do the following as root:: python setup.py build python setup.py install 1.4.2 Windows Installation ~~~~~~~~~~~~~~~~~~~~~~~~~~ On Windows, Python's distutils defaults to the Visual C++ compiler from Microsoft. If you want to use other compilers for compiling Python extensions on Windows, look into chapter 3.1 "Using non-Microsoft compilers on Windows" in the "Installing Python Modules" of your Python documentation. It's available online at http://www.python.org/doc/current/inst/non-ms-compilers.html The following are build instructions for the GNU C compiler, Borland C++ and for Microsoft's Visual C++ environment. 1.4.2.1 Using the GNU C Compiler ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This is the most tested method, as the author uses mingw to create the win32 binaries for PySQLite. As you have read the Python documentation for non-Microsoft compilers by now, you have mingw or Cygwin installed and created the required import library for the Python DLL. Fine, let's continue. 1. From http://www.hwaci.com/sw/sqlite/download.html get the sqlite_source.zip and sqlitedll.zip files. Unpack them all in the same directory. Create an import library for the GNU linker: ``dlltool --def sqlite.def --dllname sqlite.dll --output-lib libsqlite.a`` 2. Unpack the PySQLite sources and open setup.py in your editor. Search for win32. Change the include_dirs and library_dirs variable to point the place where you've unpacked the SQLite files and where you created the import library. 3. Build PySQLite: ``python setup.py build --compiler=mingw32`` 1.4.2.2 Using the Borland Compiler ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ As you have read the Python documentation for non-Microsoft compilers by now, you have installed the Borland C++ compiler and created the required import library for the Python DLL. Fine, let's continue. 1. From http://www.hwaci.com/sw/sqlite/download.html get the sqlite_source.zip and sqlitedll.zip files. Unpack them all in the same directory. 2. Create an import library for the Borland linker: ``implib -a sqlite.lib sqlite.dll`` 3. Unpack the PySQLite sources and open setup.py in your editor. Search for "win32". Change the include_dirs and library_dirs variable to point the place where you've unpacked the SQLite files and where you created the import library. 4. Build PySQLite: ``python setup.py build --compiler=bcpp`` 1.4.2.3 Using Microsoft Visual C++ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1. From http://www.hwaci.com/sw/sqlite/download.html get the sqlite_source.zip and sqlitedll.zip files. Unpack them all in the same directory. 2. Create an import library for the Microsoft linker: ``lib /def:sqlite.def`` 3. Unpack the PySQLite sources and open setup.py in your editor. Search for win32. Change the include_dirs and library_dirs variable to point the place where you've unpacked the SQLite files and where you created the import library. 4. Build PySQLite: ``python setup.py build`` 1.4.2.4 After Building PySQLite ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1. Make sure the sqlite.dll can be found. Either copy it into your system directory, somewhere else in your PATH or in the same directory as the .pyd file from PySQLite. 2. Run the included test suite with these or similar commands:: copy ..\sqlite\sqlite.dll build\lib.win32-2.3 copy test\*.py build\lib.win32-2.3 cd build\lib.win32-2.3 python all_tests.py All the tests should pass. If something goes wrong, report it to the PySQLite developers using the Sourceforge bug tracker. 3. Install PySQLite: ``python setup.py install`` Again make sure the sqlite.dll can be found 1.5 Getting Help ---------------- 1.5.1 Mailing Lists ~~~~~~~~~~~~~~~~~~~ If an answers cannot be found in the documentation, you can post to the PySQLite users' mailing list on SourceForge. To join the list, go to http://lists.sourceforge.net/lists/listinfo/pysqlite-users. ================================ 2. Features Specific to PySQLite ================================ 2.1 API Modifications --------------------- 2.1.1 Functions ~~~~~~~~~~~~~~~ connect() SQLite is a file-based embedded database. As such, you only need to give the path to the database file to the connect function. 2.2 API Additions ----------------- 2.2.1 Data Types ~~~~~~~~~~~~~~~~ SQLite is typeless. It stores all data as text. However, all data returned from PySQLite are cast into their respective Python types as declared in the database schema, provided that said types are standard ANSI SQL type (e.g. VARCHAR, INT, FLOAT, etc.). In addition to using the schema, PySQLite allows you to specify the data types of columns to be retrieved by a select statement using types in a SQL comment. This is done using execute(). When fetchone(), fetchmany() or fetchall() is called, PySQLite converts columns in the returned row to the types specified in the types list. For example, say you have a table defined as such:: create table plants ( id integer primary key, name varchar(25), av_height int, av_lifespan float ); Now you want to select all records in plants and have them properly typed. You would do as follows:: cnx = connect('db') c = cnx.cursor() c.execute('-- types int,str,int,float') c.execute('select * from plants') PySQLite looks for types in SQL comments and if found, stores the list of types internally for later use when rows are fetched from the result set. In this particular example that even without the '-types' mechanism, data types would have still been properly cast since they are ANSI datatypes. 2.2.1.1 Custom Types ~~~~~~~~~~~~~~~~~~~~ While PySQLite supports native Python data types, you can also apply your own data types to result sets as well. When you do this, PySQLite will pass the field value to your type's constructor when converting the column type. The way to do this is to add your type the the converters list, which is done on connect() This is perhaps best illustrated by example:: class my_type: def __init__(self,value): self.value = value def __repr__(self): return "my_type(%.3f)" % float(self.value) def __str__(self): return "%.3f" % float(self.value) cnx = sqlite.connect(db="db", mode=077, converters={'my_type':my_type}) c = conn.cursor() SQL = "select genus, species, family, category, author1, upper_elev_ft " \ " from calflora order by genus, species limit 33" cursor.execute('-- types str,str,str,str,str,my_type') cursor.execute(SQL) The resulting rows fetched will have upper_elev_ft formatted with three decimal places. 2.2.2 Result Sets ~~~~~~~~~~~~~~~~~ PySQLite's result set class is taken from pyPgSQL, which adds many niceties not required by the DB API Specification. As such, fetchone() returns more than just a tuple of fields, but a flexible data structure which can behave as a tuple, but also has attributes with the field names from the returned rows. Thus you can do things as in the following example:: c.execute('select last_name, street, city, zip from contacts') row = c.fetchone() # By field number (the DB-API way) print row[1] # By attribute print row.city print row.zip # As a slice print row[2:-1] 2.2.3 Aggregates and Functions ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The SQLite Extended C API has a nice framework for allowing you to add your own functions and aggregates which can be called from SQL. For example, while SQLite does not support the concept of CURRENT_DATE, you could use the framework to create an equivalent function (current_date()) so that statements like SELECT current_date() or INSERT into orders (cust_id, price, order_date) values (1, 100, current_date() are valid statements within SQLite. The only problem is that the SQLite framework requires you to implement these extensions in C. Fortunately, PySQLite provides a framework which allows you to implement them in Python. The process of creating user-defined functions and aggregates is very similar to creating custom types. The best way to demonstrate these is by example. 2.2.3.1 Functions ~~~~~~~~~~~~~~~~~ The following is from functions.py which is included in the calflora example. It creates a function times_two which takes a single argument and appropriately return a multiple of two. The example applies the times_two argument to the upper_elev_ft field in a simple select statement.:: # Function def times_two(x): return float(x)*2 conn = sqlite.connect("db", 077) # Add the aggregate conn.create_function("times_two", 1, times_two) cursor = conn.cursor() # Expect string, then two float values for columns cursor.execute('-- types str, float, float') # Execute SQL with custom aggregate cursor.execute("select family, upper_elev_ft, times_two(upper_elev_ft) from calflora limit 10") # DB-API extension that requires Python 2.2, else use cursor.fetchall(): for row in cursor: print row 2.2.3.2 Aggregates ~~~~~~~~~~~~~~~~~~ The following is from aggregates.py which is included in the calflora example. It creates an aggregate called variance which computes the variance of a set (or population). This example determines the variance in elevation heights (the upper_elev_ft field) of all plant species listed in the table. :: class variance: def __init__(self): self.X = [] def step(self, x): self.X.append(float(x)) def reset(self): self.X = [] def finalize(self): mu = 0 n = len(self.X) for x in self.X: mu += x mu = float(x)/n sigma = 0 for x in self.X: sigma += (x-mu)**2 sigma = sigma / (n - 1) return sigma conn = sqlite.connect("db", 077) # Add the aggregate conn.create_aggregate("variance", 1, variance) cursor = conn.cursor() # Expect float value for column cursor.execute('-- types float') # Execute SQL with custom aggregate cursor.execute("select variance(upper_elev_ft) from calflora") # Fetch row = cursor.fetchone() # Print print "Upper elevation variance: %.2f" % row[0] .. [1] That is to say free.