Web SQL Database Gotchas

by George White Friday, October 8th, 2010

We have been working on a pretty cool project, an offline Web application targeted to the iPad. Mobile Safari’s Webkit core provides robust support for many leading edge Web technologies. The app is built on a raft of emerging Web applications standards: HTML5, CSS3,  Offline Web Application, Web Storage and Web SQL Database.

One of the key aspects of the application is offline access to a relatively large store of information. There are three collections of data containing more than 10,000 rows each. In the case of one the data sets, there are 95 fields per row. This isn’t a huge amount of data by normal database standards, but it is quite a lot for an offline Web app. To store this data, we turned to the Web SQL Database API.

The Web SQL Database API provides access to a local SQL database via Javascript, providing Web applications with a more robust local data store than has existed in the past.

There are, however, several gotchas to working with Web databases. The first one is the built-in limit on maximum database size. Safari and Mobile Safari place a 50MB per domain limit on the size of the database that the API can create. Within that 50MB, you can have a single database or many, but their size cannot exceed the limit.

The size limit places a cap on the utility of Web databases for certain applications and forces some constraints on application design. If you have a large set of data that needs to be accessible offline that exceeds the database size cap, your design has to take this into account. You will need a scheme for determining what data is “must have” when offline.

We found that the data set for our app will take roughly 30MB, so we aren’t pushing up against the cap yet. But there are some concerns for future expansion and the lack of head room if we need to push more data locally.

A related gotcha is a security restriction that requires user approval to create databases larger than a certain size (in Mobile Safari, any database larger than 5MB). Since the user can refuse to allow creation of your database locally, you need to handle cases where database creation may not succeed.

A larger gotcha that we’ve found is performance. There are two general classes of problem in this context: queries involving large sets of data and complex queries.

The first performance gotcha is due in large part to the way the API is implemented. When you make a SELECT query, for example, the request is passed to the database engine and processed. Any data returned by the query is parsed by the API and returned as a SQLResutSet object. Each row of the result set is basically an object of its own, with each field as a property of the object.

The impact of this design choice is that large result sets lead to the creation of a lot of Javascript objects and can quickly bring the browser to its knees. This problem was magnified in our application because on of the tables has 95 columns (and really couldn’t be decomposed naturally). Making a SELECT query for any significant number of rows with all fields in the query can take several minutes to process on the slower iPad and even desktop Safari balks.

The best mitigation strategy that we’ve found so far is “don’t do this”. This may seem like a “duh” sort of statement, but as is often the case, you don’t know for certain until you try. The SQLite database engine that provides the backend for all current implementations of the Web SQL Database API is a fast file-based component. Tests of the same queries directly against the database shows the SQLite is more than capable of handling this amount of data. But the same query that takes a fraction of a second when run directly agains the database might take 10 minutes or more when run through the Javascript API.

The second major performance gotcha we have faced so far has been with complex queries. Any query involving a JOIN or a sub-query seems to have a significant impact on the performance of the application. Again, these same queries are fast when run directly against the same database file. The conclusion is that the Javascript API adds significant overhead as currently implemented.

A particular place where these performance issues affected our design and implementation was in the use of a local ORM to handle the model for the application. We choose to use persistence.js for this. The problem is that the ORM pulls the entire row for each object and often uses complex queries when filtering. We drive right into the performance pothole.

The upshot of this is that we have had to take a hybrid approach, using the ORM for convenience where we can, but falling back to custom queries that heavily restrict the size of the returned data when we need performance.

Overall, the Web SQL Database API is a useful and powerful standards-based way to store and query large sets of data. Sure, it has a few quirks, but it is a huge step towards making Web application first class citizens on in the offline world.

Update:

As an addendum, the W3C dropped work on the Web SQL Database standard in November 18th, 2010. It looks like the competing Indexed Database API is going to get all of the focus. See http://www.w3.org/TR/IndexedDB/ for details.

Web SQL wasn’t getting any traction with Mozilla or Microsoft and the Chrome team recently announced upcoming support for Indexed Database (in addition to the existing WebKit support for Web SQL). There’s no actual implementations of Indexed Database in a shipping browser just yet and Web SQL is not going to be dropped for a while. So you’re probably safe using Web SQL for the time being, but keep in mind that a transition is coming in the next year or two.