Web SQL Database Gotchas
by George White Friday, October 8th, 2010We have been working on a pretty cool project, an offline Web application targeted to the iPad. Mobile Safari’s Webkit core provides robust support for many leading edge Web technologies. The app is built on a raft of emerging Web applications standards: HTML5, CSS3, Offline Web Application, Web Storage and Web SQL Database.
One of the key aspects of the application is offline access to a relatively large store of information. There are three collections of data containing more than 10,000 rows each. In the case of one the data sets, there are 95 fields per row. This isn’t a huge amount of data by normal database standards, but it is quite a lot for an offline Web app. To store this data, we turned to the Web SQL Database API.
The Web SQL Database API provides access to a local SQL database via Javascript, providing Web applications with a more robust local data store than has existed in the past.
There are, however, several gotchas to working with Web databases. The first one is the built-in limit on maximum database size. Safari and Mobile Safari place a 50MB per domain limit on the size of the database that the API can create. Within that 50MB, you can have a single database or many, but their size cannot exceed the limit.
The size limit places a cap on the utility of Web databases for certain applications and forces some constraints on application design. If you have a large set of data that needs to be accessible offline that exceeds the database size cap, your design has to take this into account. You will need a scheme for determining what data is “must have” when offline.
We found that the data set for our app will take roughly 30MB, so we aren’t pushing up against the cap yet. But there are some concerns for future expansion and the lack of head room if we need to push more data locally.
A related gotcha is a security restriction that requires user approval to create databases larger than a certain size (in Mobile Safari, any database larger than 5MB). Since the user can refuse to allow creation of your database locally, you need to handle cases where database creation may not succeed.
A larger gotcha that we’ve found is performance. There are two general classes of problem in this context: queries involving large sets of data and complex queries.
The first performance gotcha is due in large part to the way the API is implemented. When you make a SELECT query, for example, the request is passed to the database engine and processed. Any data returned by the query is parsed by the API and returned as a SQLResutSet object. Each row of the result set is basically an object of its own, with each field as a property of the object.
The impact of this design choice is that large result sets lead to the creation of a lot of Javascript objects and can quickly bring the browser to its knees. This problem was magnified in our application because on of the tables has 95 columns (and really couldn’t be decomposed naturally). Making a SELECT query for any significant number of rows with all fields in the query can take several minutes to process on the slower iPad and even desktop Safari balks.
The best mitigation strategy that we’ve found so far is “don’t do this”. This may seem like a “duh” sort of statement, but as is often the case, you don’t know for certain until you try. The SQLite database engine that provides the backend for all current implementations of the Web SQL Database API is a fast file-based component. Tests of the same queries directly against the database shows the SQLite is more than capable of handling this amount of data. But the same query that takes a fraction of a second when run directly agains the database might take 10 minutes or more when run through the Javascript API.
The second major performance gotcha we have faced so far has been with complex queries. Any query involving a JOIN or a sub-query seems to have a significant impact on the performance of the application. Again, these same queries are fast when run directly against the same database file. The conclusion is that the Javascript API adds significant overhead as currently implemented.
A particular place where these performance issues affected our design and implementation was in the use of a local ORM to handle the model for the application. We choose to use persistence.js for this. The problem is that the ORM pulls the entire row for each object and often uses complex queries when filtering. We drive right into the performance pothole.
The upshot of this is that we have had to take a hybrid approach, using the ORM for convenience where we can, but falling back to custom queries that heavily restrict the size of the returned data when we need performance.
Overall, the Web SQL Database API is a useful and powerful standards-based way to store and query large sets of data. Sure, it has a few quirks, but it is a huge step towards making Web application first class citizens on in the offline world.
Update:
As an addendum, the W3C dropped work on the Web SQL Database standard in November 18th, 2010. It looks like the competing Indexed Database API is going to get all of the focus. See http://www.w3.org/TR/IndexedDB/ for details.
Web SQL wasn’t getting any traction with Mozilla or Microsoft and the Chrome team recently announced upcoming support for Indexed Database (in addition to the existing WebKit support for Web SQL). There’s no actual implementations of Indexed Database in a shipping browser just yet and Web SQL is not going to be dropped for a while. So you’re probably safe using Web SQL for the time being, but keep in mind that a transition is coming in the next year or two.
thanks!! this is a great little writeup.
Have you ever encountered a constraint_err once you get to a certain number of rows being inserted into the table? You stated a number around 10,000 rows per table – did you get them all stored locally?
Currently I’m looking into what will be the future technology of providing fast robust and unlimited sql databases. I had hopes for Html5 and the WEB SQL Database, but now I understand the extreme bottlenecks JS provides when it comes to large scale data handling and speed.
Considering javascript is not fast in itself, I cannot see it handling large databases faster than Silverlight. This really disappoints me, and might be the game changer for many enterprise developers.
How can Html5 be the future for offline enterprise webapps if it’s based on the outdated and slow javascript.
“But the same query that takes a fraction of a second when run directly agains the database might take 10 minutes or more when run through the Javascript API.”
It makes logical sense and like I said confirms a lot of theories I’ve had. Thanks for sharing this. I guess Silverlight is not dead after all, at least when it comes to enterprise webapps.
Jont,
While there are some problems with the Web database API, I think HTML5 is still a more than viable platform for enterprise web apps. Even with the hurdles we encountered, the app we built is responsive and capable even with relatively large data sets. We still managed to put per 25MB of data into the app with room to spare.
Yes, we had to work around frustrating limitations, but on reflection, I think the core problem is that thinking about the architecture of web apps is different than some other environments.
I also do not believe that javascript is outdated. On the contrary, it is an active and evolving language with exciting potential. The rise of high-performance engines such as Nitro and native implementations of core modern needs like JSON parsing have lead to a very capable platform. Combined with HTML5 and CSS3, javascript is in a great position to create apps for a wide range of uses and environments.
Dean,
We have been able to import fairly large data sets. The largest so far has been about 17,000 rows, with each record containing 6 fields.
I would suggest looking at the data itself if you’re seeing constraint violations. SQLite is generally pretty forgiving of data type mismatchs, but you may have something like duplicate values in a column with a unique constraint.
Currently I’m looking into what will be the future technology of providing fast robust and unlimited sql databases. I had hopes for Html5 and the WEB SQL Database, but now I understand the extreme bottlenecks JS provides when it comes to large scale data handling and speed. Considering javascript is not fast in itself, I cannot see it handling large databases faster than Silverlight. This really disappoints me, and might be the game changer for many enterprise developers. How can Html5 be the future for offline enterprise webapps if it’s based on the outdated and slow javascript. “But the same query that takes a fraction of a second when run directly agains the database might take 10 minutes or more when run through the Javascript API.” It makes logical sense and like I said confirms a lot of theories I’ve had. Thanks for sharing this. I guess Silverlight is not dead after all, at least when it comes to enterprise webapps.
thanks!! this is a great little writeup.
Thanks for really helpful post. You mentioned the time to query the database, but how about inserts? Do you insert those >10000 rows in one go or does it accumulate? How long does it take for you to insert the data?
Ryszard,
The inserts are pretty slow overall for the dataset with a lot of fields. Initially, we tried using our ORM, persistence.js, to do the insert. The result was snail-slow. Following that, we switched over to wrapping our own insert query into a single transaction. Faster, but still not ideal.
In the end, we ended up chunking the insert query into batches of ~2000 rows at a time per transaction. This let us get a reasonable balance between writing the rows and capturing the state of the insert (handled in the transaction’s callback).
One tricky thing is that the rows can take quite a while to write on iOS devices. And since the writes are asynchronous, you really have to pay attention to the callbacks at the query or transaction level and make sure that some UI feedback is presented to the user. This is sort of a “duh” kind of thing, but it’s possible for the user to accidentally abort ongoing write to the database if they navigate away from the page/view.
Thanks for the post. Here’s a product with a powerful database in the cloud with ready-made apps. Just point-and-click to build your custom apps http://www.caspio.com/
Thanks for this article. There is so little discussion of the practical issues relating to webSQL / HTML5 localDatabase.
I wonder: you mention that the webSQL engine itself is pretty fast, and any speed issues are usually the result of the Javascript API. This comes titillatingly close to suggesting that there might be a way to get the data using some kind of direct access to the webSQL and not using openDatabase().
Did I get that right? Is there such a direct access? How would you go about it?
I’ve got a localDatabase with five tables of one, two or maybe three records. I can’t get how it can be so slow.
Thanks again!
Wytze
Wytze,
The underlying engine for WebSQLDB is SQLite, which is relatively fast, embeddable database implemented in C. It was chosen as the basis for both the WebSQLDB and LocalStorage/SessionStorage implementations in several browser because it was well-known and easy to tie in.
Unfortunately, you don’t have any direct access to the engine from the browser. The WebSQLDB API is your only method for getting at the database for querying.
I am a bit surprised to hear that you’re getting very slow performance from such a small set of tables and rows. Are you doing straight SELECT queries or using JOINs a lot?