Let's be honest: working with big databases is a lot of fun. There's something cool about dealing with tables that have hundreds of millions or billions of rows in them, loading huge amounts of data, building star and snowflake schemas for data warehouses/marts, optimizing query performance, and all that jazz. Yes, working with big databases is a lot of fun.
On the other hand, let's be honest: working with big databases is not a lot of fun. There's a lot of pain in dealing with tables that have hundreds of millions or billions of rows in them, waiting for huge amounts of data to be loaded only to have the load job toss its cookies and fail when it's 99% done, building special schemas that you wonder whether make any difference at all, and trying to figure out why just a simple two-way join query has been hanging for over an hour. Yes, working with big databases is not a lot of fun.
But here's the thing: big databases aren't going away; in fact, there's more of them than ever before and they just keep multiplying and getting bigger. I could quote a lot of industry statistics and TDWI reports on this fact, but that would just bore you because you likely know all this stuff already. You also know that there are a number of database vendors out there who can take on big data, but they also come with big price tags.
Surveys done by MySQL show that data warehousing is the fifth most common use case for MySQL users, so it's understandable that many try and get around the sticker shock of proprietary data warehouse DB's with MySQL. One of the challenges is scaling MySQL for very large data volume sizes or for complex/analytic queries with fast response times. Moreover, one of the drawbacks that most generic RDBMS's have is a row-based architecture, whereas most data warehouses and read-intensive query situations are selective column-based use cases.
Lately, the idea of column-oriented databases has been catching new wind in its sails. I say 'new wind' because Sybase IQ did the column database thing when columns weren't that cool. But now there are a number of column-oriented databases that are really taking off, however most still are proprietary and cost quite a bit. But what happens when you marry a column-oriented database that's modular in design, can scale both up and out, with MySQL and open source? You get Calpont's InfiniDB.
To be frank, I'm as skeptical as they come when claims are made that say how this or that different 'type' of database is better than a general purpose RDBMS. When object-oriented databases came out, I scoffed. When OLAP/cube databases were hogging all the tech press, I smirked. I believed that neither could really challenge the good ol' relational model, which isn't perfect by any means, but hey: it still has enough punch to relegate those two other challengers to the very boutique marketplace. That being the case, why should we entertain a relational database that is vertical vs. horizontally architected?
Because, for querying and processing medium to huge amounts of data, they actually work. And, many times, they work better at finishing the SQL query horse race than a general purpose RDBMS. That's why.
Before column databases came out, experienced data modelers creating data warehouses, marts, and reporting databases knew that by designing vertically partitioned tables that were subsets of an overall table, they could sometimes buy better physical query performance because the rows were shorter and thus I/O could somewhat be reduced. Using views, they'd then tie the vertically partitioned tables together to form one logically-based table that could be referenced. And sometimes this practice worked. But oftentimes, it didn't make the grade because (1) even the vertically partitioned tables were still row-based and I/O wasn't always reduced by much; (2) it was really hard to predict what columns would be consistently queried together; (3) the practice didn't reduce the need for indexes (and in fact made the matter worse sometimes because you needed indexes to smartly do joins between the vertically partitioned tables in the view definitions); (4) like with the difficulty in predicting what columns would be queried together, it was difficult to do index design because user's query patterns could change at the drop of a hat and suddenly queries would do full scans instead of index searches. And on and on it went.
Column-oriented databases designed especially for analytics overcome these limitations by storing, managing, and querying data based on columns rather than rows. Because only the necessary columns in a query are accessed rather than entire rows, the column can act as the index and I/O activities as well as overall query response times can oftentimes be reduced. The end result is the ability to interrogate and return query results against either moderate or large amounts of data in a pretty darn efficient manner. On average, a row-based system can do up to 5-10x the physical I/O that a column-based database does to retrieve the same information. This is why column bigot Philip Howard (Research Director for Bloor Research) says: "Columns provide better performance at a lower cost with a smaller footprint: it is difficult to understand why any company seriously interested in query performance would not consider a column-based solution."
Nice quote, but should you really check out a column database? Can they really help you? Read on…
What's it Have?
Column databases can be a help for sure in the right situation, but there's more that's needed to crunch through large data volumes than just having a vertical data design. In addition to being column-oriented, the open source version of InfiniDB provides the following core feature set, which adds a number of needed items to the mix:
- Column-oriented architecture: mentioned already...
- Multi-threaded design: the engine is multi-threaded and uses today's modern multi-CPU/core architectures to split a single query up across the available processing units.
- Automatic vertical and horizontal partitioning: Being column-oriented, the engine uses vertical partitioning to store data but it also uses a form of logical horizontal range partitioning that does not require special storage placements or design. Using both vertical and logical horizontal range partitioning allows it to reduce I/O in both directions (column and row). Both vertical and horizontal partitioning are automatically handled so you don't have to manually do anything.
- High user concurrency: There are no real limits as far as concurrency is concerned.
- High-speed data loader: To effectively load lots of data, a high-speed load utility is made available. I'll show you how this works in just a second.
- DML support: In addition to supporting the high-speed bulk loading of data, the engine supports full DML (insert, update, delete) operations as well.
- Transactional support: ACID-compliant transactional support is provided in the engine. Commit, rollback, deadlock detection, all that good stuff.
- Crash recovery: the engine has full crash recovery capabilities.
- MVCC design: who doesn't like 'snapshot read'? I do, and the engine has it.
- No need for indexing: Because the engine uses both vertical and logical horizontal partitioning, there is no need for indexing. In essence, the data is the index. In addition, the engine automatically maintains a small, important structure called the Extent Map (explained later), which is used to reduce I/O. Besides no indexes, you also don't have to worry about dealing with materialized views, summary tables (of course, in MySQL, you don't' have these anyway…), or manual data partitioning.
- Logical Data compression: the engine doesn't have real physical/file compression yet (coming…) so for now, it uses transparent logical data compression to store data that benefits from being compressed.
- ALTER TABLE is supported (add, drop).
- Performance diagnostics: To help troubleshoot those queries from hell, InfiniDB supplies a nice tracing/diagnostic utility that helps figure out exactly what a query is doing above the normal EXPLAIN output.
InfiniDB also uses MySQL as its front end, so if you're familiar with MySQL, you'll be right at home with InfiniDB. MySQL is used mainly in InfiniDB for security, SQL parsing, and initial query plan output.
In addition to the above open source product feature set, a commercial/Enterprise version of InfiniDB will be offered that has a pay-for scale out option where multiple machines can be used in a massive parallel processing (MPP) configuration. However, for many data warehousing, data mart, and analytic database tasks, the open source version of InfiniDB provides plenty of power to plow through big databases.
The architecture of InfiniDB is modular and consists of three basic components:
- User Module: The User Module is made up of a small MySQL instance and a number of InfiniDB processes that handle concurrency scaling. The User Module is also responsible for breaking down SQL requests and distributing the various parts to one or more Performance Modules that actually retrieve requested data from either memory caches or disk. Finally, the User Module assembles all the query results from the various participating Performance Modules to form the complete query result set that is returned to the user. Note there can more than one User Module in an InfiniDB Enterprise configuration so you can do concurrency scale out by adding more User Module nodes.
- Performance Module: The Performance Module is responsible for retrieving and managing data, and passing it back to the User Module(s) to satisfy query requests. The Performance Module selects data from disk and caches it in a shared nothing data cache that is part of the server on which the Performance Module resides. InfiniDB Enterprise accomplishes its MPP benefits by allowing the user to configure as many Performance Modules as they would like; each additional Performance Module adds more cache to the overall database as well as more processing power.
- Storage: InfiniDB can use either local storage or shared storage (e.g. SAN) to store data. A user can have everything configured and running on one server. In the commercial edition of InfiniDB, they can scale out with multiple servers and configure either a shared disk (currently supported) or shared nothing (coming in future release) architecture.
All the above modular components can exist on one server, or in the Enterprise version, be broken up and run on different machines. If you want to compare the architecture and design of InfiniDB to other proprietary analytic databases in the market that do both scale up (use a box's available CPU's) and scale out (MPP), InfiniDB is comparable to Vertica, Sybase IQ, and Paraccel. There are other analytic databases available that scale and do MPP, but all others are row-based (e.g. Greenplum, Aster Data, Teradata, Netezza, etc.) vs. column-oriented.
One interesting feature that allows InfiniDB to handle lots of data is a structure called the Extent Map. It's the Extent Map that removes the need for you to do any indexing, manual partitioning of data, create summary tables, etc.
An "extent" in InfiniDB is a logical block of space that exists within a physical file, with an extent being anywhere from 8-64MB in size depending on a column's datatype. Each extent supports the same number of rows, with smaller data types using less space on disk. The Extent Map catalogs all extents and their corresponding blocks (identified with logical block identifiers or LBID's). The Extent Map also maintains minimum and maximum values for a column's data within an extent.
The Extent Map provides the ability for InfiniDB to only retrieve the blocks needed to satisfy a query, but it also provides another benefit – that of logical range partitioning. This is accomplished via the minimum and maximum values for each extent that are contained within the Extent Map. Extent elimination is first accomplished in InfiniDB via the column-oriented architecture (only needed columns are scanned), but the process is accelerated because of this logical horizontal partitioning that is implemented in the Extent Map.
This automatic extent elimination behavior is well suited for time-based data where data is loaded frequently and often referenced by time. Near real-time loads with queries against the leading edge of the data can easily show good extent elimination for all of the date/time columns as well as an ascending key value. Any column with clustered values is a good candidate for extent elimination.
To eliminate an extent when a column scan involves a filter, that filter value is compared to the minimum and maximum values stored in each extent for the column:
In the above figure, if a
WHERE column filter of
col1 between 220 and 250 is specified, InfiniDB will eliminate extents 1, 2 and 4 from being scanned, saving ¾ of the I/O and many comparison operations. If the extent cannot possibly contain any rows that would match the filter, that extent is ignored entirely. Additionally, since each extent eliminated for a column also eliminates the same extents for all the other columns in a table, impossible filter chains can be quickly identified without any I/O being required. For example, take the following two columns and their Extent Map information:
If a column
WHERE filter of
col1 between 220 and 250 and col2 < 10000 is specified, InfiniDB will eliminate extents 1, 2 and 4 from the first column filter, then, looking at just the matching extents for
col2 (i.e. just extent 3), it will determine that no extents match and return zero rows without doing any I/O at all.