One of the biggest selling points of Oracle’s engineered system Exadata is its storage offloading capability. Storage offloading allows data to be instantly filtered when it is read from storage, reducing the amount of data that is sent and processed by the database. One of the filter operations storage offloading performs is join filtering via bloom filters.
What is a Bloom Filter?
A bloom filter is a space-efficient probabilistic data structure that is used to test whether an element is in a member set.
First developed in 1970 by Burton Howard Bloom , the bloom filter is not a new concept, nor is it specific to Oracle. The company first introduced bloom filters in Oracle 10g, and used them to reduce traffic between parallel query processes when joining tables. The same functionality was then extended to Exadata Storage layer, still requiring parallel query. The decision to utilize bloom filters is made by the Oracle cost-based optimizer (CBO), based on estimated CPU and IO usage.
How Bloom Filters Are Used in Exadata
Bloom filters are effectively used during hash joins between two tables: typically, a smaller table and a larger table. First, Oracle scans the smaller table to identify the rows which satisfy the WHERE clause predicate. A result set is obtained based on the information that satisfies the predicate. Then, a hash table is created based on that result set. While the hash table is being created, a bit vector is also created based on the join column.
The result so far is a hash table of the rows which satisfy the predicate, as well as a bit vector on the join column of the first table. Both pieces of information are sent together as part of the join condition when the second (larger) table is being scanned. The second table goes through its own process of being scanned using the WHERE clause predicate to identify the rows that satisfy the query, creating another result set. A hash table is created on the join column of this result set and will be compared to the values in the bit vector For each row, if the bit vector matches, it is then sent to hash join, . Rows without a matching bit vector are disregarded.
This will efficiently process the data, reducing the amount of data which has to be returned back to the data servers.
On Exadata, the bloom filter or bit vector is passed as an additional predicate so it will be overloaded to the storage cells making bloom filtering very efficient.
When join filtering happens at the storage layer, Oracle can take advantage of storage indexes and immediately eliminate large sections of the table as not being part of the result set. You can actually see when this type of work is happening on Exadata. When you run a SQL profile to see the wait events on the SQL statements running, you will see “cell physical IO bytes saved by storage index” which indicates that a range based bloom filter is in action.
Everything happens in the storage layer, meaning significantly less data is moving from the storage layer to the database servers after the filter is implemented. You’re reducing traffic on the InfiniBand, as well as greatly reducing the time it takes to return applicable rows of data.
Bloom filters basically convert join into filters and that’s why it’s a big deal on Exadata, as they can be offloaded to Cell server.
Bloom filtering provides a simple optimization that can have huge impact on both the SQL processing time and overall system resource utilization.
Join filtering occurs at the storage layer. As such, it will eliminate most of the data transfer to database servers.
The last thing is I want to talk about is the limitations. There are instances when you cannot use bloom filters; for example, when the small tables have indexes in the join columns.
If the optimizer selects an index range scan of that index then it will be executed in serial and not in parallel. Thus preventing a bloom filter being created.
Another limitation is that bloom filters are based on hash values, and depend heavily on the quality of the data. If you have skewed data and there are some significant hash collisions with popular values, this will make the join filtering much less efficient at the storage layer.
Want to know more about Oracle Engineered Systems and Exadata best practices? Click here.