Creating efficient SQL statements

This topic includes the following sections:

Optimizing with restriction clauses
Operators and optimization
AND and OR clauses
ORDER BY clauses
Checking the order of the FROM clause for a SQL89 join
Avoid mixing SQL92 and SQL89 syntax

Once your data files and repository are set up correctly (see Optimizing with keys), the next step in optimization is to make sure each query that accesses the database is well designed and takes advantage of xfODBC’s ability to optimize. Although this topic doesn’t cover the entire subject of creating efficient SQL statements, it does discuss some basic rules and strategies.

See Appendix B: SQL Support for information on supported SQL commands and syntax.
See How xfODBC uses keys for information on how xfODBC uses indexes for sort and restriction clauses (including joins).
See Determining which indexes are used for information on tracking which indexes are used for a query.

Optimizing with restriction clauses

xfODBC uses restriction clauses in two ways: to determine which rows to read (an initial read restriction) and to apply criteria to limit rows (a limit restriction). For optimization, initial read is the most important, but is available only if xfODBC can use a key with the restriction clause. Depending on the restriction criteria and the key order, initial reads may enable xfODBC to skip records. For example, assume you have the following restriction clause:

WHERE last_name >= 'Smith'

If there’s an ascending key on last_name, the initial read starts with ‘Smith’. Only records starting with ‘Smith’ and following will be read, which cuts down the number of reads and improves performance. As another example, say you have a statement with the following clause:

WHERE last_name <= 'Doe'

In this case, the read starts with ‘Doe’ back through to the beginning of the file. Once again, the number of reads is reduced, and performance is improved.

Operators and optimization

For relative files, xfODBC supports optimization only for reads that use the equal operator (=) with the field record number. For example, xfODBC can optimize the SQL statement in Checking the order of the FROM clause for a SQL89 join because the last line of the WHERE clause checks for equality:

AND part_index.record_number = part.record_number

For ISAM and ASCII sequential files, xfODBC supports optimization for reads that use any valid operator (>, >=, =, and so forth).

AND and OR clauses

If keys are available to optimize both sides of an OR clause, xfODBC can optimize the clause. To do this, xfODBC treats each side as a separate statement and then combines the results. If keys are not available to optimize both sides of an OR clause, xfODBC cannot optimize the clause. For more information on OR clause optimization and the SQL command that controls it, see Max number of rows and Notes on MERGESIZE.

An optimizable OR clause is generally preferable to an AND clause, but if you can’t state a restriction clause as an optimizable OR clause, it’s generally better, when possible, to use AND clauses rather than an unoptimizable OR clause. Because all conditions in an AND clause must be met, xfODBC can use the first condition as an initial read if an index can be used with the condition. (See Optimizing with restriction clauses for information on initial reads.) This may limit the number of rows xfODBC is required to read and evaluate. On the other hand, if an OR clause can’t be optimized, xfODBC can’t use the first condition of the clause as an initial read because a row can be included based on either side of an OR clause; xfODBC must read and evaluate every row.

ORDER BY clauses

If a user-defined field is part of an ORDER BY clause, xfODBC won’t use a pre-defined key for optimization.

Checking the order of the FROM clause for a SQL89 join

If your SQL statement has a SQL89 join, the FROM clause is critical. The order of the tables listed in a FROM clause determines the order the tables are evaluated, the contents of the final result set, as well as the time required to generate the result set. The first table specified in the FROM clause is the primary table, and unless there’s a restriction on the table, all rows in the primary table are selected. Rows in other tables are selected only if they meet the criteria specified in the WHERE clause. Because of this, you should order the tables in the FROM clause so that the tables are listed in the order that the rows relate to each other. For example, for the following SQL statement, assume that the order_detail and part tables are relative files with record_number as an index; in addition, assume the part_index table is an ISAM file with part_number as an index. Note that the primary table is order_detail.

SELECT
    order_detail.customer_number,
    order_detail.cust_name,
    order_detail.record_number,
    order_detail.billed_amt,
    order_detail.allowed_amt,
    order_detail.invoice_date,
    part_index.order_number,
    part.name
  FROM
    order_detail, part_index, part 
  WHERE
    order_detail.order_number > 0
    AND order_detail.record_number > 1
    AND order_detail.part_number = part_index.part_number
    AND part_index.record_number = part.record_number
    ORDER BY
      order_detail.order_number

To process the above statement, xfODBC reads order_detail (the primary table) sequentially from the first row to the last row by record_number. Each row of order_detail is tested against the restriction criteria. When a row meets the criteria (in this case order_number > 0 and record_number > 1), the associated row in part_index is located directly by part_number, and then the associated row in the part table is located directly by record_number. When the row is found, the items specified in the SELECT statement are written to a temporary sort file. Then, after reading the last row in the order_detail table, the temporary sort file is sorted by order_number and returned as the resulting set of data.

Avoid mixing SQL92 and SQL89 syntax

When writing a SQL92 join, avoid using a WHERE clause that uses a SQL89 inner join—i.e., matches columns (table1.field1 = table1.field2). This will result in a separate SQL89 join, which will generally reduce performance and may produce incorrect results. For example:

SELECT plants.in_itemid, orders.or_number 
FROM plants 
  LEFT JOIN orders
  ON plants.in_itemid = orders.or_item
WHERE plants.in_price = orders.or_price

This could be rewritten as the following, which can be optimized:

SELECT plants.in_itemid, orders.or_number 
FROM plants 
  LEFT JOIN orders
    ON plants.in_itemid = orders.or_item
      AND plants.in_price = orders.or_price