Join the Select Who Select Using Join

By John Brunett, Senior Software Engineer

The Select statement was introduced back in Synergy/DE 9.3.1, originally conceived as a tool to retrieve status from the new change tracking feature yet to be released in version 10. Since then, it has grown in functionality as useful new ways to use it have been revealed. Once again, with the release of Synergy/DE 10.3.3, Select contains a new addition: the Join construct. Using SQL92 as a basis, this initial implementation of the Join provides functionality equivalent to SQL’s INNER and LEFT OUTER JOIN. We added two new methods to the From class, InnerJoin and LeftJoin, to accomplish this functionality. Both return From objects so they can be combined to form complex, multi-table joins.

The general philosophy of Synergy DBMS has always been “simple is better,” which explains why ISAM speeds tend to outperform a fully functional SQL database. First in designing Select, and now with the Join, we’ve followed that same paradigm. Rather than build in a complex preprocess to reorganize and re-optimize the overall Join, we’ve opted for the WYSIWYG (what you see is what you get) approach, and we accomplish this with certain specific rules. For example, the first (or leftmost) table/file is always the “driving table,” and all subsequent tables are “inner tables.”  All Joins include an ON expression to define how to join two tables in the form of an “equi-join,” where identical fields in both tables match equally. The fields matched on inner tables are required to be key fields. An On expression may contain additional filter expressions separated by a logical AND, which is evaluated during the Join. Additionally, filter conditions are allowed in the Where parameter of the Select, but these may only reference fields in the driving table and are evaluated after the final Join.

If you’re familiar with the look and feel of the Select statement, and its nonconventional “OO method” that loosely resembles the SQL Select, you’ll recognize our effort to convey the same feel with the Join component in a procedural language. Here’s a simple comparison of SQL’s Select vs. Synergy’s Select statement:

      SELECT * FROM Customers

vs.

      sObj = new Select(new From(CUSTOMERS, rec))

The Join simply builds on that concept, always maintaining everything in a single procedural statement.

The following statement shows a simple inner Join of Customers from Orders:

      sObj = new Select(new From(ORDERS, oRec).InnerJoin(new From(CUSTOMERS, cRec),
      &      (On)(oRec.CustNo==cRec.CustNo))).Join()

As you can see, the single statement methodology starts to get a little cumbersome, so breaking up the statement into separate objects can help make it more readable.

      fromOrders = new From(ORDERS, oRec)
      fromCustomers = new From(CUSTOMERS, cRec)
      jObj = new Select(fromOrders.InnerJoin(fromCustomers, (On)(oRec.CustNo==cRec.CustNo))).Join()

Returning to SQL as an example, a JOIN is typically formatted like this:

      SELECT *
      FROM Orders
      INNER JOIN Customers
      ON Orders.CustNo=Customers.CustNo

Beginning in Synergy/DE 10.3.1, the compiler allows a physical line to be broken on a method’s path delimiters, so the Join can be as readable as a SQL statement by using indenting.

      jObj = new Select(
      &      new From(ORDERS, oRec)
      &      .InnerJoin(new From(CUSTOMERS, cRec),
      &      (On)(oRec.CustNo==cRec.CustNo))).Join()

In conjunction with Join, a few other new classes have also been added to the Select namespace: On, Rows and RowEnumerator, JoinSelect, and Sparse. Similar to the Where class, the On class is used to define how two tables should be joined, as well as to set conditions to filter out rows (much like the SQL ON statement). The Rows class represents the resulting set of rows returned by a single Join iteration, and it contains methods and properties to access individual rows. The RowEnumerator and JoinSelect classes represent the “glue” to combine everything into an enumerable entity. And finally, the Sparse class, an alternative to the existing SparseRecord method, allows runtime construction of the sparse data definition for xfServer for Join and non-Join Select rows.

The following is a more complete example of a complex four-table Join.

      foreach rows in new Select(“*”,
      &       new From(ORDERS, oRec)
      &       .InnerJoin(new From(ORDERDETAILS, dRec),
      &       (On)(oRec.OrderID == dRec.OrderID))
      &       .InnerJoin(new From(CUSTOMERS, cRec),
      &       (On)(oRec.CustomerID == cRec.CustomerID))
      &       .InnerJoin(new From(PRODUCTS, pRec),
      &       (On)(drec.ProductID == pRec.ProductID)),
      &       (Where)(oRec.Date .gt. beginDate)).Join()
        begin
              rows.fill(oRec)      ;Fills the local records from
              rows.fill(dRec)      ; the retrieved rows following
              rows.fill(cRec)      ; the specified Sparse spec “*”
              rows.fill(pRec)      ; in this case all fields.
        end

Notice the additional use of a Where expression to filter rows that are actually returned. This could also serve as an example of what not to do. Let’s say the ORDERS file is actually your ORDERHISTORY file that contains every order since your company opened, and you only want to see orders since the beginning of the week. Your result will include only the rows/records that you want, but at a cost of potentially reading every order, order detail, customer, and product record in your database. By moving the expression from the Where parameter to the first On parameter, as shown below, the first Join will only include the desired rows, reducing subsequent I/O and making the operation take a fraction of the time.

      &       (On)(oRec.OrderID == dRec.OrderID .AND. oRec.Date .gt. beginDate))

I hope you will find this latest enhancement to Synergy DBMS useful and decide to “Join the Select.” For more information on Join and what it can do, see Synergex.SynergyDE.Select.Select in the Synergy/DE documentation.