isutl: The Next Generation

By John Brunett, Senior Software Engineer

Over the years, Synergex has provided a few different re-indexing tools for its proprietary ISAM files. Back in the ’80s when DBL version 4 first came out, before it was called “Synergy,” there was no way to re-index an ISAM file if the index (.ism) file was removed or corrupted. So a few years later, we introduced irebld.dbr. It was just a DBL program that read through the data (.is1) file and then did ISAM STOREs to a new file. Not very fancy, but for the time, it did the trick.

As ISAM files started being used more, a better method was needed. To meet that need, we introduced the irecovr and ismvfy utilities in Synergy version 5 in the early ’90s. Irecovr and ismvfy were standalone C programs that were loosely based on the irebld concept; records were read from the data file and then written into a new ISAM file using the low-level SDMS API logic. This sufficed for a few years, but as disks grew larger, so did ISAM files (records in the hundreds of thousands and files in the hundreds of megabytes), and performance became much more of a factor.

Around the time of Synergy/DE 7.3.1, a totally new design replaced both irecovr and ismvfy. Enter the Isutl utility, which took advantage of the computer technology of the time, using large 64K-block sequential reads and writes and building the index in a single pass from sorted key files rather than using the per-record STORE logic, which improved performance nearly tenfold.

While the isutl utility has now been in use for nearly 20 years with little change, there have been massive leaps in disk size as well as ISAM file size (records in the tens of millions and file sizes in the tens of gigabytes). Processing times for these large files have also grown, into hours and multiple hours. So it seemed like time for another design change.

Since the original isutl design, many computer technology advances have become commonly available—multi-core CPUs, terabyte-sized disks, solid state drives, and lots and lots of memory. The old design no longer takes advantage of the available technology. Therefore, over the last year, the isutl utility has undergone a complete redesign of its internal indexing logic, taking advantage of parallel and asynchronous processing using multi-threading and atomic (non-locking) operations.

In coming up with a new design that performs better than the existing isutl, we tested many different prototypes. In the end, what made the biggest all-around impact was reducing I/O—it simply came down to the number of reads and writes. For example, the first change was simple: rather than reading through the data file for each key, isutl now reads through the data once and creates all the key files simultaneously. Because of this change, re-indexing requires more disk space, but the improved performance makes it well worth it. (To limit the amount of disk space used, see below.) This was an easy change, and the benefits will be reflected on even the oldest hardware.

The main focus of the new design effort is to keep the data in transit longer, allowing it to be further processed before writing it to a file to reduce I/O. Additionally, with what I’ll call creative ingenuity—combining one operation with the next—we could eliminate unnecessary I/O. For example, before writing data to the key file, we could send it through the first phase of the sort to eliminate one complete set of reads and writes. This all becomes possible with the introduction of multi-threading.

[Here’s where it gets nerdy. Those who are not interested in the nitty-gritty detail may want to skip over the next few paragraphs.]

In the new isutl design, the “main thread” passes memory-mapped records to a “feeder thread” via a lock-free queue, the “feeder thread” uncompresses the records (if necessary) and extracts the keys passing them to individual “key threads” via a set of key queues, and then the “key threads” use the data as input to phase 1 of the sort, which eventually gets written out as partially sorted runs. Not only is the data held in transit longer before being written out, multiple keys can be processed simultaneously. On systems with many CPU cores, large files with many keys can now be re-indexed in a fraction of the time compared to the old design—another n-fold improvement.

Old design sequence:

  1. Read data file, decompose into individual records (uncompressing if necessary), extract into a single key, write single key to a key file until end of data file. (Read/Write)
  2. Sort phase 1: Read key data, write out partially sorted runs to key file. (Read/Write)
  3. Sort phase 2: Read partially sorted key file and write sorted keys until key file is completely sorted. (Read/Write)
  4. Read sorted key file to establish metrics for the index. (Read)
  5. Read sorted key file again; based on metrics, write index. (Read/Write)
  6. Repeat 1-5 for each alternate key

New design sequence:

  1. Main thread: Read data segments (via memory-mapped I/O), add to data queue → Feeder threads (0-feeder#): Extract data segments from data queue, decompose (uncompressing if necessary) each segment into individual keys, and add to individual key queues → Key threads (1-key#): Remove keys from their respective key queues and then pass though sort phase 1 writing partially sorted key file. (Read/Write)
  2. Sort threads (1-sort#): Read partially sorted key file and calculate metrics for index while writing out sorted key file. (Read/Write)
  3. Index threads (1-index#): Read sorted key file; based on metrics, write index. (Read/Write)

When comparing the two designs, the thing to note is that the number of read and write operations is dramatically reduced in the new design. In addition, the more “machine” you can throw at it, the more it will likely take advantage of and the faster it will become.

With the new design, several configurable parameters occur automatically based on the number of CPU cores available, but as configurable parameters go, you can control different aspects based on your available hardware or circumstances by setting the following IU_ environment variables before running the utility:

IU_FEEDERTHREADS=n
               feeder#: Number of feeder threads   – (0-8) defaults roughly to number of CPUs/2

IU_KEYTHREADS=n
               key#: Number of key threads             – (1-32) defaults to number of keys up to 32

IU_SORTTHREADS=n
               sort#: Number of sort threads            – (1-8) defaults to number of CPUs up to 8

IU_IDXTHREADS
               index#: Number of index threads       – (1-n) defaults to 1

The index# parameter is always set to 1 by default because most systems don’t handle multiple threads writing to the same file very efficiently. However, on systems with extremely fast random I/O (like an NMVe SSD with PCIe or a high-end SANS), setting this number to 2 or higher may result in better performance.

The most useful parameter to know about is key#. Running the new isutl with large files with many keys will perform faster, but it will take up more disk space than before. Setting IU_KEYTHREADS to a value smaller than the number of keys will run in smaller batches (based on the number you choose), consuming less disk space. However, it may not perform quite as fast.

Another way to improve performance is in the form of a command line option, by choosing one or two secondary drives to write temp files. Reading from one drive and writing to another drive always performs better than reading and writing to the same drive. The isutl utility has a temporary file location specification, -t <tempdir>; however, the latest utility design includes a way to specify two separate drives. Specifying (-t d:\temp) as before will write all temporary files to the d:\temp folder, but specifying a second drive (-t d:\temp,e:\temp) will write all key temp files to d:\temp and all sort temp files to e:\temp. Additionally, specifying (-t ,d:\temp) will write only sort temp files to d:\temp. During testing, we found that when only one secondary drive is available, performance is better when just separating the sort temp files (-t ,d:\temp), because with (-t d:\temp) or no temp specification, all sort I/O still occurs on the same drive.

If you need to limit the amount of disk space used, you can use either or both of the above recommendations (i.e., offloading the temp files to a second or third drive or setting IU_KEYTHREADS to a small number).

Due to the reduced I/O per key, the better increase in performance will occur on large ISAM files with many keys, while single key ISAM files may not show much significant change. Because the isutl verify option uses the first two parts of the new design, higher performance can be seen there as well. You may not be aware of it, but the indexing logic found in isutl is also used by fconvert and xfODBC, as well as Synergy/DE 11 xfServer (Windows and Unix) in support of RESILIENT file management. All of these will reflect the same performance improvements.

With its forward-looking new design, I am hopeful that isutl will ride out the continuing changes in computer technology and provide the best performance possible far into the next decade.