How Much Memory Does Your Sort Really Want?
By John Brunett, Senior Software Engineer
The Synergy sort has always been a powerful language feature. However, as disk drives have gotten larger, so, too, have the files sorted. Files that used to be many megabytes in size are now many gigabytes (and soon will be many terabytes). It has become more difficult for Synergy to choose an optimal memory size that provides the best sort performance amongst all ranges of file sizes. In this article, I will give you some history of the Synergy sort and tell you about some new features added in 10.3.3c.
The sort used by Synergy Language (on Windows and Unix) as well as by the ISAM utilities has been tweaked over the years in regard to memory usage. When the sort was first introduced back in DBL version 4, memory was a precious commodity: Everything had to operate within a minimum amount of memory. At the time, a fixed size of 512K was allocated for the task regardless of how large the file actually was. Considering the maximum amount of memory and the disk sizes available at the time, that number was adequate.
Then, starting in version 7.3.1, the SORTMEM environment variable was introduced with a default of 1 MB, allowing an adjusted range between 512K and 32 MB. As files got larger (due to larger disks) and more memory became available, this higher default was a significant improvement because it allowed for better performance. Shortly thereafter, the isutl and fconvert utilities were introduced using the same sort with a default SORTMEM value of 20 MB. Back then, it wasn’t uncommon for systems to have over 20 MB of memory available, and since ISAM utilities aren’t typically run simultaneously on a multi-user platform (as applications that might be sorting could be), we could throw more memory at it. Then, in Synergy/DE for Windows 9.1.1, the default SORTMEM value for all application sorts was increased to 4 MB, since Windows platforms tended to be single user, single process.
Today, memory and disks are even larger, often making the default SORTMEM value much too small. When the idea for this article originally surfaced, my thought was to suggest how to set SORTMEM appropriately for large files. However, as I dug into the sort, I discovered that it wasn’t such an easy problem to crack. So, I became determined to figure out what that magic number should be.
The sorting system we use is based on a two-phase Tournament Sort that incorporates a replacement selection algorithm (based on Knuth’s Art of Computer Programming, Volume 3, section 5.4.1) to generate sorted runs in the first phase and then an external polyphase merge in the second phase. With this system, a file of any physical size can be successfully sorted with a minimal amount of memory (512K). However, with a limited amount of memory, the sort may have to process many intermediate runs, causing more I/O, which affects performance. For the best possible performance, it would need enough memory to complete the sort in only two passes, regardless of record size, number of records, or “sortedness” of the input file. The rest of this article suggests how to attain the best performance by increasing sort memory. There is no requirement that you follow these suggestions, as the sort will work regardless.
To determine what that magic number is, I had to look closer at both sort phases. The memory in the first phase is used to allocate the selection tree, which is an array of nodes that will contain records in the form of sorted runs. The maximum number of nodes is fixed at 8192, but the memory required for those nodes is determined by the record size. A large record that consumes more node space will happily sort with fewer nodes, but it may end up requiring more external merge passes. Here’s a rough formula to determine the amount of memory needed to allocate all 8192 nodes based on the record size:
phase1 memory = 262144 + ((recsize + 6) * 8192)
You can see that if you use the Unix default SORTMEM of 1 MB, a record size of only 89 characters will consume all of the available nodes. (The Windows default SORTMEM of 4 MB will take a 473-character record before consuming all nodes.) A larger record size will sort with fewer tree nodes and may require more merge passes unless more memory is added. But that’s not the whole picture. Even when all of the nodes have been allocated, the second phase may want even more memory.
Determining the optimal memory to give phase 2 is a little more complicated. To calculate the exact amount of memory required for a single pass in phase 2, you would first have to know how sorted or unsorted the input file was. That’s not very easy without knowing the contents of the file. As a wild guess, we could pick a random distribution, or one-half sorted for every file. A formula based on that guess would look roughly like this (where #nodes is determined when allocating phase 1):
#sorted_runs = (#records / #nodes) / 2 phase2 memory = (#sorted_runs * (65560 + recsiz)) + 262144
However, while I was considering this, I realized that not only is #nodes known after phase 1, but #sorted runs is also known. That means we also know exactly how sorted the file was—no guessing required. To calculate the amount of memory needed to complete the second merge phase in exactly one pass is now very simple, but it’s not something you can set in an environment variable.
So, as an enhancement to version 10.3.3c, these calculations are now done automatically, and the optimal amount of memory is chosen when possible. However, to ensure the sort doesn’t get too greedy, we’ve also added a few new limits with new environment variables, which you can adjust appropriately for the system being used, if necessary. The SORTMEM environment variable remains and will override the automatic sort memory calculation, but its use is being deprecated. (You may want to check if you’re currently using SORTMEM and consider unsetting it now.) In place of SORTMEM, the new SORTMEMMAX environment variable has been added to limit the upper bounds (with an allowed minimum of 512 and no maximum). On Windows, the default for SORTMEMMAX is set to 65536 (64 MB), which makes better use of available system memory. For Unix systems, where multi-user processes compete for system memory, the default is set to 4096 (4 MB). In both cases, sorts that require less memory will only allocate the exact amount needed to minimize both the number of passes and the amount of I/O, and nothing more. In many cases, this will mean less sort memory will be used.
Looking at sort memory usage, here are two examples. The first file contains 5,000,000 128-byte records. Sorted on a Sun machine, the following times were logged based on different SORTMEMMAX settings (optimum is around 20 MB):
512 - 3min 47sec 1MB - 1min 50sec 4MB - 1min 29sec 32MB - 40sec
For the same file with a larger record size of 1000 bytes, the following times were logged:
512 - 3hr 57min 3sec 1MB - 1hr 39min 9sec 4MB - 1hr 3min 18sec 32MB - 10min 33sec
These times will vary from system to system depending on disk I/O speed and the amount of memory available, but the relative differences in time based on memory usage is what I find interesting. So, you may ask, how can you tell if you need to adjust SORTMEMMAX? As a general rule of thumb, Unix files containing more than 1 million records larger than 500 characters and Windows files that exceed 16 million 8K records are candidates. These days, both are probably rare occurrences—but to determine if other files would be good candidates, you can use debug logging. By setting SHOWSORTMEM=1, the amount of memory used and the number of necessary merge passes will be displayed to the console or terminal (if available) on every sort. Setting SHOWSORTMEM_FILE=filename will redirect that information to a log file. Sample output from the 4- and 32-MB 1000-byte record sorts above follows. (I didn’t include the 512K or 1 MB entries because they included 4830 and 322 passes, respectively):
[testfile.ddf] Total phase 1 consumed 4193344 (4194304/960) - tree nodes allocated (pass=1): 3899 Total phase 2 runs: 643 Total phase 2 consumed 4134168 (4194304/60136) - tree nodes allocated (pass=2): 58 - tree nodes allocated (pass=3): 58 - tree nodes allocated (pass=4): 58 - tree nodes allocated (pass=5): 58 - tree nodes allocated (pass=6): 58 - tree nodes allocated (pass=7): 58 - tree nodes allocated (pass=8): 58 - tree nodes allocated (pass=9): 58 - tree nodes allocated (pass=10): 58 - tree nodes allocated (pass=11): 58 - tree nodes allocated (pass=12): 58 - tree nodes allocated (pass=13): 16 Sort complete 3798 seconds
Total phase 1 consumed 8520688 (8520688/0) – tree nodes allocated (pass=1): 8192 Total phase 2 runs: 306 Total phase 2 consumed 20644992 (20644992/0) – tree nodes allocated (pass=2): 306 Sort complete 633 seconds
This gives you a way to check your sorts and adjust the maximum memory threshold depending on file sizes and the amount of system memory available. Each pass (beyond the optimal two passes) represents an intermediate state where all of the records have been read and rewritten to disk. As part of the Synergy Runtime, sorts occur during Select when using OrderBy on very large selections, as well as on the SORT statement. And, if you are sorting files on a remote server, the sort occurs as part of xfServer on the server machine.
In addition to application sorting, sorts also occur when using the isutl and fconvert utilities. The process of reindexing and reordering uses the sort, but since these operations tend to be done individually as a maintenance procedure, we increased the defaults and added a new environment variable separate from the one used by applications. You can set the environment variable IU_SORTMEMMAX to adjust ISAM utility sort memory, which now has a utility default of 64 MB on both Windows and Unix systems. Reordering multi-key ISAM files with many (tens of millions) large records will benefit from this change.
So in the end, the sort has just become a little smarter, and we’ve given you debug tools to help you optimize. Knowing exactly how much memory it wants is no longer necessary (except for those of you who push the envelope in file sizes). Now the sort will automatically adjust the memory it uses, and you can better control application performance by analyzing usage and making a few simple adjustments.