Snapshots in a Snap

Recover previous data states with ISAM change tracking

By John Brunett, Senior Software Engineer, Synergy/DE

New in Synergy/DE 10.1, change tracking is one of the biggest major additions to Synergy DBMS since its inception. Built directly into the DBMS subsystem, change tracking is an entirely transparent mechanism. Once configured, a file automatically tracks all types of changes — inserts, updates, and deletes — within the same working file. Then, using the Select class, you can develop routines to query those changes as you require.

Internally, change tracking is simply another index key (hidden, so it doesn’t interfere with your other keys). Unlike other methods of change logging (like keeping two copies of the same record, one in the file and the other in a log file), change tracking can, in most cases, just move a link from one index to another — there’s no need to write the record somewhere else, because the original copy is still there.

A new concept associated with change tracking is a snapshot. Typically with a mechanism that tracks changes, there needs to be a way to isolate changes made during a specific time period. Recording the date of every change is one option, but doing so for thousands of updates could cause costly overhead, when all you really may want are changes made between two specific times. A snapshot is a timed synchronization point that can represent a specific time (e.g., a start time or an end time). Changes made after a snapshot is applied are grouped in chronological order with respect to that snapshot until another snapshot is applied, which ends that group and starts a new group. Changes that follow the last applied snapshot are considered to be part of the next snapshot. Using the Select class, you can retrieve those changes by specifying two snapshots (start and end) by number. Snapshots can be applied at key time intervals (like end-of-day or end-of-week) for effective retrieval.

The file itself is like any other ISAM file; all READ, READS, STORE, WRITE and Select I/O operations perform as they always have. But to get at the change history, we’ve added a few new methods to the Select class. The Where.Changes and Where.NetChange methods take two snapshot numbers as arguments and return a single record for each enumerator iteration, representing a change or net change that was made to a file between the two snapshot times. So, across several snapshots, Where.Changes shows all recorded changes made throughout those snapshots. It may show an insert of a record followed by several updates of the same record. Over that same span of snapshots, Where.NetChange shows the net result of those recorded changes. In other words, the above selected record would be a single insert with the contents of the last update, or no record at all if it had been inserted and deleted during that period.

To make it easier to programmatically specify snapshot numbers, we’ve included relative numbering. A positive snapshot number represents the actual recorded snapshot number, but a negative snapshot number represents a relative snapshot. It’s possible to use relative snapshots without knowing the current snapshot.

Using the new CTInfo class, you can get information like change type (insert, update, delete) by using the CTInfo.CTState property. If necessary, you can also access back and forth across snapshots to see different versions of the same record by using the CTInfo.Older and CTInfo.Newer properties.

Identifying a change record depends on the CTState bit pattern that gets returned.

public enum CTState
    Untracked               0x00
    Insert                  0x01
    Update                  0x02
    Delete                  0x04
    ModifiedByDelete        0x10
    ModifiedByUpdate        0x20
    Changed                 0x30
    PriorVersion            0x40
endenum

Correctly identifying a change record requires using the bitwise logical operators, which can be a little tricky. Here is some sample code:

ctstate = selenum.GetCTInfo.GetCTState
if (ctstate.band.CTState.Insert) then
    display(ttchn, "[Insert")
else if (ctstate.band.CTState.Delete) then
    display(ttchn, "[Delete")
else if (ctstate.band.CTState.Update) then
    display(ttchn, "[Update")
else
    display(ttchn, "[No Change data")
if (ctstate.band.CTState.ModifiedByDelete) then
    display(ttchn, ":ModifiedByDelete")
else if (ctstate.band.CTState.ModifiedByUpdate)
    display(ttchn, ":ModifiedByUpdate")
if (ctstate.band.CTState.PriorVersion)
    display(ttchn, ":PriorVersion")
if (.not.(int)(ctstate.band.CTState.Changed).and..not.(int)(ctstate.band.CTState.Delete)) then
    writes(ttchn, ":Active]")
else
    writes(ttchn, "]")

A retrieved change record flagged with Insert reflects the record contents as they were first stored; Update reflects the new record contents after they were updated; and Delete reflects the record contents as they were when they were deleted. If an Insert or Update record includes the Changed enumeration (ModifiedByUpdate or ModifiedByDelete), an updated version or deletion of that record exists in a newer snapshot; otherwise that record is currently active. An “active record” is simply the record version that will be retrieved when accessing the file through regular I/O (READ/READS). An “inactive record” is a prior record version and cannot be changed.

Something to consider when deciding when to issue a snapshot is how changes are recorded during a snapshot. Multiple changes made to the same record during the same snapshot will get folded into a single net change. For example, an Update following an Insert during the same snapshot will be converted to a single change and marked as a net Insert at the interval the Update was made. Similarly, a Delete following an Insert made during the same snapshot will result in all traces of that record being removed entirely. If a finer level of change tracking is desired, snapshots need to be applied more frequently. You might consider a daily snapshot if you plan on implementing data replication.

You can also select on a certain type of change. For example, if you just want inserts, you can include the following in your Where expression:

Where.Changes(snapshot_1, snapshot_2).and.Where.ChangeType(CTState.Insert)

Another use of change tracking is the ability to view a file as it was at a certain time in history. The Where.Snapshot method takes a single snapshot number as its only argument to identify the end time condition for a file. Then, enumerated records represent the active (and possibly inactive) contents of a file as they existed at the time the snapshot was made. Records can be inserted, updated, or even deleted after the snapshot occurred without altering the original content of that snapshot. An example of how this feature can be used is in end-of-period processing. A bank can issue a snapshot at 5:00 pm and proceed to calculate end-of-day figures or generate a report, while customers continue to access and update their accounts without interfering with that process.

To manage a file with change tracking, we’ve added a new utility called ctutl. Ctutl enables you to apply snapshots as well as free them. A powerful function of this utility is the “rollback” function: all changes to a file can be rolled back to a specific snapshot. You’ll also use this utility to free old snapshots and the change history preceding them. The “free” function is necessary to manage the space used by change tracking. Neglecting to free old snapshots can cause a file to exceed your disk capacity or exceed the maximum 255 snapshot limit, so don’t forget to free.

Whether you need to replicate DBMS data to another database, roll back changes after failure, or report end-of-day figures, change tracking can be your path. For more information, see “Change tracking” and “Synergex.SynergyDE.Select.CTInfo” in the Synergy/DE documentation.