By John Brunett, Senior Software Engineer
You’ve just spent months reorganizing your application to run on tablets and installing Wi-Fi for your shop floor managers. You rolled out the new version with a big launch party, complete with balloons and belly dancers. But now you’re getting reports from those same managers that the application keeps failing with intermittent network errors for no apparent reason, and you’re the one feeling exposed.
For years, xfServer has provided users a reliable way to access remote files. But those users were sitting at desktop PCs, hard-wired to a network. Now, as more mobile devices are being used to run Synergy applications, users are bumping up against the limits of the technology and wondering what’s happened to that reliable xfServer connection they depend on. It’s simple, really: Access to xfServer is via TCP/IP over a socket. In a wireless environment, the device containing the application may move from one Wi-Fi network access point to another as the user walks about. And when that happens, the connection is dropped—just as if you’d unplugged a desktop PC from the network. This results in a bad channel that needs to be closed and reopened, and then the application needs to somehow restore its context so the user can continue what he was doing. Chances are, most applications aren’t prepared to do this, so the unfortunate result is an $ERR_NETPROB error and some very unhappy users. Connection Recovery to the rescue!
Connection Recovery, introduced in version 10.3, solves most of these transient connection loss issues. Windows xfServer can now detect a connection loss and suspend its resources (locks and context). The client runtime will automatically attempt to reconnect the next time a remote channel is accessed. Once client and server are reconnected, all channels (including their lock status and context) are restored. And you don’t need to make major changes to your application to implement this.
Connection Recovery is configured with a combination of settings on the client and server. Let’s start on the server side: On the xfServer tab of the Synergy Configuration program you’ll see some new options where you can enable the feature in either slave or master mode and set values for various time-outs. By default, whether you’re creating a new xfServer service or editing an existing one, Connection Recovery is enabled in Slave mode. All a client application needs to do is set the environment variable SCSKEEPCONNECT to “On” prior to opening the first file on the server, and Connection Recovery will be enabled for that client. Alternatively, Connection Recovery can be started in Master mode, which enables the feature for all clients without explicitly setting anything on the client. Master mode is a good choice if all the clients for a particular server are the same type of device and experience the same potential for connection losses. Slave mode is a better choice if you have a mixture of clients, where some are mobile devices and some are desktop systems that are unlikely to experience connection problems. (We don’t recommend using Connection Recovery for wired LANs that frequently drop connections—we’d rather see you repair your network.)
In addition to the mode settings, there are four timers that can be configured: Keep context, Keep locks, Keepalive, and Client retry. You can set these timers by choosing one of the predefined profiles, which offer a convenient way to set values based on the target client type, or by selecting “Custom” and specifying your own values. The predefined profiles will work for most situations and are a good starting point. Once you’ve used the feature for a while, you may want to delve deeper into exactly what these four timers do and how they interact so that you can customize the settings to suit your particular application and the environment in which it is used.
The Keep context timer is the primary timer. It determines how long the server will save the context for a client once a dropped connection is detected. The degree of mobility of a client and how the application is used can help you determine how long you want to suspend a client context. A laptop on the desk of the CEO is less susceptible to connection loss than the tablet being carried by the shop floor manager from one side of the warehouse to the other. But should the CEO decide to carry her laptop to a conference room, she might experience the same connection loss as the shop manager. Also consider how the application is used. If the shop floor manager is constantly making queries, you’ll want a short Keep context time, but if he checks things only every half hour, you’ll want a longer time. By default, Keep context is set to five minutes.
The Keep locks timer defines how long records remain locked when a client context is saved. If your application uses optimistic locking (which is what we recommend), Keep locks can be set to a very low value—even zero. But if your application holds onto record locks, you’ll probably want a larger value. You should also consider the amount of locking activity done by your application. If it’s high, you’ll probably want to set Keep locks to a fairly short time so that other users can continue doing their work. By default, locks are held as long as the context is suspended, but regardless, once the context expires, the locks will be released.
The Keepalive timer, which configures the actual TCP/IP keepalive timer, may be the trickiest to configure properly. This timer fires periodically to tell the server to check on the status of connections. The smaller you make this value, the quicker the server can detect a connection loss and suspend the client context. But this efficiency comes at the cost of more frequent TCP/IP keepalive pings, which may be undesirable in a busy network. On the other hand, a higher value may end up leaving client contexts suspended (with record locks) for times well over the Keep context and Keep locks timers. If you’re unsure, just go with the default of five minutes.
The final timer is the Client retry timer. As the name suggests, this controls how long the client will continue to try to reconnect to the server once it detects the disconnect. At first, you might think this timer should be set the same as the Keep context timer. After all, once the client stops retrying and fails, why would you want the context to stay around? But the truth is, the server is constantly listening on the network connection, whereas the client application is waiting for the user to select some menu item that happens to access the server connection that was lost, and that’s when the Client retry timer starts ticking. So, the question is: Once the shop floor manager clicks on that particular menu item, how long do you want to make him wait if the connection is permanently lost? Even if he’s extraordinarily patient, if it takes longer than a minute, he’s going to start thinking about killing the app and starting over. So you’ll want to set the Client retry timer to a time long enough to weather a temporary outage, but not longer than the average user is likely to wait. Once the timer expires, the application will throw a “Server connection retry failure” error ($ERR_SRVCONRTY). But, with a slight modification to your application, you can make a great improvement in the user experience. We added a new IOHooks method called reconnect_hook(), which is called when the Client retry time expires, prior to the $ERR_SRVCONTRY error being generated. Your application can pop up a message and request a Yes/No response as to whether to continue retrying the connection. If you use this approach, you can set the Client retry timer to less than a minute, and let the user decide how long he wants to wait.
By now you’ve likely come to the conclusion that configuring Connection Recovery effectively is a matter of balancing competing goals. The end user wants resources to stay around long enough for the application to reconnect so that he doesn’t lose his work. But the system administrator doesn’t want to tie up system resources by holding on to contexts and locks when the client isn’t going to reconnect any time soon. To help you in this balancing act, three of the timers—keep context, keep locks, and keepalive—can also be set on the client with environment variables—SCSKEEPCONTEXT, SCSKEEPLOCKS, and SCSKEEPALIVE—that will override the server settings. So, you can pick values that work for most users most of the time and set those on the server, then override one or more of them for individual clients as necessary. Also, keep in mind that if you have more than one xfServer, you can set the Connection Recovery values differently for the different servers and then allocate client resources according to their connection recovery needs.
There are a few more things to understand. Connection Recovery is not magic: Once the application is terminated, the original context is no longer available. This is why we recommend you implement the new IOHooks method so that the user knows what is going on and can make an informed decision about quitting the application and losing his work or waiting for the connection to be recovered. Also, if the application is idle after a network loss (that is, it doesn’t attempt to access a file on the server), the server will go ahead and release the context when the Keep context value expires. If the user then resumes accessing files, the application will try to connect, and the result will be a “Server session has expired or has been terminated” error.
To help you manage Connection Recovery, we made some improvements to the Monitor utility for Windows (synxfmon). If you’re not already familiar with synxfmon, you might want to check it out. In addition to telling you which files are opened and by whom, it now reports when a client connection has been suspended and for how long. You can then choose to terminate those suspended contexts if necessary.
With version 10.3, xfServer can be as mobile as your application. Once you implement Connection Recovery, your shop floor managers are going to be so happy that they might even throw you a party. Maybe they’ll even invite the belly dancers back.