Caching and Synchronization

This section discusses the caching and synchronization of Curl applets and applet support files.
You can find specific recommendations on configuring Curl applications for efficient and effective caching and synchronization in the following sections:

Overview of Synchronization and Caching

Synchronization is relevant only to content that is cached locally on the client machine. For Curl applets, such content includes packages, manifests, and all content loaded over the Internet via HTTP. The section Deployment Issues discusses caching and resynchronization when using minimal style browser resident HTTP.
There are three levels of caching that can affect Curl content:
  1. HTTP caching affects all content downloaded using the HTTP protocol, i.e. with URLs having the "http:" or "https:" prefixes. HTTP caching applies to files loaded directly using read-open and related APIs as well as files loaded indirectly as the result of importing a package or a manifest.

    HTTP caching results in faster load times of content overall because the content does not have to be downloaded from the server if it is already in the cache. Files remain in the HTTP cache until they are replaced by a newer version, removed to make space when the cache becomes full, or expire in accordance with one of several cache directives that might appear in the file's HTTP header. Expiration is usually only configured for dynamically generated HTTP content (such as that generated by a CGI script) and is not commonly used with statically deployed files. See RFC 2616: Hypertext Transfer Protocol—HTTP/1.1 for details of headers that affect caching and synchronization.

    Note that on Windows, the Curl RTE uses the same HTTP cache as Internet Explorer (even when the applet is loaded by a different browser, such as Mozilla Firefox) and is affected by the user's Temporary Internet Settings from the Internet Properties control panel.
  2. In-memory caching is used for packages and manifests. The in-memory cache is maintained for the duration of a single session of the Curl RTE, until the RTE is shutdown.

    In-memory caching of packages and manifests allows multiple applet instances or subsequent loads of the same applet to share components that are already loaded. It avoids the cost of recompiling and reloading packages that are already in use by the RTE. The RTE flushes unused packages and manifests from memory at various times, such as several seconds after an applet finishes loading.

    The image text procedure also caches data, but only within a single process. It does not share its data between multiple applets.
  3. Persistent caching, which was first introduced in the 3.0 API, is used only for packages. Packages are cached on disk in binary form and do not need to be recompiled when they are loaded. Persistently cached packages are removed from the cache when they are known to be out-of-date, or in order to make space in the cache as it fills up.

    By default, all packages are persistently cached. Persistent caching can be controlled or disabled by adding a package-caching-style specification to the applet or script declaration.
The Curl RTE provides synchronization mechanisms that affect all three layers of caching.
The RTE synchronizes all content relative to the time returned by process-resync-as-of, which is never later than the start time of the process. This time can be set to the time specified by the resync-as-of keyword in an applet, script, or manifest declaration. Or it can be set to the process start time by one of the forced resynchronization mechanisms described in Forcing Synchronization.
Packages and manifests can specify a maximum amount of time they may be cached without resynchronization by declaring a cache-duration in their package/manifest declaration.
Content is determined to be up-to-date based on cached modification timestamps of the underlying files. Changing the modification time of a file can cause it to be considered out-of-date, even if its content has not been changed. In some cases, after the RTE has reloaded a package or manifest, it may discover that it has not changed and continue to use a previously cached version.

Forcing Synchronization

Synchronization is forced when process-resync-as-of is set to the process's start time, which can come about in a number of ways:
Automatic Synchronization
The General tab of the Curl Control Panel contains settings that affect the forced synchronization of applets. The user may arrange for all applets to be resynchronized when loaded, limit forced synchronization to debuggable applets (this is the default setting), or disable automatic forced synchronization.

Debuggable applets originating from the local filesystem are always synchronized regardless of this setting.

Applets launched from the Run menu of the Curl IDE are also automatically synchronized.
Programmatic Synchronization
Specifying a time later than the process start time for resync-as-of in the applet or script declaration forces synchronization. For instance, an applet with the declaration:

{applet resync-as-of = {utc-date-time "3000-01-01"}}


is always synchronized as of the process start time. However, because synchronization can require significant extra overhead at load time, we recommend that the applet's resync-as-of declaration refer to the time when the most recent content used by an applet is deployed to the http server.

A better technique, introduced in Curl 7.0, is to specify a file whose last modification time will be used as the synchronization time. Then you only need to touch the file in order to force resynchronization. For instance, the following declaration will cause the applet to use the last modification time of the manifest as the synchronization time:

{applet
    resync-file = "manifest.mcurl",
    manifest = "manifest.mcurl"
}




If the applet or script does not declare resync-as-of or resync-file directly, but specifies a manifest, the RTE uses the manifest's resync-as-of setting, if one exists, to set the process-resync-as-of time.

If there is no resync-as-of/resync-file declaration in the applet, the manifest itself may not be synchronized, so if you use this technique, the manifest should specify a cache-duration to ensure that it is synchronized within an acceptable window. See Using cache-duration in Packages and Manifests for more details on the cache-duration attribute.

The function request-resync-on-reload can also be used to request forced synchronization of an applet the next time it is loaded.
Manual Synchronization
Users can manually force synchronization of applets by changing the Curl Control Panel settings described above, but this is not convenient when synchronization is only needed once in a while.

Instead, users can simply select the Resync on Reload option on the right-click context menu in the applet running in the browser, and then simply reload the applet in the browser. Since this context menu may be overridden by the applet, it is possible that this option might not be available for all applets. However, if the Curl IDE is installed, the same option is also available on the control-right-click context menu.

Users of Curl scripts can force synchronization from the command line using the --resync flag. For instance:

> curl --resync my-script.xcurl

Synchronizing Individual Components and Files

The process-resync-as-of setting applies to all files and components loaded in the process, but sometimes it is useful to synchronize only a subset of the files or components used in the process. This section describes some mechanisms for doing so.

Using cache-duration in Packages and Manifests

A cache-duration declaration in a manifest can be used to ensure that the manifest has been synchronized. The cache-duration attribute specifies the amount of time that a cached component, such as a package or manifest, can go without having to be resynchronized. For instance the declaration ensures that the package is synchronized within one hour of the time it is imported.
{package ALPHA,
    cache-duration = 1hour
}
{let public constant alpha:int = 1}
Apart from its use to force synchronization of manifests when the applet lacks a resync-as-of, setting the cache-duration attribute should be used extremely sparingly It should be used only in situations in which a particular package or manifest changes infrequently but needs to be synchronized more frequently than other components that depend on it and are not expected to change. For example, given the ALPHA package shown above and another package that imports it:
{package BETA}
{import public alpha from ALPHA}
{let public constant beta:int = 1}
When an applet imports ALPHA more than an hour after it was last synchronized, it synchronizes it again. This means that any changes to the constant alpha are guaranteed to be observed by any applet that imports it an hour or more after the change. When an applet imports BETA, on the other hand, the package does not need to be synchronized even though it BETA depends on ALPHA. If ALPHA is modified, then BETA is recompiled to use the most recent version of ALPHA but BETA itself is not resynchronized, so the applet may not reflect changes to BETA's underlying HTTP files.
For this reason, use of cache-duration causes problems in this situation:
For example, if alpha were to be renamed to alpha-constant in both ALPHA and BETA, only the change to the ALPHA package might be noticed after the cache duration expires, and BETA would then fail to compile.

Using with-file-caching-style to Synchronize HTTP Files

You can use the with-file-caching-style statement to change the file caching setting used while dynamically reading HTTP content within the scope of the statement.
For example, the following statement forces the file to be synchronized when it is read.
let content:StringBuf =
    {with-file-caching-style FileCachingStyle.resynchronize do
        {read-from {choose-location}}
    }
See FileCachingStyle for a description of the different HTTP synchronization settings, and see with-file-caching-style for a more extensive executable example.
This approach can be used to force HTTP synchronization for the underlying import-package and import-manifest files but only when the package or manifest was not already found in the in-memory or persistent caches.
The with-file-caching-style statement can also be used to disable caching of data used by the image text proc.
Note that with-file-caching-style may not be used around top-level include or import statements.

Synchronizing Dynamically Imported Manifests

Normally the main manifest and its delegate manifests are loaded implicitly based on the manifest statement in the applet or script declaration. However, manifests can also be loaded dynamically using import-manifest. The default arguments for import-manifest honor the normal process and component synchronization settings described above, but the caller may explicitly force synchronization after the process start time by setting the check-out-of-date? flag to true and wrapping the call in with-file-caching-style to force synchronization of the HTTP files:
let manifest:ComponentManifest =
    {with-file-caching-style FileCachingStyle.resynchronize do
        {import-manifest
            manifest-url,
            check-out-of-date? = true
        }
    }
The import-package syntax does not have a comparable option to the import-manifest function's check-out-of-date? keyword, and there is no direct way to force synchronization of individual imported packages. However, since packages depend on the identify of their default manifest (see get-default-manifest), changing the manifest containing the location of the package and using that to import the package should result in a newly synchronized version of the package:
let package:Package =
    {with-file-caching-style FileCachingStyle.resynchronize do
        {import-package
            package-selector,
            manifest = manifest
        }
    }
For this approach to work, the manifest must contain the entry used to locate the package being imported and the contents of the manifest must change in some way.
Since packages and manifests are not released from memory until all the processes that use them have terminated, it may waste memory to forcibly reload packages and manifests in this fashion. Furthermore, such forced reloading increases the likelihood of mixing packages with incompatible APIs, which can result in various types of errors. Because of these problems, we strongly recommend against forcing synchronization of individual manifests and packages in this fashion unless it is absolutely necessary. If you think you may need to do this, instead consider putting frequently changing code in a separate file and load it using evaluate into a separate OpenPackage.

Configuration Recommendations Summary

This section provides some recommendations on how to configure Curl applications so that they perform caching and synchronization efficiently, and maintain correct program function. The following list summarizes the recommendations. The points summarized here are discussed in greater detail in Configuration Recommendations Details.

Configuration Recommendations Details

Server Configuration

The recommendations are:
You should keep the clocks synchronized on all machines involved in developing, deploying and serving Curl applications. The Curl RTE's synchronization mechanisms as well as those of Web browsers depend on file modification times for static content.
Keeping development and deployment machines synchronized is important because it affects the modification timestamps of files that are eventually deployed to the HTTP server. Keeping server machines synchronized is important because it may affect low-level HTTP caching behavior. Keeping server and development machines synchronized makes it much less likely that a server has files with timestamps in the future, which produces bad caching results as discussed below.
The appropriate technique for synchronizing machines depends on the platform, but is not difficult. On Windows, time synchronization can be configured from the "Date and Time" control panel. You may also need to configure a server machine as a time server and/or ensure that your firewall will allow NTP requests to external time servers. On Linux, you will need to set up the NTP service.
We also recommend configuring the HTTP server to give short expiration times to Curl applets, but not to other files loaded directly or indirectly by the applet. The Curl RTE can make sure that the files it loads are synchronized according to the applet's resync-as-of declaration, but has no control over the synchronization of the applet file itself since that is controlled by the web browser. The end-user can force synchronization in the browser by holding down either Control (for IE) or Shift (for Netscape/Mozilla) when pressing the reload button (usually mapped to the F5 key), but that requires the end user to know when to do that. The expiration period depends on the nature of your application, but unless the applet is very large and you do not need users to notice changes quickly, it usually makes sense to expire it immediately. Setting expirations for other files used by the applet is not recommended because it will result in extra unnecessary HTTP synchronization.

Managing File Modification Times

The recommendations are:
The last-modified timestamp is the key file attribute controlling caching and synchronization of non-dynamically generated content in both the Curl RTE and the underlying HTTP layer. When serving content originating from files installed on the server, web browsers turn the file's last-modified timestamp into an HTTP Last-Modified header. This value is saved by HTTP caches when caching the file's content, and by the Curl RTE when caching Curl packages which depend on the file. Changes to the modification time are interpreted as an indication that the content has changed and needs to be reloaded.
The HTTP standard specifies that the server may not return a Last-Modified header with a time in the future with respect to the server's time. If a file's timestamp is in the future, the server should be converted it to the current time before setting the Last-Modified header. The result is that putting a file on the server with a modified timestamp in the future causes the file to be reported as having a different Last-Modified time every time it is loaded from the server, until the server's clock catches up with the timestamp. When this situation occurs with Curl package or manifest files, it looks to the Curl RTE as if the content is changing when it is not, which results in needless recompilation. The result can be a substantial increase in load time for large applications, making it important to avoid this situation. Fortunately, this situation is highly unlikely if the clocks on all machines involved in developing, deploying, and serving the Curl applications are kept synchronized.
For similar reasons, it is also highly important that the same files have the same timestamps on all mirrored servers. Otherwise, a client reading a file from one server at one time and later from another server will notice the difference in Last-Modified time, consider the file to be out of date, and require reloading it and rebuilding any components that depend on it. This behavior could eliminate much of the benefit of having load balanced HTTP servers.
In order to minimize unnecessary reloading and recompilation of Curl packages, you should not change modification times of files if the file content has not changed with respect to what was previously on the web server.
You should not replace files on the server with files having earlier modification times, because the client may not properly update those files. The design of the HTTP protocol assumes that the Last-Modified time only moves forward. Consequently, most HTTP client and server implementations do not update content when the time is changed to an earlier time. When server files have earlier modification times, the Web browser and the Curl RTE may fail to notice changes on the server until the client's browser HTTP cache has been cleared of the offending files.
This situation is most likely to occur when you revert the content on a server to an earlier version. Such a reversion may involve replacing newer files with older ones that have corresponding older time stamps. To prevent this problem, always update the file modifications times before deploying to the server.

Applets

The recommendations are:
Once the Web browser has loaded the main applet file and passed its contents to the Curl RTE, the RTE controls synchronization of the remaining files used by the applet. You can control how the RTE performs this synchronization with a resync-file declaration in the applet's header, such as the following:
{curl 8.0 applet}
{applet
    manifest = "manifest.mcurl",
    resync-file = ""
}
This declaration tells the Curl RTE to synchronize all files and cached content (such as Curl packages and manifests) used by the applet as of the last modification time of the applet file itself adjusted by the observed time difference between the client and server machines. Other reasonable choices for the value of the resync-file are the manifest file or an empty file used solely for this purpose. If all files and other cached content have been synchronized more recently than the specified time, and have been cached, the Curl RTE does not need to perform any further communication with the server, other than that required by the application itself. It is important that you remember to update the synchronization file's modification time when you update content on your webserver when using this technique.
Under the rare circumstances when your applet is hosted by a webserver that is set to the wrong time and is not under your control and your resync file's modification time is set to the correct time (i.e., it was not set on the webserver), then you probably will want to disable automatic time adjustment by adding the declaration:
resync-adjust? = false
We no longer recommend using the resync-as-of command to control synchronization because it is harder to update and does not adjust for differences between the client and server clocks. It may still be useful when you want to force your applet to always resynchronize by setting it to a date far in the future.
The Curl RTE provides an option on the Control Panel General tab to force resynchronization of applets, but you are advised not to depend on the client machine using this setting. There is no way that you as application deployer can be sure that all clients have set forced resynchronization, and any end-user can easily change the setting. Giving the applet a resync-file declaration is much better, because it allows the you to change the declaration as needed without having to change the configuration of any client machines.
Also note that there is a known bug present in the 4.0 and 5.0 versions of the Curl RTE, where setting the forced resynchronization control panel option causes extra synchronization to occur during lazy background caching of packages, which can lead to bad performance. This bug has been fixed in the 4.0.4 and 5.0.2 patch releases.
Since you expect the applet file to be expired frequently, you should keep its size small in order to minimize its download cost. You should place as much of the applet's functionality as possible in Curl packages, which can be cached in binary form and provide substantial time savings when loading the applet. You can also move the contents of the applet following the applet declaration into a separate file which is included by the applet. The effect of these strategies is that when an applet is reloaded but does not need to be resynchronized, only the small main applet file needs to be retrieved by the web browser, and the remaining files can come from the client's HTTP cache.

Caching and Synchronization API Summary

The following API elements are relevant to caching and synchronization of Curl applets:

Resources

HTTP Monitoring Tools

References

HTTP Protocol

Apache HTTP Server