This the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Caching

The following articles exlpain how Trickster leverages caching.

1 - Cache Options

Trickster supports a number of caches.

Supported Caches

Trickster supports several cache types:

  • In-Memory (default)
  • Filesystem
  • bbolt
  • BadgerDB
  • Redis (basic, cluster, and sentinel)

The sample configuration, trickster/examples/conf/example.full.yaml, demonstrates how to select and configure a particular cache type, as well as how to configure generic cache configurations such as Retention Policy.

In-Memory

In-Memory Cache is the default type that Trickster will implement if none of the other cache types are configured. The In-Memory cache utilizes a Golang sync.Map object for caching, which ensures atomic reads/writes against the cache with no possibility of data collisions. This option is good for both development environments and most smaller dashboard deployments.

When running Trickster in a Docker container, ensure your node hosting the container has enough memory available to accommodate the cache size of your footprint, or your container may be shut down by Docker with an Out of Memory error (#137). Similarly, when orchestrating with Kubernetes, set resource allocations accordingly.

Filesystem

The Filesystem Cache is a popular option when you have larger dashboard setup (e.g., many different dashboards with many varying queries, Dashboard as a Service for several teams running their own Prometheus instances, etc.) that requires more storage space than you wish to accommodate in RAM. A Filesystem Cache configuration keeps the Trickster RAM footprint small, and is generally comparable in performance to In-Memory. Trickster performance can be degraded when using the Filesystem Cache if disk i/o becomes a bottleneck (e.g., many concurrent dashboard users).

The default Filesystem Cache path is /tmp/trickster. The sample configuration demonstrates how to specify a custom cache path. Ensure that the user account running Trickster has read/write access to the custom directory or the application will exit on startup upon testing filesystem access. All users generally have access to /tmp so there is no concern about permissions in the default case.

bbolt

The BoltDB Cache is a popular key/value store, created by Ben Johnson. CoreOS’s bbolt fork is the version implemented in Trickster. A bbolt store is a filesystem-based solution that stores the entire database in a single file. Trickster, by default, creates the database at trickster.db and uses a bucket name of ‘trickster’ for storing key/value data. See the example config file for details on customizing this aspect of your Trickster deployment. The same guidance about filesystem permissions described in the Filesystem Cache section above apply to a bbolt Cache.

BadgerDB

BadgerDB works similarly to bbolt, in that it is a filesystem-based key/value datastore. BadgerDB provides its own native object lifecycle management (TTL) and other additional features that distinguish it from bbolt. See the configuration for more info on using BadgerDB with Trickster.

Redis

Note: Trickster does not come with a Redis server. You must provide a pre-existing Redis endpoint for Trickster to use.

Redis is a good option for larger dashboard setups that also have heavy user traffic, where you might see degraded performance with a Filesystem Cache. This allows Trickster to scale better than a Filesystem Cache, but you will need to provide your own Redis instance at which to point your Trickster instance. The default Redis endpoint is redis:6379, and should work for most docker and kube deployments with containers or services named redis. The sample configuration demonstrates how to customize the Redis endpoint. In addition to supporting TCP endpoints, Trickster supports Unix sockets for Trickster and Redis running on the same VM or bare-metal host.

Ensure that your Redis instance is located close to your Trickster instance in order to minimize additional roundtrip latency.

In addition to basic Redis, Trickster also supports Redis Cluster and Redis Sentinel. Refer to the sample configuration for customizing the Redis client type.

Purging the Cache

Cache purges should not be necessary, but in the event that you wish to do so, the following steps should be followed based upon your selected Cache Type.

A future release will provide a mechanism to fully purge the cache (regardless of the underlying cache type) without stopping a running Trickster instance.

Purging In-Memory Cache

Since this cache type runs inside the virtual memory allocated to the Trickster process, bouncing the Trickster process or container will effectively purge the cache.

Purging Filesystem Cache

To completely purge a Filesystem-based Cache, you will need to:

  • Docker/Kube: delete the Trickster container (or mounted volume) and run a new one
  • Metal/VM: Stop the Trickster process and manually run rm -rf /tmp/trickster (or your custom-configured directory).

Purging Redis Cache

Connect to your Redis instance and issue a FLUSH command. Note that if your Redis instance supports more applications than Trickster, a FLUSH will clear the cache for all dependent applications.

Purging bbolt Cache

Stop the Trickster process and delete the configured bbolt file.

Purging BadgerDB Cache

Stop the Trickster process and delete the configured BadgerDB path.

Cache Status

Trickster reports several cache statuses in metrics, logs, and tracing, which are listed and described in the table below.

StatusDescription
kmissThe requested object was not in cache and was fetched from the origin
rmissObject is in cache, but the specific data range requested (timestamps or byte ranges) was not
hitThe object was fully cached and served from cache to the client
phitThe object was cached for some of the data requested, but not all
nchitThe response was served from the Negative Cache
rhitThe object was served from cache to the client, after being revalidated for freshness against the origin
proxy-onlyThe request was proxied 1:1 to the origin and not cached
proxy-errorThe upstream request needed to fulfill an associated client request returned an error

2 - Byte Range Request Support

Trickster’s HTTP Reverse Proxy Cache offers best-in-class acceleration and caching of Byte Range Requests.

Much like its Time Series Delta Proxy Cache, Trickster’s Reverse Proxy Cache will determine what ranges are cached, and only request from the origin any uncached ranges needed to service the client request, reconstituting the ranges within the cache object. This ensures minimal response time for all Range requests.

In addition to supporting requests with a single Range (Range: bytes=0-5) Trickster also supports Multipart Range Requests (Range: bytes=0-5, 10-20).

Fronting Origins That Do Not Support Multipart Range Requests

In the event that an upstream origin supports serving a single Range, but does not support serving Multipart Range Requests, which is quite common, Trickster can transparently enable that support on behalf of the origin. To do so, Trickster offers a unique feature called Upstream Range Dearticulation, that will separate any ranges needed from the origin into individual, parallel HTTP requests, which are reconstituted by Trickster. This behavior can be enabled for any origin that only supports serving a single Range, by setting the origin configuration value dearticulate_upstream_ranges = true, as in this example:

backends:
  default:
    provider: reverseproxycache
    origin_url: 'http://example.com/'
    dearticulate_upstream_ranges: true

If you know that your clients will be making Range requests (even if they are not Multipart), check to ensure the configured origin supports Multipart Range requests. Use curl to request any static object from the origin, for which you know the size, and include a Multipart Range request; like curl -v -H 'Range: bytes=0-1, 3-4' 'http://example.com/object.js'. If the origin returns 200 OK and the entire object body, instead of 206 Partial Content and a multipart body, enable Upstream Range Dearticulation to ensure optimal performance.

This is important because a partial hit could result in multiple ranges being needed from the origin - even for a single-Range client request, depending upon what ranges are already in cache. If Upstream Range Dearticulation is disabled in this case, full objects could be unnecessarily returned from the Origin to Trickster, instead of small delta ranges, irrespective of the object’s overall size. This may or may not impact your use case.

Rule of thumb: If the origin does not support Multipart requests, enable Upstream Range Dearticulation in Trickster to compensate. Conversely, if the origin does support Multipart requests, do not enable Upstream Range Dearticulation.

Disabling Multipart Ranges to Clients

One of the great benefits of using Upstream Range Dearticulation is that it transparently enables Multipart Range support for clients, when fronting any origin that already supports serving just a single Range.

There may, however, be cases where you do not want to enable Multipart Range support for clients (since its paired Origin does not), but need Upstream Range Dearticulation to optimize Partial Hit fulfillments. For those cases, Trickster offers a setting to disable Multipart Range support for clients, while Upstream Range Dearticulation is enabled. Set multipart_ranges_disabled = true, as in the below example, and Trickster will strip Multipart Range Request headers, which will result in a 200 OK response with the full body. Client single Range requests are unaffected by this setting. This should only be set if you have a specific use case where clients should not be able to make multipart Range requests.

backends:
  default:
    provider: reverseproxycache
    origin_url: 'http://example.com/'
    dearticulate_upstream_ranges: true
    multipart_ranges_disabled: true

Partial Hit with Object Revalidation

As explained above, whenever the client makes a Range request, and only part of the Range is in the Trickster cache, Trickster will fetch the uncached Ranges from the Origin, then reconstitute and cache all of the accumulated Ranges, while also replying to the client with its requested Ranges.

In the event that a cache object returns 1) a partial hit, 2) that is no longer fresh, 3) but can be revalidated, based on a) the Origin’s provided caching directives or b) overridden by the Trickster operator’s explicit path-based Header configs; Trickster will revalidate the client’s requested-but-cached range from the origin with the appropriate revalidation headers.

In a Partial Hit with Revalidation, the revalidation request is made as a separate, parallel request to the origin alongside the uncached range request(s). If the revalidation succeeds, the cached range is merged with the newly-fetched range as if it had never expired. If the revalidation fails, the Origin will return the range needed by the client that was previously cached, or potentially the entire object - either of which are used to complete the ranges needed by the client and update the cache and caching policy for the object.

Range Miss with Object Revalidation

Trickster recognizes when an object exists in cache, but has none of the client’s requested Ranges. This is a state that lies between Cache Miss and Partial Hit, and is known as “Range Miss.” Range Misses can happen frequently on Range-requested objects.

When a Range Miss occurs against an object that also requires revalidation, Trickster will not initiate a parallel revalidation request, since none of the client’s requested Ranges are actually eligible for revalidation. Instead, Trickster will use the Response Headers returned by the Range Miss Request to perform a local revalidation of the cache object. If the object is revalidated, the new Ranges are merged with the cached Ranges before writing to cache based on the newly received Caching Policy. If the object is not revalidated, the cache object is created anew solely from the Range Miss Response.

Multiple Parts Require Revalidation

A situation can arise where there is a partial cache hit has multiple ranges that require revalidation before they can be used to satisfy the client. In these cases, Trickster will check if Upstream Range Dearticulation is enabled for the origin to determine how to resolve this condition. If Upstream Range Dearticulation is not enabled, Trickster trusts that the upstream origin will support Multipart Range Requests, and will include just the client’s needed-and-cached-but-expired ranges in the revalidation request. If Upstream Range Dearticulation is enabled, Trickster will forward, without modification, the client’s requested Ranges to the revalidation request to the origin. This behavior means Trickster currently does not support multiple parallel revalidation requests. Whenever the cache object requires revalidation, there will be only 1 revalidation request upstream, and 0 to N additional parallel upstream range requests as required to fulfill a partial hit.

If-Range Not Yet Supported

Trickster currently does not support revalidation based on If-Range request headers, for use with partial download resumptions by clients. If-Range headers are simply ignored by Trickster and passed through to the origin, which can result in unexpected behavior with the Trickster cache for that object.

We plan to provide full support for If-Range as part of Trickster 1.1 or 2.0

Mockster Byte Range

For verification of Trickster’s compatibility with Byte Range Requests (as well as Time Series data), we created a golang library and accompanying standalone application dubbed Mockster. Mockster’s Byte Range library simply prints out the Lorem ipsum ... sample text, pared down to the requested range or multipart ranges, with a few bells and whistles that allow you to customize its response for unit testing purposes. We make extensive use of Mockster in unit testing to verify the integrity of Trickster’s output after performing operations like merging disparate range parts, extracting ranges from other ranges, or from a full body, compressing adjacent ranges into a single range in the cache, etc.

It is fairly straightforward to run or import Mockster into your own applications. For examples of using it for Unit Testing, check out /pkg/proxy/engines/objectproxycache_test.go.

3 - Collapsed Forwarding

Keep requests efficient with collapsed forwarding.

Collapsed Forwarding is feature common among Reverse Proxy Cache solutions like Squid, Varnish and Apache Traffic Server. It works by ensuring only a single request to the upstream origin is performed for any object on a cache miss or revalidation attempt, no matter how many users are requesting the object at the same time.

Trickster has support for two types of Collapsed Forwarding: Basic (default) and Progressive

Basic Collapsed Forwarding

Basic Collapsed Forwarding is the default functionality for Trickster, and works by waitlisting all requests for a cacheable object while a cache miss is being serviced for the object, and then serving the waitlisted requests once the cache has been populated.

The feature is further detailed in the following diagram:

Diagram of basic collapsed forwarding in Trickster

Progressive Collapsed Forwarding

Progressive Collapsed Forwarding (PCF) is an improvement upon the basic version, in that it eliminates the waitlist and serves all simultaneous requests concurrently while the object is still downloading from the server, similar to Apache Traffic Server’s “read-while-write” feature. This may be useful in low-latency applications such as DASH or HLS video delivery, since PCF minimizes Time to First Byte latency for extremely popular objects.

The feature is further detailed in the following diagram:

Diagram of progressive collapsed forwarding in Trickster

PCF for Proxy-Only Requests

Trickster provides a unique feature that implements PCF in Proxy-Only configurations, to bring the benefits of Collapsed Forwarding to HTTP Paths that are not configured to be routed through the Reverse Proxy Cache. See the Paths documentation for more info on routing.

The feature is further detailed in the following diagram:

Diagram of progressive collapsed forwarding for proxy-only requests in Trickster

How to enable Progressive Collapsed Forwarding

When configuring path configs as described in Paths Documentation you simply need to add progressive_collapsed_forwarding = true in any path config using the proxy or proxycache handlers.

Example:

origins:
  test:
    paths:
      thing1:
        path: /test_path1/
        match_type: prefix
        handler: proxycache
        progressive_collapsed_forwarding: true
      thing2:
        path: /test_path2/
        match_type: prefix
        handler: proxy
        progressive_collapsed_forwarding: true

See the example.full.yaml for more configuration examples.

How to test Progressive Collapsed Forwarding

An easy way to test PCF is to set up your favorite file server to host a large file(Lighttpd, Nginx, Apache WS, etc.), In Trickster turn on PCF for that path config and try make simultaneous requests. If the networking between your machine and Trickster has enough bandwidth you should see both streaming at the equivalent rate as the origin request.

Example:

  • Run a Lighttpd instance or docker container on your local machine and make a large file available to be served
  • Run Trickster locally
  • Make multiple curl requests of the same object

You should see the speed limited on the origin request by your disk IO, and your speed between Trickster limited by Memory/CPU

4 - Negative Caching

Cache responses to prevent system overload.

Negative Caching means to cache undesired HTTP responses for a very short period of time, in order to prevent overwhelming a system that would otherwise scale normally when desired, cacheable HTTP responses are being returned. For example, Trickster can be configured to cache 404 Not Found or 500 Internal Server Error responses for a short period of time, to ensure that a thundering herd of HTTP requests for a non-existent object, or unexpected downtime of a critical service, do not create an i/o bottleneck in your application pipeline.

Trickster supports negative caching of any status code >= 300 and < 600, on a per-Backend basis. In your Trickster configuration file, associate the desired Negative Cache Map to the desired Backend config. See the example.full.yaml, or refer to the snippet below for more information.

The Negative Cache Map must be an all-inclusive list of explicit status codes; there is currently no wildcard or status code range support for Negative Caching entries. By default, the Negative Cache Map is empty for all backend configs. The Negative Cache only applies to Cacheable Objects, and does not apply to Proxy-Only configurations.

For any response code handled by the Negative Cache, the response object’s effective cache TTL is explicitly overridden to the value of that code’s Negative Cache TTL, regardless of any response headers provided by the Backend concerning cacheability. All response headers are left in-tact and unmodified by Trickster’s Negative Cache, such that Negative Caching is transparent to the client. The X-Trickster-Result response header will indicate a response was served from the Negative Cache by providing a cache status of nchit.

Multiple negative cache configurations can be defined, and are referenced by name in the backend config. By default, a backend will use the ‘default’ Negative Cache config, which, by default is empty. The default can be easily populated in the config file, and additional configs can easily be added, as demonstrated below.

The format of a negative cache map entry is 'status_code': ttl_in_ms.

Example Negative Caching Config

negative_caches:
  default:
    '404': 3000 # cache 404 responses for 3 seconds
  foo:
    '404': 3000
    '500': 5000
    '502': 5000

backends:
  default:
    provider: rpc
    # by default will assume negative_cache_name = 'default'
  another:
    provider: rpc
    negative_cache_name: foo

5 - Trickster Caching Retention Policies

How long Trickster retains data

Basic HTTP Backends

Trickster will respect HTTP 1.0, 1.1 and 2.0 caching directives from both the downstream client and the upstream origin when determining object cacheability and TTL. You can override the TTL by setting a custom Cache-Control header on a per-Path Config basis.

Cache Object Evictions

If you use a Trickster-managed cache (Memory, Filesystem, bbolt), then a maximum cache size is maintained by Trickster. You can configure the maximum size in number of bytes, number of objects, or both. See the example configuration for more information.

Once the cache has reached its configured maximum size of objects or bytes, Trickster will undergo an eviction routine that removes cache objects until the size has fallen below the configured maximums. Trickster-managed caches maintain a last access time for each cache object, and utilizes a Least Recently Used (LRU) methodology when selecting objects for eviction.

Caches whose object lifetimes are not managed internally by Trickster (Redis, BadgerDB) will use their own policies and methodologies for evicting cache records.

Time Series Backends

For non-time series responses from a TSDB, Trickster will adhere to HTTP caching rules as directed by the downstream client and upstream origin.

For time series data responses, Trickster will cache as follows:

TTL Settings

TTL settings for each Backend configured in Trickster can be customized independently of each other, and separate TTL configurations are available for timeseries objects, and fast forward data. See examples/conf/example.full.yaml for more info on configuring default TTLs.

Time Series Data Retention

Separately from the TTL of a time series cache object, Trickster allows you to control the size of each timeseries object, represented as a count of maximum timestamps in the cache object, on a per origin basis. This configuration is known as the timeseries_retention_factor (TRF), and has a default of 1024. Most dashboards for most users request and display approximately 300-to-400 timestamps, so the default TRF allows users to still recall recently-displayed data from the Trickster cache for a period of time after the data has aged off of real-time views.

If you have users with a high-resolution dashboard configuration (e.g., a 24-hour view with a 1-minute step, amounting to 1440 data points per graph), then you may benefit from increasing the timeseries_retention_factor accordingly. If you use a managed cache (see caches) and increase the timeseries_retention_factor, the overall size of your cache will not change; the result will be fewer objects in cache, with the timeseries objects having a larger share of the overall cache size with more aged data.

Time Series Data Evictions

Once the TRF is reached for a time series cache object, Trickster will undergo a timestamp eviction process for the record in question. Unlike the Cache Object Eviction, which removes an object from cache completely, TRF evictions examine the data set contained in a cache object and remove timestamped data in order to reduce the object size down to the TRF.

Time Series Data Evictions apply to all cached time series data sets, regardless of whether or not the cache object lifecycle is managed by Trickster.

Trickster provides two eviction methodologies (timeseries_eviction_method) for time series data eviction: oldest (default) and lru, and is configurable per-origin.

When timeseries_eviction_method is set to oldest, Trickster maintains time series data by calculating the “oldest cacheable timestamp” value upon each request, using time.Now().Add(step * timeseries_retention_factor * -1). Any queries for data older than the oldest cacheable timestamp are intelligently offloaded to the proxy since they will never be cached, and no data that is older than the oldest cacheable timestamp will be stored in the query’s cache record.

When timeseries_eviction_method is set to lru, Trickster will not calculate an oldest cacheable timestamp, but rather maintain a last-accessed time for each timestamp in the cache object, and evict the Least-Recently-Used items in order to maintain the cache size.

The advantage of the oldest methodology better cache performance, at the cost of not caching very old data. Thus, Trickster will be more performant computationally while providing a slightly lower cache hit rate. The lru methodology, since it requires accessing the cache on every request and maintaining access times for every timestamp, is computationally more expensive, but can achieve a higher cache hit rate since it permits caching data of any age, so long as it is accessed frequently enough to avoid eviction.

Most users will find the oldest methodology to meet their needs, so it is recommended to use lru only if you have a specific use case (e.g., dashboards with data from a diverse set of time ranges, where caching only relatively young data does not suffice).