Skip to content

Update caching service to allow proper distinction between error states

Problem

The current implementation of our caching layer has a few holes that need to be addressed.

If an error occurs while caching data such as LDAP responses, HTTP responses, or large data processes, the CachingService's current behavior is to log the stack trace, suppress the error, and return an empty Optional. This solution works well for some cases, but is insufficient when it comes to passing a meaningful error to the client.

EXAMPLE: The RESTEasy client throws a ResteasyWebApplicationException for both 404 and 500s. This causes an error while creating a cache key, but the cache only returns an empty Optional. In the case of fetching a user from Accounts or fetching from Foundationdb-api, this makes it impossible to determine the difference between data not existing and the external service being down without looking at the logs. Seeing as we generally assume an empty Optional means the data doesn't exist, the final return to the client is always a 404.

In the case of the sync script, this would lead to user permissions being removed if LDAP or Accounts were experiencing some service disruptions.

Proposed Solution

Create a cache wrapper object with some relevant fields:

  • An Optional containing the cached data
  • An Optional containing the error if there is one
  • We can ad additional fields that we deem relevant

Instead of returning and Optional, the caching service would return this object. Also in the case of an error, we could allow for a refresh if the error is not a 404.

/cc @malowe