Create Web Analytics service

Something that has been missing from our APIs is a way to check how often an API is used, and what parameters were passed to do those checks. This has been a feature talked about internally for a while as a way to determine what APIs are in use, which aren't, and which might not be performing well in production/staging environments. While this is possible with NGINX access logs if we truly wanted to parse those logs, there are some analytics that would be difficult to easily produce like average processing times, payload sizes, and exact parameters passed to the call that were useful.

This issue details the first phase of this feature, where the results are ephemeral and aren't retained between runs, which represents a more complex use case that would need more logic and thought to keep the data accessible but compact. In future phases, we would work to serialize this to the disk in a compressed form, and allow for reports to be requested on the serialized data.

Implementation details

To implement phase one of this functionality, we'd need a request and response filter and a service for tracking the results posted by the filters. To serve the data, we'll want a resource that would query the results table that we maintain and allow us to display those results in the browser.

Request filter

A new request filter will be created to begin the tracking of the request. At the start of the request, the start time of the request gotten by calling now will be inserted into the request context properties. This will be used later to calculate the time a request took to complete. To make sure this starts as early as possible, we'll need to set @Priority(1) on the class, which should make it run as early as possible to include pre-processing in the time.

Response filter

A new response filter will be created to use the earlier tracking and submit an entry to the new service defined below. A total runtime will be calculated by using the start time set in the request context by the above filter. Additionally, we'll want to leave the priority of the response unset as that gives the lowest priority we can give to the response filter, as we want it to run as late as possible.

WebAnalyticsService

To track and maintain the data, we'll require a new service to hold the data and allow us to do some basic querying of the data for the resource that will serve the data. In this service, we will want to track the following data points:

base URL of the request w/o params
params passed in the request, flattened to a string for ease of storage
time elapsed to process the request
status code of the request

These stats will allow us to sort results and be able to look for patterns in the future, both in time to complete certain requests, or requests that commonly result in non-successful status codes.

To limit the memory impact in phase 1 of running this API, we should cull old stats past a configurable limit, with a default of 20000 entries. This should give us a good snapshot most of the time while limiting the impact of the analytics on the host of long-running services.

WebAnalyticsResource

To serve this data in the first phase of this functionality, we'll want to add a resource that follows the same pattern as the cache resource for security. To make management of these secure resources, we can use the same key object we use to secure the cache resource in this class as well.

This resource will need 2 endpoints to be fully useful. The first endpoint is a summary endpoint that collates the results and gives the following data, with each base URL being given its own section:

base URL
average time to complete
max time to complete
number of calls
number of 2xx results

This doesn't represent a perfect solution, as this means dynamic path entries will each spawn their own section in the analytics report. This is acceptable for the first phase as more investigations will need to be done around approaches to better handle these requests without large overhead.

The second endpoint will take an encoded URL as a query string param and get all endpoints results that match the passed string. For results matched, we will return them as JSON for better flexibility in how we look at these results in the future, and give us the option for more complex filtering and reporting.

For the query param mentioned above, we'll need to sanitize it first with a regex like [^a-zA-Z0-9{}_\/-], which will strip suspicious characters out, and then replace literal / with \/ to make sure that regex patterns can properly read the path parts. We can then support dynamic paths by replacing {[\S_]+} with [^\/]+, and then creating a pattern from the resulting string. This would allow us to for example check /api/organizations/{}, generating a regex like \/api\/organizations\/[^\/]+. This regex work should give us good enough lookups of analytics without getting too deep in the weeds on the first pass.