eclipsefdn-api-common issueshttps://gitlab.eclipse.org/eclipsefdn/it/api/eclipsefdn-api-common/-/issues2024-03-20T14:17:36Zhttps://gitlab.eclipse.org/eclipsefdn/it/api/eclipsefdn-api-common/-/issues/100Quarkus 3.8 LTS upgrade2024-03-20T14:17:36ZMartin Lowemartin.lowe@eclipse-foundation.orgQuarkus 3.8 LTS upgradeIn early July, Quarkus 3.2 will no longer recieve security updates as it will reach it's 1 year end of life. At that point, we will need to migrate to the new LTS of Quarkus 3.8. There is currently one breaking change that may impact us,...In early July, Quarkus 3.2 will no longer recieve security updates as it will reach it's 1 year end of life. At that point, we will need to migrate to the new LTS of Quarkus 3.8. There is currently one breaking change that may impact us, though investigation will be needed to check the scope of the impact. The migration guides and one of the guides mentions that anything below the announced minimums will cause the API to fail to start (https://github.com/quarkusio/quarkus/wiki/Migration-Guide-3.5#hibernate-orm), though it is mentioned if you explicitly set the version that older versions may start though they could experience issues.
The only DB that should be on the old MariaDB version is one of our internal API DBs that isn't exposed to the public, so we might be able to branch our release and do fix backports in the short term in the worst case scenario.2024-06-07https://gitlab.eclipse.org/eclipsefdn/it/api/eclipsefdn-api-common/-/issues/98Add cleanup task for distributed CSRF table2024-03-20T14:17:36ZMartin Lowemartin.lowe@eclipse-foundation.orgAdd cleanup task for distributed CSRF tableThe current persistence runtime adds support for distributed CSRF but doesn't account for the cleanup of the table. Since this is user data, we should clean up any entries older than an hour on some regular schedule to ensure we don't re...The current persistence runtime adds support for distributed CSRF but doesn't account for the cleanup of the table. Since this is user data, we should clean up any entries older than an hour on some regular schedule to ensure we don't retain private data longer than necessary.https://gitlab.eclipse.org/eclipsefdn/it/api/eclipsefdn-api-common/-/issues/97Add alternate to caching layers to use external/distributed caching2024-03-20T14:17:36ZMartin Lowemartin.lowe@eclipse-foundation.orgAdd alternate to caching layers to use external/distributed cachingCurrently, we use an in-memory cache for caches for our services. As we continue to integrate with external services and scale up the size and scale of our pods, the need for a distributed cache increases. To better account for these cas...Currently, we use an in-memory cache for caches for our services. As we continue to integrate with external services and scale up the size and scale of our pods, the need for a distributed cache increases. To better account for these cases, we should look to using a distributed cache technology like Redis to supplement our current in-memory caching strategies. Ideally, we would be able to switch standard and loading caches independently to use Redis or the in-memory Caffeine cache as needed for applications to give greater flexibility when handling expensive to calculate values and faster quick/inexpensive to calculate values.
We currently don't have access to a Redis instance to develop against, so that would need to be taken care of first before we could begin the implementation of this feature.https://gitlab.eclipse.org/eclipsefdn/it/api/eclipsefdn-api-common/-/issues/90Investigate adding SQL comments to persistence queries for logging2024-03-20T14:17:35ZMartin Lowemartin.lowe@eclipse-foundation.orgInvestigate adding SQL comments to persistence queries for loggingIn SQL, comments can be included with queries to provide additional context to the call in the DBs event logs. For the most part, Hibernate (our DB engine) doesn't seem to support comments very well, and Quarkus might not have added the ...In SQL, comments can be included with queries to provide additional context to the call in the DBs event logs. For the most part, Hibernate (our DB engine) doesn't seem to support comments very well, and Quarkus might not have added the option to configure this setting. We will need to do some internal testing to see if adding HQL comments to queries will result in proper comments in the SQL log.https://gitlab.eclipse.org/eclipsefdn/it/api/eclipsefdn-api-common/-/issues/83Move logging and cache resources to be non-application endpoints2024-03-20T14:12:26ZMartin Lowemartin.lowe@eclipse-foundation.orgMove logging and cache resources to be non-application endpointsWith our current setup, applications that don't have a proper root will never be able to properly use the caching and logging endpoints. To fix this, we should create custom non-application endpoints that exist outside of the normal scop...With our current setup, applications that don't have a proper root will never be able to properly use the caching and logging endpoints. To fix this, we should create custom non-application endpoints that exist outside of the normal scope of endpoints and can be mounted on a different root to make it available at runtime.
To do this, we would need to convert the core package to have a deployment/runtime split to allow for bindings of build time augments to include the above resources in the management interface. Details on how this is implemented is available in https://github.com/quarkusio/quarkus/blob/main/docs/src/main/asciidoc/management-interface-reference.adochttps://gitlab.eclipse.org/eclipsefdn/it/api/eclipsefdn-api-common/-/issues/82Investigate handling HTTP auth for EF staging services2024-03-20T14:10:52ZMartin Lowemartin.lowe@eclipse-foundation.orgInvestigate handling HTTP auth for EF staging servicesWhile testing projects-staging to validate behaviour, I noticed we still aren't handling HTTP auth for staging servers. This breaks our connections any time we attempt to connect to staging API services, which isn't ideal for proper inte...While testing projects-staging to validate behaviour, I noticed we still aren't handling HTTP auth for staging servers. This breaks our connections any time we attempt to connect to staging API services, which isn't ideal for proper integration testing. We should add an option to include an HTTP auth header if we set a config with a token.https://gitlab.eclipse.org/eclipsefdn/it/api/eclipsefdn-api-common/-/issues/75Page size limit not being observed by persistence2024-03-20T14:09:27ZMartin Lowemartin.lowe@eclipse-foundation.orgPage size limit not being observed by persistenceIn working with Eric on the EFCP integration, we noticed that the pagesize limit was misbehaving on queries, and was overall very inconsistent on how it was being obeyed.In working with Eric on the EFCP integration, we noticed that the pagesize limit was misbehaving on queries, and was overall very inconsistent on how it was being obeyed.https://gitlab.eclipse.org/eclipsefdn/it/api/eclipsefdn-api-common/-/issues/74Create Web Analytics service2024-03-20T14:09:21ZMartin Lowemartin.lowe@eclipse-foundation.orgCreate Web Analytics serviceSomething that has been missing from our APIs is a way to check how often an API is used, and what parameters were passed to do those checks. This has been a feature talked about internally for a while as a way to determine what APIs are...Something that has been missing from our APIs is a way to check how often an API is used, and what parameters were passed to do those checks. This has been a feature talked about internally for a while as a way to determine what APIs are in use, which aren't, and which might not be performing well in production/staging environments. While this is possible with NGINX access logs if we truly wanted to parse those logs, there are some analytics that would be difficult to easily produce like average processing times, payload sizes, and exact parameters passed to the call that were useful.
This issue details the first phase of this feature, where the results are ephemeral and aren't retained between runs, which represents a more complex use case that would need more logic and thought to keep the data accessible but compact. In future phases, we would work to serialize this to the disk in a compressed form, and allow for reports to be requested on the serialized data.
## Implementation details
To implement phase one of this functionality, we'd need a request and response filter and a service for tracking the results posted by the filters. To serve the data, we'll want a resource that would query the results table that we maintain and allow us to display those results in the browser.
### Request filter
A new request filter will be created to begin the tracking of the request. At the start of the request, the start time of the request gotten by calling now will be inserted into the request context properties. This will be used later to calculate the time a request took to complete. To make sure this starts as early as possible, we'll need to set `@Priority(1)` on the class, which should make it run as early as possible to include pre-processing in the time.
### Response filter
A new response filter will be created to use the earlier tracking and submit an entry to the new service defined below. A total runtime will be calculated by using the start time set in the request context by the above filter. Additionally, we'll want to leave the priority of the response unset as that gives the lowest priority we can give to the response filter, as we want it to run as late as possible.
### WebAnalyticsService
To track and maintain the data, we'll require a new service to hold the data and allow us to do some basic querying of the data for the resource that will serve the data. In this service, we will want to track the following data points:
- base URL of the request w/o params
- params passed in the request, flattened to a string for ease of storage
- time elapsed to process the request
- status code of the request
These stats will allow us to sort results and be able to look for patterns in the future, both in time to complete certain requests, or requests that commonly result in non-successful status codes.
To limit the memory impact in phase 1 of running this API, we should cull old stats past a configurable limit, with a default of 20000 entries. This should give us a good snapshot most of the time while limiting the impact of the analytics on the host of long-running services.
### WebAnalyticsResource
To serve this data in the first phase of this functionality, we'll want to add a resource that follows the same pattern as the cache resource for security. To make management of these secure resources, we can use the same key object we use to secure the cache resource in this class as well.
This resource will need 2 endpoints to be fully useful. The first endpoint is a summary endpoint that collates the results and gives the following data, with each base URL being given its own section:
- base URL
- average time to complete
- max time to complete
- number of calls
- number of 2xx results
This doesn't represent a perfect solution, as this means dynamic path entries will each spawn their own section in the analytics report. This is acceptable for the first phase as more investigations will need to be done around approaches to better handle these requests without large overhead.
The second endpoint will take an encoded URL as a query string param and get all endpoints results that match the passed string. For results matched, we will return them as JSON for better flexibility in how we look at these results in the future, and give us the option for more complex filtering and reporting.
For the query param mentioned above, we'll need to sanitize it first with a regex like `[^a-zA-Z0-9{}_\/-]`, which will strip suspicious characters out, and then replace literal `/` with `\/` to make sure that regex patterns can properly read the path parts. We can then support dynamic paths by replacing `{[\S_]+}` with `[^\/]+`, and then creating a pattern from the resulting string. This would allow us to for example check `/api/organizations/{}`, generating a regex like `\/api\/organizations\/[^\/]+`. This regex work should give us good enough lookups of analytics without getting too deep in the weeds on the first pass.https://gitlab.eclipse.org/eclipsefdn/it/api/eclipsefdn-api-common/-/issues/68Add an API to get a list of repositories for a project2024-03-21T19:49:51ZWayne BeatonAdd an API to get a list of repositories for a projectWe have a need to get a list of repositories for individual projects. Currently, we need this to gather project metrics and we need this to identify repositories that require review by the IP Team. As we move to implement metrics via an ...We have a need to get a list of repositories for individual projects. Currently, we need this to gather project metrics and we need this to identify repositories that require review by the IP Team. As we move to implement metrics via an external provider, we're going to need a consistent and fully-supported means of getting the list of repositories.
Here's what the scripts that I maintain currently do.
We start by getting the project metadata via API call to `projects.eclipse.org` (e.g., `https://projects.eclipse.org/json/project/adoptium.aqavit`). From that, we get:
* A list of repository urls from the `source_repositories` field;
* Zero or more GitHub organisations from the `github_org` field; and
* Zero or more GitLab groups from the `gl_project_group` field and _excluded_ groups from the `gl_excl_sub_groups` field.
There is potentially some duplication between what is listed in the `source_repositories` field and what we find in the GitHub group, so I filter for that.
For each of the GitHub organisations, I use the GitHub API to get the list of repositories.
For each of the GitLab organisations, I use the GitLab API to get the list of repositories. I do this recursively, so that subgroups are included. I prune branches in excluded groups during the recursion. Note that, AFAICT, this "excluded groups" feature isn't actually exploited by any projects currently.
With a full list of repositories gathered in this manner, I exclude some repositories that I know to be mirrors or otherwise do not include Eclipse project code. This includes a number of OpenJDK repositories from the Adoptium subprojects and a bunch of mirror/third-party repositories under Oniro. I also skip all "website" repositories under an assumption that they do not contain project code. These exclusions are all done with a nasty bit of hard coded regular expressions that an earlier version of myself decided would be a temporary hack.
I currently have this all implemented twice: once in PHP, and once in Java.
I would very much like to have this all implemented once and maintained by the same team that decides how all of this information is represented.
By way of example... The [Eclipse Dash metadata](https://projects.eclipse.org/json/project/technology.dash) contains this (trimmed)...
```
...
"gl_excl_sub_groups" : [
{
"value" : "eclipse/technology/dash/sync-script-testing/exclude-test"
}
],
"gl_project_group" : [
{
"value" : "eclipse/technology/dash"
}
],
...
"source_repo" : [
{
"name" : "dash-licenses",
"path" : "https://github.com/eclipse/dash-licenses",
"type" : "github",
"url" : "https://github.com/eclipse/dash-licenses"
}
],
...
```
Which gives me this list of repositories:
```
https://github.com/eclipse/dash-licenses
https://gitlab.eclipse.org/eclipse/technology/dash/eclipse-api-for-java.git
https://gitlab.eclipse.org/eclipse/technology/dash/eclipse-project-code.git
https://gitlab.eclipse.org/eclipse/technology/dash/org.eclipse.dash.handbook.git
```https://gitlab.eclipse.org/eclipsefdn/it/api/eclipsefdn-api-common/-/issues/43Investigate integrating with a metrics/monitoring system for analytics2024-03-20T14:00:49ZMartin Lowemartin.lowe@eclipse-foundation.orgInvestigate integrating with a metrics/monitoring system for analyticsTo better provide support for different APIs, we want to start tracking how often certain APIs are hit, and also provide extra data points like the IP of the request, the endpoint, and the timestamp for the hit. These would allow us to m...To better provide support for different APIs, we want to start tracking how often certain APIs are hit, and also provide extra data points like the IP of the request, the endpoint, and the timestamp for the hit. These would allow us to make custom usage reports for our endpoints and give us insight into APIs that we may be able to either deprecate or enhance based on usage patterns.
Quarkus has [bindings for Prometheus](https://quarkus.io/blog/micrometer-prometheus-openshift/) built in that allow for easy integration into the systems that work with the standard data types and allow for manipulation of tags and other metadata as well. It looks like we do have a Prometheus instance around the cluster, though I don't know if it's available for application usage or if we'd want to spin up a new instance.https://gitlab.eclipse.org/eclipsefdn/it/api/eclipsefdn-api-common/-/issues/32Update EntityMapper to add support to create instances2024-03-22T19:52:09ZMartin Lowemartin.lowe@eclipse-foundation.orgUpdate EntityMapper to add support to create instancesCurrently, we don't have the code that will create references for DB entities on load/conversion. This was coded into a few projects and should probably be added to the base of the Mapper Service to support this use case better. https://...Currently, we don't have the code that will create references for DB entities on load/conversion. This was coded into a few projects and should probably be added to the base of the Mapper Service to support this use case better. https://gitlab.eclipse.org/eclipsefdn/it/api/git-eca-rest-api/-/merge_requests/115Zachary SabourinZachary Sabourinhttps://gitlab.eclipse.org/eclipsefdn/it/api/eclipsefdn-api-common/-/issues/24Add response filter to calculate and add ETag to headers2024-03-20T13:51:59ZMartin Lowemartin.lowe@eclipse-foundation.orgAdd response filter to calculate and add ETag to headersPreviously, we had a helper to help add the ETag, but it hasn't been in use for a long time as it was clunky to use. We should look at generating etags as part of a response filter rather than through the helper. This way, it is built in...Previously, we had a helper to help add the ETag, but it hasn't been in use for a long time as it was clunky to use. We should look at generating etags as part of a response filter rather than through the helper. This way, it is built into our response system rather than being something we have to intentionally calculate.Martin Lowemartin.lowe@eclipse-foundation.orgMartin Lowemartin.lowe@eclipse-foundation.orghttps://gitlab.eclipse.org/eclipsefdn/it/api/eclipsefdn-api-common/-/issues/1Build broken due to missing test dependency - docker2024-03-20T13:43:32ZMartin Lowemartin.lowe@eclipse-foundation.orgBuild broken due to missing test dependency - dockerAfter merging in a previous changeset which incremented the version of Quarkus to 1.13.7 (latest), our builds began to break. This breakage is due to a missing dependency of Docker within the container. This is a new dependency introduce...After merging in a previous changeset which incremented the version of Quarkus to 1.13.7 (latest), our builds began to break. This breakage is due to a missing dependency of Docker within the container. This is a new dependency introduced by Quarkus to implement `@TestContainer` interfacing for tests.
This annotation builds a temporary docker image on the host that will contain anonymous and disposable images to help run tests. Usually, you have to request these images, but it looks like Hibernate ORM/JPA contains logic to look for configured test instances and otherwise creates a container for data. This looks incredibly valuable but will mean that builds will suck more time and resources on the cluster for builds as it manages local Docker containers.
@mbarbero @fgurr what are your thoughts on this? Should we look for a way to bypass this, or should we embrace our container overlords?Frederic Gurrfrederic.gurr@eclipse-foundation.orgMikaƫl BarberoFrederic Gurrfrederic.gurr@eclipse-foundation.org