The three front-end nodes serving this content have high usage spikes at times. What causes this is still unknown, but it's been happening on-and-off for years.
Note that from bug 564113, use are actually seeing 502 Bad Gateway response from the wiki, so it's more than just general slowness, it causes the reverse proxy to timeout.
There are many requests on Marketplace which take the webheads > 1s to respond. Although the responses are cacheable, the request string contains so many parameters about the client that it makes all of these requests quite unique.
If you fetch that same request again, you get a cached response which is great. But change one of java.version, product.version, etc. and it's another 3s of processing time to resolve that request.
There are many requests on Marketplace which take the webheads > 1s to
respond. Although the responses are cacheable, the request string contains
so many parameters about the client that it makes all of these requests
quite unique.
Witness:
$ time wget -S https://marketplace.eclipse.org/popular/top/api/p?client=org.eclipse.epp.mpc.\
core&client.version=1.8.1.v20191106-
1317&os=win32&ws=win32&nl=en_GB&java.version=1.8.0_251&product=org.
eclipse.epp.package.java.product&product.version=4.14.0.I20191210-
0610&runtime.version=3.17.0.v20191122-2104&platform.version=4.14.0.
v20191210-0612
If you fetch that same request again, you get a cached response which is
great. But change one of java.version, product.version, etc. and it's
another 3s of processing time to resolve that request.
This one takes 5s: https://marketplace.eclipse.org/category/free-tagging/fileExtension_accounts.\
json%2CfileExtension_json/api/p?client=org.eclipse.epp.mpc.core&client.
version=1.8.0.v20190725-1807&os=win32&ws=win32&nl=en_US&java.version=1.8.
0_201&product=org.eclipse.epp.package.java.product&product.version=4.13.0.
I20190916-1045&runtime.version=3.16.0.v20190823-1314&platform.version=4.13.0.
v20190916-1045
Eventually, more requests come in that can be resolved in a timely fashion,
and the server CPUs get overloaded.
I believe MP API calls are the cause of the issues.
The marketplace REST API server is designed to return listings based off these variables.
For example, if you are using Eclipse 2020-03, we will only return listings that are compatible with your Eclipse version.
I do agree that we don't need all these parameters. For example, we are not using &nl=en_US. The problem is that a similar environment might not share a cached response since a value might be slightly different.
We are currently working with MPC to improve these requests for version 2 of the API.
@Denis, would it help if we give you a full list of params that we need for the request so that we can configure Nginx to ignore the others?
$ time wget -S https://marketplace.eclipse.org/popular/top/api/p?client=org.eclipse.epp.mpc.\
core&client.version=1.8.1.v20191106-
1317&os=win32&ws=win32&nl=en_GB&java.version=1.8.0_251&product=org.
eclipse.epp.package.java.product&product.version=4.14.0.I20191210-
0610&runtime.version=3.17.0.v20191122-2104&platform.version=4.14.0.
v20191210-0612
I think one of the problems that we have with the hosting of Eclipse Marketplace right now is: A request today is more expensive than a request from last year.
I took a quick look at our stats from May 2019 and compared them with those from last month.
Marketplace served 19,777,485 API requests in May 2020 and 20,264,673 in 2019.
The difference from last year is that we have more 4 extra Eclipse Releases to support:
2020-03
2019-12
2019-09
2019-06
This is causing more requests to hit our php-nodes instead of being cached by our reverse proxy.
Could bug 564308 be a duplicate, or a symptom of the same issue?
I believe so, I started to deploy some fixes today in an effort to improve performance.
You created that bug around the same time where I was pruning a lot of deprecated data in our database in an effort to make our marketplace listing entities smaller.
Historically, we had two use-cases for these parameters.
One was statistics - seeing the spread of different MPC, Eclipse and Java versions across different locales.
The other was filtering of compatible solutions.
Regarding the statistics, I have no clue if and what of this information is actually used at the moment - however, all of it is also available in the MPC's user agent string.
Regarding the filtering, I think we can drastically reduce this information. The only information that we really need is\
os: win32/linux/macos\
java.version: cut off to the major release level (8,9,11,...)\
platform.version: targetted eclipse release, cut off to the minor version (4.13, ...)
Unless needed for the stats, we can completely remove the other properties:\
As far as I know the Marketplace server logic, I think cutting off the platform.version like this shouldn't be an issue. I'm not sure about the java.version. Will 8,9,... work here?
And another thing to keep in mind: We will still have to deal with old clients that will keep sending the long format. Hopefully the MPC update prompt will do its thing and get people to update to the new version, which would catch most of the versions since Photon. Not sure what the numbers are wrt old clients though.
Chris, I guess we'll want to make the same reduction in the new API as well (remove product and cut off versions sent by the client).
platform.version would be the one to keep, since it also applies to 3rd-party RCPs, while product.version doesn't.
At this moment, we only need the product.version (Eclipse Version) and the
OS to decide which listing to return.
Don't we also use the java.version? If we don't, I'm fine with cutting it. But in light of bug 483383 / bug 561865, we will be needing it eventually...
Regarding the statistics, I have no clue if and what of this information is
actually used at the moment - however, all of it is also available in the
MPC's user agent string.
We started to capture some stats in 2012 or 2013 but that only lasted a month or two.
If we plan on gathering stats in the future, we should use the information from the user-agent at the Nginx level.
Regarding the filtering, I think we can drastically reduce this information.
The only information that we really need is\
os: win32/linux/macos\
java.version: cut off to the major release level (8,9,11,...)\
platform.version: targetted eclipse release, cut off to the minor version
(4.13, ...)
Unless needed for the stats, we can completely remove the other properties:\
client\
client.version\
ws\
nl\
product\
product.version\
runtime.version
I stand corrected. We would need to keep "client" since we filter out commercial listings from non-eclipse members from the API, if the client is set to org.eclipse.epp.mpc.core.
I don't think this is a problem. The order will be important here, we would need to make sure that this is the first query variable in the request.
As far as I know the Marketplace server logic, I think cutting off the
platform.version like this shouldn't be an issue. I'm not sure about the
java.version. Will 8,9,... work here?
We are not using this value and I don't plan on adding this feature in the current API.
However, we hope to support this with version 2 of the Marketplace REST API:
I think we can remove it and plan on adding it back with version 2.
And another thing to keep in mind: We will still have to deal with old
clients that will keep sending the long format. Hopefully, the MPC update
prompt will do its thing and get people to update to the new version, which
would catch most of the versions since Photon. Not sure what the numbers are
wrt old clients though.
+1 From a user experience perspective, folks with an older version of MPC will continue to have a slow experience and if we do this right, folks with this change in their MPC client will notice improvements with the load time from the marketplace server.
With this said, we will break this API with version 2 of our API.
I am thinking that once we launch version 2 of this API will announce that version 1 is deprecated and we will decide on a date that is acceptable for all to remove it from service.
Chris, I guess we'll want to make the same reduction in the new API as well
(remove product and cut off versions sent by the client).
The expectation for the new API is that we should only include parameters that are needed to alter the result.
We should leverage the user-agent value to pass information that we would want to use to gather statistics. I don't want to use cpu cycles to store and aggregate these stats while we are processing the request.
(In reply to Carsten Reckord from comment #24)
Don't we also use the java.version? If we don't, I'm fine with cutting it.
But in light of bug 483383 / bug 561865, we will be needing it eventually...
I posted information about earlier in this comment!