This script lists all repositories for a project, not only a github or gitlab org (as in the PMI). In cases where a project (e.g. Eclipse Kuksa) has repos both in the GitHub Eclipse org (e.g. https://github.com/eclipse/kuksa.val) and in their own project org (e.g. https://github.com/eclipse-kuksa), it lists them all.
The script also lists repositories that are empty, and repositories that are used for websites (in case you wondered).
This is the problem with the Projects endpoint... I've already had to interpret and extrapolate from it twice. It would be good to have one single endpoint that gives us all of the information.
@epoirier The code that I've provided uses GitHub APIs to identify repositories in GitHub orgs, the GitLab API to identify repositories (recursively) in GitLab groups, and then adds repositories that are listed individually. The Java version uses actual libraries to do this. The PHP version makes REST calls directly and interprets the JSON results.
I have tried to use the excluded groups feature where I can, but you'll likely notice that I've had to hardcode a few exceptions to avoid, for example, including the OpenJDK repositories that Adoptium clones as "project repositories" (and similar for Oniro). I need to try hard to configure projects using the exclusion features so that I can remove the hard coded content.
You'll also notice that the Java implementation hardcodes fewer exceptions. This is because the context in which that implementation is used never encounters Adoptium or Oniro.
The script also lists repositories that are empty, and repositories that are used for websites (in case you wondered).
FYI, I started experimenting with some code that tries to determine whether or not a repository has "interesting" content. A repository that only contains a README, LICENSE, ... is not interesting (in an initial contribution review context, we would skip an uninteresting repository until it becomes interesting). I'll likely push that update later today.
It might be helpful to include some metadata along with the list of repositories.
e.g., it would be helpful to know whether or not the repository isEmpty or hasInterestingContent
I suggested that we extend our existing Project API to ensure that all GitHub and Gitlab repos URLs are included in the body. This data should be available to us via the Dash database.
We used to track all the GitHub and GitLab repos in the PMI but over time, the purpose of those fields changed when we started to track GL groups and GH orgs.
This should not replace the existing GitLab and GitHub fields. The idea here is to merge these 2 datasets and remove duplicates.
These changes may have an impact on our sync script. We will need to ensure that @zacharysabourin and @malowe are involved before we deploy any changes to our API.
I plan to make this an MBO for us to implement this functionality in D9 in Q1. It will go live when the PMI D9 is ready for production.
@epoirier Can you share a link from staging to allow @wbeaton and @bbaldassari2kd to review the changes and provide feedback on the format if they have any special requirements?