Skip to Content

modeling.sirius

About this document

This document is a R notebook, dynamically created from the numbers extracted on the project. It lists all datasets published for the project, providing basic numbers, figures and a quick summary, and serves as a test case to make sure that all the required data is present and roughly consistent with requirements. All plots and tables are computed from the actual data as provided in the downloads.

To re-execute the document, simply render it with the project ID as a parameter:

render("datasets_report.inc", params = list(project_id = "modeling.sirius"))

This report was generated on 2021-02-28.

Downloads

Downloads are composed of gzip’d CSV and JSON files. CSV files always have a header to name the fields, which makes it easy to import in analysis software like R:

data <- read.csv(file='myfile.csv', header=T)
names(data)

List of datasets generated for the project:

  • Git
    • Git Commits (CSV) – Full list of commits with id, message, time, author, committer, and added, deleted and modifed lines.
    • Git Commits Evol (CSV) – Evolution of number of commits and authors by day.
    • Git Log (TXT) – the raw export of git log.
  • Bugzilla
  • Eclipse Forums
    • Forums Posts (CSV) – list of all forum posts for this project.
    • Forums threads (CSV) – list of all forum threads for this project.
  • Eclipse PMI
    • PMI Checks (CSV) – list of all checks applied to the Project Management Infrastructure entries for the project.

Git

Git commits

Download: git_commits_evol.csv.gz

data <- read.csv(file=file_git_commits_evol, header=T)

File is git_commits_evol.csv, and has 3 columns for 50 entries.

data$commits_sum <- cumsum(data$commits)
data.xts <- xts(x = data[,c('commits_sum', 'commits', 'authors')], order.by=as.POSIXct(as.character(data[,c('date')]), format="%Y-%m-%d"))

time.min <- index(data.xts[1,])
time.max <- index(data.xts[nrow(data.xts)])
all.dates <- seq(time.min, time.max, by="days")
empty <- xts(order.by = all.dates)

merged.data <- merge(empty, data.xts, all=T)
merged.data[is.na(merged.data) == T] <- 0

p <-dygraph(merged.data[,c('commits')],
        main = paste('Daily commits for ', project_id, sep=''),
        width = 800, height = 250 ) %>%
      dyRangeSelector()
p


Git log

Download: git_log.txt.gz

File is git_log.txt, and full log has 914 lines.


Bugzilla

Bugzilla issues

Download: bugzilla_issues.csv.gz

data <- read.csv(file=file_bz_issues, header=T)

File is bugzilla_issues.csv, and has 17 columns for 2702 issues.

Bugzilla open issues

Download: bugzilla_issues_open.csv.gz

data <- read.csv(file=file_bz_issues_open, header=T)

File is bugzilla_issues_open.csv, and has 17 columns for 905 issues (all open).

Bugzilla evolution

Download: bugzilla_evol.csv.gz

data <- read.csv(file=file_bz_evol, header=T)

File is bugzilla_evol.csv, and has 3 columns for 1174 weeks.

Let’s try to plot the monthly number of submissions for the project:

Versions

Download: bugzilla_versions.csv.gz

data <- read.csv(file=file_bz_versions, header=T)

File is bugzilla_versions.csv, and has 2 columns for 40 weeks.

Components

Download: bugzilla_components.csv.gz

data <- read.csv(file=file_bz_components, header=T)

File is bugzilla_components.csv, and has 2 columns for 9 weeks.

data.sorted <- data[order(data$Bugs, decreasing = T),]

g <- gvisColumnChart(data.sorted, options=list(title='List of product components', legend="{position: 'none'}", width="automatic", height="300px"))
plot(g)

Eclipse Forums

Forums posts

Download: eclipse_forums_posts.csv.gz

data <- read.csv(file=file_forums_posts, header=T)

File is eclipse_forums_posts.csv, and has 6 columns for 7702 posts. The evolution of posts

data$created.date <- as.POSIXct(data$created_date, origin="1970-01-01")
posts.xts <- xts(data, order.by = data$created.date)

time.min <- index(posts.xts[1,])
time.max <- index(posts.xts[nrow(posts.xts)])
all.dates <- seq(time.min, time.max, by="weeks")
empty <- xts(order.by = all.dates)

merged.data <- merge(empty, posts.xts$id, all=T)
merged.data[is.na(merged.data) == T] <- 0

posts.weekly <- apply.weekly(x=merged.data, FUN = nrow)
names(posts.weekly) <- c("posts")

p <- dygraph(
  data = posts.weekly[-1,],
  main = paste('Weekly forum posts for ', project_id, sep=''),
  width = 800, height = 250 ) %>%
  dyAxis("x", drawGrid = FALSE) %>%
  dySeries("posts", label = "Weekly posts") %>%
  dyOptions(stepPlot = TRUE) %>%
  dyRangeSelector()
p

The list of the 10 last active posts on the forums:

data$created.date <- as.POSIXct(data$created_date, origin="1970-01-01")
posts.table <- head(data[,c('id', 'subject', 'created.date', 'author_id')], 10)
posts.table$subject <- paste('<a href="', posts.table$html_url, '">', posts.table$subject, '</a>', sep='')
posts.table$created.date <- as.character(posts.table$created.date)
names(posts.table) <- c('ID', 'Subject', 'Post date', 'Post author')

print(
    xtable(head(posts.table, 10),
        caption = paste('10 most recent posts on', project_id, 'forum.', sep=" "),
        digits=0, align="lllll"), type="html",
    html.table.attributes='class="table table-striped"',
    caption.placement='bottom',
    include.rownames=FALSE,
    sanitize.text.function=function(x) { x }
)

10 most recent posts on modeling.sirius forum.

ID

Subject

Post date

Post author

1835772

Re: Nodes with Pins

2020-12-12 00:38:23

230824

1835756

Sirius Support for ETL and other Epsilon Languages

2020-12-11 15:58:57

228437

1835711

Re: How to prevent associations to parent containers

2020-12-10 20:51:12

228437

1835709

Constrain edges to be unique

2020-12-10 20:50:20

228437

1835616

Re: Is there a way to support UML Aggregation in Sirius?

2020-12-08 21:58:04

228437

1835614

How to prevent associations to parent containers

2020-12-08 21:53:26

228437

1835599

Re: Nodes with Pins

2020-12-08 09:10:41

49151

1835534

Re: Edges display - Bug / Missing feature

2020-12-06 06:24:32

226924

1835530

Nodes with Pins

2020-12-06 05:31:11

230824

1835516

Re: background color of Project explorer in eclipse

2020-12-05 13:36:21

49151


Forums threads

Download: eclipse_forums_threads.csv.gz

data <- read.csv(file=file_forums_threads, header=T)

File is eclipse_forums_threads.csv, and has 8 columns for 2310 threads. A wordcloud with the main words used in threads is presented below.

The list of the 10 last active threads on the forums:

data$last.post.date <- as.POSIXct(data$last_post_date, origin="1970-01-01")
threads.table <- head(data[,c('id', 'subject', 'last.post.date', 'last_post_id', 'replies', 'views')], 10)
threads.table$subject <- paste('<a href="', threads.table$html_url, '">', threads.table$subject, '</a>', sep='')
threads.table$last.post.date <- as.character(threads.table$last.post.date)
names(threads.table) <- c('ID', 'Subject', 'Last post date', 'Last post author', 'Replies', 'Views')

print(
    xtable(threads.table,
        caption = paste('10 last active threads on', project_id, 'forum.', sep=" "),
        digits=0, align="lllllll"), type="html",
    html.table.attributes='class="table table-striped"',
    caption.placement='bottom',
    include.rownames=FALSE,
    sanitize.text.function=function(x) { x }
)

10 last active threads on modeling.sirius forum.

ID

Subject

Last post date

Last post author

Replies

Views

1106261

Sirius Support for ETL and other Epsilon Languages

2020-12-11 15:58:57

1835756

0

263

1106247

Constrain edges to be unique

2020-12-10 20:50:20

1835709

0

343

1106217

How to prevent associations to parent containers

2020-12-10 20:51:12

1835711

1

320

1106185

Nodes with Pins

2020-12-12 00:38:23

1835772

2

436

1106176

background color of Project explorer in eclipse

2020-12-05 13:36:21

1835516

1

86

1106164

[ANN] Sirius 6.4.0

2020-12-04 11:31:40

1835484

0

1343

1106136

Edges display - Bug / Missing feature

2020-12-06 06:24:32

1835534

2

712

1106128

How to add an extra property to a container style specification?

2020-12-04 16:32:58

1835487

1

641

1106116

‘Viewpoints selection’ option not displayed on right click

2020-12-04 10:58:52

1835483

1

792

1106099

Cant add custom class to Services method

2020-11-30 12:54:25

1835294

1

304


PMI

PMI Checks

Download: eclipse_pmi_checks.csv.gz

data <- read.csv(file=file_pmi_checks, header=T)

File is eclipse_pmi_checks.csv, and has 3 columns for 17 commits.

checks.table <- head(data[,c('Description', 'Value', 'Results')], 10)

print(
    xtable(checks.table,
        caption = paste('Extract of the 10 first PMI checks for ', 
                        project_id, '.', sep=" "),
        digits=0, align="llll"), type="html",
    html.table.attributes='class="table table-striped"',
    caption.placement='bottom',
    include.rownames=FALSE,
    sanitize.text.function=function(x) { x }
)

Extract of the 10 first PMI checks for modeling.sirius .

Description

Value

Results

Checks if the URL can be fetched using a simple get query.

https://bugs.eclipse.org/bugs/enter_bug.cgi?product=Sirius

OK: Create <a href=https://bugs.eclipse.org/bugs/enter_bug.cgi?product=Sirius>URL could be successfully fetched.

Checks if the URL can be fetched using a simple get query.

https://bugs.eclipse.org/bugs/buglist.cgi?product=Sirius

OK: Query <a href=https://bugs.eclipse.org/bugs/buglist.cgi?product=Sirius>URL could be successfully fetched.

Sends a get request to the given CI URL and looks at the headers in the response (200 404..). Also checks if the URL is really a Hudson instance (through a call to its API).

https://ci.eclipse.org/sirius/

OK. Fetched CI URL.\OK. CI URL is a Hudson instance. Title is [master]

Checks if the Dev ML URL can be fetched using a simple get query.

https://dev.eclipse.org/mailman/listinfo/sirius-dev

OK: Dev ML URL could be successfully fetched.

Checks if the URL can be fetched using a simple get query.

http://www.eclipse.org/sirius/doc

OK: Documentation URL could be successfully fetched.

Checks if the URL can be fetched using a simple get query.

http://www.eclipse.org/sirius/download.html

OK: Download URL could be successfully fetched.

Checks if the Forums URL can be fetched using a simple get query.

http://eclipse.org/forums/eclipse.sirius

OK. Forum [Sirius Forum] correctly defined.\OK: Forum [Sirius Forum] URL could be successfully fetched.

Checks if the URL can be fetched using a simple get query.

http://wiki.eclipse.org/Sirius/Getting_Started

OK: Documentation URL could be successfully fetched.

Checks if the Mailing lists URL can be fetched using a simple get query.

https://dev.eclipse.org/mailman/listinfo/sirius-dev

OK. [sirius-dev] ML correctly defined with email.\OK: [sirius-dev] ML URL could be successfully fetched.

Checks if the URL can be fetched using a simple get query.

Failed: no URL defined for plan.