Skip to content

Commit b625c7a

Browse files
committed
add a validation re. the GitHub issue in the register and fix some test failures, add quick start
1 parent 179c8f8 commit b625c7a

15 files changed

Lines changed: 686 additions & 50 deletions

.claude/settings.local.json

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,9 @@
2929
"Bash(do echo \"=== $file ===\")",
3030
"Bash(done)",
3131
"Bash(1)",
32-
"Bash(tee:*)"
32+
"Bash(tee:*)",
33+
"Bash(xelatex:*)",
34+
"Bash(echo:*)"
3335
],
3436
"deny": [],
3537
"ask": []

.github/workflows/R-CMD-check.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@ jobs:
2828
env:
2929
R_REMOTES_NO_ERRORS_FROM_WARNINGS: true
3030
RSPM: ${{ matrix.config.rspm }}
31+
GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
3132

3233
steps:
3334
- uses: actions/checkout@v4

CLAUDE.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,7 @@ When bumping the version in `DESCRIPTION`, add a new section to `NEWS.md` with t
8080
- `validate_codecheck_yml_crossref()` - Validates paper metadata against CrossRef API; compares title and author information with CrossRef data
8181
- `validate_codecheck_yml_orcid()` - Validates author and codechecker names against ORCID API; queries ORCID records using rorcid package; compares names in ORCID records with local metadata; requires ORCID authentication by default (set `skip_on_auth_error = TRUE` to skip validation when authentication is unavailable)
8282
- `validate_contents_references()` - Comprehensive validation wrapper; runs both CrossRef and ORCID validations; provides unified summary; supports strict mode for certificate rendering; requires ORCID authentication by default (users can opt-in to skipping via `skip_on_auth_error = TRUE`)
83+
- `validate_certificate_github_issue()` - Validates certificate identifier exists in GitHub register issues; checks issue state (warns if closed) and assignment (warns if unassigned); stops with error if no matching issue found; supports strict mode where warnings become errors; automatically skips validation for placeholder certificates (R/validation.R:1204)
8384

8485
**Zenodo integration**: Functions for uploading certificates to Zenodo:
8586

NAMESPACE

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@ export(update_codecheck_yml_from_lifecycle)
2727
export(upload_zenodo_certificate)
2828
export(upload_zenodo_metadata)
2929
export(validate_certificate_for_rendering)
30+
export(validate_certificate_github_issue)
3031
export(validate_codecheck_yml)
3132
export(validate_codecheck_yml_crossref)
3233
export(validate_codecheck_yml_orcid)

NEWS.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,15 @@
11
# codecheck (development version)
22

3+
## GitHub Issue Validation
4+
5+
* **New validation function**: Added `validate_certificate_github_issue()` to verify that certificate identifiers exist in the codecheckers/register GitHub repository
6+
* **Issue state checking**: Warns if the certificate's GitHub issue is closed (indicating the CODECHECK is already complete and published)
7+
* **Assignment validation**: Warns if the certificate's GitHub issue is unassigned (no codechecker assigned yet)
8+
* **Strict mode**: Optional strict mode (`strict = TRUE`) treats warnings as errors, stopping certificate processing if issues are found
9+
* **Placeholder handling**: Automatically skips validation for placeholder certificate identifiers
10+
* **Comprehensive error handling**: Provides clear error messages for missing issues, API rate limits, and authentication problems
11+
* **GitHub Actions integration**: Updated R-CMD-check workflow to include GITHUB_PAT token for API access during testing
12+
313
## ORCID Validation Improvements
414

515
* **Graceful authentication handling**: ORCID validation functions now handle authentication failures gracefully with clear error messages instead of requiring interactive login

R/latex.R

Lines changed: 12 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -185,14 +185,11 @@ cite_certificate <- function(metadata) {
185185
##' @return NULL (outputs directly via cat() for knitr/rmarkdown)
186186
##' @keywords internal
187187
render_error_box <- function(filename, error_msg) {
188-
cat("\\begin{center}\n")
189-
cat("\\fcolorbox{red}{yellow!20}{\\parbox{0.9\\textwidth}{\n")
190-
cat("\\textbf{\\textcolor{red}{\\Large \\ding{56}} Cannot include file: \\texttt{",
191-
filename, "}}\\\\\n", sep = "")
192-
cat("\\vspace{0.2cm}\n")
193-
cat("\\textbf{Error:} ", gsub("_", "\\\\_", error_msg, fixed = TRUE), "\n", sep = "")
194-
cat("}}\n")
195-
cat("\\end{center}\n\n")
188+
# Use simple markdown formatting that will be reliably converted by pandoc
189+
# Avoid complex LaTeX to prevent compilation errors
190+
cat("**ERROR: Cannot include file:** `", filename, "`\n\n", sep = "")
191+
cat("**Reason:** ", error_msg, "\n\n", sep = "")
192+
cat("---\n\n")
196193
}
197194

198195
##' Render single-page image for certificate output
@@ -235,12 +232,17 @@ render_manifest_image <- function(path, comment) {
235232
paste("Failed to convert", format_name, "image:", e$message))
236233
})
237234
} else {
238-
# PNG, JPG, JPEG - include directly
235+
# PNG, JPG, JPEG - validate image before including
239236
tryCatch({
237+
# Validate that the file is actually a valid image using magick
238+
# This prevents LaTeX compilation errors from corrupted image files
239+
img <- magick::image_read(path)
240+
241+
# If validation succeeds, include the image
240242
cat(paste0("![", comment, "](", path, ")\n"))
241243
}, error = function(e) {
242244
render_error_box(basename(path),
243-
paste("Failed to include image:", e$message))
245+
paste("Failed to read image file (possibly corrupted):", e$message))
244246
})
245247
}
246248
}

R/manifest.R

Lines changed: 18 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -23,10 +23,12 @@ copy_manifest_files <- function(root, metadata, dest_dir,
2323
outputs = sapply(manifest, function(x) x$file)
2424
src_files = file.path(root, outputs)
2525
missing = !file.exists(src_files)
26+
27+
# Warn about missing files but continue processing
2628
if (any(missing)) {
27-
err = paste("Manifest files missing:\n",
28-
paste(src_files[missing], sep='\n'))
29-
stop(err)
29+
warning("Manifest files missing:\n",
30+
paste(src_files[missing], collapse='\n'),
31+
"\nThese files will be marked as missing in the certificate.")
3032
}
3133

3234
dest_files = file.path(dest_dir,
@@ -41,13 +43,23 @@ copy_manifest_files <- function(root, metadata, dest_dir,
4143
}
4244
}
4345

44-
if (overwrite) message("Overwriting output files: ", toString(dest_files))
45-
file.copy(src_files, dest_files, overwrite = overwrite)
46+
# Only copy files that exist
47+
existing_files = !missing
48+
if (any(existing_files)) {
49+
if (overwrite && any(file.exists(dest_files[existing_files]))) {
50+
message("Overwriting output files: ", toString(dest_files[existing_files]))
51+
}
52+
file.copy(src_files[existing_files], dest_files[existing_files], overwrite = overwrite)
53+
}
54+
55+
# Get file sizes, using NA for missing files
56+
file_sizes = rep(NA_real_, length(dest_files))
57+
file_sizes[existing_files] = file.size(dest_files[existing_files])
4658

4759
manifest_df = data.frame(output=outputs,
4860
comment=sapply(manifest, function(x) x$comment),
4961
dest=dest_files,
50-
size=file.size(dest_files),
62+
size=file_sizes,
5163
stringsAsFactors = FALSE)
5264
manifest_df
5365
}

R/validation.R

Lines changed: 202 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1158,3 +1158,205 @@ validate_certificate_for_rendering <- function(yml_file = "codecheck.yml",
11581158

11591159
return(invisible(TRUE))
11601160
}
1161+
1162+
#' Validate certificate identifier exists in GitHub register issues
1163+
#'
1164+
#' Checks if the certificate identifier from a codecheck.yml file has a corresponding
1165+
#' issue in the codecheckers/register GitHub repository. This function validates that:
1166+
#' \itemize{
1167+
#' \item A matching issue exists for the certificate identifier
1168+
#' \item Warns if the issue is closed (certificate already completed)
1169+
#' \item Warns if the issue is unassigned (no codechecker assigned yet)
1170+
#' \item Stops with error if no matching issue is found
1171+
#' }
1172+
#'
1173+
#' @param yml_file Path to the codecheck.yml file (defaults to "./codecheck.yml")
1174+
#' @param metadata Optional. Pre-loaded metadata list. If NULL, will be loaded from yml_file
1175+
#' @param repo GitHub repository in format "owner/repo". Defaults to "codecheckers/register"
1176+
#' @param strict Logical. If TRUE, treats warnings as errors. Default is FALSE
1177+
#'
1178+
#' @return Invisibly returns a list with the validation result:
1179+
#' \describe{
1180+
#' \item{valid}{Logical indicating if validation passed}
1181+
#' \item{certificate}{The certificate identifier checked}
1182+
#' \item{issue_number}{GitHub issue number if found, otherwise NULL}
1183+
#' \item{issue_state}{Issue state ("open" or "closed") if found}
1184+
#' \item{issue_assignees}{List of assignees if found}
1185+
#' \item{warnings}{Character vector of warning messages}
1186+
#' \item{errors}{Character vector of error messages}
1187+
#' }
1188+
#'
1189+
#' @examples
1190+
#' \dontrun{
1191+
#' # Validate certificate in current directory
1192+
#' validate_certificate_github_issue()
1193+
#'
1194+
#' # Validate with strict mode (warnings become errors)
1195+
#' validate_certificate_github_issue(strict = TRUE)
1196+
#'
1197+
#' # Validate specific file
1198+
#' validate_certificate_github_issue("path/to/codecheck.yml")
1199+
#' }
1200+
#'
1201+
#' @author Daniel Nüst
1202+
#' @importFrom gh gh
1203+
#' @export
1204+
validate_certificate_github_issue <- function(yml_file = "codecheck.yml",
1205+
metadata = NULL,
1206+
repo = "codecheckers/register",
1207+
strict = FALSE) {
1208+
1209+
# Load metadata if not provided
1210+
if (is.null(metadata)) {
1211+
if (!file.exists(yml_file)) {
1212+
stop("codecheck.yml file not found at: ", yml_file)
1213+
}
1214+
metadata <- yaml::read_yaml(yml_file)
1215+
}
1216+
1217+
# Get certificate identifier
1218+
certificate <- metadata$certificate
1219+
1220+
if (is.null(certificate) || certificate == "") {
1221+
stop("Certificate identifier not found in codecheck.yml",
1222+
call. = FALSE)
1223+
}
1224+
1225+
# Check if certificate is a placeholder
1226+
if (is_placeholder_certificate(yml_file = yml_file,
1227+
metadata = metadata,
1228+
strict = FALSE)) {
1229+
message("Certificate identifier '", certificate, "' appears to be a placeholder. ",
1230+
"Skipping GitHub issue validation.")
1231+
return(invisible(list(
1232+
valid = TRUE,
1233+
certificate = certificate,
1234+
issue_number = NULL,
1235+
issue_state = NULL,
1236+
issue_assignees = NULL,
1237+
warnings = character(0),
1238+
errors = character(0),
1239+
skipped = TRUE
1240+
)))
1241+
}
1242+
1243+
# Split repo into owner and name
1244+
repo_parts <- strsplit(repo, "/")[[1]]
1245+
if (length(repo_parts) != 2) {
1246+
stop("repo must be in format 'owner/repo'", call. = FALSE)
1247+
}
1248+
1249+
# Certificate pattern in issue titles: YYYY-NNN
1250+
cert_pattern <- paste0("\\b", gsub("-", "-", certificate), "\\b")
1251+
1252+
# Search for issues with the certificate ID (search all states)
1253+
tryCatch({
1254+
# Search in all issues (open + closed)
1255+
all_issues <- gh::gh("GET /repos/:owner/:repo/issues",
1256+
owner = repo_parts[1],
1257+
repo = repo_parts[2],
1258+
state = "all",
1259+
per_page = 100)
1260+
1261+
# Find matching issue
1262+
matching_issue <- NULL
1263+
for (issue in all_issues) {
1264+
if (grepl(cert_pattern, issue$title)) {
1265+
matching_issue <- issue
1266+
break
1267+
}
1268+
}
1269+
1270+
# Initialize result
1271+
warnings <- character(0)
1272+
errors <- character(0)
1273+
valid <- TRUE
1274+
1275+
# Check if issue was found
1276+
if (is.null(matching_issue)) {
1277+
error_msg <- paste0(
1278+
"No GitHub issue found for certificate '", certificate, "' ",
1279+
"in repository '", repo, "'. ",
1280+
"Please ensure an issue exists in the register before proceeding."
1281+
)
1282+
errors <- c(errors, error_msg)
1283+
stop(error_msg, call. = FALSE)
1284+
}
1285+
1286+
issue_number <- matching_issue$number
1287+
issue_state <- matching_issue$state
1288+
issue_assignees <- matching_issue$assignees
1289+
1290+
# Check if issue is closed
1291+
if (issue_state == "closed") {
1292+
warning_msg <- paste0(
1293+
"GitHub issue #", issue_number, " for certificate '", certificate, "' ",
1294+
"is already CLOSED. This usually means the CODECHECK has been completed and published. ",
1295+
"If you are still working on it, consider reopening the issue."
1296+
)
1297+
warnings <- c(warnings, warning_msg)
1298+
warning(warning_msg, call. = FALSE)
1299+
1300+
if (strict) {
1301+
valid <- FALSE
1302+
}
1303+
}
1304+
1305+
# Check if issue is unassigned
1306+
if (length(issue_assignees) == 0) {
1307+
warning_msg <- paste0(
1308+
"GitHub issue #", issue_number, " for certificate '", certificate, "' ",
1309+
"is UNASSIGNED. Please assign a codechecker to this issue."
1310+
)
1311+
warnings <- c(warnings, warning_msg)
1312+
warning(warning_msg, call. = FALSE)
1313+
1314+
if (strict) {
1315+
valid <- FALSE
1316+
}
1317+
}
1318+
1319+
# If strict mode and we have warnings, stop
1320+
if (strict && !valid) {
1321+
stop("Certificate validation failed in strict mode: ",
1322+
paste(warnings, collapse = "; "),
1323+
call. = FALSE)
1324+
}
1325+
1326+
# Success message
1327+
if (valid && length(warnings) == 0) {
1328+
message("Certificate '", certificate, "' validated: ",
1329+
"Found in GitHub issue #", issue_number, " (", issue_state, ")")
1330+
}
1331+
1332+
return(invisible(list(
1333+
valid = valid,
1334+
certificate = certificate,
1335+
issue_number = issue_number,
1336+
issue_state = issue_state,
1337+
issue_assignees = issue_assignees,
1338+
issue_title = matching_issue$title,
1339+
warnings = warnings,
1340+
errors = errors,
1341+
skipped = FALSE
1342+
)))
1343+
1344+
}, error = function(e) {
1345+
# Handle GitHub API errors
1346+
if (grepl("HTTP 404", e$message) || grepl("Not Found", e$message)) {
1347+
stop("GitHub repository '", repo, "' not found or not accessible. ",
1348+
"Please check the repository name and your access permissions.",
1349+
call. = FALSE)
1350+
} else if (grepl("API rate limit", e$message) || grepl("403", e$message)) {
1351+
stop("GitHub API rate limit exceeded. ",
1352+
"Please set a GITHUB_PAT environment variable with a valid GitHub token.",
1353+
call. = FALSE)
1354+
} else {
1355+
# Re-throw if already our custom error
1356+
if (grepl("No GitHub issue found", e$message)) {
1357+
stop(e)
1358+
}
1359+
stop("Error accessing GitHub API: ", e$message, call. = FALSE)
1360+
}
1361+
})
1362+
}

README.Rmd

Lines changed: 31 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -21,25 +21,50 @@ knitr::opts_chunk$set(
2121
[![DOI](https://zenodo.org/badge/256862293.svg)](https://zenodo.org/badge/latestdoi/256862293)
2222
<!-- badges: end -->
2323

24-
`codecheck` is an assistant for conducting CODECHECKs, written in the R language and distributed as an R package.
25-
The goal of codecheck is to ease the process to create a CODECHECK-ready workspace, and to conduct the actual CODECHECK.
26-
Furthermore, the package contains some helper functions for managing the [CODECHECK register](https://codecheck.org.uk/register/).
24+
`codecheck` is an R package to assist codecheckers in creating CODECHECK-ready workspaces and conducting codechecks. This package focuses on the technical workflow for codecheckers using R. It also contains helper functions for managing the [CODECHECK register](https://codecheck.org.uk/register/).
2725

28-
**Learn more about CODECHECK on [https://codecheck.org.uk/](https://codecheck.org.uk/).**
26+
For general information about the CODECHECK initiative and community processes, visit [https://codecheck.org.uk/](https://codecheck.org.uk/).
2927

3028
## Installation
3129

3230
The package is not on [CRAN](https://CRAN.R-project.org) yet.
33-
Install the development version from [GitHub](https://github.com/codecheckers/codecheck) with:
31+
Install the current version from [GitHub](https://github.com/codecheckers/codecheck) with:
3432

3533
``` r
3634
# install.packages("remotes")
3735
remotes::install_github("codecheckers/codecheck")
3836
```
3937

38+
## Quick Start
39+
40+
For first-time codecheckers using this R package:
41+
42+
1. **Fork the research repository** - Fork to the [codecheckers organization](https://github.com/codecheckers) on GitHub
43+
2. **Clone and navigate to the repository root** - Run R from the top-level directory of the research project
44+
3. **Create CODECHECK files** - Run `codecheck::create_codecheck_files()` to generate:
45+
- A `codecheck.yml` configuration file with metadata (certificate ID, authors, manifest, etc.)
46+
- A `codecheck/` directory with report templates
47+
4. **Define the manifest** - List all computational outputs (figures, tables, data files) that you've successfully reproduced in the `manifest` section of `codecheck.yml`
48+
5. **Complete the certificate** - Fill in the report template and render it
49+
6. **Create a record on Zenodo** (or OSF, or ResearchEquals) and submit the draft for feedback to your CODECHECK editor/contact person, e.g., via a sharing link or the CODECHECK Zenodo community; push the `codecheck.yml` to the repository
50+
51+
## Key Concepts
52+
53+
- **Certificate** - The final report documenting your CODECHECK, which includes metadata, the manifest, and your assessment.
54+
- **codecheck.yml** - The configuration file containing all CODECHECK metadata (paper details, authors, manifest, etc.)
55+
- **Manifest** - A list of computational output files (figures, data files, tables) that you have successfully reproduced during the CODECHECK. Each manifest entry includes the file path and a brief description. The manifest is defined in the `codecheck.yml`.
56+
4057
## Usage
4158

42-
See the [main vignette](https://github.com/codecheckers/codecheck/blob/master/vignettes/codecheck_overview.Rmd).
59+
See the [getting started guide](https://codecheck.org.uk/codecheck/articles/codecheck_overview.html) for step-by-step instructions on using the template and the [workflow descriptions](https://codecheck.org.uk/workflows/) on the overall procedures.
60+
61+
**Note on certificate templates**: The R Markdown template created by this package can be used in multiple ways:
62+
63+
- Execute code in various languages (R, Python, bash, etc.) using knitr's language engines
64+
- Simply write your certificate narrative without executing any code
65+
- Mix both approaches as needed
66+
67+
If you prefer working with Jupyter Notebooks (especially for Python-based projects), see the [Python CODECHECK template](https://github.com/codecheckers/codecheck-py) based on a Jupyter Notebook.
4368

4469
## Development
4570

0 commit comments

Comments
 (0)