library(papercheck)
#>
#>
#> *******************************************
#> ✅ Welcome to PaperCheck.
#> For support and examples visit:
#> https://scienceverse.github.io/papercheck/
#>
#> ⚠️ This is alpha software; please check any
#> results. False positives and negatives will
#> occur at unknown rates.
#> *******************************************
Modules are user-created patterns for checking a paper or set of papers. Module specifications are written in the same format as functions in R packages, using roxygen2 for documentation.
#' Module Name
#'
#' @description
#' A short description of the module
#'
#' @author Author Name (\email{name@email.com})
#'
#' @references
#' # Optional reference to include in reports
#'
#' @import dplyr
#'
#' @param paper a paper object or paperlist object
#' @param ... further arguments (not used)
#'
#' @returns a list with table, summary, traffic light, and report text
#'
#' @examples
#' module_run(psychsci, "module_name")
module_name <- function(paper, ...) {
# detailed table of results ----
pattern <- "significant"
table <- search_text(paper, pattern)
# summary output for paperlists ----
# must have id column as the id of each paper, one row per paper
# further columns to be added to a master summary table
summary_table <- dplyr::count(table, id, name = "n_significant")
# determine the traffic light ----
# possible values: na, info, red, yellow, green, fail
tl <- if (nrow(table)) "info" else "na"
# report text for each possible traffic light ----
report <- c(
na = "Not applicable",
info = "This table is provided for your information",
red = "This is a potential problem",
yellow = "There may be a problem",
green = "No problems found",
fail = "The check failed, sorry"
)
# return a list ----
list(
table = table,
summary = summary_table,
na_replace = 0,
traffic_light = tl,
report = report[[tl]]
)
}
Roxygen Documentation
The module file starts with standard function documentation using
roxygen2. Roxygen documentation always starts with #'
.
Title
On the first line, give your module a short title, which will be used as a section header in reports.
#' Module Name
Description
You can skip a line and write a 1-sentence description, which will be
shown in module_list()
, or optionally start this with
@description
.
#' @description
#' A short description of the module
Details
You can write more detailed help under the tag @details
,
which will be shown when calling module_help()
. This is
optional.
#' @details
#' Here is more information about the module to help you use or understand it.
#'
#' You can skip more lines to break up paragraphs.
#'
#' * make a list
#' * check it twice
If you have experience writing R functions with roxygen, you can also
omit the @description
and @details
tags and
rely on paragraph spacing to distinguish description from details.
Author
Include the module authors so they can get credit! Adda new
@author
tag for each author, and optionally add ther
emailaddress.
#' @author Lisa DeBruine (\email{debruine@gmail.com})
#' @author Daniel Lakens (\email{lakens@gmail.com})
References
Optionally include references that you would want available to users. If you are building a module that uses citable resources, please list them here.
#' @references
#' The Retraction Watch Database [Internet].
#' New York: The Center for Scientific Integrity. 2018.
#' ISSN: 2692-4579. [Cited 2025-05-20].
#' Available from: http://retractiondatabase.org/.
Import
If you are using packages other than papercheck
, add
each with an @import
statement.
#' @import dplyr
#' @import tidyr
Technically, you can then use functions from these packages in your
function code without the package name prefix, but it is still best
practice to use the package name prefix for all functions, like
dplyr::case_when()
.
Parameters
Each argument should be defined for a function. All papercheck
modules require the first argument to be paper
. The last
argument can optionally be ...
. This allows the
module_run()
function to pass any arguments, and your code
can use them by name (e.g.,
extra_args <- list(...)
).
#' @param paper a paper object or paperlist object
#' @param ... further arguments (not used)
Returns
It is good practice to explain what your function returns. This is usually the default list with table, summary, traffic light, and report text, but you can edit this. It’s just a human-readable string.
#' @returns a list with table, summary, traffic light, and report text
Examples
You can add an example of how to use this module with the
module_run()
function. Give a paper or list of papers in
the example so you can demonstrate the purpose of this module and it
doesn’t take too much tme to run the example.
#' @examples
#' module_run(psychsci, "module_name")
Function Code
The module function is written like any R package function, with the
requirement that the first argument be paper
. Set
module_name
to your module name, which must be a valid R
variable name. Your module script should also have the same name, with a
.R suffix (e.g., module_name.R
).
module_name <- function(paper, ...) {
# detailed table of results ----
# summary output for paperlists ----
# determine the traffic light ----
# report text for each possible traffic light ----
# return a list ----
}
You can define helper functions below your main module functions, but the first functions defined in the script is what will be run on the paper object.
A module can technically do anything you want with the paper input, but you will need to follow the template below for your module to work automatically with reports and the metascience workflow.
If you are using your modules to build a report, you need to specify what type of output corresponds to good practice or practice that may need improvement. We do this through “traffic_light” and “report”.
Table
Most modules will need to structure their output in a table that can
be shown in a report. The search_text()
function below
creates a table with a row for each sentence that contains to word
“significant”.
# detailed table of results ----
pattern <- "significant"
table <- search_text(paper, pattern)
You will need to make sure that your module works with both single
paper object sand lists of paper objects. The papercheck functions
search_text()
and llm()
are already vectorised
for paper lists.
Summary
For the metascience workflow, it is useful to create a table with a row for each paper in a list, and some columns that summarise the results. You can use nested tables if you want some of your cells to contain multiple values.
# summary output for paperlists ----
# must have id column as the id of each paper, one row per paper
# further columns to be added to a master summary table
summary_table <- dplyr::count(table, id, name = "n_significant")
Your summary table might omit some papers from the whole list because
no relevant text was found. You don’t have to add them into your table,
as the module_run()
function will do that automatically for
you. However, you may want the values of your summary variables to be
something other than NA
for these missing papers. You can
set the value of na_replace
in the return list (below) to
this default value. For example, if you are returning a summary of the
count of sentences with the word “significant”, you can replace
NA
s with 0.
If you are returning more than one summary column and have different replacement values, use a named list.
na_replace <- list(
n_significant = 0,
paper_type = "unknown"
)
Traffic Light
The traffic lights are used in single-paper reports to give a quick visual overview of the module results. There are 5 kinds of traffic lights:
🟢 no problems detected;
🟡 something to check;
🔴 possible
problems detected;
🔵 informational only;
⚪️ not applicable;
⚫️ check failed
You will need to write some code to determine which traffic lights
apply to your case. If you don’t include a traffic light, but do include
a table
in the returned list, the following rule will be
applied for the traffic light.
# determine the traffic light ----
# possible values: na, info, red, yellow, green, fail
tl <- if (nrow(table)) "info" else "na"
Report Text
Reports need to explain concepts or give resources for further learning. This is often specific to the outcome of a check, so you can use the pattern below to customise the report text for each traffic light.
# report text for each possible traffic light ----
report <- c(
na = "Not applicable",
info = "This table is provided for your information",
red = "This is a potential problem",
yellow = "There may be a problem",
green = "No problems found",
fail = "The check failed, sorry"
)
Return
Structure the returned values in a list, with the names
table
, summary
, na_replace
,
traffic_light
and report
.
# return a list ----
list(
table = table,
summary = summary_table,
na_replace = 0,
traffic_light = tl,
report = report[[tl]]
)