Reads the provenance collected by rdtLite or rdt from execution of a script or commands in the console. Creates a human-readable summary of the provenance, including details on the computing environment, loaded and attached libraries, scripts used (if any), input and output files, console output, and error and warning messages. The summary is optionally saved to a text file and all related provenance files are optionally packaged in a zip file.
Install from GitHub:
# install.packages("devtools")
::install_github("End-to-end-provenance/provSummarizeR") devtools
Once installed, load the package:
The summarize functions can be used in one of three ways.
prov.summarize ()
prov.summarize.file ("prov_script/prov.json") ("script.R")
All three functions have six optional parameters: details, check, console, save,, and notes.
If details is TRUE, loaded and attached packages (with version numbers), file details (timestamp, hash value, saved copy) and messsage details (script and line numbers) are displayed. The default value of details is FALSE.
If check is TRUE, the user’s file system is checked to see if input files, output files, and scripts (in their original locations) are unchanged, changed, or missing. File status is marked as follows: unchanged [:], changed [+], missing [-], or not checked [ ]. The default value of check is TRUE.
If console is TRUE, the summary is displayed in the console. The default value of console is TRUE.
If save is TRUE, the summary is saved to the text file prov-summary.txt or prov-summary-details.txt (depending on the value of details) and is stored in the current working directory. The default value of save is FALSE.
If is TRUE, the provenance directory is packaged into a timestamped zip file and placed in the current working directory. This file contains a copy of all input and output files and scripts used, as well as the provenance summary text file, if save is TRUE. It also includes the prov.json file containing the detailed execution trace. The default value of is FALSE.
If notes is TRUE, notes are included for how to interpret the provenance summary. The default value of notes is TRUE.
Creating the zip file depends on use of an external zip program. This feature has been tested with zip for Unix/Mac OS and with 7z on Windows. It may or may not work with other zip programs. To use a program other than zip, set the R_ZIPCMD environment variable.
Here is an example of what the summary looks like (details = TRUE). The first line contains the name of the main R script. The ENVIRONMENT section includes details describing how and when the script was executed and how the provenance was collected. The LIBRARIES section lists all libraries that were loaded or attached along with their version numbers. The SCRIPTS section lists the main script and any scripts that were sourced. The PRE-EXISTING section lists any variables in the global environment that were used but not set by the script or console session. The INPUTS section lists any input files or URLs. The OUTPUTS section lists any output files. The CONSOLE section lists any output to the screen. The ERRORS section lists any error or warning messages that were generated when the script was executed. The NOTE section explains how to intepret the provenance summary.
PROVENANCE SUMMARY for basicTest.R (details)
Executed at 2022-06-10T10.03.33EDT
Total execution time was 4.6 seconds
Script last modified at 2019-01-02T12.43.36EST
Executed with R version 4.2.0 (2022-04-22 ucrt)
Platform was x86_64-w64-mingw32/x64 (64-bit)
Operating system was Windows 10 x64 (build 19044)
Provenance was collected with rdtLite1.3.1
Provenance is stored in C:/Prov/prov_basicTest
Hash algorithm is md5
base 4.2.0
datasets 4.2.0
ggplot2 3.3.6
graphics 4.2.0
grDevices 4.2.0
methods 4.2.0
rdtLite 1.3.1
stats 4.2.0
utils 4.2.0
1[:] C:/TEST/basicTest.R
1[ ]
2022-06-10 10:03:38
1[:] C:/TEST/shortdata.csv
2022-06-10 10:03:38
2[:] C:/TEST/airt-vs-prec.pdf
2022-06-10 10:03:38
Error in file(file, "rt"): cannot open the connection
Line 66 in basicTest.R
NOTE: Files are listed in the order of execution (script 1 = main script).
The status of each file in its original location is marked as follows:
File unchanged [:], File changed [+], File missing [-], Not checked [ ].
Copies of original files are available on the provenance directory.```