Creates a plot with two bar graphs. One shows the absolute number of duplicate records for each
data source
while the other shows the proportion of records that are duplicated within each data source.
This function requires a dataset that has been run through dupeSummary()
.
A data frame or tibble. Occurrence records as input.
Character. The path to a directory (folder) in which the output should be saved.
Character. The name of the output file, ending in '.pdf'.
The position of the legend as coordinates. Default = c(0.85, 0.8).
Numeric. The height of the plot in inches. Default = 7.
Numeric. The width of the plot in inches. Default = 7.
Other arguments to be used to change factor levels of data sources.
A vector of colours for the levels duplicate, kept duplicate, and unique. Default = c("#F2D2A2","#B9D6BC", "#349B90").
Logical. If TRUE, return the plot to the environment. Default = FALSE.
Outputs a .pdf figure.
# This example will show a warning for the factor levels taht are not present in the specific
# test dataset
dupePlotR(
data = beesFlagged,
# The outPath to save the plot as
# Should be something like: #paste0(OutPath_Figures, "/duplicatePlot_TEST.pdf"),
outPath = tempdir(),
fileName = "duplicatePlot_TEST.pdf",
# Colours in order: duplicate, kept duplicate, unique
dupeColours = c("#F2D2A2","#B9D6BC", "#349B90"),
# Plot size and height
base_height = 7, base_width = 7,
legend.position = c(0.85, 0.8),
# Extra variables can be fed into forcats::fct_recode() to change names on plot
GBIF = "GBIF", SCAN = "SCAN", iDigBio = "iDigBio", USGS = "USGS", ALA = "ALA",
ASP = "ASP", CAES = "CAES", 'B. Mont.' = "BMont", 'B. Minckley' = "BMin", Ecd = "Ecd",
Gaiarsa = "Gai", EPEL = "EPEL", Lic = "Lic", Bal = "Bal", Arm = "Arm"
)
#> Loading required namespace: forcats
#> Loading required namespace: cowplot
#> Warning: Unknown levels in `f`: CAES, BMont, BMin, Ecd, Gai, EPEL, Lic, Bal, Arm