R/toyData_beesRaw.R
beesRaw.Rd
A small bee occurrence dataset with flags generated by BeeBDC used to run example script and test
functions. For data types, see ColTypeR()
.
data("beesRaw", package = "BeeBDC")
An object of class "tibble"
Occurrence code generated in bdc or BeeBDC
Full scientificName as shown on DiscoverLife
Family name
Subfamily name
Genus name
Subgenus name
Full name with subspecies name - ALA column
The species name only
The subspecies name only
The full name, with authorship and date information if known, of the currently valid (zoological) or accepted (botanical) taxon.
The taxonomic rank of the most specific name in the scientificName.
The authorship information for the scientificName formatted according to the conventions of the applicable nomenclaturalCode.
A brief phrase or a standard term ("cf.", "aff.") to express the determiner's doubts about the Identification.
A list (concatenated and separated) of taxa names terminating at the rank immediately superior to the taxon referenced in the taxon record.)
A list (concatenated and separated) of references (publication, global unique identifier, URI) used in the Identification.
A list (concatenated and separated) of nomenclatural types (type status, typified scientific name, publication) applied to the subject.
A list (concatenated and separated) of previous assignments of names to the Organism.
This term is meant to allow the capture of an unaltered original identification/determination, including identification qualifiers, hybrid formulas, uncertainties, etc. This term is meant to be used in addition to scientificName (and identificationQualifier etc.), not instead of it.
A list (concatenated and separated) of names of people, groups, or organizations who assigned the Taxon to the subject.
The date on which the subject was determined as representing the Taxon.
The geographic latitude (in decimal degrees, using the spatial reference system given in geodeticDatum) of the geographic center of a Location. Positive values are north of the Equator, negative values are south of it. Legal values lie between -90 and 90, inclusive.
The geographic longitude (in decimal degrees, using the spatial reference system given in geodeticDatum) of the geographic center of a Location. Positive values are east of the Greenwich Meridian, negative values are west of it. Legal values lie between -180 and 180, inclusive.
The name of the next smaller administrative region than country (state, province, canton, department, region, etc.) in which the Location occurs.
The name of the continent in which the Location occurs.
The specific description of the place.
The name of the island on or near which the Location occurs.
The full, unabbreviated name of the next smaller administrative region than stateProvince (county, shire, department, etc.) in which the Location occurs.
The full, unabbreviated name of the next smaller administrative region than county (city, municipality, etc.) in which the Location occurs. Do not use this term for a nearby named place that does not contain the actual location.
A legal document giving official permission to do something with the resource.
A GBIF-defined issue.
The date-time or interval during which an Event occurred. For occurrences, this is the date-time when the event was recorded. Not suitable for a time in a geological context.
The time or interval during which an Event occurred.
The integer day of the month on which the Event occurred.
The integer month in which the Event occurred.
The four-digit year in which the Event occurred, according to the Common Era Calendar.
The specific nature of the data record. Recommended best practice is to use the standard label of one of the Darwin Core classes.PreservedSpecimen, FossilSpecimen, LivingSpecimen, MaterialSample, Event, HumanObservation, MachineObservation, Taxon, Occurrence, MaterialCitation
The name of the country or major administrative unit in which the Location occurs.
The nature or genre of the resource. StillImage, MovingImage, Sound, PhysicalObject, Event, Text.
A statement about the presence or absence of a Taxon at a Location. present, absent.
An identifier given to the Occurrence at the time it was recorded. Often serves as a link between field notes and an Occurrence record, such as a specimen collector's number.
A list (concatenated and separated) of names of people, groups, or organizations responsible for recording the original Occurrence. The primary collector or observer, especially one who applies a personal identifier (recordNumber), should be listed first.
An identifier for the set of information associated with an Event (something that occurs at a place and time). May be a global unique identifier or an identifier specific to the data set.
A spatial region or named place.
The names of, references to, or descriptions of the methods or protocols used during an Event. Examples UV light trap, mist net, bottom trawl, ad hoc observation | point count, Penguins from space: faecal stains reveal the location of emperor penguin colonies, https://doi.org/10.1111/j.1466-8238.2009.00467.x, Takats et al. 2001.
The amount of effort expended during an Event. Examples 40 trap-nights, 10 observer-hours, 10 km by foot, 30 km by car.
The number of individuals present at the time of the Occurrence. Integer.
A number or enumeration value for the quantity of organisms. Examples 27 (organismQuantity) with individuals (organismQuantityType). 12.5 (organismQuantity) with percentage biomass (organismQuantityType). r (organismQuantity) with Braun Blanquet Scale (organismQuantityType). many (organismQuantity) with individuals (organismQuantityType).
A decimal representation of the precision of the coordinates given in the decimalLatitude and decimalLongitude.
The horizontal distance (in meters) from the given decimalLatitude and decimalLongitude describing the smallest circle containing the whole of the Location. Leave the value empty if the uncertainty is unknown, cannot be estimated, or is not applicable (because there are no coordinates). Zero is not a valid value for this term.
Occurrence records in the ALA can be filtered by using the spatially valid flag. This flag combines a set of tests applied to the record to see how reliable are its spatial data components.
An identifier (preferably unique) for the record within the data set or collection.
The identifier assigned by GBIF for each record.
An identifier for the set of data. May be a global unique identifier or an identifier specific to a collection or institution.
The name (or acronym) in use by the institution having custody of the object(s) or information referred to in the record. Examples MVZ, FMNH, CLO, UCMP.
The name identifying the data set from which the record was derived.
A list (concatenated and separated) of previous or alternate fully qualified catalog numbers or other human-used identifiers for the same Occurrence, whether in the current or any other data set or collection.
An identifier for the Occurrence (as opposed to a particular digital record of the occurrence). In the absence of a persistent global unique identifier, construct one from a combination of identifiers in the record that will most closely make the occurrenceID globally unique.
The GBIF-assigned taxon identifier number.
An identifier for the collection or dataset from which the record was derived.
The verbatim (originally-provided) scientific name
The verbatim original representation of the date and time information for an Event.
A list (concatenated and separated) of identifiers or names of taxa and the associations of this Occurrence to each of them.
A list (concatenated and separated) of identifiers of other Organisms and the associations of this Organism to each of them.
One of a) an indicator of the existence of, b) a reference to (publication, URI), or c) the text of notes taken in the field about the Event.
The sex of the biological individual(s) represented in the Occurrence.
A description of the usage rights applicable to the record.
A person or organization owning or managing rights over the resource.
Information about who can access the resource or an indication of its security status.
A list (concatenated and separated) of identifiers (publication, bibliographic reference, global unique identifier, URI) of literature associated with the Occurrence.
A bibliographic reference for the resource as a statement indicating how this record should be cited (attributed) when used.
A related resource that is referenced, cited, or otherwise pointed to by the described resource.
Additional information that exists, but that has not been shared in the given record.
Additional information that exists, but that has not been shared in the given record.
Variable indicating presence/absence of location coordinates.
Variable indicating validity of geospatial data associated with record.
Year associated with Occurrence.
Variable with identifying value for the Occurrenc.
Variable indicating is Occurrence is duplicate or not.
A list (concatenated and separated) of identifiers of other Occurrence records and their associations to this Occurrence.
Comments or notes about the Location.
BeeBDC assigned source of the data. Often written when the data is formatted by a BeeBDC::xxx_readr function or similar.
The verbatim (originally-provided) scientific name
This data set was created by generating a random subset of 100 rows from the full, unfiltered and unflagged, BeeBDC dataset from the publication: Dorey, J.B., Fischer, E.E., Chesshire, P.R., Nava-Bolaños, A., O’Reilly, R.L., Bossert, S., Collins, S.M., Lichtenberg, E.M., Tucker, E., Smith-Pardo, A., Falcon-Brindis, A., Guevara, D.A., Ribeiro, B.R., de Pedro, D., Hung, J.K.-L., Parys, K.A., McCabe, L.M., Rogan, M.S., Minckley, R.L., Velzco, S.J.E., Griswold, T., Zarrillo, T.A., Jetz, W., Sica, Y.V., Orr, M.C., Guzman, L.M., Ascher, J., Hughes, A.C. & Cobb, N.S. (2023) A globally synthesised and flagged bee occurrence dataset and cleaning workflow. Scientific Data, 10, 1–17. https://www.doi.org/10.1038/S41597-023-02626-W
beesRaw <- BeeBDC::beesRaw
head(beesRaw)
#> # A tibble: 6 × 90
#> database_id scientificName family subfamily genus subgenus subspecies species
#> <chr> <chr> <chr> <chr> <chr> <chr> <lgl> <chr>
#> 1 Dorey_data_… Pseudoanthidi… Megac… Megachil… Pseu… NA NA Pseudo…
#> 2 Dorey_data_… Macrotera arc… Andre… Panurgin… Macr… NA NA Macrot…
#> 3 Dorey_data_… Xanthesma fur… Colle… Euryglos… Xant… NA NA Xanthe…
#> 4 Dorey_data_… Exomalopsis s… Apidae Apinae Exom… NA NA Exomal…
#> 5 Dorey_data_… Osmia bicolor… Megac… Megachil… Osmia NA NA Osmia …
#> 6 Paige_data_… Augochlorella… Halic… Halictin… Augo… NA NA Augoch…
#> # ℹ 82 more variables: specificEpithet <chr>, infraspecificEpithet <chr>,
#> # acceptedNameUsage <lgl>, taxonRank <chr>, scientificNameAuthorship <chr>,
#> # identificationQualifier <lgl>, higherClassification <chr>,
#> # identificationReferences <lgl>, typeStatus <chr>,
#> # previousIdentifications <chr>, verbatimIdentification <chr>,
#> # identifiedBy <chr>, dateIdentified <chr>, decimalLatitude <dbl>,
#> # decimalLongitude <dbl>, stateProvince <chr>, continent <chr>, …