A small bee occurrence dataset with flags generated by BeeBDC used to run example script and test functions. For data types, see ColTypeR().

data("beesRaw", package = "BeeBDC")

Format

An object of class "tibble"

database_id

Occurrence code generated in bdc or BeeBDC

scientificName

Full scientificName as shown on DiscoverLife

family

Family name

subfamily

Subfamily name

genus

Genus name

subgenus

Subgenus name

subspecies

Full name with subspecies name - ALA column

specificEpithet

The species name only

infraspecificEpithet

The subspecies name only

acceptedNameUsage

The full name, with authorship and date information if known, of the currently valid (zoological) or accepted (botanical) taxon.

taxonRank

The taxonomic rank of the most specific name in the scientificName.

scientificNameAuthorship

The authorship information for the scientificName formatted according to the conventions of the applicable nomenclaturalCode.

identificationQualifier

A brief phrase or a standard term ("cf.", "aff.") to express the determiner's doubts about the Identification.

higherClassification

A list (concatenated and separated) of taxa names terminating at the rank immediately superior to the taxon referenced in the taxon record.)

identificationReferences

A list (concatenated and separated) of references (publication, global unique identifier, URI) used in the Identification.

typeStatus

A list (concatenated and separated) of nomenclatural types (type status, typified scientific name, publication) applied to the subject.

previousIdentifications

A list (concatenated and separated) of previous assignments of names to the Organism.

verbatimIdentification

This term is meant to allow the capture of an unaltered original identification/determination, including identification qualifiers, hybrid formulas, uncertainties, etc. This term is meant to be used in addition to scientificName (and identificationQualifier etc.), not instead of it.

identifiedBy

A list (concatenated and separated) of names of people, groups, or organizations who assigned the Taxon to the subject.

dateIdentified

The date on which the subject was determined as representing the Taxon.

decimalLatitude

The geographic latitude (in decimal degrees, using the spatial reference system given in geodeticDatum) of the geographic center of a Location. Positive values are north of the Equator, negative values are south of it. Legal values lie between -90 and 90, inclusive.

decimalLongitude

The geographic longitude (in decimal degrees, using the spatial reference system given in geodeticDatum) of the geographic center of a Location. Positive values are east of the Greenwich Meridian, negative values are west of it. Legal values lie between -180 and 180, inclusive.

stateProvince

The name of the next smaller administrative region than country (state, province, canton, department, region, etc.) in which the Location occurs.

continent

The name of the continent in which the Location occurs.

locality

The specific description of the place.

island

The name of the island on or near which the Location occurs.

county

The full, unabbreviated name of the next smaller administrative region than stateProvince (county, shire, department, etc.) in which the Location occurs.

municipality

The full, unabbreviated name of the next smaller administrative region than county (city, municipality, etc.) in which the Location occurs. Do not use this term for a nearby named place that does not contain the actual location.

license

A legal document giving official permission to do something with the resource.

issue

A GBIF-defined issue.

eventDate

The date-time or interval during which an Event occurred. For occurrences, this is the date-time when the event was recorded. Not suitable for a time in a geological context.

eventTime

The time or interval during which an Event occurred.

day

The integer day of the month on which the Event occurred.

month

The integer month in which the Event occurred.

year

The four-digit year in which the Event occurred, according to the Common Era Calendar.

basisOfRecord

The specific nature of the data record. Recommended best practice is to use the standard label of one of the Darwin Core classes.PreservedSpecimen, FossilSpecimen, LivingSpecimen, MaterialSample, Event, HumanObservation, MachineObservation, Taxon, Occurrence, MaterialCitation

country

The name of the country or major administrative unit in which the Location occurs.

type

The nature or genre of the resource. StillImage, MovingImage, Sound, PhysicalObject, Event, Text.

occurrenceStatus

A statement about the presence or absence of a Taxon at a Location. present, absent.

recordNumber

An identifier given to the Occurrence at the time it was recorded. Often serves as a link between field notes and an Occurrence record, such as a specimen collector's number.

recordedBy

A list (concatenated and separated) of names of people, groups, or organizations responsible for recording the original Occurrence. The primary collector or observer, especially one who applies a personal identifier (recordNumber), should be listed first.

eventID

An identifier for the set of information associated with an Event (something that occurs at a place and time). May be a global unique identifier or an identifier specific to the data set.

Location

A spatial region or named place.

samplingProtocol

The names of, references to, or descriptions of the methods or protocols used during an Event. Examples UV light trap, mist net, bottom trawl, ad hoc observation | point count, Penguins from space: faecal stains reveal the location of emperor penguin colonies, https://doi.org/10.1111/j.1466-8238.2009.00467.x, Takats et al. 2001.

samplingEffort

The amount of effort expended during an Event. Examples 40 trap-nights, 10 observer-hours, 10 km by foot, 30 km by car.

individualCount

The number of individuals present at the time of the Occurrence. Integer.

organismQuantity

A number or enumeration value for the quantity of organisms. Examples 27 (organismQuantity) with individuals (organismQuantityType). 12.5 (organismQuantity) with percentage biomass (organismQuantityType). r (organismQuantity) with Braun Blanquet Scale (organismQuantityType). many (organismQuantity) with individuals (organismQuantityType).

coordinatePrecision

A decimal representation of the precision of the coordinates given in the decimalLatitude and decimalLongitude.

coordinateUncertaintyInMeters

The horizontal distance (in meters) from the given decimalLatitude and decimalLongitude describing the smallest circle containing the whole of the Location. Leave the value empty if the uncertainty is unknown, cannot be estimated, or is not applicable (because there are no coordinates). Zero is not a valid value for this term.

spatiallyValid

Occurrence records in the ALA can be filtered by using the spatially valid flag. This flag combines a set of tests applied to the record to see how reliable are its spatial data components.

catalogNumber

An identifier (preferably unique) for the record within the data set or collection.

gbifID

The identifier assigned by GBIF for each record.

datasetID

An identifier for the set of data. May be a global unique identifier or an identifier specific to a collection or institution.

institutionCode

The name (or acronym) in use by the institution having custody of the object(s) or information referred to in the record. Examples MVZ, FMNH, CLO, UCMP.

datasetName

The name identifying the data set from which the record was derived.

otherCatalogNumbers

A list (concatenated and separated) of previous or alternate fully qualified catalog numbers or other human-used identifiers for the same Occurrence, whether in the current or any other data set or collection.

occurrenceID

An identifier for the Occurrence (as opposed to a particular digital record of the occurrence). In the absence of a persistent global unique identifier, construct one from a combination of identifiers in the record that will most closely make the occurrenceID globally unique.

taxonKey

The GBIF-assigned taxon identifier number.

collectionID

An identifier for the collection or dataset from which the record was derived.

verbatim_scientificName

The verbatim (originally-provided) scientific name

verbatimEventDate

The verbatim original representation of the date and time information for an Event.

associatedTaxa

A list (concatenated and separated) of identifiers or names of taxa and the associations of this Occurrence to each of them.

associatedOrganisms

A list (concatenated and separated) of identifiers of other Organisms and the associations of this Organism to each of them.

fieldNotes

One of a) an indicator of the existence of, b) a reference to (publication, URI), or c) the text of notes taken in the field about the Event.

sex

The sex of the biological individual(s) represented in the Occurrence.

rights

A description of the usage rights applicable to the record.

rightsHolder

A person or organization owning or managing rights over the resource.

accessRights

Information about who can access the resource or an indication of its security status.

associatedReferences

A list (concatenated and separated) of identifiers (publication, bibliographic reference, global unique identifier, URI) of literature associated with the Occurrence.

bibliographicCitation

A bibliographic reference for the resource as a statement indicating how this record should be cited (attributed) when used.

references

A related resource that is referenced, cited, or otherwise pointed to by the described resource.

informationWithheld

Additional information that exists, but that has not been shared in the given record.

isDuplicateOf

Additional information that exists, but that has not been shared in the given record.

hasCoordinate

Variable indicating presence/absence of location coordinates.

hasGeospatialIssues

Variable indicating validity of geospatial data associated with record.

occurrenceYear

Year associated with Occurrence.

id

Variable with identifying value for the Occurrenc.

duplicateStatus

Variable indicating is Occurrence is duplicate or not.

associatedOccurrences

A list (concatenated and separated) of identifiers of other Occurrence records and their associations to this Occurrence.

locationRemarks

Comments or notes about the Location.

dataSource

BeeBDC assigned source of the data. Often written when the data is formatted by a BeeBDC::xxx_readr function or similar.

verbatim_scientificName

The verbatim (originally-provided) scientific name

References

This data set was created by generating a random subset of 100 rows from the full, unfiltered and unflagged, BeeBDC dataset from the publication: Dorey, J.B., Fischer, E.E., Chesshire, P.R., Nava-Bolaños, A., O’Reilly, R.L., Bossert, S., Collins, S.M., Lichtenberg, E.M., Tucker, E., Smith-Pardo, A., Falcon-Brindis, A., Guevara, D.A., Ribeiro, B.R., de Pedro, D., Hung, J.K.-L., Parys, K.A., McCabe, L.M., Rogan, M.S., Minckley, R.L., Velzco, S.J.E., Griswold, T., Zarrillo, T.A., Jetz, W., Sica, Y.V., Orr, M.C., Guzman, L.M., Ascher, J., Hughes, A.C. & Cobb, N.S. (2023) A globally synthesised and flagged bee occurrence dataset and cleaning workflow. Scientific Data, 10, 1–17. https://www.doi.org/10.1038/S41597-023-02626-W

Examples


beesRaw <- BeeBDC::beesRaw
head(beesRaw)
#> # A tibble: 6 × 90
#>   database_id  scientificName family subfamily genus subgenus subspecies species
#>   <chr>        <chr>          <chr>  <chr>     <chr> <chr>    <lgl>      <chr>  
#> 1 Dorey_data_… Pseudoanthidi… Megac… Megachil… Pseu… NA       NA         Pseudo…
#> 2 Dorey_data_… Macrotera arc… Andre… Panurgin… Macr… NA       NA         Macrot…
#> 3 Dorey_data_… Xanthesma fur… Colle… Euryglos… Xant… NA       NA         Xanthe…
#> 4 Dorey_data_… Exomalopsis s… Apidae Apinae    Exom… NA       NA         Exomal…
#> 5 Dorey_data_… Osmia bicolor… Megac… Megachil… Osmia NA       NA         Osmia …
#> 6 Paige_data_… Augochlorella… Halic… Halictin… Augo… NA       NA         Augoch…
#> # ℹ 82 more variables: specificEpithet <chr>, infraspecificEpithet <chr>,
#> #   acceptedNameUsage <lgl>, taxonRank <chr>, scientificNameAuthorship <chr>,
#> #   identificationQualifier <lgl>, higherClassification <chr>,
#> #   identificationReferences <lgl>, typeStatus <chr>,
#> #   previousIdentifications <chr>, verbatimIdentification <chr>,
#> #   identifiedBy <chr>, dateIdentified <chr>, decimalLatitude <dbl>,
#> #   decimalLongitude <dbl>, stateProvince <chr>, continent <chr>, …