The Global Biodiversity Information Facility (GBIF) is an international network and data infrastructure funded by the world's governments providing global data that document the occurrence of species. GBIF currently integrates datasets documenting over 1.6 billion species occurrences, growing daily. The GBIF occurrence dataset combines data from a wide array of sources including specimen-related data from natural history museums, observations from citizen science networks and environment recording schemes. While these data are constantly changing at GBIF.org, periodic snapshots are taken and made available on AWS.
Public Domain Mark 1.0https://creativecommons.org/publicdomain/mark/1.0/
License information was derived automatically
Dataset that provides a direct link to Cook Island's data hosted on the GBIF website / records.
The Global Biodiversity Information Facility (GBIF) was established by governments in 2001 to encourage free and open access to biodiversity data, via the Internet. Through a global network of countries and organizations, GBIF promotes and facilitates the mobilization, access, discovery and use of information about the occurrence of organisms over time and across the planet. GBIF provides three core services and products: # An information infrastructure an Internet-based index of a globally distributed network of interoperable databases that contain primary biodiversity data information on museum specimens, field observations of plants and animals in nature, and results from experiments so that data holders across the world can access and share them # Community-developed tools, standards and protocols the tools data providers need to format and share their data # Capacity-building the training, access to international experts and mentoring programs that national and regional institutions need to become part of a decentralized network of biodiversity information facilities. GBIF and its many partners work to mobilize the data, and to improve search mechanisms, data and metadata standards, web services, and the other components of an Internet-based information infrastructure for biodiversity. GBIF makes available data that are shared by hundreds of data publishers from around the world. These data are shared according to the GBIF Data Use Agreement, which includes the provision that users of any data accessed through or retrieved via the GBIF Portal will always give credit to the original data publishers. * Explore Species: Find data for a species or other group of organisms. Information on species and other groups of plants, animals, fungi and micro-organisms, including species occurrence records, as well as classifications and scientific and common names. * Explore Countries: Find data on the species recorded in a particular country, territory or island. Information on the species recorded in each country, including records shared by publishers from throughout the GBIF network. * Explore Datasets: Find data from a data publisher, dataset or data network. Information on the data publishers, datasets and data networks that share data through GBIF, including summary information on 10028 datasets from 419 data publishers.
GBIF is an international organisation that is working to make the world's biodiversity data freely accessible. The GBIF data portal is a service that provides access to millions of data records that are being shared via the GBIF network. These data are generously made available through the GBIF network by a wide range of institutions and organisations from around the world. The two types of data currently being shared through the GBIF Network are: * Species occurrence records (based on specimens and observations) - information about the occurrence of species at particular times and places. * Names and classifications of organisms - information on the names (both scientific and common) used for species and on the classification of those organisms into taxonomic hierarchies.
Public Domain Mark 1.0https://creativecommons.org/publicdomain/mark/1.0/
License information was derived automatically
Dataset that provides a direct link to PNG's data hosted on the GBIF website/ records.
Contact emails: info@gbif.org / helpdesk@gbif.org
GBIF, the Global Biodiversity Information Facility, is an international network and data infrastructure funded by the world's governments and aimed at providing anyone, anywhere, open access to data about all types of life on Earth. Coordinated through its Secretariat in Copenhagen, the GBIF network of participating countries and organizations, working through participant nodes, provides data-holding institutions around the world with common standards and open-source tools that enable them to share information about where and when species have been recorded. This knowledge derives from many sources, including everything from museum specimens collected in the 18th and 19th century to geotagged smartphone photos shared by amateur naturalists in recent days and weeks. The GBIF network draws all these sources together through the use of data standards, such as Darwin Core, which forms the basis for the bulk of GBIF.org's index of hundreds of millions of species occurrence records. Publishers provide open access to their datasets using machine-readable Creative Commons licence designations, allowing scientists, researchers and others to apply the data in hundreds of peer-reviewed publications and policy papers each year. Many of these analyses, which cover topics from the impacts of climate change and the spread of invasive and alien pests to priorities for conservation and protected areas, food security and human health, would not be possible without this. GBIF arose from a 1999 recommendation by the Biodiversity Informatics Subgroup of the Organization for Economic Cooperation and Development's Megascience Forum. This report concluded that "An international mechanism is needed to make biodiversity data and information accessible worldwide", arguing that this mechanism could produce many economic and social benefits and enable sustainable development by providing sound scientific evidence.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The electronic catalog of the entomology type collection at the California Academy of Sciences, San Francisco.
Bird observations at different spatial scales at several sites and regions across deserts in California. All data pulled from GBIF.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
The type collections of Fabaceae are a part of targeted datasets of the Herbarium Bogoriense mobilization program. The program is initiated in order to facilitate studies and enhance access to herbarium collections for botanical community and others.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The Global Biodiversity Information Facility (GBIF) indexes thousands of biodiversity datasets from Natural History Collections, citizen science initiatives (e.g., iNaturalist, eBird), and other sources. As part of the index process, GBIF associates at least two identifiers with indexed records: a record id (aka gbifID) and a dataset id (aka dataset key). These ids are central to do lookup, reference data, and package interpreted data products.
This publication contains an exhaustive list of GBIF IDs and ids associated by their data providers as derived from:
GBIF.org (01 March 2023) GBIF Occurrence Download https://doi.org/10.15468/dl.pk3trq
The resource (size: ~260GB) provided by GBIF had content id hash://sha256/c8bac8acb28c8524c53589b3a40e322dbbbdadf5689fef2e20266fbf6ddf6b97 and was used to generate the resource included in this publication using
preston cat 'zip:hash://sha256/c8bac8acb28c8524c53589b3a40e322dbbbdadf5689fef2e20266fbf6ddf6b97!/0015281-230224095556074.csv'
| cut -f 1,2,3,37,38,39
| gzip\
gbifid.tsv.gz
with the content id of gbifid.tsv.gz (size: ~35GB) being hash://sha256/a339e32e10edaad585f61f2ded06cbb23e0618c65a6360db18d7d729054940a8 .
the first 10 lines of gbifid.tsv.gz as extracted via
preston cat --remote https://zenodo.org/record/7789866/files,https://linker.bio hash://sha256/a339e32e10edaad585f61f2ded06cbb23e0618c65a6360db18d7d729054940a8
| gunzip
| head
are:
gbifID datasetKey occurrenceID institutionCode collectionCode catalogNumber 2997162320 c71c8000-9fc7-422c-804a-ce6abe751771 3399442 CEPEC CEPEC CEPEC00109669 2997162309 c71c8000-9fc7-422c-804a-ce6abe751771 2733085 CEPEC CEPEC CEPEC00000818 2997162317 c71c8000-9fc7-422c-804a-ce6abe751771 2733086 CEPEC CEPEC CEPEC00000888 2997162313 c71c8000-9fc7-422c-804a-ce6abe751771 3399443 CEPEC CEPEC CEPEC00109744 2997162306 c71c8000-9fc7-422c-804a-ce6abe751771 2733087 CEPEC CEPEC CEPEC00000889 2997162316 c71c8000-9fc7-422c-804a-ce6abe751771 3399440 CEPEC CEPEC CEPEC00109605 2997162324 c71c8000-9fc7-422c-804a-ce6abe751771 2733088 CEPEC CEPEC CEPEC00000890 2997162308 c71c8000-9fc7-422c-804a-ce6abe751771 3399441 CEPEC CEPEC CEPEC00109615 2997162303 c71c8000-9fc7-422c-804a-ce6abe751771 2733089 CEPEC CEPEC CEPEC00000891
Note that at time of writing, the html resource associated with the occurrence id 2997162320, and data set key c71c8000-9fc7-422c-804a-ce6abe751771 (extracted from of the first data row example above) are available via:
https://gbif.org/occurrence/2997162320
and
https://gbif.org/dataset/c71c8000-9fc7-422c-804a-ce6abe751771
respectively.
This resource was initially created to help integrate with Bionomia (https://bionomia.net) to help associate people identifiers provided by bionomia to their original records via their GBIF ids. Bionomia re-uses GBIF records ids as a way to define links between records and the people (e.g., curators, collectors, identifiers) that worked on them.
In other words, this resource provides a versioned translation table from the GBIF data universe (as defined by GBIF record ids, and dataset keys) to the data collections that exist (and evolve) independent of it.
Note that the resource identified by hash://sha256/c8bac8acb28c8524c53589b3a40e322dbbbdadf5689fef2e20266fbf6ddf6b97 was not included in this publication it was too big (260GB) to fit. You may be able to retrieve the resource from its original location at https://api.gbif.org/v1/occurrence/download/request/0015281-230224095556074.zip .
https://pacific-data.sprep.org/dataset/data-portal-license-agreements/resource/de2a56f5-a565-481a-8589-406dc40b5588https://pacific-data.sprep.org/dataset/data-portal-license-agreements/resource/de2a56f5-a565-481a-8589-406dc40b5588
Dataset that provides a direct link to Nauru's data hosted on the GBIF website/records.
https://pacific-data.sprep.org/dataset/data-portal-license-agreements/resource/de2a56f5-a565-481a-8589-406dc40b5588https://pacific-data.sprep.org/dataset/data-portal-license-agreements/resource/de2a56f5-a565-481a-8589-406dc40b5588
Dataset that provides a direct link to Solomon Island's data hosted on the GBIF website / records.
Database of living organisms, taxonomic. The GBIF—the Global Biodiversity Information Facility—is international network and data infrastructure funded by the world's governments and aimed at providing anyone, anywhere, open access to data about all types of life on Earth.
https://pacific-data.sprep.org/dataset/data-portal-license-agreements/resource/de2a56f5-a565-481a-8589-406dc40b5588https://pacific-data.sprep.org/dataset/data-portal-license-agreements/resource/de2a56f5-a565-481a-8589-406dc40b5588
Dataset that provides a direct link to Kiribati's data hosted on the GBIF website/ records.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Data from the Global Biodiversity Information Facility were extracted using R (version 3.2.0) on 9 July 2015 using the rgbif package (version 0.9.0) (Chamberlain, S., Ram, K., Barve, V. & Mcglinn, D. (2015) Package ‘rgbif’: Interface to the Global 'Biodiversity' Information Facility 'API' http://cran.r-project.org/web/packages/rgbif/rgbif.pdf). The ‘rights’ statements was extracted for all occurrence datasets with one or more observations. A total of 12,458 datasets were extracted, but only about 11% of the datasets have an explicit data-useage-rights statement at the dataset level. However, some datasets use the occurrence level ‘rights’ and ‘accessRights’ fields. To extract these data the rights information was obtained from the first record of each dataset where a rights statement was missing at the dataset level.
The datasets were categorized into 13 different types depending on the origin of the observations.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
This FAIRsharing record describes: The Global Biodiversity Information Facility (GBIF) is an international open data infrastructure for biodiversity, funded by governments. GBIF encourages institutions to publish data according to common standards. GBIF operates through a network of nodes, coordinating the biodiversity information facilities of Participant countries and organizations. It provides a single point of access (through GBIF.org and its web services) to hundreds of millions of records, shared freely by hundreds of institutions worldwide, making it the biggest biodiversity database on the Internet. Many GBIF Participant countries have set up national portals to better inform their citizens and policy makers about their own biodiversity. Many GBIF Participants also support data publishers/holders by setting up a data hosting centre (DHC) where the data can be deposited and shared through GBIF.org. The DHC must meet a strict set of criteria (https://github.com/gbif/ipt/wiki/dataHostingCentres#data-hosting-centre-criteria), demonstrating that they are trustworthy. A list of trusted IPT DHCs are grouped in this collection for your convenience. To get started depositing/sharing your data using an IPT DHC, locate the one nearest you, and then request your own account, which will allow you to manage your own data sets. To understand how to use the IPT and publish your data, follow this simple how-to publish guide: https://github.com/gbif/ipt/wiki/howToPublish
https://pacific-data.sprep.org/dataset/data-portal-license-agreements/resource/de2a56f5-a565-481a-8589-406dc40b5588https://pacific-data.sprep.org/dataset/data-portal-license-agreements/resource/de2a56f5-a565-481a-8589-406dc40b5588
Dataset that provides a direct link to Tonga's data hosted on the GBIF website/ records.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A set of six databases used in a study of the biogeography of Greater Caribbean reef fishes entitled:
Comparing biodiversity databases: Greater Caribbean reef-fishes as a case study
Iliana Chollett, D. Ross Robertson
Database Authors: D Ross Robertson and Ernesto Peña, Smithsonian Tropical Research Institute, Panamá
This set of six databases contains georeferenced location records from six sources as described below.These six sources provided georeferenced records of occurrence of fishes found in the Greater Caribbean study area (6-330 N, 57-1000 W). Each occurrence record consists of a species name and associated latitude and longitude. Databases included in the comparisons made here are from five major online aggregators. Since their content overlaps to some extent, and OBIS, iDigBio and FishNet collaborate with GBIF, their data might be expected to produce similar biogeographic patterns. STRI includes a curated compendium of data from those five aggregators, enriched with data from many additional sources.
Only reef-associated fish species were included in the present analysis. These mostly represent demersal species known to occur on hard bottoms (coral, rock and oyster substrata), but also include species living on rubble, sand and vegetated bottoms within and around the immediate fringes of reefs, and pelagic species regularly found on reefs. All exotic and non-resident species and species other than reef-associated fishes were excluded from all databases prior to comparisons. Non-residents were defined as otherwise widespread species only rarely seen in the study area. Shore-fishes, including what are generally regarded as reef fishes, include those found in the waters of continental and insular shelves, i.e. between 0-200m. Reef-fish assemblages dominated by shallow-water taxa extend down to that depth in the study area (Baldwin et al. 2018). We used the shelf edge as a breakpoint and excluded records in areas deeper than 200m, identifying those areas using the General Bathymetric Chart of the Oceans (Kapoor, 1981; GEBCO Compilation Group, 2019).
Before the analyses, for all databases, duplicate records were deleted. Subsequently, records in the Pacific or on land were deleted. We used the Global Self-consistent, Hierarchical, High-resolution Geography Database (Wessel & Smith, 1996) to identify these areas. The spatial distribution of species-records in each database is shown in Figure 1 of the publication.
Global Biodiversity Information Facility (GBIF, https://www.gbif.org/): GBIF is an international network and research infrastructure aimed at providing open access to data about all types of life on earth. GBIF works through participant nodes using common standards and open-source tools that enable them to share information. Data from among the 49,000+ datasets hosted by GBIF that were used here range from those on museum specimens collected since the 18th century, to published scientific checklists, to curated local checklists produced by trained science sources such as the Atlantic and Gulf Rapid Assessment Program (https://www.agrra.org/),to geotagged smartphone photos (that act as vouchers allowing verification) shared by amateur and scientific naturalists through iNaturalist (https://www.inaturalist.org/), to unvouchered, unverified and unverifiable observation records from untrained divers such as those contributing to DiveBoard (http://www.diveboard.com). GBIF data are standardized in Darwin Core format. GBIF data were obtained from a polygon of the region of study and subject to taxonomic review and selection after downloading. GBIF data were obtained from a polygon of the study area and subject to taxonomic review after downloading (accessed through the GBIF portal, https://www.gbif.org/, on or about 2019-05-19).
Ocean Biogeographic Information System (OBIS, https://obis.org/): OBIS is a global open-access data and information clearing-house on marine biodiversity (OBIS, 2019) that was adopted as a project of the Intergovernmental Oceanographic Data and Information Exchange of the Intergovernmental Commission of UNESCO . Its range of sources is similar to that of GBIF. OBIS hosts data from organizations or programs that join it as one of 13 “nodes”, and harvest the data from the IPT (Integrated Publishing Toolkit), where providers publish their data. The IPT is developed and maintained by the GBIF, and OBIS is a major contributor of marine data to GBIF. Data are standardized in Darwin Core format. OBIS data were obtained for the region of study by downloading data on each family, then retaining only data inside the study area, which were then subject to taxonomic review and selection (accessed through the OBIS portal, https://obis.org/, on or about 2019-05-19).
Integrated Digitized Biocollections (iDigBio, https://portal.idigbio.org/portal/search): iDigBio is sponsored by the a US National Science Foundation and run by the University of Florida that provides digital data from public, non-federal, US collections. Data are standardized in a Darwin Core format, and provided “as is”. IDigBio joined the GBIF network in 2017. IDigBio records were downloaded from a polygon of the region of study and subject to taxonomic review and selection (accessed through the iDigBio portal, https://portal.idigbio.org/portal/search, on or about 2019-05-19).
FishNet2 (http://www.fishnet2.net/): FishNet2 is a collaborative effort that aggregates data on fish collections around the world to share and distribute data on specimen holdings from ~75 museums, universities and other institutions. FishNet2 distributes data in Darwin Core, and data are provided “as is”. FishNet2 is part of the network VerNet, which has contributed to GBIF since 2013 and became part of IDigBio in 2016. While FishNet2 has made substantial efforts to georeference location-record data it hosts, many hosted records still lack georeferencing. FishNet2 data were obtained from a polygon of the study area and subject to taxonomic review after downloading (accessed through the Fishnet2 Portal, www.fishnet2.org, 2019-05-19).
FishBase (http://www.fishbase.org): FishBase is a global biodiversity information system supervise by a consortium of nine non-USA international institutions, and hosts data on fin fishes and elasmobranchs (Froese & Pauly, 2009). Information presented in FishBase is extracted from the scientific literature, reports and museum or aggregator (GBIF) databases, and standardized by a team of specialists. Data from Fishbase were downloaded for the following ecosystems: Caribbean Sea, Gulf of Mexico, Southeast U.S. Continental Shelf, Atlantic Ocean, Sargasso Sea and Bermuda, and subject to taxonomic review and selection after downloading (2019-05-19).
Smithsonian Tropical Research Institute (STRI; https://biogeodb.stri.si.edu/caribbean/en/pages): The STRI database was compiled by DRR and Ernesto Peña at STRI’s Naos Marine Laboratory, and represents about 15 years accumulation of curated data (see below) from the following sources: data downloaded at roughly two year intervals from the five aggregators; data from online databases of various museums that supply aggregators (data directly downloaded from a museum sometimes differs from that available in an aggregator from the same museum), including the Swedish Museum of Natural History, the American Museum of Natural History, the Natural History Museum of Denmark, the Gulf Coast Research Laboratory, the Colombian Museum of Natural Marine History, the United States National Museum, and the United States Geological Survey; data from national aggregators of Colombia (Sistema de Información Sobre Biodiversidad de Colombia (https://sibcolombia.net/), and Sistema de Información Ambiental Marina de Colombia, https://siam.invemar.org.co/), Mexico (La Comisión Nacional para el Conocimiento y Uso de la Biodiversidad, CONABIO; http://www.conabio.gob.mx/informacion/gis/), and Costa Rica (Museo de Zoologia de la Universidad de Costa Rica, http://museo.biologia.ucr.ac.cr/); verified (by DRR) underwater photographs of fishes taken at known locations; peer reviewed publications containing location information (species descriptions; taxonomic revisions of species, genera and families; regional and local checklists); fisheries reports; digital tagging data for species such as elasmobranchs; diving surveys and collections of local faunas by DRR (e.g. Robertson et al. 2019). In addition selected data from two sources that collect species lists at sites scattered throughout the Greater Caribbean are incorporated: from the Atlantic and Gulf Rapid Reef Assessment program (AGRRA, https://www.agrra.org/: Kramer & Lang, 2003) and from trained citizen scientists who contribute data on fishes to the Reef Environmental Education Foundation’s database (REEF: Pattengill-Semmens & Semmens, 2003). The bibliographic module (https://biogeodb.stri.si.edu/caribbean/en/library) of Robertson & VanTassel (2019) contains ~1700 publications linked to species names, among them the publications from which location data were extracted.
Data from the aggregators is presented “as is” and the aggregators themselves do not do data curation. Duplicates (and occasionally triplicates and quaduplicates) of the same museum record often are included from multiple sources (e.g. the original museum source, derivative checklists, an aggregator), sometimes with slightly different georeferenced coordinates. Data available in one year may subsequently disappear from an aggregator, and different data may be available for the same species under different names (e.g. the old and new names when a species is reassigned to another genus). Errors, sometimes large errors (Robertson, 2008), are common in aggregator data, from museums as well as other sources, and longstanding errors can seem to take on a
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Plant and animal checklists, with conservation status information, are fundamental for conservation management. Historical field data, more recent data of digital origin and data-sharing platforms provide useful sources for collating species locality data. However, different biodiversity datasets have different formats and inconsistent naming systems. Additionally, most digital data sources do not provide an easy option for download by protected area. Further, data-entry-ready software is not readily available for conservation organization staff with limited technical skills to collate these heterogeneous data and create distribution maps and checklists for protected areas. The insights presented here are the outcome of conceptualizing a biodiversity information system for South African National Parks. We recognize that a fundamental requirement for achieving better standardization, sharing and use of biodiversity data for conservation is capacity building, internet connectivity, national institutional data management support and collaboration. We focus on some of the issues that need to be considered for capacity building, data standardization and data support. We outline the need for using taxonomic backbones and standardizing biodiversity data and the utility of data from the Global Biodiversity Information Facility and other available sources in this process. Additionally, we make recommendations for the fields needed in relational databases for collating species data that can be used to inform conservation decisions and outline steps that can be taken to enable easier collation of biodiversity data, using South Africa as a case study.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This database contains information on the so far registered specimens in the fish collection of the Swedish Museum of Natural History. The database is managed with The Artedian collection management system in MS-Access. The main tables describe collecting events and collection objects which are related on a one to many relationships. The database is Darwin Core compatible but uses alternative table names and includes also loan routines, document management and other components of collection management. Principal items managed are objects (catalog number, identification, number of specimens, sex size, location in collection, collection space, preservative, identifier, etc.); localities (collecting events with date, collector, geographic information including coordinates, and more), collecting site habitat; loans (with extensive metadata), accession information (date of accession, donors, acquisition conditions etc., specimen preservation progress). DNA samples are managed as subsamples of objects. Specimen and collecting site images are managed with extensive metadata. The database has a web presentation at http://artedi.nrm.se/nrmfish.
The Global Biodiversity Information Facility (GBIF) is an international network and data infrastructure funded by the world's governments providing global data that document the occurrence of species. GBIF currently integrates datasets documenting over 1.6 billion species occurrences, growing daily. The GBIF occurrence dataset combines data from a wide array of sources including specimen-related data from natural history museums, observations from citizen science networks and environment recording schemes. While these data are constantly changing at GBIF.org, periodic snapshots are taken and made available on AWS.