Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Paleobiology Database (PBDB) is a non-governmental, non-profit public resource for paleontological data. It has been organized and operated by a multi-disciplinary, multi-institutional, international group of paleobiological researchers. Its purpose is to provide global, collection-based occurrence and taxonomic data for organisms of all geological ages, as well data services to allow easy access to data for independent development of analytical tools, visualization software, and applications of all types. The Database’s broader goal is to encourage and enable data-driven collaborative efforts that address large-scale paleobiological questions.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Paleobiology Database is a public resource for the global scientific community. It has been organized and operated by a multi-disciplinary, multi-institutional, international group of paleobiological researchers. Its purpose is to provide global, collection-based occurrence and taxonomic data for marine and terrestrial animals and plants of any geological age, as well as web-based software for statistical analysis of the data. The project_s wider, long-term goal is to encourage collaborative efforts to answer large-scale paleobiological questions by developing a useful database infrastructure and bringing together large data sets.
http://paleobiodb.org/The Paleobiology Database is a public database of paleontological data that anyone can use, maintained by an international non-governmental group of paleontologists. https://paleobiodb.org/#/
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Paleobiology Database is a public resource for the global scientific community. It has been organized and operated by a multi-disciplinary, multi-institutional, international group of paleobiological researchers. Its purpose is to provide global, collection-based occurrence and taxonomic data for marine and terrestrial animals and plants of any geological age, as well as web-based software for statistical analysis of the data. The project's wider, long-term goal is to encourage collaborative efforts to answer large-scale paleobiological questions by developing a useful database infrastructure and bringing together large data sets. http://paleobiodb.org/
Facebook
TwitterA non-governmental, non-profit public database for paleontological data providing researchers and the public with information about the entire fossil record. It has been organized and operated by a multi-disciplinary, multi-institutional, international group of paleobiological researchers. Its purpose is to provide global, collection-based occurrence and taxonomic data for organisms of all geological ages, as well data services to allow easy access to data for independent development of analytical tools, visualization software, and applications of all types. The Database's broader goal is to encourage and enable data-driven collaborative efforts that address large-scale paleobiological questions. Paleontological data files are accepted for upload. However, PaleoBioDB needs some basic data types to be included in order to perform an upload. The Application Programming Interface (API) gives scientists, students, and developers programmatic access to taxonomic, spatial, and temporal data contained within the database.
Facebook
Twitterhttps://www.gnu.org/licenses/gpl-3.0.htmlhttps://www.gnu.org/licenses/gpl-3.0.html
Scripts and protocols used to generate the results, and data tables necessary to run the scripts.To run the code in these scripts, we recommend using the R studio interface and saving all scripts and the files in the data_tables.zip archive in a single folder. This folder should then be set as your working directory in R, for example:setwd("C://Users/keichenseer/data_tables")
Facebook
TwitterThis is the supplementary data repository of the Paleobiology paper titled Bedrock Geological Map Predictions for Phanerozoic Fossil Occurrences. Geographically-explicit, taxonomically resolved fossil occurrences are necessary for reconstructing macroevolutionary patterns and for testing a wide range of hypotheses in the Earth and life sciences. Heterogeneity in the spatial and temporal distribution of fossil occurrences in the Paleobiology Database (PBDB) is attributable to several different factors, including turnover among biological communities, socioeconomic disparities in the intensity of paleontological research, and geological controls on the distribution and fossil yield of sedimentary deposits. Here we use the intersection of global geologic map data from Macrostrat and fossil collections in the PBDB to assess the extent to which the potentially fossil-bearing, surface-expressed sedimentary record has yielded fossil occurrences. We find a significant and moderately strong posi..., ,
Facebook
TwitterA Plant Proteome DataBase for Arabidopsis thaliana and maize (Zea mays). The PPDB stores experimental data from in-house proteome and mass spectrometry analysis, curated information about protein function, protein properties and subcellular localization. Importantly, proteins are particularly curated for possible (intra) plastid location and their plastid function. Protein accessions identified in published Arabidopsis (and other Brassicacea) proteomics papers are cross-referenced to rapidly determine previous experimental identification by mass spectrometry. All protein-encoding gene models in the Arabidopsis nuclear and organellar genomes, as assembled by TAIR, as well as all maize EST assemblies (ZmGI) as assembled by DFCI Maize Gene Index project. These are all uploaded in PPDB and are linked to each other via a BLAST alignment. Thus every predicted protein in both species can be searched for experimental and other information (even if not experimentally identified).
Facebook
TwitterA plant promoter database that provides information on transcription start sites (TSSs), core promoter structure and regulatory element groups (REGs) as putative and comprehensive transcriptional regulatory elements. Microarray data-based predictions have been appended as REG annotations which inform their putative physiological roles.
Facebook
TwitterAttribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
Geographic range is used as a correlate of extinction risk for extant and extinct organisms across the fields of conservation and paleobiology. However, the exact method used to measure geographic range, the biases, and the limitations of each are rarely discussed explicitly despite their potential to impact conclusions. Here I examine and quantify properties of five commonly used measures of geographic range (convex hull area, maximum pairwise great circle distance, latitudinal range, longitudinal range, and cell count) along with a rarely used measure (minimum spanning tree distance) in the context of three datasets. A simulated dataset of two shapes with known areal limits, a paleontological occurrence dataset of pre-Cenozoic brachiopod genera from the Paleobiology Database (PBDB), and 50000 occurrence records of birds species in the western hemisphere from the eBird database.
Facebook
Twitterhttps://doi.org/10.5061/dryad.xwdbrv1k9
All supplementary data files for this study, including R-scripts for analyses, an animation of fossil and stratigraphic column locations through time, tables of rock units matched to fossil occurrences, tables of fossil occurrence assigned ages, and correlations for Ediacaran and Cambrian rock and fossil quantities as separate time periods are included in this Dryad repository.
Supplementary Figure, Tables, and captions. Highlighting of cells in Table S3, S4, and S5 are to indicate statistical significance at different confidence levels. Green highlighting indicates a correlation that is significant at the 95% level, and yellow indicates a correlation that is significant at the 90% level. No coloring indicates correlations that are not ...
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Cheilostomata is the most diverse and ecologically dominant order of bryozoans living today. We apply a Bayesian framework to estimate macroevolutionary rates of cheilostomes since the Late Jurassic across four datasets: I) manually curated genus ranges, II) published text-mined genus ranges, III) non-revised Paleobiology Database (PBDB) records, IV) revised and augmented PBDB records. All datasets revealed increased origination rates in the Albian, and a twin K-Pg and Danian extinction rate peak. High origination rates in the late Selandian-Ypresian in Dataset I indicate the onset of an ascophoran-grade radiation. Lineage-through-time plots confirm the macroevolutionary lag preceding the radiation of cheilostomes in the mid-Cretaceous, and their renewed diversification in the late Paleocene and Eocene. A multivariate birth-death model indicates that origination rates are shaped by diversity-dependent dynamics coupled with a positive correlation with sea surface temperature, while extinction rates negatively correlate with sea level. Text-mined data provide broadly similar rate dynamics as manually curated data, although discrepancies could be attributed to the omission of key literature in Dataset II, and the inclusion of new published and unpublished data, and revised ranges in Dataset I. Revision and augmentation of PBDB occurrences were necessary to generate rate profiles akin to those of Datasets I and II and highlight the risks of using unedited occurrence data. Our results support the widely held assumption that diversification dynamics are controlled by both biotic and abiotic factors and pave the way for integrating fossils with molecular phylogenies to study these processes in more detail.
Facebook
TwitterThis dataset accompanies an analysis of fossil information loss across the Cenozoic and integrates palaeoclimatic reconstructions, lithological sedimentary data, and fossil occurrences. The dataset includes processed climatic variables (temperature and precipitation) derived from the HadCM3 model at 14 geological intervals from 66 to 0 Ma, categorized into Köppen-Geiger climate zones. Lithological data were extracted from a generalized global geological map (Chorlton, 2007) to isolate sedimentary formations with fossil preservation potential. Fossil occurrence records were obtained from the Paleobiology Database (PBDB) as of February 2025. The R script provided merges and analyses these data layers to assess the spatial-temporal overlap between climate zones, sedimentary coverage, and fossil distribution. Outputs include estimates of information loss in the fossil record due to the absence of suitable depositional environments within specific climate zones. This dataset facilitates repr..., , # Data from: Silent past: Biogeographic gaps in the Cenozoic fossil archive
Dataset DOI: 10.5061/dryad.34tmpg4wk
This dataset was generated as part of the analyses presented in “Silent Past: Biogeographic Gaps in the Cenozoic Fossil Archive†(Palaeogeography, Palaeoclimatology, Palaeoecology, 2025). The data compilation integrates paleoclimatic model outputs, sedimentary basin reconstructions, and fossil occurrence data to explore the spatial and environmental representativeness of the Cenozoic fossil record.
Specifically, the dataset combines:
Facebook
TwitterOccurrence dataset: A relatively large (~1500) dataset of fossil mammal occurrence data for the Paleocene, Eocene and Oligocene (66 Ma - 23 Ma) of Mongolia and Northern China above 30 degrees North. Occurrence data comprises species or genus name, specimen information where possible, geological unit specimen was found in, age (range) of specimen and/or geological unit and any other relevant information. Data taken from multiple sources. The majority comes from the Palaeobiology Database (PBDB), an open-access community dataset of global fossil occurrences (and some trait data) for all time periods and taxonomic groups. Our dataset used only the mammal records from our study region and time period. A very small amount of data (10's of occurrences) was taken from the NOW (New and Old Worlds) Database of fossil mammals (NOW database), another open-access community dataset. This database contains only mammal occurrence and trait data for fossil mammals throughout geological history and across the world. Additional occurrence data (~100) was collected first hand from the literature by Dr Gemma Benevento.
Body Size dataset: Lower first molar (m1) length and width (which can be used to estimate mammal body size) was collected for approximately 60% of the individual species in the occurrence dataset (~430 species).
Facebook
TwitterConservation planners and resource managers are concerned about ecological resilience and survival of species as climate and sea level change. The fossil record contains an excellent means to test species responses to changing conditions. This dataset utilizes molluscan faunal data extracted from a fossil database – the Paleobiology Database (PBDB; https://paleobiodb.org/classic) – for the late Pleistocene through Holocene (129,000 years before present (ybp) to present), limited to the south Florida region, as a way to address the question how many molluscan taxa survived the significant changes to Florida’s coastline over approximately the last 129,000 years. The initial PDBD download was cleaned by eliminating duplicate entries and invalid taxa. After the data cleaning and validation, 347 taxa remained (327 late Pleistocene, and 20 Holocene); of these, 314 are considered valid taxa for this study (294 late Pleistocene, 20 Holocene). The remaining 33 taxa had some uncertainty in their taxonomic standing that could not be resolved, but the names were retained for portions of the analysis. All 347 taxa were compared to databases and published lists of extant mollusks to determine which taxa have survived to the present, and if they are still found within Florida. When only the 314 valid species are examined for the late Pleistocene and Holocene, 93% of the taxa are still alive today, indicating survival throughout the last glacial cycle; 7% went extinct; and <1% were locally extirpated. Surviving species drop to 86% and extinct species rise to 13% if the 33 uncertain taxa are included for the late Pleistocene and Holocene. If just the late Pleistocene (0.129 Ma to 0.0117 Ma) valid taxa are compared to extant fauna, 92% survived, 8% went extinct, and less than 1% were locally extirpated. These data suggest that the molluscan fauna of south Florida are relatively resilient to significant changes, information that can be of value as resource managers develop conservation plans for changing conditions. The work described here is funded by the Greater Everglades Priority Ecosystem Science program of the USGS.
Facebook
Twitterhttps://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Large-scale analysis of the fossil record requires aggregation of palaeontological data from individual fossil localities. Prior to computers these synoptic datasets were compiled by hand, a laborious undertaking that took years of effort and forced palaeontologists to make difficult choices about what types of data to tabulate. The advent of desktop computers ushered in palaeontology’s first digital revolution – online literature-based databases, such as the Paleobiology Database (PBDB). However, the published literature represents only a small proportion of the palaeontological data housed in museum collections. Although this issue has long been appreciated, the magnitude, and thus potential significance, of these so-called “dark data” has been difficult to determine. Here, in the early phases of a second digital revolution in palaeontology the digitization of museum collections – we provide an estimate of the magnitude of palaeontology’s dark data. Digitization of our nine institutions’ holdings of Cenozoic marine invertebrate collections from California, Oregon, and Washington in the United States reveals that they represent 23 times the number of unique localities than are currently available in the Paleobiology Database. These data, and the vast quantity of similarly untapped dark data in other museum collections, will when digitally mobilized enhance palaeontologists’ ability to make inferences about the patterns and processes of past evolutionary and ecological changes.
Facebook
TwitterOccurrence dataset: A large dataset of fossil mammal occurrence data for the Quaternary (Pleistocene and Holocene) of Europe. Occurrence data comprises species or genus name, specimen information where possible, geological unit specimen was found in, age (range) of specimen and/or geological unit and any other relevant information. Data taken from multiple sources, including the Palaeobiology Database (PBDB), an open-access community dataset of global fossil occurrences (and some trait data) for all time periods and taxonomic groups. Our dataset used only the mammal records from our study region and time period. Data was taken from the NOW (New and Old Worlds) Database of fossil mammals (NOW database), another open-access community dataset. This database contains only mammal occurrence and trait data for fossil mammals throughout geological history and across the world. All additional occurrence data was collected first hand from the literature.
Trait dataset: Trait data for species in the occurrence dataset. Including (but not limited to) body size data, collected as lower first molar length and width).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Short R code and lists of trace fossils, marine taxa, and egg taxa that were removed from our data before analysis
Facebook
TwitterUnderstanding how biodiversity has changed through time and space is a central aim of paleobiology. To elucidate accurate biodiversity patterns in deep time, regional case studies, where sampling biases can be minimized, are needed. The Upper Jurassic Morrison Formation of the western USA crops out over 1.2 million km2 and covers 12 degrees of latitude. It was deposited over a ~9-million-year time period and was home to some of the most iconic dinosaurs. Utilizing a new, high-resolution chronostratigraphic framework for the formation, tetrapod occurrences from the Paleobiology Database were temporally and spatially mapped to examine patterns of diversity change through time and space, and the geographic ranges of taxa were examined to shed light on niche partitioning. Latitudinally, diversity was found to peak in the center of the basin, perhaps due to the availability of water resources. Diversity increased over time in the Morrison Formation, and there is no evidence to indicate a dec..., All vertebrate occurrences in the Morrison Formation were downloaded from the Paleobiology Database (PBDB; paleobiodb.org; accessed 23/12/2022). The data were visually inspected and occurrences related to eggshells or tracks were removed, leaving only those pertaining to body fossils. This resulted in 1397 occurrences. Taxonomy was cleansed following the recent literature. Occurrences were manually attributed to systems tracts described in Maidment & Muxworthy (2019) based on stratigraphic logs or descriptions in the literature for each locality and supplemented with first-hand observations of a number of quarries. A full list of quarries, systems tracts, and references for the stratigraphic location are provided in the spreadsheet “Quarry data.csv†in the Online Supplementary Material available with the manuscript. As not all references provided stratigraphic logs or descriptions, it was not always possible to attribute quarries to stratigraphic locations, but 1144 occurrences (82%..., , # Data and code for: Diversity through space and time in the Upper Jurassic Morrison Formation, western USA
https://doi.org/10.5061/dryad.6m905qg77
This dataset provides raw data and code for all analyses carried out in the above paper. There are eight .xlsx files that contain raw data, and four scripts that implement the analyses carried out in R.
The R script 'Diversity_analysis_code.R' plots raw generic occurrences, tetrapod-bearing collections and abundance against latitude and systems tract, and carries out correlation tests to examine whether these are statistically correlated with each other. It uses the data files "Genera_with_latitude.xlsx", "Collections with time.xlsx", "Corrected abundance with time.xlsx", "Collections_with_latitude.xlsx" and "Corrected abundance with latitude.xlsx".
The R script 'iNext_code.R' sample standardizes the raw generic occurrence data for each degree of latitude and for each systems tract using the iNE...
Facebook
TwitterThis data set contains abundance data for fossil mollusk genera from the Late Cretaceous of the U.S. Coastal Plain published by Sohl and Koch (1983, 1984, 1987). It also contains global stratigraphic ranges, global geographic ranges, and taxonomic information for genera, downloaded from the Paleobiology Database (PBDB) at http://paleodb.org in February 2008. This data set is used to examine the link between rarity and extinction across the end-Cretaceous mass extinction in Coastal Plain mollusks.
Facebook
Twitter200 years after the naming of the first dinosaur, taxonomic studies remain an important component of dinosaur research. Around 50 new dinosaurs are named each year, and are discovered from across the globe. The rate of new dinosaur discovery shows no signs of slowing, but not all geographic areas and temporal windows have been equally investigated. The potential for new dinosaur discoveries in India and Africa seems particularly high, while the Carnian, when dinosaurs probably originated, and the Middle Jurassic, when the major clades diversified, offer the best opportunities to make discoveries that will fundamentally change our understanding of dinosaur evolution. A major challenge to the discovery of new dinosaurs is funding. Frontier fieldwork is sometimes viewed as too risky to fund, while basic taxonomic work is considered to lack impact. As a consequence, we risk an ‘extinction of experience’, where researchers have limited training in the basic field and specimen-based research ..., Collector curves–All dinosaur regular genera and species, both valid and invalid, were downloaded from the Paleobiology Database (PBDB; paleobiodb.org) on 17th December 2024. The data were cleaned to remove Avialae, ichnotaxa, and ootaxa. Taxa that were listed as invalid due to misspellings, obsolete variates, or that were renamed for grammatical or linguistic reasons were removed. Nomina dubia, nomina nuda, objective and subjective synonyms, and recombinations were retained. Collector curves (Fig. 1) were built in R 3.4.0 [124]. Code and raw data are available in the Supplementary Material. Time-calibrated phylogeny–A consensus dinosaur phylogeny was manually produced in Mesquite [125]. First and last appearance data were collected for all taxa in the phylogeny and are listed in the data file provided in the Supplementary Material. First and last appearances generally correspond to the earliest and latest dates of the Stage from which the taxon is known, unless more accurate info..., , # New frontiers in dinosaur exploration
https://doi.org/10.5061/dryad.05qfttfd3
These data were collected to review the state of dinosaur taxonomy and systematics today, as part of an invited review titled 'New Frontiers in Dinosaur Exploration'. The raw data tables in xlsx and csv format were downloaded from the Paleobiology Database or Scopus and then cleansed according to the methods provided here and in the publication. The .txt file was compiled from the literature, while the .nex file is a phylogenetic tree that represents a consensus dinosaur phylogeny and was hand-built in Mesquite.Â
Description:Â A file showing the first and last appearance data for dinosaur taxa in the phylogenetic tree. This file is needed for time-calibration of the phylogenetic tree (DinotreeR1.nex) and is used in the code "Time-calibration_palaeotree.R".
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Paleobiology Database (PBDB) is a non-governmental, non-profit public resource for paleontological data. It has been organized and operated by a multi-disciplinary, multi-institutional, international group of paleobiological researchers. Its purpose is to provide global, collection-based occurrence and taxonomic data for organisms of all geological ages, as well data services to allow easy access to data for independent development of analytical tools, visualization software, and applications of all types. The Database’s broader goal is to encourage and enable data-driven collaborative efforts that address large-scale paleobiological questions.