100+ datasets found
  1. Classification of Mars Terrain Using Multiple Data Sources - Dataset - NASA...

    • data.nasa.gov
    Updated Mar 31, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). Classification of Mars Terrain Using Multiple Data Sources - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/classification-of-mars-terrain-using-multiple-data-sources
    Explore at:
    Dataset updated
    Mar 31, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    Classification of Mars Terrain Using Multiple Data Sources Alan Kraut1, David Wettergreen1 ABSTRACT. Images of Mars are being collected faster than they can be analyzed by planetary scientists. Automatic analysis of images would enable more rapid and more consistent image interpretation and could draft geologic maps where none yet exist. In this work we develop a method for incorporating images from multiple instruments to classify Martian terrain into multiple types. Each image is segmented into contiguous groups of similar pixels, called superpixels, with an associated vector of discriminative features. We have developed and tested several classification algorithms to associate a best class to each superpixel. These classifiers are trained using three different manual classifications with between 2 and 6 classes. Automatic classification accuracies of 50 to 80% are achieved in leave-one-out cross-validation across 20 scenes using a multi-class boosting classifier.

  2. d

    Addresses (Open Data)

    • catalog.data.gov
    • data-academy.tempe.gov
    • +11more
    Updated Nov 22, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Tempe (2025). Addresses (Open Data) [Dataset]. https://catalog.data.gov/dataset/addresses-open-data
    Explore at:
    Dataset updated
    Nov 22, 2025
    Dataset provided by
    City of Tempe
    Description

    This dataset is a compilation of address point data for the City of Tempe. The dataset contains a point location, the official address (as defined by The Building Safety Division of Community Development) for all occupiable units and any other official addresses in the City. There are several additional attributes that may be populated for an address, but they may not be populated for every address. Contact: Lynn Flaaen-Hanna, Development Services Specialist Contact E-mail Link: Map that Lets You Explore and Export Address Data Data Source: The initial dataset was created by combining several datasets and then reviewing the information to remove duplicates and identify errors. This published dataset is the system of record for Tempe addresses going forward, with the address information being created and maintained by The Building Safety Division of Community Development.Data Source Type: ESRI ArcGIS Enterprise GeodatabasePreparation Method: N/APublish Frequency: WeeklyPublish Method: AutomaticData Dictionary

  3. d

    Replication Data for: Scaling Data from Multiple Sources

    • dataone.org
    • dataverse.harvard.edu
    Updated Nov 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Enamorado, Ted; Lopez-Moctezuma, Gabriel; Ratkovic, Marc (2023). Replication Data for: Scaling Data from Multiple Sources [Dataset]. http://doi.org/10.7910/DVN/FOUVEL
    Explore at:
    Dataset updated
    Nov 22, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Enamorado, Ted; Lopez-Moctezuma, Gabriel; Ratkovic, Marc
    Description

    We introduce a method for scaling two data sets from different sources. The proposed method estimates a latent factor common to both datasets as well as an idiosyncratic factor unique to each. In addition, it offers a flexible modeling strategy that permits the scaled locations to be a function of covariates, and efficient implementation allows for inference through resampling. A simulation study shows that our proposed method improves over existing alternatives in capturing the variation common to both datasets, as well as the latent factors specific to each. We apply our proposed method to vote and speech data from the 112th U.S. Senate. We recover a shared subspace that aligns with a standard ideological dimension running from liberals to conservatives while recovering the words most associated with each senator's location. In addition, we estimate a word-specific subspace that ranges from national security to budget concerns, and a vote-specific subspace with Tea Party senators on one extreme and senior committee leaders on the other.

  4. u

    Data from: GALLO: An R package for Genomic Annotation and integration of...

    • portalcientifico.unileon.es
    • portalcienciaytecnologia.jcyl.es
    Updated 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fonseca, Pablo, A.S.; Suárez-Vega, Aroa; Marras, Gabriele; Cánovas, Ángela; Fonseca, Pablo, A.S.; Suárez-Vega, Aroa; Marras, Gabriele; Cánovas, Ángela (2020). GALLO: An R package for Genomic Annotation and integration of multiple data source in livestock for positional candidate LOci [Dataset]. https://portalcientifico.unileon.es/documentos/668fc461b9e7c03b01bdb93f
    Explore at:
    Dataset updated
    2020
    Authors
    Fonseca, Pablo, A.S.; Suárez-Vega, Aroa; Marras, Gabriele; Cánovas, Ángela; Fonseca, Pablo, A.S.; Suárez-Vega, Aroa; Marras, Gabriele; Cánovas, Ángela
    Description

    The development of high-throughput sequencing and genotyping methodologies allowed the identification of thousands of genomic regions associated with several complex traits. The integration of multiple sources of biological information is a crucial step required to better understand patterns regulating the development of these traits. Genomic Annotation in Livestock for positional candidate LOci (GALLO) is an R package developed for the accurate annotation of genes and quantitative trait loci (QTLs) located in regions identified in common genomic analyses performed in livestock, such as Genome-Wide Association Studies and transcriptomics using RNA-Sequencing. Moreover, GALLO allows the graphical visualization of gene and QTL annotation results, data comparison among different grouping factors (e.g., methods, breeds, tissues, statistical models, studies, etc.), and QTL enrichment in different livestock species including cattle, pigs, sheep, and chickens, etc. Consequently, GALLO is a useful package for the annotation, identification of hidden patterns across datasets, datamining previously reported associations, as well as the efficient scrutinization of the genetic architecture of complex traits in livestock.

  5. Binance Coin BNB, 1m Full Historical Data

    • kaggle.com
    zip
    Updated Oct 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Imran Bukhari (2025). Binance Coin BNB, 1m Full Historical Data [Dataset]. https://www.kaggle.com/datasets/imranbukhari/comprehensive-bnbusd-1m-data/data
    Explore at:
    zip(266775584 bytes)Available download formats
    Dataset updated
    Oct 11, 2025
    Authors
    Imran Bukhari
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    I am a new developer and I would greatly appreciate your support. If you find this dataset helpful, please consider giving it an upvote!

    Key Features:

    Complete 1m Data: Raw 1m historical data from multiple exchanges, covering the entire trading history of BNBUSD available through their API endpoints. This dataset is updated daily to ensure up-to-date coverage.

    Combined Index Dataset: A unique feature of this dataset is the combined index, which is derived by averaging all other datasets into one, please see attached notebook. This creates the longest continuous, unbroken BNBUSD dataset available on Kaggle, with no gaps and no erroneous values. It gives a much more comprehensive view of the market i.e. total volume across multiple exchanges.

    Superior Performance: The combined index dataset has demonstrated superior 'mean average error' (MAE) metric performance when training machine learning models, compared to single-source datasets by a whole order of MAE magnitude.

    Unbroken History: The combined dataset's continuous history is a valuable asset for researchers and traders who require accurate and uninterrupted time series data for modeling or back-testing.

    https://i.imgur.com/aqtuPay.png" alt="BNBUSD Dataset Summary">

    https://i.imgur.com/mnzs2f4.png" alt="Combined Dataset Close Plot"> This plot illustrates the continuity of the dataset over time, with no gaps in data, making it ideal for time series analysis.

    Included Resources:

    Two Notebooks:

    Dataset Usage and Diagnostics: This notebook demonstrates how to use the dataset and includes a powerful data diagnostics function, which is useful for all time series analyses.

    Aggregating Multiple Data Sources: This notebook walks you through the process of combining multiple exchange datasets into a single, clean dataset. (Currently unavailable, will be added shortly)

  6. DataSheet2_Data Sources for Drug Utilization Research in Brazil—DUR-BRA...

    • frontiersin.figshare.com
    • datasetcatalog.nlm.nih.gov
    xlsx
    Updated Jun 1, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lisiane Freitas Leal; Claudia Garcia Serpa Osorio-de-Castro; Luiz Júpiter Carneiro de Souza; Felipe Ferre; Daniel Marques Mota; Marcia Ito; Monique Elseviers; Elisangela da Costa Lima; Ivan Ricardo Zimmernan; Izabela Fulone; Monica Da Luz Carvalho-Soares; Luciane Cruz Lopes (2023). DataSheet2_Data Sources for Drug Utilization Research in Brazil—DUR-BRA Study.xlsx [Dataset]. http://doi.org/10.3389/fphar.2021.789872.s002
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    Frontiers Mediahttp://www.frontiersin.org/
    Authors
    Lisiane Freitas Leal; Claudia Garcia Serpa Osorio-de-Castro; Luiz Júpiter Carneiro de Souza; Felipe Ferre; Daniel Marques Mota; Marcia Ito; Monique Elseviers; Elisangela da Costa Lima; Ivan Ricardo Zimmernan; Izabela Fulone; Monica Da Luz Carvalho-Soares; Luciane Cruz Lopes
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Brazil
    Description

    Background: In Brazil, studies that map electronic healthcare databases in order to assess their suitability for use in pharmacoepidemiologic research are lacking. We aimed to identify, catalogue, and characterize Brazilian data sources for Drug Utilization Research (DUR).Methods: The present study is part of the project entitled, “Publicly Available Data Sources for Drug Utilization Research in Latin American (LatAm) Countries.” A network of Brazilian health experts was assembled to map secondary administrative data from healthcare organizations that might provide information related to medication use. A multi-phase approach including internet search of institutional government websites, traditional bibliographic databases, and experts’ input was used for mapping the data sources. The reviewers searched, screened and selected the data sources independently; disagreements were resolved by consensus. Data sources were grouped into the following categories: 1) automated databases; 2) Electronic Medical Records (EMR); 3) national surveys or datasets; 4) adverse event reporting systems; and 5) others. Each data source was characterized by accessibility, geographic granularity, setting, type of data (aggregate or individual-level), and years of coverage. We also searched for publications related to each data source.Results: A total of 62 data sources were identified and screened; 38 met the eligibility criteria for inclusion and were fully characterized. We grouped 23 (60%) as automated databases, four (11%) as adverse event reporting systems, four (11%) as EMRs, three (8%) as national surveys or datasets, and four (11%) as other types. Eighteen (47%) were classified as publicly and conveniently accessible online; providing information at national level. Most of them offered more than 5 years of comprehensive data coverage, and presented data at both the individual and aggregated levels. No information about population coverage was found. Drug coding is not uniform; each data source has its own coding system, depending on the purpose of the data. At least one scientific publication was found for each publicly available data source.Conclusions: There are several types of data sources for DUR in Brazil, but a uniform system for drug classification and data quality evaluation does not exist. The extent of population covered by year is unknown. Our comprehensive and structured inventory reveals a need for full characterization of these data sources.

  7. Matched Sentinel-2 spectral data and chlorophyll a concentrations 2015-2020

    • catalog.data.gov
    • datasets.ai
    Updated Sep 3, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2023). Matched Sentinel-2 spectral data and chlorophyll a concentrations 2015-2020 [Dataset]. https://catalog.data.gov/dataset/matched-sentinel-2-spectral-data-and-chlorophyll-a-concentrations-2015-2020
    Explore at:
    Dataset updated
    Sep 3, 2023
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    The dataset includes Sentinel-2 spectral data for all bands spatiotemporally matched with available chlorophyll a concentration data from several data sources including the Water Quality Portal.

  8. f

    Data from: Multimorbidity in Australia: Comparing estimates derived using...

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Aug 29, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zwar, Nicholas; Jorm, Louisa; Lujic, Sanja; Hosseinzadeh, Hassan; Simpson, Judy M. (2017). Multimorbidity in Australia: Comparing estimates derived using administrative data sources and survey data [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001779669
    Explore at:
    Dataset updated
    Aug 29, 2017
    Authors
    Zwar, Nicholas; Jorm, Louisa; Lujic, Sanja; Hosseinzadeh, Hassan; Simpson, Judy M.
    Area covered
    Australia
    Description

    BackgroundEstimating multimorbidity (presence of two or more chronic conditions) using administrative data is becoming increasingly common. We investigated (1) the concordance of identification of chronic conditions and multimorbidity using self-report survey and administrative datasets; (2) characteristics of people with multimorbidity ascertained using different data sources; and (3) whether the same individuals are classified as multimorbid using different data sources.MethodsBaseline survey data for 90,352 participants of the 45 and Up Study—a cohort study of residents of New South Wales, Australia, aged 45 years and over—were linked to prior two-year pharmaceutical claims and hospital admission records. Concordance of eight self-report chronic conditions (reference) with claims and hospital data were examined using sensitivity (Sn), positive predictive value (PPV), and kappa (κ).The characteristics of people classified as multimorbid were compared using logistic regression modelling.ResultsAgreement was found to be highest for diabetes in both hospital and claims data (κ = 0.79, 0.78; Sn = 79%, 72%; PPV = 86%, 90%). The prevalence of multimorbidity was highest using self-report data (37.4%), followed by claims data (36.1%) and hospital data (19.3%). Combining all three datasets identified a total of 46 683 (52%) people with multimorbidity, with half of these identified using a single dataset only, and up to 20% identified on all three datasets. Characteristics of persons with and without multimorbidity were generally similar. However, the age gradient was more pronounced and people speaking a language other than English at home were more likely to be identified as multimorbid by administrative data.ConclusionsDifferent individuals, with different combinations of conditions, are identified as multimorbid when different data sources are used. As such, caution should be applied when ascertaining morbidity from a single data source as the agreement between self-report and administrative data is generally poor. Future multimorbidity research exploring specific disease combinations and clusters of diseases that commonly co-occur, rather than a simple disease count, is likely to provide more useful insights into the complex care needs of individuals with multiple chronic conditions.

  9. h

    Data from: VISEM

    • huggingface.co
    Updated Jan 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sperm-net (2025). VISEM [Dataset]. https://huggingface.co/datasets/sperm-net/VISEM
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 31, 2025
    Dataset authored and provided by
    Sperm-net
    Description

    Dataset Card for VISEM Dataset

      Dataset Details
    
    
    
    
    
      Dataset Description
    

    The VISEM dataset is a multimodal video dataset designed for the analysis of human spermatozoa. It is one of the few open datasets that combine multiple data sources, including videos, biological analysis data, and participant-related information. The dataset consists of anonymized data from 85 different participants, with a focus on improving research in human reproduction, particularly male… See the full description on the dataset page: https://huggingface.co/datasets/sperm-net/VISEM.

  10. d

    Transportation Projects in Your Neighborhood

    • catalog.data.gov
    • datasets.ai
    • +3more
    Updated Jul 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    State of New York (2025). Transportation Projects in Your Neighborhood [Dataset]. https://catalog.data.gov/dataset/transportation-projects-in-your-neighborhood
    Explore at:
    Dataset updated
    Jul 19, 2025
    Dataset provided by
    State of New York
    Description

    This data set contains DOT construction project information. The data is refreshed nightly from multiple data sources, therefore the data becomes stale rather quickly.

  11. Bitcoin BTC, 7 Exchanges, 1h Full Historical Data

    • kaggle.com
    Updated Sep 9, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Imran Bukhari (2025). Bitcoin BTC, 7 Exchanges, 1h Full Historical Data [Dataset]. https://www.kaggle.com/datasets/imranbukhari/comprehensive-btcusd-1h-data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 9, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Imran Bukhari
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    I am a new developer and I would greatly appreciate your support. If you find this dataset helpful, please consider giving it an upvote!

    Key Features:

    Complete 1h Data: Raw 1h historical data from multiple exchanges, covering the entire trading history of BTCUSD available through their API endpoints. This dataset is updated daily to ensure up-to-date coverage.

    Combined Index Dataset: A unique feature of this dataset is the combined index, which is derived by averaging all other datasets into one, please see attached notebook. This creates the longest continuous, unbroken BTCUSD dataset available on Kaggle, with no gaps and no erroneous values. It gives a much more comprehensive view of the market i.e. total volume across multiple exchanges.

    Superior Performance: The combined index dataset has demonstrated superior 'mean average error' (MAE) metric performance when training machine learning models, compared to single-source datasets by a whole order of MAE magnitude.

    Unbroken History: The combined dataset's continuous history is a valuable asset for researchers and traders who require accurate and uninterrupted time series data for modeling or back-testing.

    https://i.imgur.com/OVOyF5A.png" alt="BTCUSD Dataset Summary">

    https://i.imgur.com/6hxG2G3.png" alt="Combined Dataset Close Plot"> This plot illustrates the continuity of the dataset over time, with no gaps in data, making it ideal for time series analysis.

    Included Resources:

    Two Notebooks:

    Dataset Usage and Diagnostics: This notebook demonstrates how to use the dataset and includes a powerful data diagnostics function, which is useful for all time series analyses.

    Aggregating Multiple Data Sources: This notebook walks you through the process of combining multiple exchange datasets into a single, clean dataset. (Currently unavailable, will be added shortly)

  12. Ethereum ETH, 7 Exchanges, 1h Full Historical Data

    • kaggle.com
    zip
    Updated Oct 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Imran Bukhari (2025). Ethereum ETH, 7 Exchanges, 1h Full Historical Data [Dataset]. https://www.kaggle.com/datasets/imranbukhari/comprehensive-ethusd-1h-data/code
    Explore at:
    zip(16024314 bytes)Available download formats
    Dataset updated
    Oct 11, 2025
    Authors
    Imran Bukhari
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    I am a new developer and I would greatly appreciate your support. If you find this dataset helpful, please consider giving it an upvote!

    Key Features:

    Complete 1h Data: Raw 1h historical data from multiple exchanges, covering the entire trading history of ETHUSD available through their API endpoints. This dataset is updated daily to ensure up-to-date coverage.

    Combined Index Dataset: A unique feature of this dataset is the combined index, which is derived by averaging all other datasets into one, please see attached notebook. This creates the longest continuous, unbroken ETHUSD dataset available on Kaggle, with no gaps and no erroneous values. It gives a much more comprehensive view of the market i.e. total volume across multiple exchanges.

    Superior Performance: The combined index dataset has demonstrated superior 'mean average error' (MAE) metric performance when training machine learning models, compared to single-source datasets by a whole order of MAE magnitude.

    Unbroken History: The combined dataset's continuous history is a valuable asset for researchers and traders who require accurate and uninterrupted time series data for modeling or back-testing.

    https://i.imgur.com/1Qgdoqo.png" alt="ETHUSD Dataset Summary">

    https://i.imgur.com/RDKMDjo.png" alt="Combined Dataset Close Plot"> This plot illustrates the continuity of the dataset over time, with no gaps in data, making it ideal for time series analysis.

    Included Resources:

    Two Notebooks:

    Dataset Usage and Diagnostics: This notebook demonstrates how to use the dataset and includes a powerful data diagnostics function, which is useful for all time series analyses.

    Aggregating Multiple Data Sources: This notebook walks you through the process of combining multiple exchange datasets into a single, clean dataset. (Currently unavailable, will be added shortly)

  13. Data from: Mental Health United States 2010

    • catalog.data.gov
    • data.virginia.gov
    • +1more
    Updated Sep 6, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Substance Abuse and Mental Health Services Administration (2025). Mental Health United States 2010 [Dataset]. https://catalog.data.gov/dataset/mental-health-united-states-2010
    Explore at:
    Dataset updated
    Sep 6, 2025
    Dataset provided by
    Substance Abuse and Mental Health Services Administrationhttps://www.samhsa.gov/
    Area covered
    United States
    Description

    This publication provides behavioral health statistics at the national and state levels from multiple data sources, including the National Survey on Drug Use and Health, the National Health Interview Survey, the Medical Expenditures Panel Survey, the National Association of State Mental Health Program Directors, as well as peer-reviewed journal articles.

  14. d

    A+ Schools Report to the Community

    • datasets.ai
    • data.wprdc.org
    • +2more
    33
    Updated Jan 24, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Allegheny County / City of Pittsburgh / Western PA Regional Data Center (2023). A+ Schools Report to the Community [Dataset]. https://datasets.ai/datasets/a-schools-report-to-the-community
    Explore at:
    33Available download formats
    Dataset updated
    Jan 24, 2023
    Dataset authored and provided by
    Allegheny County / City of Pittsburgh / Western PA Regional Data Center
    Description

    This report consolidates information from multiple data sources including PPS, PDE and Pittsburgh charter schools. Data is obtained through downloads from the web or through data requests. Raw data used to generate the reports will be made available as the files are processed.

  15. w

    Data from: ISLSCP II Global Population of the World

    • data.wu.ac.at
    • search.dataone.org
    • +6more
    bin
    Updated Apr 19, 2010
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Aeronautics and Space Administration (2010). ISLSCP II Global Population of the World [Dataset]. https://data.wu.ac.at/schema/data_gov/OWZkOTYyZWMtMTI4MS00MmFmLTg3YjItOWFkODIzM2NkNjkz
    Explore at:
    binAvailable download formats
    Dataset updated
    Apr 19, 2010
    Dataset provided by
    National Aeronautics and Space Administration
    Area covered
    749861aaba2e7f5fd030670b24966767c772f104
    Description

    Global Population of the World (GPW) translates census population data to a latitude-longitude grid so that population data may be used in cross-disciplinary studies. There are three data files with this data set for the reference years 1990 and 1995. Over 127,000 administrative units and population counts were collected and integrated from various sources to create the gridded data. In brief, GPW was created using the following steps: * Population data were estimated for the product reference years, 1990 and 1995, either by the data source or by interpolating or extrapolating the given estimates for other years. * Additional population estimates were created by adjusting the source population data to match UN national population estimates for the reference years. * Borders and coastlines of the spatial data were matched to the Digital Chart of the World where appropriate and lakes from the Digital Chart of the World were added. * The resulting data were then transformed into grids of UN-adjusted and unadjusted population counts for the reference years. * Grids containing the area of administrative boundary data in each cell (net of lakes) were created and used with the count grids to produce population densities.As with any global data set based on multiple data sources, the spatial and attribute precision of GPW is variable. The level of detail and accuracy, both in time and space, vary among the countries for which data were obtained.

  16. c

    Data from: Compiled Database and Results of the Analysis of Multiple...

    • s.cnmilf.com
    • data.usgs.gov
    • +1more
    Updated Oct 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). Compiled Database and Results of the Analysis of Multiple Groundwater-Quality Datasets for Idaho [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/compiled-database-and-results-of-the-analysis-of-multiple-groundwater-quality-datasets-for
    Explore at:
    Dataset updated
    Oct 8, 2025
    Dataset provided by
    U.S. Geological Survey
    Description

    Groundwater is an important source of drinking and irrigation water throughout Idaho, and groundwater quality is monitored by various Federal, State, and local agencies. The historical, multi-agency records of groundwater quality include a valuable dataset that has yet to be compiled or analyzed on a statewide level. The purpose of this study is to combine groundwater-quality data from multiple sources into a single database, to summarize this dataset, and to perform bulk analyses to reveal spatial and temporal patterns of water quality throughout Idaho. Data were retrieved from the Water Quality Portal (www.waterqualitydata.us), the Idaho Department of Environmental Quality, and the Idaho Department of Water Resources. Analyses included counting the number of times a sample _location had concentrations above Maximum Contaminant Levels (MCL), performing trends tests, and calculating correlations between water-quality analytes.

  17. 2MASS Survey Merged Point Source Information Table

    • catalog.data.gov
    Updated Sep 19, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NASA/IPAC Infrared Science Archive (2025). 2MASS Survey Merged Point Source Information Table [Dataset]. https://catalog.data.gov/dataset/2mass-survey-merged-point-source-information-table
    Explore at:
    Dataset updated
    Sep 19, 2025
    Dataset provided by
    NASA/IPAC Extragalactic Database
    Description

    The merged source tables contain the mean positions magnitudes and uncertainties for sources detected multiple times in each of the 2MASS data sets. The merging was carried out using an autocorrelation of the respective databases to identify groups of extractions that are positionally associated with each other, all lying within a 1.5" radius circular region. A number of confirmation statistics are also provided in the tables that can be used to test for source motion and/or variability, and the general quality of the merge.

  18. International Comprehensive Ocean-Atmosphere Data Set (ICOADS)...

    • catalog.data.gov
    • ncei.noaa.gov
    Updated Sep 16, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DOC/NOAA/NESDIS/NCEI > National Centers for Environmental Information, NESDIS, NOAA, U.S. Department of Commerce (Point of Contact) (2023). International Comprehensive Ocean-Atmosphere Data Set (ICOADS) Near-Real-Time (NRT) - Daily, Release 3.0.2 [Dataset]. https://catalog.data.gov/dataset/international-comprehensive-ocean-atmosphere-data-set-icoads-near-real-time-nrt-daily-release-3
    Explore at:
    Dataset updated
    Sep 16, 2023
    Dataset provided by
    National Oceanic and Atmospheric Administrationhttp://www.noaa.gov/
    National Centers for Environmental Informationhttps://www.ncei.noaa.gov/
    United States Department of Commercehttp://commerce.gov/
    National Environmental Satellite, Data, and Information Service
    Description

    The International Comprehensive Ocean-Atmosphere Data Set (ICOADS) is the world's most extensive surface marine meteorological data collection. Building on national and international partnerships, ICOADS provides a variety of user communities with easy access to many different data sources in a consistent format. Data sources range from early historical ship observations to more modern, automated measurement systems including moored buoys and surface drifters. Past versions of the ICOADS dataset have been published as monthly files while holding a daily version of the product for internal use only. NCEI has since developed a reformatted daily product of the dataset that now aligns with the monthly, ready for public use. The objective of this initiative is to sustain the quality and usability of this high-profile ICOADS product for stakeholders that have requested the need for an expanded product. ICOADS R3.0.2 Daily is now developed and released.

  19. EU MPA Paper submission Aminian Biquet et al

    • figshare.com
    xlsx
    Updated Aug 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Juliette Aminian Biquet (2024). EU MPA Paper submission Aminian Biquet et al [Dataset]. http://doi.org/10.6084/m9.figshare.25103450.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Aug 16, 2024
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Juliette Aminian Biquet
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Rdata and RMD file for the submission to One Earth by Aminian Biquet et al.See pdf file for a description of data files.To get 1) the entire dataset containing regulations at activity levels, identifiers of other databases, etc., and 2) the detailed description of raw data sources and protocol, look up for the publication (in prep. for Data in Brief): Regulations of activities and protection levels in Marine Protected Areas of the European Union gathered from multiple data sources. Aminian-Biquet et al. In prep.

  20. H

    U.S. Community Water Systems Service Boundaries, v1.0.0

    • hydroshare.org
    • beta.hydroshare.org
    • +1more
    zip
    Updated May 18, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    HydroShare (2022). U.S. Community Water Systems Service Boundaries, v1.0.0 [Dataset]. https://www.hydroshare.org/resource/5485b0f0278547068972ac7289547ab1
    Explore at:
    zip(202.0 MB)Available download formats
    Dataset updated
    May 18, 2022
    Dataset provided by
    HydroShare
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Area covered
    Description

    This is a layer of water service boundaries for 44,919 community water systems that deliver tap water to 306.88 million people in the US. This amounts to 97.22% of the population reportedly served by active community water systems and 90.85% of active community water systems. The layer is based on multiple data sources and a methodology developed by SimpleLab and collaborators called a Tiered, Explicit, Match, and Model approach–or TEMM, for short. The name of the approach reflects exactly how the nationwide data layer was developed. The TEMM is composed of three hierarchical tiers, arranged by data and model fidelity. First, we use explicit water service boundaries provided by states. These are spatial polygon data, typically provided at the state-level. We call systems with explicit boundaries Tier 1. In the absence of explicit water service boundary data, we use a matching algorithm to match water systems to the boundary of a town or city (Census Place TIGER polygons). When a water system and TIGER place match one-to-one, we label this Tier 2a. When multiple water systems match to the same TIGER place, we label this Tier 2b. Tier 2b reflects overlapping boundaries for multiple systems. Finally, in the absence of an explicit water service boundary (Tier 1) or a TIGER place polygon match (Tier 2a or Tier 2b), a statistical model trained on explicit water service boundary data (Tier 1) is used to estimate a reasonable radius at provided water system centroids, and model a spherical water system boundary (Tier 3).

    Several limitations to this data exist–and the layer should be used with these in mind. First, the case of assigning a Census Place TIGER polygon to multiple systems results in an inaccurate assignment of the same exact area to multiple systems; we hope to resolve Tier 2b systems into Tier 2a or Tier 3 in a future iteration. Second, matching algorithms to assign Census Place boundaries require additional validation and iteration. Third, Tier 3 boundaries have modeled radii stemming from a lat/long centroid of a water system facility; but the underlying lat/long centroids for water system facilities are of variable quality. It is critical to evaluate the "geometry quality" column (included from the EPA ECHO data source) when looking at Tier 3 boundaries; fidelity is very low when geometry quality is a county or state centroid– but we did not exclude the data from the layer. Fourth, missing water systems are typically those without a centroid, in a U.S. territory, or missing population and connection data. Finally, Tier 1 systems are assumed to be high fidelity, but rely on the accuracy of state data collection and maintenance.

    All data, methods, documentation, and contributions are open-source and available here: https://github.com/SimpleLab-Inc/wsb.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
nasa.gov (2025). Classification of Mars Terrain Using Multiple Data Sources - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/classification-of-mars-terrain-using-multiple-data-sources
Organization logo

Classification of Mars Terrain Using Multiple Data Sources - Dataset - NASA Open Data Portal

Explore at:
Dataset updated
Mar 31, 2025
Dataset provided by
NASAhttp://nasa.gov/
Description

Classification of Mars Terrain Using Multiple Data Sources Alan Kraut1, David Wettergreen1 ABSTRACT. Images of Mars are being collected faster than they can be analyzed by planetary scientists. Automatic analysis of images would enable more rapid and more consistent image interpretation and could draft geologic maps where none yet exist. In this work we develop a method for incorporating images from multiple instruments to classify Martian terrain into multiple types. Each image is segmented into contiguous groups of similar pixels, called superpixels, with an associated vector of discriminative features. We have developed and tested several classification algorithms to associate a best class to each superpixel. These classifiers are trained using three different manual classifications with between 2 and 6 classes. Automatic classification accuracies of 50 to 80% are achieved in leave-one-out cross-validation across 20 scenes using a multi-class boosting classifier.

Search
Clear search
Close search
Google apps
Main menu