100+ datasets found

d
Sea Surface Temperature (SST) Standard Deviation of Long-term Mean,...
catalog.data.gov
data.ioos.us
+2more
Updated Jan 27, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Center for Ecological Analysis and Synthesis (NCEAS) (Point of Contact) (2025). Sea Surface Temperature (SST) Standard Deviation of Long-term Mean, 2000-2013 - Hawaii [Dataset]. https://catalog.data.gov/dataset/sea-surface-temperature-sst-standard-deviation-of-long-term-mean-2000-2013-hawaii
Explore at:
Dataset updated
Jan 27, 2025
Dataset provided by
National Center for Ecological Analysis and Synthesis (NCEAS) (Point of Contact)
Area covered
Hawaii
Description
Sea surface temperature (SST) plays an important role in a number of ecological processes and can vary over a wide range of time scales, from daily to decadal changes. SST influences primary production, species migration patterns, and coral health. If temperatures are anomalously warm for extended periods of time, drastic changes in the surrounding ecosystem can result, including harmful effects such as coral bleaching. This layer represents the standard deviation of SST (degrees Celsius) of the weekly time series from 2000-2013. Three SST datasets were combined to provide continuous coverage from 1985-2013. The concatenation applies bias adjustment derived from linear regression to the overlap periods of datasets, with the final representation matching the 0.05-degree (~5-km) near real-time SST product. First, a weekly composite, gap-filled SST dataset from the NOAA Pathfinder v5.2 SST 1/24-degree (~4-km), daily dataset (a NOAA Climate Data Record) for each location was produced following Heron et al. (2010) for January 1985 to December 2012. Next, weekly composite SST data from the NOAA/NESDIS/STAR Blended SST 0.1-degree (~11-km), daily dataset was produced for February 2009 to October 2013. Finally, a weekly composite SST dataset from the NOAA/NESDIS/STAR Blended SST 0.05-degree (~5-km), daily dataset was produced for March 2012 to December 2013. The standard deviation of the long-term mean SST was calculated by taking the standard deviation over all weekly data from 2000-2013 for each pixel.
d
Standard deviation of the bathymetric DEM of the Sacramento River, from the...
catalog.data.gov
data.cnra.ca.gov
+3more
Updated Jul 6, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2024). Standard deviation of the bathymetric DEM of the Sacramento River, from the Feather River to Knights Landing, California in February 2011 [Dataset]. https://catalog.data.gov/dataset/standard-deviation-of-the-bathymetric-dem-of-the-sacramento-river-from-the-feather-river-t
Explore at:
Dataset updated
Jul 6, 2024
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Area covered
Feather River, Sacramento River, Knights Landing, California
Description
This part of the data release contains a grid of standard deviations of bathymetric soundings within each 0.5 m x 0.5 m grid cell. The bathymetry was collected on February 1, 2011, in the Sacramento River from the confluence of the Feather River to Knights Landing. The standard deviations represent one component of bathymetric uncertainty in the final digital elevation model (DEM), which is also available in this data release. The bathymetry data were collected by the USGS Pacific Coastal and Marine Science Center (PCMSC) team with collaboration and funding from the U.S. Army Corps of Engineers. This project used interferometric sidescan sonar to characterize the riverbed and channel banks along a 12 mile reach of the Sacramento River near the town of Knights Landing, California (River Mile 79 through River Mile 91) to aid in the understanding of fish response to the creation of safe habitat associated with levee restoration efforts in two 1.5 mile reaches of the Sacramento River between River Mile 80 and 86.
Z
Monthly aggregated Water Vapor MODIS MCD19A2 (1 km): Long-term data...
data.niaid.nih.gov
zenodo.org
Updated Jul 11, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Leandro Parente (2024). Monthly aggregated Water Vapor MODIS MCD19A2 (1 km): Long-term data (2000-2022) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8192543
Explore at:
Dataset updated
Jul 11, 2024
Dataset provided by
Rolf Simoes
Leandro Parente
Tomislav Hengl
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
This data is part of the Monthly aggregated Water Vapor MODIS MCD19A2 (1 km) dataset. Check the related identifiers section on the Zenodo side panel to access other parts of the dataset. General Description The monthly aggregated water vapor dataset is derived from MCD19A2 v061. The Water Vapor data measures the column above ground retrieved from MODIS near-IR bands at 0.94μm. The dataset time spans from 2000 to 2022 and provides data that covers the entire globe. The dataset can be used in many applications like water cycle modeling, vegetation mapping, and soil mapping. This dataset includes:

Monthly time-series:Derived from MCD19A2 v061, this data provides a monthly aggregated mean and standard deviation of daily water vapor time-series data from 2000 to 2022. Only positive non-cloudy pixels were considered valid observations to derive the mean and the standard deviation. The remaining no-data values were filled using the TMWM algorithm. This dataset also includes smoothed mean and standard deviation values using the Whittaker method. The quality assessment layers and the number of valid observations for each month can provide an indication of the reliability of the monthly mean and standard deviation values. Yearly time-series:Derived from monthly time-series, this data provides a yearly time-series aggregated statistics of the monthly time-series data. Long-term data (2000-2022):Derived from monthly time-series, this data provides long-term aggregated statistics for the whole series of monthly observations. Data Details

Time period: 2000–2022 Type of data: Water vapor column above the ground (0.001cm) How the data was collected or derived: Derived from MCD19A2 v061 using Google Earth Engine. Cloudy pixels were removed and only positive values of water vapor were considered to compute the statistics. The time-series gap-filling and time-series smoothing were computed using the Scikit-map Python package. Statistical methods used: Four statistics were derived: standard deviation, percentiles 25, 50, and 75. Limitations or exclusions in the data: The dataset does not include data for Antarctica. Coordinate reference system: EPSG:4326 Bounding box (Xmin, Ymin, Xmax, Ymax): (-180.00000, -62.00081, 179.99994, 87.37000) Spatial resolution: 1/120 d.d. = 0.008333333 (1km) Image size: 43,200 x 17,924 File format: Cloud Optimized Geotiff (COG) format. Support If you discover a bug, artifact, or inconsistency, or if you have a question please use some of the following channels:

Technical issues and questions about the code: GitLab Issues General questions and comments: LandGIS Forum Name convention To ensure consistency and ease of use across and within the projects, we follow the standard Open-Earth-Monitor file-naming convention. The convention works with 10 fields that describes important properties of the data. In this way users can search files, prepare data analysis etc, without needing to open files. The fields are:

generic variable name: wv = Water vapor variable procedure combination: mcd19a2v061.seasconv = MCD19A2 v061 with gap-filling algorithm Position in the probability distribution / variable type: m = mean | sd = standard deviation | n = number of observations | qa = quality assessment Spatial support: 1km Depth reference: s = surface Time reference begin time: 20000101 = 2000-01-01 Time reference end time: 20221231 = 2022-12-31 Bounding box: go = global (without Antarctica) EPSG code: epsg.4326 = EPSG:4326 Version code: v20230619 = 2023-06-19 (creation date)
f
Data from: The Often-Overlooked Power of Summary Statistics in Exploratory...
acs.figshare.com
xlsx
Updated Jun 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tahereh G. Avval; Behnam Moeini; Victoria Carver; Neal Fairley; Emily F. Smith; Jonas Baltrusaitis; Vincent Fernandez; Bonnie. J. Tyler; Neal Gallagher; Matthew R. Linford (2023). The Often-Overlooked Power of Summary Statistics in Exploratory Data Analysis: Comparison of Pattern Recognition Entropy (PRE) to Other Summary Statistics and Introduction of Divided Spectrum-PRE (DS-PRE) [Dataset]. http://doi.org/10.1021/acs.jcim.1c00244.s002
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1021/acs.jcim.1c00244.s002
Dataset updated
Jun 8, 2023
Dataset provided by
ACS Publications
Authors
Tahereh G. Avval; Behnam Moeini; Victoria Carver; Neal Fairley; Emily F. Smith; Jonas Baltrusaitis; Vincent Fernandez; Bonnie. J. Tyler; Neal Gallagher; Matthew R. Linford
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Unsupervised exploratory data analysis (EDA) is often the first step in understanding complex data sets. While summary statistics are among the most efficient and convenient tools for exploring and describing sets of data, they are often overlooked in EDA. In this paper, we show multiple case studies that compare the performance, including clustering, of a series of summary statistics in EDA. The summary statistics considered here are pattern recognition entropy (PRE), the mean, standard deviation (STD), 1-norm, range, sum of squares (SSQ), and X4, which are compared with principal component analysis (PCA), multivariate curve resolution (MCR), and/or cluster analysis. PRE and the other summary statistics are direct methods for analyzing datathey are not factor-based approaches. To quantify the performance of summary statistics, we use the concept of the “critical pair,” which is employed in chromatography. The data analyzed here come from different analytical methods. Hyperspectral images, including one of a biological material, are also analyzed. In general, PRE outperforms the other summary statistics, especially in image analysis, although a suite of summary statistics is useful in exploring complex data sets. While PRE results were generally comparable to those from PCA and MCR, PRE is easier to apply. For example, there is no need to determine the number of factors that describe a data set. Finally, we introduce the concept of divided spectrum-PRE (DS-PRE) as a new EDA method. DS-PRE increases the discrimination power of PRE. We also show that DS-PRE can be used to provide the inputs for the k-nearest neighbor (kNN) algorithm. We recommend PRE and DS-PRE as rapid new tools for unsupervised EDA.
N
Median Household Income Variation by Family Size in United States:...
neilsberg.com
csv, json
Updated Jan 11, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2024). Median Household Income Variation by Family Size in United States: Comparative analysis across 7 household sizes [Dataset]. https://www.neilsberg.com/research/datasets/1b8874db-73fd-11ee-949f-3860777c1fe6/
Explore at:
json, csvAvailable download formats
Dataset updated
Jan 11, 2024
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
United States
Variables measured
Household size, Median Household Income
Measurement technique
The data presented in this dataset is derived from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates. It delineates income distributions across 7 household sizes (mentioned above) following an initial analysis and categorization. Using this dataset, you can find out how household income varies with the size of the family unit. For additional information about these estimations, please contact us via email at research@neilsberg.com
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset presents median household incomes for various household sizes in United States, as reported by the U.S. Census Bureau. The dataset highlights the variation in median household income with the size of the family unit, offering valuable insights into economic trends and disparities within different household sizes, aiding in data analysis and decision-making.

Key observations

Of the 7 household sizes (1 person to 7-or-more person households) reported by the census bureau, all of the household sizes were found in United States. Across the different household sizes in United States the mean income is $94,149, and the standard deviation is $26,829. The coefficient of variation (CV) is 28.50%. This high CV indicates high relative variability, suggesting that the incomes vary significantly across different sizes of households.

In the most recent year, 2021, The smallest household size for which the bureau reported a median household income was 1-person households, with an income of $38,463. It then further increased to $114,329 for 7-person households, the largest household size for which the bureau reported a median household income.

https://i.neilsberg.com/ch/united-states-median-household-income-by-household-size.jpeg" alt="United States median household income, by household size (in 2022 inflation-adjusted dollars)">

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

Household Sizes:

1-person households

2-person households

3-person households

4-person households

5-person households

6-person households

7-or-more-person households

Variables / Data Columns

Household Size: This column showcases 7 household sizes ranging from 1-person households to 7-or-more-person households (As mentioned above).

Median Household Income: Median household income, in 2022 inflation-adjusted dollars for the specific household size.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for United States median household income. You can refer the same here
A
‘Walmart Dataset (Retail)’ analyzed by Analyst-2
analyst-2.ai
Updated Apr 18, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2020). ‘Walmart Dataset (Retail)’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-walmart-dataset-retail-0283/e07567d8/?iid=003-947&v=presentation
Explore at:
Dataset updated
Apr 18, 2020
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘Walmart Dataset (Retail)’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/rutuspatel/walmart-dataset-retail on 28 January 2022.

--- Dataset description provided by original source is as follows ---

Dataset Description :

This is the historical data that covers sales from 2010-02-05 to 2012-11-01, in the file Walmart_Store_sales. Within this file you will find the following fields:

Store - the store number

Date - the week of sales

Weekly_Sales - sales for the given store

Holiday_Flag - whether the week is a special holiday week 1 – Holiday week 0 – Non-holiday week

Temperature - Temperature on the day of sale

Fuel_Price - Cost of fuel in the region

CPI – Prevailing consumer price index

Unemployment - Prevailing unemployment rate

Holiday Events Super Bowl: 12-Feb-10, 11-Feb-11, 10-Feb-12, 8-Feb-13 Labour Day: 10-Sep-10, 9-Sep-11, 7-Sep-12, 6-Sep-13 Thanksgiving: 26-Nov-10, 25-Nov-11, 23-Nov-12, 29-Nov-13 Christmas: 31-Dec-10, 30-Dec-11, 28-Dec-12, 27-Dec-13

Analysis Tasks

Basic Statistics tasks

1) Which store has maximum sales

2) Which store has maximum standard deviation i.e., the sales vary a lot. Also, find out the coefficient of mean to standard deviation

3) Which store/s has good quarterly growth rate in Q3’2012

4) Some holidays have a negative impact on sales. Find out holidays which have higher sales than the mean sales in non-holiday season for all stores together

5) Provide a monthly and semester view of sales in units and give insights

Statistical Model

For Store 1 – Build prediction models to forecast demand

Linear Regression – Utilize variables like date and restructure dates as 1 for 5 Feb 2010 (starting from the earliest date in order). Hypothesize if CPI, unemployment, and fuel price have any impact on sales.

Change dates into days by creating new variable.

Select the model which gives best accuracy.

--- Original source retains full ownership of the source dataset ---
Chlorophyll-a Standard Deviation of Long-Term Mean, 1998-2018 - American...
catalog.data.gov
data.ioos.us
+1more
Updated Dec 27, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
NOAA Pacific Islands Fisheries Science Center (PIFSC) (Point of Contact) (2024). Chlorophyll-a Standard Deviation of Long-Term Mean, 1998-2018 - American Samoa [Dataset]. https://catalog.data.gov/dataset/chlorophyll-a-standard-deviation-of-long-term-mean-1998-2018-american-samoa
Explore at:
Dataset updated
Dec 27, 2024
Dataset provided by
National Oceanic and Atmospheric Administrationhttp://www.noaa.gov/
Area covered
American Samoa
Description
Chlorophyll-a, is a widely used proxy for phytoplankton biomass and an indicator for changes in phytoplankton production. As an essential source of energy in the marine environment, the extent and availability of phytoplankton biomass can be highly influential for fisheries production and dictate trophic structure in marine ecosystems. Changes in phytoplankton biomass are predominantly effected by changes in nutrient availability, through either natural (e.g., turbulent ocean mixing) or anthropogenic (e.g., agricultural runoff) processes. This layer represents the standard deviation of the 8-day time series of chlorophyll-a (mg/m3) from 1998-2018. Data products generated by the Ocean Colour component of the European Space Agency (ESA) Climate Change Initiative (CCI) project. These files are 8-day 4-km composites of merged sensor products: Global Area Coverage (GAC), Local Area Coverage (LAC), MEdium Resolution Imaging Spectrometer (MERIS), Moderate Resolution Imaging Spectroradiometer (MODIS) Aqua, Ocean and Land Colour Instrument (OLCI), Sea-viewing Wide Field-of-view Sensor (SeaWiFS), and Visible Infrared Imaging Radiometer Suite (VIIRS). The standard deviation was calculated over all 8-day chlorophyll-a data from 1998-2018 for each pixel. A quality control mask was applied to remove spurious data associated with shallow water, following Gove et al., 2013. Nearshore map pixels with no data were filled with values from the nearest neighboring valid offshore pixel by using a grid of points and the Near Analysis tool in ArcGIS then converting points to raster. Data source: https://oceanwatch.pifsc.noaa.gov/erddap/griddap/esa-cci-chla-8d-v5-0.graph
Walmart Dataset (Retail)
kaggle.com
Updated Oct 27, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rutu Patel (2021). Walmart Dataset (Retail) [Dataset]. https://www.kaggle.com/rutuspatel/walmart-dataset-retail/tasks
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 27, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Rutu Patel
Description
Dataset Description :

This is the historical data that covers sales from 2010-02-05 to 2012-11-01, in the file Walmart_Store_sales. Within this file you will find the following fields:

Store - the store number

Date - the week of sales

Weekly_Sales - sales for the given store

Holiday_Flag - whether the week is a special holiday week 1 – Holiday week 0 – Non-holiday week

Temperature - Temperature on the day of sale

Fuel_Price - Cost of fuel in the region

CPI – Prevailing consumer price index

Unemployment - Prevailing unemployment rate

Holiday Events Super Bowl: 12-Feb-10, 11-Feb-11, 10-Feb-12, 8-Feb-13 Labour Day: 10-Sep-10, 9-Sep-11, 7-Sep-12, 6-Sep-13 Thanksgiving: 26-Nov-10, 25-Nov-11, 23-Nov-12, 29-Nov-13 Christmas: 31-Dec-10, 30-Dec-11, 28-Dec-12, 27-Dec-13

Analysis Tasks

Basic Statistics tasks

1) Which store has maximum sales

2) Which store has maximum standard deviation i.e., the sales vary a lot. Also, find out the coefficient of mean to standard deviation

3) Which store/s has good quarterly growth rate in Q3’2012

4) Some holidays have a negative impact on sales. Find out holidays which have higher sales than the mean sales in non-holiday season for all stores together

5) Provide a monthly and semester view of sales in units and give insights

Statistical Model

For Store 1 – Build prediction models to forecast demand

Linear Regression – Utilize variables like date and restructure dates as 1 for 5 Feb 2010 (starting from the earliest date in order). Hypothesize if CPI, unemployment, and fuel price have any impact on sales.

Change dates into days by creating new variable.

Select the model which gives best accuracy.
Z
Data from: #PraCegoVer dataset
data.niaid.nih.gov
Updated Jan 19, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sandra Avila (2023). #PraCegoVer dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_5710561
Explore at:
Dataset updated
Jan 19, 2023
Dataset provided by
Sandra Avila
Gabriel Oliveira dos Santos
Esther Luna Colombini
Description
Automatically describing images using natural sentences is an essential task to visually impaired people's inclusion on the Internet. Although there are many datasets in the literature, most of them contain only English captions, whereas datasets with captions described in other languages are scarce.

PraCegoVer arose on the Internet, stimulating users from social media to publish images, tag #PraCegoVer and add a short description of their content. Inspired by this movement, we have proposed the #PraCegoVer, a multi-modal dataset with Portuguese captions based on posts from Instagram. It is the first large dataset for image captioning in Portuguese with freely annotated images.

PraCegoVer has 533,523 pairs with images and captions described in Portuguese collected from more than 14 thousand different profiles. Also, the average caption length in #PraCegoVer is 39.3 words and the standard deviation is 29.7.

Dataset Structure

PraCegoVer dataset is composed of the main file dataset.json and a collection of compressed files named images.tar.gz.partX

containing the images. The file dataset.json comprehends a list of json objects with the attributes:

user: anonymized user that made the post;

filename: image file name;

raw_caption: raw caption;

caption: clean caption;

date: post date.

Each instance in dataset.json is associated with exactly one image in the images directory whose filename is pointed by the attribute filename. Also, we provide a sample with five instances, so the users can download the sample to get an overview of the dataset before downloading it completely.

Download Instructions

If you just want to have an overview of the dataset structure, you can download sample.tar.gz. But, if you want to use the dataset, or any of its subsets (63k and 173k), you must download all the files and run the following commands to uncompress and join the files:

cat images.tar.gz.part* > images.tar.gz tar -xzvf images.tar.gz

Alternatively, you can download the entire dataset from the terminal using the python script download_dataset.py available in PraCegoVer repository. In this case, first, you have to download the script and create an access token here. Then, you can run the following command to download and uncompress the image files:

python download_dataset.py --access_token=
d
Data from: Boundary strength analysis: combining colour pattern geometry and...
datadryad.org
search.dataone.org
zip
Updated Jul 30, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
John A. Endler; Gemma L. Cole; Alexandrea M. Kranz (2019). Boundary strength analysis: combining colour pattern geometry and coloured patch visual properties for use in predicting behaviour and fitness [Dataset]. http://doi.org/10.5061/dryad.g66247g
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.g66247g
Dataset updated
Jul 30, 2019
Dataset provided by
Dryad
Authors
John A. Endler; Gemma L. Cole; Alexandrea M. Kranz
Time period covered
2019
Description
1.Colour patterns are used by many species to make decisions that ultimately affect their Darwinian fitness. Colour patterns consist of a mosaic of patches that differ in geometry and visual properties. Although traditionally pattern geometry and colour patch visual properties are analysed separately, these components are likely to work together as a functional unit. Despite this, the combined effect of patch visual properties, patch geometry, and the effects of the patch boundaries on animal visual systems, behaviour and fitness are relatively unexplored.

2.Here we describe Boundary Strength Analysis (BSA), a novel way to combine the geometry of the edges (boundaries among the patch classes) with the receptor noise estimate (ΔS) of the intensity of the edges. The method is based upon known properties of vertebrate and invertebrate retinas. The mean and SD of ΔS (mΔS, sΔS) of a colour pattern can be obtained by weighting each edge class ΔS by its length, separately for chromatic and ac...
US Gross Rent ACS Statistics
kaggle.com
Updated Aug 23, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Golden Oak Research Group (2017). US Gross Rent ACS Statistics [Dataset]. https://www.kaggle.com/datasets/goldenoakresearch/acs-gross-rent-us-statistics/versions/3
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 23, 2017
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Golden Oak Research Group
Area covered
United States
Description
What you get:

Upvote! The database contains +40,000 records on US Gross Rent & Geo Locations. The field description of the database is documented in the attached pdf file. To access, all 325,272 records on a scale roughly equivalent to a neighborhood (census tract) see link below and make sure to upvote. Upvote right now, please. Enjoy!

Get the full free database with coupon code: FreeDatabase, See directions at the bottom of the description... And make sure to upvote :) coupon ends at 2:00 pm 8-23-2017

Gross Rent & Geographic Statistics:

Mean Gross Rent (double)

Median Gross Rent (double)

Standard Deviation of Gross Rent (double)

Number of Samples (double)

Square area of land at location (double)

Square area of water at location (double)

Geographic Location:

Longitude (double)

Latitude (double)

State Name (character)

State abbreviated (character)

State_Code (character)

County Name (character)

City Name (character)

Name of city, town, village or CPD (character)

Primary, Defines if the location is a track and block group.

Zip Code (character)

Area Code (character)

Abstract

The data set originally developed for real estate and business investment research. Income is a vital element when determining both quality and socioeconomic features of a given geographic location. The following data was derived from over +36,000 files and covers 348,893 location records.

License

Only proper citing is required please see the documentation for details. Have Fun!!!

Golden Oak Research Group, LLC. “U.S. Income Database Kaggle”. Publication: 5, August 2017. Accessed, day, month year.

For any questions, you may reach us at research_development@goldenoakresearch.com. For immediate assistance, you may reach me on at 585-626-2965

please note: it is my personal number and email is preferred

Check our data's accuracy: Census Fact Checker

Access all 325,272 location for Free Database Coupon Code:

Don't settle. Go big and win big. Optimize your potential**. Access all gross rent records and more on a scale roughly equivalent to a neighborhood, see link below:

Website: Golden Oak Research make sure to upvote

A small startup with big dreams, giving the every day, up and coming data scientist professional grade data at affordable prices It's what we do.
d
Data from: OPR-PPR, a Computer Program for Assessing Data Importance to...
datadiscoverystudio.org
Updated Aug 21, 2007
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2007). OPR-PPR, a Computer Program for Assessing Data Importance to Model Predictions Using Linear Statistics [Dataset]. http://datadiscoverystudio.org/geoportal/rest/metadata/item/275190e4f57448bdbad5c4733e416f22/html
Explore at:
Dataset updated
Aug 21, 2007
Description
Link to the ScienceBase Item Summary page for the item described by this metadata record. Service Protocol: Link to the ScienceBase Item Summary page for the item described by this metadata record. Application Profile: Web Browser. Link Function: information
o
ANV - Probability distribution for Corylus avellana
data.opendatascience.eu
Updated Jan 2, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2021). ANV - Probability distribution for Corylus avellana [Dataset]. https://data.opendatascience.eu/geonetwork/srv/search?type=dataset
Explore at:
Dataset updated
Jan 2, 2021
Description
Overview: Actual Natural Vegetation (ANV): probability of occurrence for the Common hazel in its realized environment for the period 2000 - 2022 Traceability (lineage): This is an original dataset produced with a machine learning framework which used a combination of point datasets and raster datasets as inputs. Point dataset is a harmonized collection of tree occurrence data, comprising observations from National Forest Inventories (EU-Forest), GBIF and LUCAS. The complete dataset is available on Zenodo. Raster datasets used as input are: harmonized and gapfilled time series of seasonal aggregates of the Landsat GLAD ARD dataset (bands and spectral indices); monthly time series air and surface temperature and precipitation from a reprocessed version of the Copernicus ERA5 dataset; long term averages of bioclimatic variables from CHELSA, tree species distribution maps from the European Atlas of Forest Tree Species; elevation, slope and other elevation-derived metrics; long term monthly averages snow probability and long term monthly averages of cloud fraction from MODIS. For a more comprehensive list refer to Bonannella et al. (2022) (in review, preprint available at: https://doi.org/10.21203/rs.3.rs-1252972/v1). Scientific methodology: Probability and uncertainty maps were the output of a spatiotemporal ensemble machine learning framework based on stacked regularization. Three base models (random forest, gradient boosted trees and generalized linear models) were first trained on the input dataset and their predictions were used to train an additional model (logistic regression) which provided the final predictions. More details on the whole workflow are available in the listed publication. Usability: Probability maps can be used to detect potential forest degradation and compositional change across the time period analyzed. Some possible applications for these topics are explained in the listed publication. Uncertainty quantification: Uncertainty is quantified by taking the standard deviation of the probabilities predicted by the three components of the spatiotemporal ensemble model. Data validation approaches: Distribution maps were validated using a spatial 5-fold cross validation following the workflow detailed in the listed publication. Completeness: The raster files perfectly cover the entire Geo-harmonizer region as defined by the landmask raster dataset available here. Consistency: Areas which are outside of the calibration area of the point dataset (Iceland, Norway) usually have high uncertainty values. This is not only a problem of extrapolation but also of poor representation in the feature space available to the model of the conditions that are present in this countries. Positional accuracy: The rasters have a spatial resolution of 30m. Temporal accuracy: The maps cover the period 2000 - 2020, each map covers a certain number of years according to the following scheme: (1) 2000--2002, (2) 2002--2006, (3) 2006--2010, (4) 2010--2014, (5) 2014--2018 and (6) 2018--2020 Thematic accuracy: Both probability and uncertainty maps contain values from 0 to 100: in the case of probability maps, they indicate the probability of occurrence of a single individual of the target species, while uncertainty maps indicate the standard deviation of the ensemble model.
Z
Supplementary data - "Learning Reduced Models for Large-Scale Agent-Based...
data.niaid.nih.gov
Updated Jul 11, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Niemann, Jan-Hendrik (2022). Supplementary data - "Learning Reduced Models for Large-Scale Agent-Based Systems" [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_5561164
Explore at:
Dataset updated
Jul 11, 2022
Dataset authored and provided by
Niemann, Jan-Hendrik
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This repository contains supplementary data on my PhD thesis "Learning Reduced Models for Large-Scale Agent-Based Systems". Chapters 1-3, 7 and A do not have supplementary data.

Chapter 4

Large_deviation_example.zip contains the trajectory for Figure 4.8.

mean_exit_time* contains the raw data to compute the mean exit time and standard deviation for the ABM process (JP) and SDE process (CLE). It contains additionally a precomputed mean and standard deviation as well as the corresponding numbers of agents.

transition_matrix* contain the computed box discretizations as MATLAB and Numpy files as used for Figures 4.2-4.4, 4.6 and Tables 4.1 and 4.2.

Chapter 5

CVM_2021-07-09-15-53_training_data.npz contains the training data for Figure 5.7 a and b.

CVM_2021-09-29-07-13_distribution.npz contains the raw data for Figure 5.7 c.

The remaining data for Chapter 5 can be found in the related dataset doi.org/10.5281/zenodo.4522119.

Chapter 6

CVM_pareto_estimate contains trajectory data required for Figure 6.6 b to estimate points in the Pareto Front using the civil violence model.

CVM_training_data contains the training data to construct the surrogate model. Each data set consists of CVM_*_cops_train.npz as training set, CVM_*_cops_trajectory.npz as sample trajectory and CVM_*_cops.pkl to compute the training data.

CVM_covering_iterations_8.mat Pareto set covering after 8 iterations for the civil violence model. Required for Figure 6.6 a.

CVM_pareto_set+front.npz is required for Figure 6.6 b.

CVM_surrogate_model.mat contains the surrogate model for the civil violence model

Expl_iterations_* contains Pareto set coverings after 8 and 12 iterations for Example 6.1.4 and Figure 6.1.

VM_covering_iterations_12.mat contains the Pareto set covering depicted in Figure 6.4 a.

VM_ODE_covering_iterations_12_subset_front.mat contains the Pareto set covering depicted in Figure 6.5 and 6.5 c.

VM_ODE_covering_iterations_12_subset.mat contains the Pareto set covering depicted in Figure 6.5 and 6.5 d.

VM_ODE_covering_iterations_12.mat contains the Pareto set covering depicted in Figure 6.4 b.

VM_surrogate_model.mat contains the surrogate model for the extended voter model.

VM_test_points_non_pareto.npz contains Non-Pareto points in Figure 6.5 and 6.5 d.

VM_test_points_pareto.npz contains Pareto points in Figure 6.5 and 6.5 c.
Dataset and images for "Instantaneous R calculation for COVID-19 epidemic in...
zenodo.org
bin, csv, png
Updated Jul 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Félix. Francisco H. C.; Félix. Francisco H. C.; Juvenia Bezerra Fontenele; Juvenia Bezerra Fontenele (2024). Dataset and images for "Instantaneous R calculation for COVID-19 epidemic in Brazil" [Dataset]. http://doi.org/10.5281/zenodo.3819284
Explore at:
png, csv, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.3819284
Dataset updated
Jul 22, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Félix. Francisco H. C.; Félix. Francisco H. C.; Juvenia Bezerra Fontenele; Juvenia Bezerra Fontenele
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset was generated from raw data obtained at

Ceará State - https://indicadores.integrasus.saude.ce.gov.br/api/casos-coronavirus/export-csv

São Paulo State - http://www.seade.gov.br/wp-content/uploads/2020/04/Dados-covid-19-estado.csv

Brazil - https://covid.saude.gov.br/

Data was processed with R package EpiEstim (methodology in the associated preprint). Briefly, instantaneous R was estimated within a 5 day time window. Prior mean and standard deviation values for R were set at 3 and 1. Serial interval was estimated using a parametric distribution with uncertainty (offset gamma). We compared the results at two time points (day 7 and day 21 after the first case was registered at each region) from different brazillian states in order to make inferences about the epidemic dynamics.

Multiscale Land Surface Parameters of GEDTM30: Spherical Standard Deviation...

zenodo.org

png, tiff

Updated Feb 26, 2025

Facebook

Twitter

Click to copy link

Link copied

Cite

Yufeng Ho; Yufeng Ho; Tom Hengl; Tom Hengl (2025). Multiscale Land Surface Parameters of GEDTM30: Spherical Standard Deviation of the Normals [Dataset]. http://doi.org/10.5281/zenodo.14920383

Explore at:

tiff, pngAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.14920383

Dataset updated

Feb 26, 2025

Dataset provided by

Zenodohttp://zenodo.org/

Authors

Yufeng Ho; Yufeng Ho; Tom Hengl; Tom Hengl

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Spherical Standard Deviation of the Normals

This data is part of the Global Ensemble Digital Terrain Model (GEDTM30) dataset. Check the related identifiers section below to access other parts of the dataset.

Disclaimer

This is the first release of the Multiscale Land Surface Parameters (LSPs) of Global Ensemble Digital Terrain Model (GEDTM30). Use for testing purposes only. This work was funded by the European Union. However, the views and opinions expressed are solely those of the author(s) and do not necessarily reflect those of the European Union or the European Commission. Neither the European Union nor the granting authority can be held responsible for them. The data is provided "as is." The Open-Earth-Monitor project consortium, along with its suppliers and licensors, hereby disclaims all warranties of any kind, express or implied, including, without limitation, warranties of merchantability, fitness for a particular purpose, and non-infringement. Neither the Open-Earth-Monitor project consortium nor its suppliers and licensors make any warranty that the website will be error-free or that access to it will be continuous or uninterrupted. You understand that you download or otherwise obtain content or services from the website at your own discretion and risk.

Description

LSPs are derivative products of the GEDTM30 that represent measures of local topographic position, curvature, hydrology, light, and shadow. A pyramid representation is implemented to generate multiscale resolutions of 30m, 60m, 120m, 240m, 480m, and 960m for each LSP. The parametrization is powered by Whitebox Workflows in Python. To see the documentation, please visit our GEDTM30 GitHub (https://github.com/openlandmap/GEDTM30).

Dataset Contents

This dataset includes:

Global Spherical Standard Deviation of the Normals 120m
Global Spherical Standard Deviation of the Normals 240m
Global Spherical Standard Deviation of the Normals 480m
Global Spherical Standard Deviation of the Normals 960m

Due to Zenodo's storage limitations, the high resolution LSP data are provided via external links:

Related Identifiers

Digital Terrain Model:
GEDTM30
Landform:
Slope in Degree, Geomorphons
Light and Shadow:
Positive Openness, Negative Openness, Hillshade
Curvature:
Minimal Curvature, Maximal Curvature, Profile Curvature, Tangential Curvature, Ring Curvature, Shape Index
Local Topographic Position:
Difference from Mean Elevation, Spherical Standard Deviation of the Normals
Hydrology:
Specific Catchment Area, LS Factor, Topographic Wetness Index

Data Details

Time period: static.
Type of data: properties derived from Digital Terrain Model
How the data was collected or derived: The data was derived using Whitbox Workflows.
Methods used: LSP algorithms.
Limitations or exclusions in the data: The dataset does not include data Antarctica.
Coordinate reference system: EPSG:4326
Bounding box (Xmin, Ymin, Xmax, Ymax): (-180, -65, 180, 85)
Spatial resolution: 120m, 240m, 480m, 960m
Image size: 360,000P x 178,219L; 180,000P x 89,110L; 45,000L x 22,282L
File format: Cloud Optimized Geotiff (COG) format.
Additional information:

Layer	Scale	Data Type	No Data
Difference from Mean Elevation	100	Int16	32,767
Geomorphons	1	Byte	255
Hillshade	1	UInt16	65,535
LS Factor	1,000	UInt16	65,535
Maximal Curvature	1,000	Int16	32,767
Minimal Curvature	1,000	Int16	32,767
Negative Openness	100	UInt16	65,535
Positive Openness	100	UInt16	65,535
Profile Curvature	1,000	Int16	32,767
Ring Curvature	10,000	Int16	32,767
Shape Index	1,000	Int16	32,767
Slope in Degree	100	UInt16	65,535
Specific Catchment Area	1,000	UInt16	65,535
Spherical Standard Deviation of the Normals	100	Int16	32,767
Tangential Curvature	1,000	Int16	32,767
Topographic Wetness Index	100	Int16	32,767

Support

If you discover a bug, artifact, or inconsistency, or if you have a question please raise a GitHub issue here

Naming convention

To ensure consistency and ease of use across and within the projects, we follow the standard Ai4SoilHealth and Open-Earth-Monitor file-naming convention. The convention works with 10 fields that describe important properties of the data. In this way users can search files, prepare data analysis etc, without needing to open files.

For example, for twi_edtm_m_120m_s_20000101_20221231_go_epsg.4326_v20241230.tif, the fields are:

generic variable name: twi = topographic wetness index
variable procedure combination: edtm = derivative direct from global ensemble digital terrain model
Position in the probability distribution/variable type: m = measurement
Spatial support: 120m
Depth reference: s = surface
Time reference begin time: 20000101 = 2000-01-01
Time reference end time: 20211231 = 2021-12-31
Bounding box: go = global
EPSG code: EPSG:4326
Version code: v20241230 = version from 2024-12-30

e
Simulated sensitivity time series and model performance in three German...
b2find.eudat.eu
Updated Aug 20, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2020). Simulated sensitivity time series and model performance in three German catchments - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/39642549-e880-501f-95ef-629fbe57f0be
Explore at:
Dataset updated
Aug 20, 2020
Description
The data sets contains the major results of the article “Improving information extraction from model data using sensitivity-weighted performance criteria“ written by Guse et al. (2020). In this article, it is analysed how a sensitivity-weighted performance criterion improves parameter identifiability and model performance. More details are given the in article. The files of this dataset are described as follows. Parameter sampling: FAST parameter sampling.xlsx: To estimate the sensitivity, the Fourier Amplitude Sensitivity Test (FAST) was used (R-routine FAST, Reusser, 2013). Each column shows the values of the model parameter of the SWAT model (Arnold et al., 1998). All parameters are explained in detail in Neitsch et al. (2011). The FAST parameter sampling defines the number of model runs. For twelve model parameters as in this case, 579 model runs are required. The same parameter sets were used for all catchments. Daily sensitivity time series: Sensitivity_2000_2005.xlsx: Daily time series of parameter sensitivity for the period 2000-2005 for three catchments in Germany (Treene, Saale, Kinzig). Each column shows the sensitivity of one parameter of the SWAT model. The methodological approach of the temporal dynamics of parameter sensitivity (TEDPAS) was developed by Reusser et al. (2011) and firstly applied to the SWAT model in Guse et al. (2014). As sensitivity index, the first-order partial variance is used that is the ratio of the partial variance of one parameter divided by the total variance. The sensitivity is thus always between 0 and 1. The sum in one row, i.e. the sensitivity of all model parameters on one day, could not be higher than 1. Parameter sampling: LH parameter sampling.xlsx: To calculate parameter identifiability, Latin Hypercube sampling was used to generate 2000 parameter sets (R-package FME, Soetaert and Petzoldt, 2010). Each column shows the values of the model parameter of the SWAT model (Arnold et al., 1998). All parameters are explained in detail in Neitsch et al. (2011). The same parameter sets were used for all catchments. Performance criteria with and without sensitivity weights: RSR_RSRw_cal.xlsx: • Calculation of the RSR once and RSRw separately for each model parameter. • RSR: Typical RSR (RMSE divided by standard deviation) • RSR_w: RSR with weights according to daily sensitivity time series. The calculation was carried out in all three catchments. • The column RSR shows the results of the RSR (RMSE divided by standard deviation) for the different model runs. • The column RSR[_parameter name] shows the calculation of the RSR_w for the specific model parameter. • RSR_w give weights on each day based on the daily parameter sensitivity (as shown in sensitivity_2000_2005.xlsx). This means that days with a higher parameter sensitivity are higher weighted. In the methodological approach the best 25% of the model runs were calculated (best 500 model runs) and the model parameters were constrained to the most appropriate parameter values (see methodological description in the article). Performance criteria for the three catchments: GOFrun_[catchment name]_RSR.xlsx: These three tables are organised identical and are available for the three catchments in Germany (Treene, Saale, Kinzig). In using the different parameter ranges for the catchments as defined in the previous steps, 2000 model simulation were carried out. Therefore, a Latin-Hypercube sampling was used (R-package FME, Soetaert and Petzoldt, 2010). The three tables show the results of 2000 model simulations for ten different performance criteria for the two different methodological approaches (RSR and swRSR) and two periods (calibration: 2000-2005 and validation: 2006-2010). Performance criteria for the three catchments: GOFrun_[catchment name]_MAE.xlsx: The three tables show the results of 2000 model simulations for ten different performance criteria for the two different methodological approaches (MAE and swMAE) and two periods (calibration: 2000-2005 and validation: 2006-2010).
e
RV standard deviation in the M2K survey - Dataset - B2FIND
b2find.eudat.eu
Updated Oct 22, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). RV standard deviation in the M2K survey - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/0c7f72e7-dca1-5ae0-9836-04bb97882b97
Explore at:
Dataset updated
Oct 22, 2023
Description
We constrain the densities of Earth- to Neptune-size planets around very cool (T_e_=3660-4660K) Kepler stars by comparing 1202 Keck/HIRES radial velocity measurements of 150 nearby stars to a model based on Kepler candidate planet radii and a power-law mass-radius relation. Our analysis is based on the presumption that the planet populations around the two sets of stars are the same. The model can reproduce the observed distribution of radial velocity variation over a range of parameter values, but, for the expected level of Doppler systematic error, the highest Kolmogorov-Smirnov probabilities occur for a power-law index {alpha}{approx}4, indicating that rocky-metal planets dominate the planet population in this size range. A single population of gas-rich, low-density planets with {alpha}=2 is ruled out unless our Doppler errors are >=5m/s, i.e., much larger than expected based on observations and stellar chromospheric emission. If small planets are a mix of {gamma} rocky planets ({alpha}=3.85) and 1-{gamma} gas-rich planets ({alpha}=2), then {gamma}>0.5 unless Doppler errors are >=4m/s. Our comparison also suggests that Kepler's detection efficiency relative to ideal calculations is less than unity. One possible source of incompleteness is target stars that are misclassified subgiants or giants, for which the transits of small planets would be impossible to detect. Our results are robust to systematic effects, and plausible errors in the estimated radii of Kepler stars have only moderate impact.
f
Data from: Error and anomaly detection for intra-participant time-series...
tandf.figshare.com
xlsx
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
David R. Mullineaux; Gareth Irwin (2023). Error and anomaly detection for intra-participant time-series data [Dataset]. http://doi.org/10.6084/m9.figshare.5189002
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.5189002
Dataset updated
Jun 1, 2023
Dataset provided by
Taylor & Francis
Authors
David R. Mullineaux; Gareth Irwin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Identification of errors or anomalous values, collectively considered outliers, assists in exploring data or through removing outliers improves statistical analysis. In biomechanics, outlier detection methods have explored the ‘shape’ of the entire cycles, although exploring fewer points using a ‘moving-window’ may be advantageous. Hence, the aim was to develop a moving-window method for detecting trials with outliers in intra-participant time-series data. Outliers were detected through two stages for the strides (mean 38 cycles) from treadmill running. Cycles were removed in stage 1 for one-dimensional (spatial) outliers at each time point using the median absolute deviation, and in stage 2 for two-dimensional (spatial–temporal) outliers using a moving window standard deviation. Significance levels of the t-statistic were used for scaling. Fewer cycles were removed with smaller scaling and smaller window size, requiring more stringent scaling at stage 1 (mean 3.5 cycles removed for 0.0001 scaling) than at stage 2 (mean 2.6 cycles removed for 0.01 scaling with a window size of 1). Settings in the supplied Matlab code should be customised to each data set, and outliers assessed to justify whether to retain or remove those cycles. The method is effective in identifying trials with outliers in intra-participant time series data.
Ensemble standar deviation of wind speed and direction of the FDDA input to...
catalog.data.gov
data.amerigeoss.org
Updated Nov 12, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. EPA Office of Research and Development (ORD) (2020). Ensemble standar deviation of wind speed and direction of the FDDA input to WRF [Dataset]. https://catalog.data.gov/dataset/ensemble-standar-deviation-of-wind-speed-and-direction-of-the-fdda-input-to-wrf
Explore at:
Dataset updated
Nov 12, 2020
Dataset provided by
United States Environmental Protection Agencyhttp://www.epa.gov/
Description
NetCDF file of the SREF standard deviation of wind speed and direction that was used to inject variability in the FDDA input. variable U_NDG_OLD contains standard deviation of wind speed (m/s) variable V_NDG_OLD contains the standard deviation of wind direction (deg). This dataset is not publicly accessible because: This is a netcdf file that is 3.9Gb. It can be accessed through the following means: On the HPC system sol (2016). In the asm archive here: /asm/grc/JGR_ENSEMBLE_ScienceHub/figure1.nc. Format: Figure 1 data. This is the variability of wind speed and direction of the four dimensional data assimilation inputs. The variability includes the 14 members of the ensemble. This dataset is associated with the following publication: Gilliam , R., C. Hogrefe , J. Godowitch, S. Napelenok , R. Mathur , and S.T. Rao. Impact of inherent meteorology uncertainty on air quality model predictions. JOURNAL OF GEOPHYSICAL RESEARCH-ATMOSPHERES. American Geophysical Union, Washington, DC, USA, 120(23): 12,259–12,280, (2015).

Facebook

Twitter

Click to copy link

Link copied

Cite

National Center for Ecological Analysis and Synthesis (NCEAS) (Point of Contact) (2025). Sea Surface Temperature (SST) Standard Deviation of Long-term Mean, 2000-2013 - Hawaii [Dataset]. https://catalog.data.gov/dataset/sea-surface-temperature-sst-standard-deviation-of-long-term-mean-2000-2013-hawaii

Sea Surface Temperature (SST) Standard Deviation of Long-term Mean, 2000-2013 - Hawaii

Explore at:

Dataset updated

Jan 27, 2025

Dataset provided by

National Center for Ecological Analysis and Synthesis (NCEAS) (Point of Contact)

Area covered

Hawaii

Description

Sea surface temperature (SST) plays an important role in a number of ecological processes and can vary over a wide range of time scales, from daily to decadal changes. SST influences primary production, species migration patterns, and coral health. If temperatures are anomalously warm for extended periods of time, drastic changes in the surrounding ecosystem can result, including harmful effects such as coral bleaching. This layer represents the standard deviation of SST (degrees Celsius) of the weekly time series from 2000-2013. Three SST datasets were combined to provide continuous coverage from 1985-2013. The concatenation applies bias adjustment derived from linear regression to the overlap periods of datasets, with the final representation matching the 0.05-degree (~5-km) near real-time SST product. First, a weekly composite, gap-filled SST dataset from the NOAA Pathfinder v5.2 SST 1/24-degree (~4-km), daily dataset (a NOAA Climate Data Record) for each location was produced following Heron et al. (2010) for January 1985 to December 2012. Next, weekly composite SST data from the NOAA/NESDIS/STAR Blended SST 0.1-degree (~11-km), daily dataset was produced for February 2009 to October 2013. Finally, a weekly composite SST dataset from the NOAA/NESDIS/STAR Blended SST 0.05-degree (~5-km), daily dataset was produced for March 2012 to December 2013. The standard deviation of the long-term mean SST was calculated by taking the standard deviation over all weekly data from 2000-2013 for each pixel.

Clear search

Close search

Google apps

Main menu

Sea Surface Temperature (SST) Standard Deviation of Long-term Mean,...

Standard deviation of the bathymetric DEM of the Sacramento River, from the...

Monthly aggregated Water Vapor MODIS MCD19A2 (1 km): Long-term data...

Data from: The Often-Overlooked Power of Summary Statistics in Exploratory...

Median Household Income Variation by Family Size in United States:...

About this dataset

Content

Inspiration

Recommended for further research

‘Walmart Dataset (Retail)’ analyzed by Analyst-2

Chlorophyll-a Standard Deviation of Long-Term Mean, 1998-2018 - American...

Walmart Dataset (Retail)

Data from: #PraCegoVer dataset

PraCegoVer has 533,523 pairs with images and captions described in Portuguese collected from more than 14 thousand different profiles. Also, the average caption length in #PraCegoVer is 39.3 words and the standard deviation is 29.7.

PraCegoVer dataset is composed of the main file dataset.json and a collection of compressed files named images.tar.gz.partX

Data from: Boundary strength analysis: combining colour pattern geometry and...

US Gross Rent ACS Statistics

What you get:

Gross Rent & Geographic Statistics:

Geographic Location:

Abstract

License

Access all 325,272 location for Free Database Coupon Code:

Data from: OPR-PPR, a Computer Program for Assessing Data Importance to...

ANV - Probability distribution for Corylus avellana

Supplementary data - "Learning Reduced Models for Large-Scale Agent-Based...

Dataset and images for "Instantaneous R calculation for COVID-19 epidemic in...

Multiscale Land Surface Parameters of GEDTM30: Spherical Standard Deviation...

Spherical Standard Deviation of the Normals

Disclaimer

Description

Dataset Contents

Related Identifiers

Data Details

Support

Naming convention

Simulated sensitivity time series and model performance in three German...

RV standard deviation in the M2K survey - Dataset - B2FIND

Data from: Error and anomaly detection for intra-participant time-series...

Ensemble standar deviation of wind speed and direction of the FDDA input to...

Sea Surface Temperature (SST) Standard Deviation of Long-term Mean, 2000-2013 - HawaiiSee More Versions

Sea Surface Temperature (SST) Standard Deviation of Long-term Mean, 2000-2013 - Hawaii