Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Some say climate change is the biggest threat of our age while others say it’s a myth based on dodgy science. We are turning some of the data over to you so you can form your own view.
Even more than with other data sets that Kaggle has featured, there’s a huge amount of data cleaning and preparation that goes into putting together a long-time study of climate trends. Early data was collected by technicians using mercury thermometers, where any variation in the visit time impacted measurements. In the 1940s, the construction of airports caused many weather stations to be moved. In the 1980s, there was a move to electronic thermometers that are said to have a cooling bias.
Given this complexity, there are a range of organizations that collate climate trends data. The three most cited land and ocean temperature data sets are NOAA’s MLOST, NASA’s GISTEMP and the UK’s HadCrut.
We have repackaged the data from a newer compilation put together by the Berkeley Earth, which is affiliated with Lawrence Berkeley National Laboratory. The Berkeley Earth Surface Temperature Study combines 1.6 billion temperature reports from 16 pre-existing archives. It is nicely packaged and allows for slicing into interesting subsets (for example by country). They publish the source data and the code for the transformations they applied. They also use methods that allow weather observations from shorter time series to be included, meaning fewer observations need to be thrown away.
In this dataset, we have include several files:
Global Land and Ocean-and-Land Temperatures (GlobalTemperatures.csv):
Other files include:
The raw data comes from the Berkeley Earth data page.
Facebook
Twitterhttps://earth.esa.int/eogateway/documents/20142/1564626/Terms-and-Conditions-for-the-use-of-ESA-Data.pdfhttps://earth.esa.int/eogateway/documents/20142/1564626/Terms-and-Conditions-for-the-use-of-ESA-Data.pdf
The Fundamental Data Record (FDR) for Atmospheric Composition UVN v.1.0 dataset is a cross-instrument Level-1 product [ATMOS_L1B] generated in 2023 and resulting from the ESA FDR4ATMOS project. The FDR contains selected Earth Observation Level 1b parameters (irradiance/reflectance) from the nadir-looking measurements of the ERS-2 GOME and Envisat SCIAMACHY missions for the period ranging from 1995 to 2012. The data record offers harmonised cross-calibrated spectra with focus on spectral windows in the Ultraviolet-Visible-Near Infrared regions for the retrieval of critical atmospheric constituents like ozone (O3), sulphur dioxide (SO2), nitrogen dioxide (NO2) column densities, alongside cloud parameters. The FDR4ATMOS products should be regarded as experimental due to the innovative approach and the current use of a limited-sized test dataset to investigate the impact of harmonization on the Level 2 target species, specifically SO2, O3 and NO2. Presently, this analysis is being carried out within follow-on activities. The FDR4ATMOS V1 is currently being extended to include the MetOp GOME-2 series. Product format For many aspects, the FDR product has improved compared to the existing individual mission datasets: GOME solar irradiances are harmonised using a validated SCIAMACHY solar reference spectrum, solving the problem of the fast-changing etalon present in the original GOME Level 1b data; Reflectances for both GOME and SCIAMACHY are provided in the FDR product. GOME reflectances are harmonised to degradation-corrected SCIAMACHY values, using collocated data from the CEOS PIC sites; SCIAMACHY data are scaled to the lowest integration time within the spectral band using high-frequency PMD measurements from the same wavelength range. This simplifies the use of the SCIAMACHY spectra which were split in a complex cluster structure (with own integration time) in the original Level 1b data; The harmonization process applied mitigates the viewing angle dependency observed in the UV spectral region for GOME data; Uncertainties are provided. Each FDR product provides, within the same file, irradiance/reflectance data for UV-VIS-NIR special regions across all orbits on a single day, including therein information from the individual ERS-2 GOME and Envisat SCIAMACHY measurements. FDR has been generated in two formats: Level 1A and Level 1B targeting expert users and nominal applications respectively. The Level 1A [ATMOS_L1A] data include additional parameters such as harmonisation factors, PMD, and polarisation data extracted from the original mission Level 1 products. The ATMOS_L1A dataset is not part of the nominal dissemination to users. In case of specific requirements, please contact EOHelp. Please refer to the README file for essential guidance before using the data. All the new products are conveniently formatted in NetCDF. Free standard tools, such as Panoply, can be used to read NetCDF data. Panoply is sourced and updated by external entities. For further details, please consult our Terms and Conditions page. Uncertainty characterisation One of the main aspects of the project was the characterization of Level 1 uncertainties for both instruments, based on metrological best practices. The following documents are provided: General guidance on a metrological approach to Fundamental Data Records (FDR) Uncertainty Characterisation document Effect tables NetCDF files containing example uncertainty propagation analysis and spectral error correlation matrices for SCIAMACHY (Atlantic and Mauretania scene for 2003 and 2010) and GOME (Atlantic scene for 2003) reflectance_uncertainty_example_FDR4ATMOS_GOME.nc reflectance_uncertainty_example_FDR4ATMOS_SCIA.nc Known Issues Non-monotonous wavelength axis for SCIAMACHY in FDR data version 1.0 In the SCIAMACHY OBSERVATION group of the atmospheric FDR v1.0 dataset (DOI: 10.5270/ESA-852456e), the wavelength axis (lambda variable) is not monotonically increasing. This issue affects all spectral channels (UV, VIS, NIR) in the SCIAMACHY group, while GOME OBSERVATION data remain unaffected. The root cause of the issue lies in the incorrect indexing of the lambda variable during the NetCDF writing process. Notably, the wavelength values themselves are calculated correctly within the processing chain. Temporary Workaround The wavelength axis is correct in the first record of each product. As a workaround, users can extract the wavelength axis from the first record and apply it to all subsequent measurements within the same product. The first record can be retrieved by setting the first two indices (time and scanline) to 0 (assuming counting of array indices starts at 0). Note that this process must be repeated separately for each spectral range (UV, VIS, NIR) and every daily product. Since the wavelength axis of SCIAMACHY is highly stable over time, using the first record introduces no expected impact on retrieval results. Python pseudo-code example: lambda_...
Facebook
TwitterThe Unpublished Digital Geologic-GIS Map of the Hagerman Fossil Beds National Monument Area, Idaho is composed of GIS data layers and GIS tables in a 10.1 file geodatabase (hfba_geology.gdb), a 10.1 ArcMap (.mxd) map document (hfba_geology.mxd), individual 10.1 layer (.lyr) files for each GIS data layer, an ancillary map information document (hafo_geology.pdf) which contains source map unit descriptions, as well as other source map text, figures and tables, metadata in FGDC text (.txt) and FAQ (.pdf) formats, and a GIS readme file (hafo_geology_gis_readme.pdf). This dataset/map was previously released by the GRI with a GRI MapCode of HAFO, however, as this dataset/map is smaller-scale and in some ways less detailed than a newer dataset/map (now assigned the GRI MapCode of HAFO) the GRI MapCode has been changed to HFBA (GRI abbreviation for Hagerman Fossil Beds Area). Please read the hafo_geology_gis_readme.pdf for information pertaining to the proper extraction of the file geodatabase and other map files. To request GIS data in ESRI 10.1 shapefile format contact Stephanie O'Meara (stephanie.omeara@colostate.edu; see contact information below). The data is also available as a 2.2 KMZ/KML file for use in Google Earth, however, this format version of the map is limited in data layers presented and in access to GRI ancillary table information. Google Earth software is available for free at: http://www.google.com/earth/index.html. Users are encouraged to only use the Google Earth data for basic visualization, and to use the GIS data for any type of data analysis or investigation. The data were completed as a component of the Geologic Resources Inventory (GRI) program, a National Park Service (NPS) Inventory and Monitoring (I&M) Division funded program that is administered by the NPS Geologic Resources Division (GRD). Source geologic maps and data used to complete this GRI digital dataset were provided by the following: U.S. Geological Survey. Detailed information concerning the sources used and their contribution the GRI product are listed in the Source Citation section(s) of this metadata record (hfba_geology_metadata.txt or hfba_geology_metadata_faq.pdf). Users of this data are cautioned about the locational accuracy of features within this dataset. Based on the source map scale of 1:48,000 and United States National Map Accuracy Standards features are within (horizontally) 24.4 meters or 80 feet of their actual location as presented by this dataset. Users of this data should thus not assume the location of features is exactly where they are portrayed in Google Earth, ArcGIS or other software used to display this dataset. All GIS and ancillary tables were produced as per the NPS GRI Geology-GIS Geodatabase Data Model v. 2.3. (available at: https://www.nps.gov/articles/gri-geodatabase-model.htm). The GIS data projection is NAD83, UTM Zone 11N, however, for the KML/KMZ format the data is projected upon export to WGS84 Geographic, the native coordinate system used by Google Earth. The data is within the area of interest of Hagerman Fossil Beds National Monument.
Facebook
TwitterComplete Genome of a Family of five - Two Parents, Three Siblings (Genome Phenotype SNPs Raw Data)
Genomics is a branch of molecular biology that involves structure, function, variation, evolution and mapping of genomes. There are several companies offering next generation sequencing of human genomes from complete 3 billion base-pairs to a few thousand Phenotype SNPs. I have used 23andMe (using Illumina HumanOmniExpress-24) for this family's DNA’s Phenotype SNPs. I am sharing the entire raw dataset of the family of five (Father, Mother and Three Brothers) here for the international research community for the following reasons:
I am a firm believer in open datasets, transparency, and the right to learn, research, explores, and educate. I do not want to restrict the knowledge flow for mere privacy concerns. Hence, I am offering this entire family DNA raw data for the world to use for research without worrying about privacy.
Most of available test datasets for research come from western world and we don’t see much from under-developing countries. I thought to share this data to bridge the gap and I expect others to follow the trend.
I would be the happiest man on earth, if a life can be saved, knowledge can be learned, an idea can be explore, or a fact can be found using this DNA dataset. Please use it the way you will
Family Origin: Pakistani
Country of Grandparents/Ancestors: India (Kerana, Utter Pradesh - UP)
Files: Father, Mother, Child 1, Child 2, Child 3 (All CSVs)
Size: 75 MB
Sources: 23andMe Personalized Genome Reports
The research community is still progressively working in this domain and it is agreed upon by professionals that genomics is still in its infancy. You now have the chance to explore this novel domain via this dataset and become one of the few genomics early adopters.
The dataset is a complete genome extracted from www.23andme.com and is represented as a sequence of SNPs represented by the following symbols: A (adenine), C (cytosine), G (guanine), T (thymine), D (base deletions), I (base insertions), and '_' or '-' if the SNP for particular location is not accessible. It contains Chromosomes 1-22, X, Y, and mitochondrial DNA.
A complete list of the exact SNPs (base pairs) available and their data-set index can be found at https://api.23andme.com/res/txt/snps.b4e00fe1db50.data
For more information about how the data-set was extracted follow https://api.23andme.com/docs/reference/#genomes
Moreover, for a more detailed understanding of the data-set content please acquaint yourself with the description of https://api.23andme.com/docs/reference/#genotypes
Users are allowed to use, copy, distribute and cite the dataset as follows: “Zeeshan-ul-hassan Usmani, Family of Give Genomic Dataset by 23andMe, Kaggle Dataset Repository, March 7, 2021.”
You may use the following human genome database sites for help:
GenBank - https://www.ncbi.nlm.nih.gov/genbank/
The Human Genome Project - https://www.genome.gov/hgp/
Genomes OnLine Database (GOLD) - https://gold.jgi.doe.gov
Complete Genomics - http://www.completegenomics.com/public-data/
Some ideas worth exploring:
Any individuals in the dataset more susceptible to cancer?
Does he/she tend to gain weight?
Where is his/her place of origin?
Which gene determines certain biological feature (cancer susceptibility, fat generation rate, hair color etc.
How does this phenotype SNPs compare with other similar datasets from the western-world?
How the family differ in genomic makeup? Which traits are silent, which ones are dominant?
What would be the likely cause of death for any given person?
What are the most likely diseases/illnesses this family is going to face in lifetime?
What is unique about this dataset?
Can you compare the genomes within this family and see which diseases will have less or more impact on a given family member?
Can you delineate recombination sites precisely, identify sequence errors or find rare SNPs?
What else you can extract from this dataset when it comes to personal trait, intelligence level, ancestry and body makeup?
Facebook
TwitterNotice to Data Users: The documentation for this data set was provided solely by the Principal Investigator(s) and was not further developed, thoroughly reviewed, or edited by NSIDC. Thus, support for this data set may be limited.
This data set consists of a sampling of each type of Hierarchical Data Format version 4 (HDF4) data that are archived at the eight National Aeronautic and Space Administration (NASA) Earth Science Data Centers (ESDCs). The data were sampled for a collaborative study between The HDF Group, the Goddard Earth Sciences Data and Information Services Center (GES-DISC), and the National Snow and Ice Data Center (NSIDC) in order to assess the complex internal byte layout of HDF files. Based on the results of this assessment, methods for producing a map of the layout of the HDF4 files held by NASA were prototyped using a markup-language-based HDF tool. The resulting maps allow a separate program to read the file without recourse to the HDF application programming interface (API). Data products selected for the study, and a table summarizing the results, are available via HTTPS.
Facebook
TwitterThe ECOsystem Spaceborne Thermal Radiometer Experiment on Space Station (ECOSTRESS) mission measures the temperature of plants to better understand how much water plants need and how they respond to stress. ECOSTRESS is attached to the International Space Station (ISS) and collects data globally between 52° N and 52° S latitudes. A map of the acquisition coverage can be found in Figure 2 on the ECOSTRESS website.The ECOSTRESS Tiled Surface Energy Balance Instantaneous L3 Global 70 m (ECO_L3T_SEB) Version 2 data product provides estimated incoming surface radiation (Rg) and net radiation (Rn) aligned with each daytime ECOSTRESS overpass. The Rg was generated using the Forest Light Environmental Simulator (FLiES) radiative transfer model implemented in an artificial neural network using Cloud Optical Thickness (COT) and Aerosol Optical Thickness (AOT) from Goddard Earth Observing System Version 5 (GEOS-5) Forward Processing (FP) along with albedo from ECOSTRESS Tiled Ancillary NDVI and Albedo Level 2 Global 70 m (ECO_L2T_STARS) Version 2 as variables. The Rg output from the FLiES model was bias corrected to Rg from GEOS-FP. The Rn is an output from the Breathing Earth System Simulator (BESS) algorithm. This data product is tiled using a modified version of the Military Grid Reference System (MGRS), which divides Universal Transverse Mercator (UTM) zones into square tiles that are 109.8 km by 109.8 km with a 70 meter (m) spatial resolution.The ECO_L3T_SEB Version 2 data product is provided in Cloud Optimized GeoTIFF (COG) format with each data layer distributed as a separate COG. This product contains four layers including Rg, Rn, cloud mask, and water mask.Known Issues Data acquisition gap: ECOSTRESS was launched on June 29, 2018, and moved to autonomous science operations on August 20, 2018, following a successful in-orbit checkout period. On September 29, 2018, ECOSTRESS experienced an anomaly with its primary mass storage unit (MSU). ECOSTRESS has a primary and secondary MSU (A and B). On December 5, 2018, the instrument was switched to the secondary MSU and science operations resumed. On March 14, 2019, the secondary MSU experienced a similar anomaly, temporarily halting science acquisitions. On May 15, 2019, a new data acquisition approach was implemented, and science acquisitions resumed. To optimize the new acquisition approach TIR bands 2, 4, and 5 are being downloaded. The data products are as previously, except the bands not downloaded contain fill values (L1 radiance and L2 emissivity). This approach was implemented from May 15, 2019, through April 28, 2023. Data acquisition gap: From February 8 to February 16, 2020, an ECOSTRESS instrument issue resulted in a data anomaly that created striping in band 4 (10.5 micron). These data products have been reprocessed and are available for download. No ECOSTRESS data were acquired on February 17, 2020, due to the instrument being in SAFEHOLD. Data acquired following the anomaly have not been affected. Data acquisition: ECOSTRESS has now successfully returned to 5-band mode after being in 3-band mode since 2019. This feature was successfully enabled following a Data Processing Unit firmware update (version 4.1) to the payload on April 28, 2023. To better balance contiguous science data scene variables, 3-band collection is currently being interleaved with 5-band acquisitions over the orbital day/night periods. Missing Cloud Layer Alert: All users of ECOSTRESS Tiled and Gridded L3 Soil Moisture and Surface Energy Balance v002 products (ECO_L3T_SM, ECO_L3G_SM, ECO_L3T_SEB and ECO_L3G_SEB) should be aware that the ‘cloud mask’ layer may be unavailable for a select number of granules for the year 2023. Users are encouraged to get that information from the corresponding Level 2 Standard Cloud Mask products (ECO_L2_CLOUD and ECO_L2G_CLOUD) to assess if a pixel is clear or cloudy (see section 3 of the User Guide). Solar Array Obstruction: Some ECOSTRESS scenes may be affected by solar array obstructions from the International Space Station (ISS), potentially impacting data quality of obstructed pixels. The 'FieldOfViewObstruction' metadata field is included in all Version 2 products to indicate possible obstructions: * Before October 24, 2024 (orbits prior to 35724): The field is present but was not populated and does not reliably identify affected scenes. * On or after October 24, 2024 (starting with orbit 35724): The field is populated and generally accurate, except for late December 2024, when a temporary processing error may have caused false positives. * A list of scenes confirmed to be affected by obstructions is available and is recommended for verifying historical data (before October 24, 2024) and scenes from late December 2024. The ISS native pointing information is coarse relative to ECOSTRESS pixels, so ECOSTRESS geolocation is improved through image matching with a basemap. Metadata in the L1B_GEO file shows the success of this geolocation improvement, using categorizations "best", "good", "suspect", and "poor". We recommend that users use only "best" and "good" scenes for evaluations where geolocation is important (e.g., comparison to field sites). For some scenes, this metadata is not reflected in the higher-level products (e.g., land surface temperature, evapotranspiration, etc.). While this metadata is always available in the geolocation product, to save users additional download, we have produced a summary text file that includes the geolocation quality flags for all scenes from launch to present. At a later date, all higher-level products will reflect the geolocation quality flag correctly (the field name is GeolocationAccuracyQA).
Facebook
TwitterUCSD Anomaly Detection Dataset
The UCSD Anomaly Detection Dataset was acquired with a stationary camera mounted at an elevation, overlooking pedestrian walkways. The crowd density in the walkways was variable, ranging from sparse to very crowded. In the normal setting, the video contains only pedestrians. Abnormal events are due to either:
the circulation of non pedestrian entities in the walkways anomalous pedestrian motion patterns Commonly occurring anomalies include bikers, skaters, small carts, and people walking across a walkway or in the grass that surrounds it. A few instances of people in wheelchair were also recorded. All abnormalities are naturally occurring, i.e. they were not staged for the purposes of assembling the dataset. The data was split into 2 subsets, each corresponding to a different scene. The video footage recorded from each scene was split into various clips of around 200 frames.
Peds1: clips of groups of people walking towards and away from the camera, and some amount of perspective distortion. Contains 34 training video samples and 36 testing video samples.
Peds2: scenes with pedestrian movement parallel to the camera plane. Contains 16 training video samples and 12 testing video samples.
For each clip, the ground truth annotation includes a binary flag per frame, indicating whether an anomaly is present at that frame. In addition, a subset of 10 clips for Peds1 and 12 clips for Peds2 are provided with manually generated pixel-level binary masks, which identify the regions containing anomalies. This is intended to enable the evaluation of performance with respect to ability of algorithms to localize anomalies.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is the first release of the Global Ensemble Digital Terrain Model (GEDTM30). Use for testing purposes only. A publication describing the methods used has been submitted to PeerJ and is currently under review. This work was funded by the European Union. However, the views and opinions expressed are solely those of the author(s) and do not necessarily reflect those of the European Union or the European Commission. Neither the European Union nor the granting authority can be held responsible for them. The data is provided "as is." The Open-Earth-Monitor project consortium, along with its suppliers and licensors, hereby disclaims all warranties of any kind, express or implied, including, without limitation, warranties of merchantability, fitness for a particular purpose, and non-infringement. Neither the Open-Earth-Monitor project consortium nor its suppliers and licensors make any warranty that the website will be error-free or that access to it will be continuous or uninterrupted. You understand that you download or otherwise obtain content or services from the website at your own discretion and risk.
Currently the dataset was produced by resampling to 0.00025 degree, which is closer to 0.9 arc sec. We are curretly working on produce a 0.000277778 degree GEDTM30 to resolve the confusion. See the issue detail here.*
GEDTM30 is presented as a 1-arc-second (~30m) global Digital Terrain Model (DTM) generated using machine-learning-based data fusion. It was trained using a global-to-local Random Forest model with ICESat-2 and GEDI data, incorporating almost 30 billion high-quality points. To see the documentation, please visit our GEDTM30 GitHub(https://github.com/openlandmap/GEDTM30).
This dataset covers the entire world and can be used for applications such as topography, hydrology, and geomorphometry analysis.
This dataset includes:
Due to Zenodo's storage limitations, the original GEDTM30 dataset and its standard deviation map are provided via external links:
| Layer | Scale | Data Type | No Data |
|---|---|---|---|
| Ensemble Digital Terrain Model | 10 | Int32 | -2,147,483,647 |
| Standard Deviation EDTM | 100 | UInt16 | 65,535 |
|
Global-to-local mask | 1 | Byte | 255 |
The primary development of GEDTM30 is documented in GEDTM30 GitHub(https://github.com/openlandmap/GEDTM30). The current version (v1) code is compressed and uploaded as GEDTM30-main.zip. To access the up-to-date development please visit our GitHub page.
If you discover a bug, artifact, or inconsistency, or if you have a question please raise a GitHub issue here
To ensure consistency and ease of use across and within the projects, we follow the standard Ai4SoilHealth and Open-Earth-Monitor file-naming convention. The convention works with 10 fields that describe important properties of the data. In this way users can search files, prepare data analysis etc, without needing to open files.
For example, for gedtm_rf_m_120m_s_20060101_20151231_go_epsg.4326.3855_v20250611.tif, the fields are:
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
The new version at https://doi.org/10.57760/sciencedb.11805
GEOSatDB is a semantic representation of Earth observation satellites and sensors that can be used to easily discover available Earth observation resources for specific research objectives.
Relevant Papers
Ming Lin, Meng Jin, Juanzi Li & Yuqi Bai (2024) GEOSatDB: global civil earth observation satellite semantic database, Big Earth Data, DOI: 10.1080/20964471.2024.2331992
Background
The widespread availability of coordinated and publicly accessible Earth observation (EO) data empowers decision-makers worldwide to comprehend global challenges and develop more effective policies. Space-based satellite remote sensing, which serves as the primary tool for EO, provides essential information about the Earth and its environment by measuring various geophysical variables. This contributes significantly to our understanding of the fundamental Earth system and the impact of human activities.
Over the past few decades, many countries and organizations have markedly improved their regional and global EO capabilities by deploying a variety of advanced remote sensing satellites. The rapid growth of EO satellites and advances in on-board sensors have significantly enhanced remote sensing data quality by expanding spectral bands and increasing spatio-temporal resolutions. However, users face challenges in accessing available EO resources, which are often maintained independently by various nations, organizations, or companies. As a result, a substantial portion of archived EO satellite resources remains underutilized. Enhancing the discoverability of EO satellites and sensors can effectively utilize the vast amount of EO resources that continue to accumulate at a rapid pace, thereby better supporting data for global change research.
Methodology
This study introduces GEOSatDB, a comprehensive semantic database specifically tailored for civil Earth observation satellites. The foundation of the database is an ontology model conforming to standards set by the International Organization for Standardization (ISO) and the World Wide Web Consortium (W3C). This conformity enables data integration and promotes the reuse of accumulated knowledge. Our approach advocates a novel method for integrating Earth observation satellite information from diverse sources. It notably incorporates a structured prompt strategy utilizing a large language model to derive detailed sensor information from vast volumes of unstructured text.
Dataset Information
The downloadable files in RDF Turtle format are located in the data directory and contain a total of 130,134 statements:
GEOSatDB_ontology.ttl: Ontology modeling of concepts, relations, and properties.
satellite.ttl: 2,365 Earth observation satellites and their associated entities.
sensor.ttl: 1,021 Earth observation sensors and their associated entities.
sensor2satellite.ttl: relations between Earth observation satellites and sensors.
In addition, a user-friendly portal is under development to facilitate easy access to GEOSatDB. The portal currently offers preliminary SPARQL query functionality, enabling the execution of SPARQL query examples.
GEOSatDB undergoes quarterly updates, involving the addition of new satellites and sensors, revisions based on expert feedback, and the implementation of additional enhancements.
Facebook
TwitterThis map comes from a preliminary release of the Gridded Population of the World, Version 4 (GPWv4). GPWv4 is a gridded data product that depicts global population data from the 2010 round of Population and Housing Censuses at a scale and extent sufficient to demonstrate the spatial relationship between human populations and the environment across the globe. This population grid provides globally-consistent and spatially-explicit data for use in research, policy making, and communications and is compatible with data sets from social, economic, and Earth science fields.GPWv4 is constructed from national or subnational input areal units of varying resolutions. The native grid cell size is 30 arc-seconds, or ~1 km at the equator. Separate grids are available for population count, population density, estimated land area, and data quality indicators.The full GPWv4 data collection will consist of population estimates for the years 2000, 2005, 2010, 2015, and 2020, and will include grids for estimates of total population, age, sex, and urban/rural status. However, this preliminary release consists only of total population estimates for the year 2010. This data is being released now to allow users early access to the population grids.Source: Columbia University, CIESIN
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
"GPWv4 is a gridded data product that depicts global population data from the 2010 round of Population and Housing Censuses. The Population Density, 2020 layer represents persons per square kilometer for year 2020.
Data Summary GPWv4 is constructed from national or subnational input areal units of varying resolutions. The native grid cell size is 30 arc-seconds, or ~1 km at the equator. Separate grids are available for population count, population density, estimated land area, and data quality indicators; which include the water mask represented in this service. Population estimates are derived by extrapolating the raw census counts to estimates for the 2010 target year. The development of GPWv4 builds upon previous versions of the data set (Tobler et al., 1997; Deichmann et al., 2001; Balk et al., 2006).
The full GPWv4 data collection will consist of population estimates for the years 2000, 2005, 2010, 2015, and 2020, and will include grids for estimates of total population, age, sex, and urban/rural status. However, this release consists only of total population estimates for the year 2020. This data is being released now to allow users access to the population grids.
Recommended Citation Center for International Earth Science Information Network - CIESIN - Columbia University. 2016. Gridded Population of the World, Version 4 (GPWv4): Population Density. Palisades, NY: NASA Socioeconomic Data and Applications Center (SEDAC). http://dx.doi.org/10.7927/H4NP22DQ. Accessed DAY MONTH YEAR"
Facebook
TwitterGeotweet Archive v2.0 The Harvard Center for Geographic Analysis (CGA) maintains the Geotweet Archive, a global record of tweets spanning time, geography, and language. The primary purpose of the Archive is to make a comprehensive collection of geo-located tweets available to the academic research community. The Archive extends from 2010 to the present and is updated daily. The number of tweets in the collection totals approximately 10 billion, and it is stored on Harvard University’s High Performance Computing (HPC) cluster. The Harvard HPC supports many applications for working with big spatio-temporal datasets, including two geospatial tools recently deployed by the CGA: OmniSci Immerse, and PostGIS. The Geotweet Archive consists of tweets which carry two types of geospatial signature: 1) GPS-based longitude/latitude generated by the originating device 2) Place-name-centroid-based longitude/latitude from the bounding box provided by Twitter, based on the user-define place designation (typically a town name). Any tweet which carries one or both of these signatures is included in the Archive. Approximately 1-2% of all tweets contain such geographic coordinates, (this percentage needs verification and may vary over time). The current version of the Archive is Version 2.0. The original Version 1.0 archive began in 2012 as part of a project with Ben Lewis of CGA and then Harvard graduate student Todd Mostak, to develop a GPU-powered spatial database called GEOPS. GEOPS formed the basis for technology startup MapD Technologies, which is now OmniSci. OmniSci Immerse software now runs on Harvard’s High Performance Computing (HPC) environment to support interactive exploration and analytics with the Geotweet Archive and any other large datasets. Version 2.0 of the archive represents the results of a merge between the CGA archive, and an archive developed by the Department of Geoinformatics at the University of Salzburg in Austria, as well as several other archives. Clemens Havas and Bernd Resch at University of Salzburg, and Devika Kakkar of Harvard CGA collaborated to deploy Version 2.0. ======================================================== Schema of Geotweet Archive v2.0 Field name_TYPE_Description message_id----BIGINT----Tweet ID tweet_date----TIMESTAMP----Date and time of tweet from Twitter (utc) tweet_text----TEXT ENCODING----Text content of tweet tags----TEXT ENCODING DICT----Tweet hashtags tweet_lang----TEXT ENCODING DICT----Language that the tweet is in source ----TEXT ENCODING DICT----Operating system or application type used to create the tweet place*----TEXT ENCODING NONE----The geographic place as defined by the user, usually a town name. A bounding box determined by Twitter based on this field, from which centroids (see longitude and latitude fields) and the spatial_error field are derived, and used when not overridden by a GPS coordinate. See Twitter tweet object for place. retweets ----SMALLINT----Number of retweets as of last time it was checked tweet_favorites----SMALLINT----Now known as ‘likes’ photo_url----TEXT ENCODING DICT----URL of any image referenced quoted_status_id ----BIGINT----ID number for quote status user_id ----BIGINT----User ID number user_name----TEXT ENCODING NONE----User name user_location*----TEXT ENCODING NONE----User defined location, usually a city or town. See Twitter user object. followers ----SMALLINT----Followers as of the last time checked friends ----SMALLINT----Number of users followed by this user user_favorites----INT----Number of topics the user is interested in status----INT----Code for what user is doing as of last time it was checked user_lang----TEXT ENCODING DICT----User defined language latitude----FLOAT----Latitude from GPS or bounding box based on Place field longitude----FLOAT----Longitude from GPS or bounding box based on Place field data_source*----TEXT ENCODING DICT----The source crawler or dataset for the tweet gps----TEXT ENCODING DICT----Flag for whether lon/lat is from GPS or town name bounding box (SRID – 4326). When both are present, the GPS coordinate takes priority. spatialerror----FLOAT----Estimate in meters horizontal error for lon/lat coordinate. 10m for GPS coordinates, error for bounding boxes calculated as radius of circle with area of bounding box. ===================================================== *data_source_Code U. Salzburg REST API crawler----1 Harvard CGA streaming crawler----2 U. Salzburg streaming API crawler----3 Ryan Qi Wang and Harvard Medical School datasets----4 U. Heidelberg dataset----5 Archive.org dataset----6 ---------------------------------------------------------------------------------------------- Note: Before April of 2015 the default for GPS coordinate capture was turned on for Twitter users. After this date users have had to opt-in to share their precise location. This is one reason for the large decrease in volume of geotweets after this date. A number of automated...
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains top of atmosphere (TOA) outgoing longwave radiation from the Geostationary Earth Radiation Budget (GERB) instrument on board the Meteosat-9 geostationary satellite, for the period from May 2007 until December 2012. In this dataset (labelled 'GERB-HR-ED01-1-1 rlut 1hrCM'), the data provided consist of monthly-mean diurnal cycles, with each day resolved into 1-hour means. Data are only available for eight months of each year (January, February, May, June, July, August, November and December), due to operational constraints of the GERB instrument.
This is version 1.1 of the product and contains significant improvements over the original version. These improvements include lower estimated uncertainties owing to missing data having been filled where possible and subsequently a greater availability of data within the period provided. Users are strongly encouraged to use this latest version of the products.
It has been produced in Obs4MIPs (Observations for Model Intercomparisons Project) format, as part of an activity to increase the use of GERB satellite observational data for the modelling and model analysis communities. This is not currently a standard GERB satellite instrument product, but does represent an effort on behalf of the GERB project team to identify a product that is appropriate for routine model evaluation. The data have been reprocessed and reformatted, utilising additional data sources where necessary, to create a product primarily intended for comparison with climate model output.
Facebook
TwitterThis data set, part of the National Aeronautics and Space Administration (NASA) Making Earth System Data Records for Use in Research Environments (MEaSUREs) program, offers users a 25 kilometer (km) daily record of surface/near-surface melting on the Greenland Ice Sheet. The presence of melting is determined from brightness temperature data acquired during a 34 year span by three satellite-borne microwave radiometers: the Scanning Multichannel Microwave Radiometer (SMMR), the Special Sensor Microwave/Imager (SSM/I), and the Special Sensor Microwave Imager/Sounder (SSMIS). This data set consists of daily files that report the presence of surface/near-surface melting in 25 km x 25 km grid cells spanning the Greenland Ice Sheet. The onset of melting is determined from satellite brightness temperature data acquired from 1 January 1979 through 31 December 2012. Mote, T. L. 2014. MEaSUREs Greenland Surface Melt Daily 25km EASE-Grid 2.0, Version 1. Boulder, Colorado USA. NASA National Snow and Ice Data Center Distributed Active Archive Center. doi: https://doi.org/10.5067/MEASURES/CRYOSPHERE/nsidc-0533.001. 13 Nov 2020.
Facebook
Twitter
NEW GOES-19 Data!! On April 4, 2025 at 1500 UTC, the GOES-19 satellite will be declared the Operational GOES-East satellite. All products and services, including NODD, for GOES-East will transition to GOES-19 data at that time. GOES-19 will operate out of the GOES-East location of 75.2°W starting on April 1, 2025 and through the operational transition. Until the transition time and during the final stretch of Post Launch Product Testing (PLPT), GOES-19 products are considered non-operational regardless of their validation maturity level. Shortly following the transition of GOES-19 to GOES-East, all data distribution from GOES-16 will be turned off. GOES-16 will drift to the storage location at 104.7°W. GOES-19 data should begin flowing again on April 4th once this maneuver is complete.
NEW GOES 16 Reprocess Data!! The reprocessed GOES-16 ABI L1b data mitigates systematic data issues (including data gaps and image artifacts) seen in the Operational products, and improves the stability of both the radiometric and geometric calibration over the course of the entire mission life. These data were produced by recomputing the L1b radiance products from input raw L0 data using improved calibration algorithms and look-up tables, derived from data analysis of the NIST-traceable, on-board sources. In addition, the reprocessed data products contain enhancements to the L1b file format, including limb pixels and pixel timestamps, while maintaining compatibility with the operational products. The datasets currently available span the operational life of GOES-16 ABI, from early 2018 through the end of 2024. The Reprocessed L1b dataset shows improvement over the Operational L1b products but may still contain data gaps or discrepancies. Please provide feedback to Dan Lindsey (dan.lindsey@noaa.gov) and Gary Lin (guoqing.lin-1@nasa.gov). More information can be found in the GOES-R ABI Reprocess User Guide.
NOTICE: As of January 10th 2023, GOES-18 assumed the GOES-West position and all data files are deemed both operational and provisional, so no ‘preliminary, non-operational’ caveat is needed. GOES-17 is now offline, shifted approximately 105 degree West, where it will be in on-orbit storage. GOES-17 data will no longer flow into the GOES-17 bucket. Operational GOES-West products can be found in the GOES-18 bucket.
GOES satellites (GOES-16, GOES-17, GOES-18 & GOES-19) provide continuous weather imagery and
monitoring of meteorological and space environment data across North America.
GOES satellites provide the kind of continuous monitoring necessary for
intensive data analysis. They hover continuously over one position on the surface.
The satellites orbit high enough to allow for a full-disc view of the Earth. Because
they stay above a fixed spot on the surface, they provide a constant vigil for the
atmospheric "triggers" for severe weather conditions such as tornadoes, flash floods,
hailstorms, and hurricanes. When these conditions develop, the GOES satellites are able
to monitor storm development and track their movements. SUVI products available in both NetCDF and FITS.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
AusENDVI (Australian Emprical NDVI) is a monthly, 5-km gridded estimate of NDVI across Australia from 1982-2022. It is built by calibrating and harmonising NOAA's Climate Data Record AVHRR NDVI data to MODIS MCD43A4 NDVI using a gradient boosting ensemble decision tree method. Additionally, the datasets are gapfilled using a synthetic NDVI dataset. The methods are extensively described in an Earth System Science Data publication.
AusENDVI consists of several datasets, each dataset has a description in the attributes of the NetCDF file that describes its provenance. The naming convention is "AusENDVI_.nc".
AusENDVI-clim_gapfilled_1982_2013. Calibrated and harmonised Climate Data Record AVHRR NDVI data from Jan. 1982 to Dec. 2013. This version of the dataset used climate data in the calibration and harmonisation process and has the best agreement statistics with MODIS MCD43A4 NDVI. The dataset has been gap filled using the methods described in the accompanying publication.
AusENDVI-clim_MCD43A4_gapfilled_1982_2022. This dataset consists of calibrated and harmonised NOAA Climate Data Record AVHRR NDVI data from Jan. 1982 to Feb. 2000, joined with MODIS-MCD43A4 NDVI data from Mar. 2000 to Dec. 2022. This version of the dataset used climate data in the calibration and harmonisation process. The dataset has been gapfilled using the methods described in the accompanying publication
AusENDVI-noclim_1982_2013. Calibrated and harmonised Climate Data Record AVHRR NDVI data from Jan. 1982 to Dec. 2013. This version of the dataset did not use climate data in the calibration and harmonisation process and the dataset has not been gap filled.
AusENDVI-synthetic_1982_2022. This dataset consists of synthetic NDVI data that was built by training a model on the joined AusENDVI-clim and MODIS-MCD43A4 NDVI timeseries using climate, woody-cover-fraction, and atmospheric CO2 as predictors. The synthetic NDVI is used for gap filling.
All datasets are in 'EPSG:4326' projection, and have a spatial resolution of 0.05 degrees. Geographic coordinate information is contained in the spatial_ref variable.
A Jupyter Notebook is also provided that shows how to load, plot, QC mask, reproject, and gap-fill AusENDVI datasets. The notebook is effectively a 'readme' file.
The notebook is also available to view/download here
An open-source github repository details the methods used to create these datasets
https://github.com/cbur24/AusENDVI
A few small changes to the datasets were implemented in version 0.2.0:
All datasets now have their values clipped to the range 0-1
The AusENDVI-clim dataset is now gapfilled, and includes a QC layer
The merged AusENDVI-noclim_MCD43A4_1982_2022 dataset was removed to simplify the number of datasets included in the repository. Users who want to join the 'noclim' and MODIS datasets can do so by clipping out MCD43A4 from the AusENDVI-clim_MCD43A4_gapfilled_1982_2022 dataset.
The accompanying Jupyter Notebook 'readme' has been updated.
Facebook
TwitterThis dataset contains operational near-real-time Level 2 ocean surface wind vector retrievals from the Advanced Scatterometer (ASCAT) on MetOp-B at 25 km sampling resolution (note: the effective resolution is 50 km). It is a product of the European Organization for the Exploitation of Meteorological Satellites (EUMETSAT) Ocean and Sea Ice Satellite Application Facility (OSI SAF) provided through the Royal Netherlands Meteorological Institute (KNMI). The wind vector retrievals are currently processed using the CMOD.n geophysical model function using a Hamming filter to spatially average the Sigma-0 data in the ASCAT L1B data. Each file is provided in netCDF version 3 format, and contains one full orbit derived from 3-minute orbit granules. Latency is approximately 2 hours from the latest measurement. The beginning of the orbit is defined by the first wind vector cell measurement within the first 3-minute orbit granule that starts north of the Equator in the ascending node. ASCAT is a C-band dual swath fan beam radar scatterometer providing two independent swaths of backscatter retrievals in sun-synchronous polar orbit aboard the MetOp-B platform. For more information on the MetOp-B mission, please visit: https://www.eumetsat.int/our-satellites/metop-series . For more timely announcements, users are encouraged to register with the KNMI scatterometer email list: scat@knmi.nl. Users are also highly advised to check the dataset user guide periodically for updates and new information on known problems and issues. All intellectual property rights of the OSI SAF products belong to EUMETSAT. The use of these products is granted to every interested user, free of charge. If you wish to use these products, EUMETSAT's copyright credit must be shown by displaying the words "copyright (year) EUMETSAT" on each of the products used.
Facebook
TwitterThe Feature Layer made available to the Living Atlas has been adapted from the 625k Geology dataset freely available from the BGS website. The attribution and labels of the geological areas (or polygons) have been simplified to make the data more available to a wider audience. The dataset is aimed at students with an interest in Earth Sciences and amateur geologists who want to find out more. The LEX_RCS & LEX_ROCK codes have been preserved to allow users to reference the layers to to the 625k Geology Dataset.
About BGS Geology 625k:
BGS Geology 625k is a generalised digital geological map dataset based on BGS’s published poster maps of the UK (north and south). Bedrock-related themes were created by generalisation of 1:50 000 data to make the 2007 fifth edition bedrock geology map. Superficial geology-related themes were digitised from the 1977 first edition Quaternary map (north and south). Many BGS geology maps are now available digitally. The Digital Geological Map of Great Britain project (formerly known as DiGMapGB) has prepared 1:625 000, 1:250 000, 1:50 000 and 1:10 000-scale datasets for England, Wales, and Scotland. Work continues to upgrade these. Geological maps are often the foundation for many other earth science-related maps and are of potential use to a wide range of end users. This dataset uses the themes:
Bedrock Geology Superficial Geology Linear features (faults)
More information on the BGS 625k Geology Dataset can be found on the BGS website. The 625k Geology data can also be viewed alongside other BGS datasets in the GeoIndex viewer. The currency of this data is August 2022, while there are no planned regular updates, BGS continuously reviews its data products and will release new versions of the BGS Geology 625k when available.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Roads, railways and utility easements are integral components of human society, allowing for the safe and efficient transport of people and goods. There are few places on earth that are not currently traversed or impacted by the vast networks of linear infrastructure. The ecological impacts of linear infrastructure and vehicles are numerous, diverse and, in most cases, deleterious. Recognition and amelioration of these impacts is becoming widespread around the world, and new roads and other linear infrastructure are increasingly planned to avoid high-quality areas and designed to minimise or mitigate the deleterious effects. Importantly, the negative effects of the existing infrastructure are also being reduced during routine maintenance and upgrade projects, as well as targeted retrofits to fix specific problem areas. (1) Global road length, number of vehicles and rate of per capita travel are high and predicted to increase significantly over the next few decades.(2) The ‘road-effect zone’ is a useful conceptual framework to quantify the negative ecological and environmental impacts of roads and traffic.(3) The effects of roads and traffic on wildlife are numerous, varied and typically deleterious. (4) The density and configuration of road networks are important considerations in road planning. (5) The costs to society of wildlife-vehicle collisions can be high. (6) The strategies of avoidance, minimisation, mitigation and offsetting are increasingly being adopted around the world – but it must be recognised that some impacts are unavoidable and unmitigable. (7) Road ecology is an applied science which underpins the quantification and mitigation of road impacts. The global rates of road construction and private vehicle ownership as well as travel demand will continue to rise for the foreseeable future, including at a rapid rate in many developing countries. The challenge currently facing society is to build a more efficient transportation system that facilitates economic growth and development, reduces environmental impacts and protects biodiversity and ecosystem functions. The legacy of the decisions we make today and the roads and railways we construct tomorrow will be with us for many years to come.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/