description: These files contain the environmental data as particular emissions or resources associated with a BEA sectors that are used in the USEEIO model. They are organized by the emission or resources type, as described in the manuscript. The main files (without SI) show the final "satellite tables" in the 'Exchanges' sheet which have emissions or resource use per USD for 2013. The other sheets in these files provide meta data for the create of the tables, including general information, sources, etc. The 'export' sheet is used for saving the satellite table for csv export. The data dictionary describes the fields in this sheet. The supporting files provide all the details data transformation and organization for the development of the satellite tables. This dataset is associated with the following publication: Yang, Y., W. Ingwersen, T. Hawkins, and D. Meyer. USEEIO: A new and transparent United States environmentally extended input-output model. JOURNAL OF CLEANER PRODUCTION. Elsevier Science Ltd, New York, NY, USA, 158: 308-318, (2017).; abstract: These files contain the environmental data as particular emissions or resources associated with a BEA sectors that are used in the USEEIO model. They are organized by the emission or resources type, as described in the manuscript. The main files (without SI) show the final "satellite tables" in the 'Exchanges' sheet which have emissions or resource use per USD for 2013. The other sheets in these files provide meta data for the create of the tables, including general information, sources, etc. The 'export' sheet is used for saving the satellite table for csv export. The data dictionary describes the fields in this sheet. The supporting files provide all the details data transformation and organization for the development of the satellite tables. This dataset is associated with the following publication: Yang, Y., W. Ingwersen, T. Hawkins, and D. Meyer. USEEIO: A new and transparent United States environmentally extended input-output model. JOURNAL OF CLEANER PRODUCTION. Elsevier Science Ltd, New York, NY, USA, 158: 308-318, (2017).
The Satellite dataset forms a practical VFL scenario for location identification based on satellite imagery. Each AOI, with its unique location identifier, is captured by 16 satellite visits. Assuming each visit is carried out by a distinct satellite organization, these organizations aim to collectively train a model to classify the land type of the location without sharing original images. The Satellite dataset encompasses four land types as labels, namely Amnesty POI (4.8%), ASMSpotter (8.9%), Landcover (61.3%), and UNHCR (25.0%), making the task a 4-class classification problem of 3927 locations, containing 62,832 images across 16 parties, simulating a practical VFL scenario of collaborative location identification via multiple satellites.
This ZIP file comprises 32 CSV files, corresponding to training and testing datasets split at a ratio of 8:2. Each training and testing file contains 3,142 and 785 flattened images from a party, respectively.
This dataset consists of ground-based multi-GNSS (Global Navigation Satellite System) Broadcast Ephemeris Data. GNSS satellites provide autonomous geo-spatial positioning with global coverage. GNSS data sets from ground receivers at the CDDIS consist primarily of the data from the U.S. Global Positioning System (GPS) and the Russian GLObal NAvigation Satellite System (GLONASS). Since 2011, the CDDIS GNSS archive includes data from other GNSS (Europe's Galileo, China's Beidou, Japan's Quasi-Zenith Satellite System/QZSS, the Indian Regional Navigation Satellite System/IRNSS, and worldwide Satellite Based Augmentation Systems/SBASs), which are similar to the U.S. GPS in terms of the satellite constellation, orbits, and signal structure. Colleagues at the Technical University in Munich (TUM) and Deutsches Zentrum für Luft- und Raumfahrt (DLR) provide to the CDDIS a merged, multi-GNSS broadcast ephemeris file containing GPS, GLONASS, Galileo, BeiDou, QZSS, and SBAS ephemerides. The file is generated from real-time streams and contains all unique broadcast navigation messages for the day.
Full-rate data include all valid satellite returns and are thus larger in volume; these data are not routinely provided by all stations in the laser tracking network. Full-rate data are useful for both engineering evaluation and scientific applications (e.g., studying the performance of retroreflectors, discerning satellite signatures, understanding the statistical nature of satellite returns, calibration of satellite targets, validating system quality of laser station co-locations, etc.). Although many of these studies are of an engineering nature, the results have an important impact on the quality of the scientific output. Full-rate data are transmitted in daily files containing all data received in the previous 24-hour period. The CDDIS then updates monthly, satellite-specific files from these daily files. The summary files summarize the data passes of the monthly full-rate data files.
CRD format started testing in 2008 and became operational in January 2011. ILRS/CSTG formats were used for normal point data from 1976 through 2011.
These data are the Goddard Satellite-based Surface Turbulent Fluxes Version-3 (GSSTF3) Dataset recently produced through a MEaSUREs funded project led by Dr. Chung-Lin Shie (UMBC/GEST, NASA/GSFC), converted to HDF-EOS5 format. The stewardship of this HDF-EOS5 dataset is part of the MEaSUREs project. This suite of GSSTF version 3 products is the 0.25x0.25 deg resolution version of the GSSTF 2c collections. It does not contain, however, the "WB" variable - 'lowest 500-m precipitable water' (g/cm**2). This is the Daily (24-hour) product; data are projected to equidistant Grid that covers the globe at 0.25 degree cell size, resulting in data arrays of 1440x720 size. As in previous versions, the daily fluxes have first been produced for each individual available SSM/I satellite tapes (e.g., F08, F10, F11, F13, F14 and F15). Then, the Combined daily fluxes are produced by averaging (equally weighted) over available flux data/files from various satellites. These Combined daily flux data are considered as the "final" GSSTF, Version 3, and are stored in this HDF-EOS5 collection. There are only one set of GSSTF, Version 3, Combined data, "Set1" The "individual" daily flux data files, produced for each individual satellite, are also available in HDF-EOS5, although from different collections: GSSTF_Fxx_3, where Fxx are the individual satellites (F08, F10, etc..) The input data sets used for this recent GSSTF production include the upgraded and improved datasets such as the Special Sensor Microwave Imager (SSM/I) Version-6 (V6) product of brightness temperature [Tb], total precipitable water [W], and wind speed [U] produced by the Wentz of Remote Sensing Systems (RSS), as well as the NCEP/DOE Reanalysis-2 (R2) product of sea skin temperature [SKT], 2-meter air temperature [Tair], and sea level pressure [SLP]. Relevant to this MEaSUREs project, these are converted to HDF-EOS5, and are stored in the GSSTF_NCEP_3 collection. Please use these products with care and proper citations, i.e., properly indicating your applications with, e.g., "using the combined 2001 data file of Set1" or "using the 2001 F13 data file". APPENDIX SET1 --------------- The following list summarizes individual satellites used to produce the Combined SET1. (1) Y1987/: F08/ 1987/07-12: F08 (Note: 1987/12 is filled with missing value due to data scarcity) (2) Y1988/: F08/ 1988/01-12: F08 (3) Y1989/: F08/ 1989/01-12: F08 (4) Y1990: F08/ F10/ 1990/01-12: F08 (Note: F10 started in 1990/12, but N/A due to data scarcity) (5) Y1991/: F08/ F10/ 1991/01-12: F08+F10 (6) Y1992/: F10/ F11/ 1992/01-12: F10+F11 (7) Y1993/: F10/ F11/ 1993/01-12: F10+F11 (8) Y1994/: F10/ F11/ 1994/01-12: F10+F11 (9) Y1995/: F10/ F11/ F13/ 1995/01-12: 01-04: F10+F11 05-12: F10+F11+F13 (10) Y1996/: F10/ F11/ F13/ 1996/01-12: F10+F11+F13 (11) Y1997/: F10/ F11/ F13/ F14/ 1997/01-12: 01-04: F10+F11+F13 05/01-11/14: F10+F11+F13+F14 11/15-12/31: F11+F13+F14 (12) Y1998/: F11/ F13/ F14/ 1998/01-12: F11+F13+F14 (13) Y1999/: F11/ F13/ F14/ 1999/01-12: F11+F13+F14 (14) Y2000/: F11/ F13/ F14/ F15/ 2000/01-12: 01/01-05/16: F11+F13+F14+F15 05/17-12/31: F13+F14+F15 (15) Y2001/: F13/ F14/ F15/ 2001/01-12: F13+F14+F15 (16) Y2002/: F13/ F14/ F15/ 2002/01-12: F13+F14+F15 (17) Y2003/: F13/ F14/ F15/ 2003/01-12: F13+F14+F15 (18) Y2004/: F13/ F14/ F15/ 2004/01-12: F13+F14+F15 (19) Y2005/: F13/ F14/ F15/ 2005/01-12: F13+F14+F15 (20) Y2006/: F13/ F14/ F15/ 2006/01-12: F13+F14 (21) Y2007/: F13/ F14/ F15/ 2007/01-12: F13+F14 (22) Y2008/: F13/ F14/ F15/ 2008/01-12: 01-07: F13+F14 08-12: F13 Special notes: (a) For Y2006, Y2007 and Y2008, the current Combined daily data files do not include the F15 Individual daily data files due to problematic calibration in F15. The Combined daily files will be updated for those three years once an improved set of Individual daily data files are produced using corrected and updated SSM/I F15 input files. (b) The current Combined daily data files are produced with at most 4 combined satellites, i.e., F10, F11, F13 and F14 for May-Nov 1997, and F11, F13, F14 and F15 for Jan-May 2000.
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
This repository has VME dataset (images and annotations files). Also, it has the script for constructing CDSI dataset.
VME is a satellite imagery dataset built for vehicle detection in the Middle East. VME images (satellite_images folder) are under https://creativecommons.org/licenses/by-nc-nd/4.0/" target="_blank" rel="noopener">CC BY-NC-ND 4.0 license, whereas the rest of folders (annotations_HBB, annotations_OBB, CDSI_construction_scripts) are under https://creativecommons.org/licenses/by/4.0/" target="_blank" rel="noopener">CC BY 4.0 license.
VME_CDSI_datasets.zip has four components:
annotations_HBB, annotations_OBB, CDSI_construction_scripts, are available in our GitHub repository
Please cite our dataset & paper with the preferred format as shown in the "Citation" section
@article{al-emadi_vme_2025,
title = {{VME: A Satellite Imagery Dataset and Benchmark for Detecting Vehicles in the Middle East and Beyond}},
volume = {12},
issn = {2052-4463},
url = {https://doi.org/10.1038/s41597-025-04567-y},
doi = {10.1038/s41597-025-04567-y},
pages = {500},
number = {1},
journal = {Scientific Data},
author = {Al-Emadi, Noora and Weber, Ingmar and Yang, Yin and Ofli, Ferda},
date = {2025-03-25},
publisher={Spring Nature},
year={2025}
}
Global Navigation Satellite System (GNSS) daily 30-second sampled data available from the Crustal Dynamics Data Information System (CDDIS). Global Navigation Satellite System (GNSS) provide autonomous geo-spatial positioning with global coverage. GNSS data sets from ground receivers at the CDDIS consist primarily of the data from the U.S. Global Positioning System (GPS) and the Russian GLObal NAvigation Satellite System (GLONASS). Other GNSS (Europe’s Galileo, China’s Beidou, Japan’s Quasi-Zenith Satellite System/QZSS, the Indian Regional Navigation Satellite System/IRNSS, and worldwide Satellite Based Augmentation Systems/SBASs) are similar to the U.S. GPS in terms of the satellite constellation, orbits, and signal structure; CDDIS began archiving data from these systems in 2011. These data include hourly files of observation (30-second sampling), broadcast ephemeris, meteorological messages in RINEX format as well as other files (e.g., hourly meteorological data) from a global network of permanent ground-based receivers.
This dataset consists of ground-based Global Navigation Satellite System (GNSS) Observation Data (1-second sampling, sub-hourly files) from the NASA Crustal Dynamics Data Information System (CDDIS). GNSS provide autonomous geo-spatial positioning with global coverage. GNSS data sets from ground receivers at the CDDIS consist primarily of the data from the U.S. Global Positioning System (GPS) and the Russian GLObal NAvigation Satellite System (GLONASS). Since 2011, the CDDIS GNSS archive includes data from other GNSS (Europe’s Galileo, China’s Beidou, Japan’s Quasi-Zenith Satellite System/QZSS, the Indian Regional Navigation Satellite System/IRNSS, and worldwide Satellite Based Augmentation Systems/SBASs), which are similar to the U.S. GPS in terms of the satellite constellation, orbits, and signal structure. The sub-hourly GNSS observation files (un-compacted) contain 15 minutes of GPS or multi-GNSS observation (1-second sampling) data in RINEX format from a global permanent network of ground-based receivers, one file per 15 minutes per site. More information about these data is available on the CDDIS website at https://cddis.nasa.gov/Data_and_Derived_Products/GNSS/high-rate_data.html.
NEW GOES-19 Data!! On April 4, 2025 at 1500 UTC, the GOES-19 satellite will be declared the Operational GOES-East satellite. All products and services, including NODD, for GOES-East will transition to GOES-19 data at that time. GOES-19 will operate out of the GOES-East location of 75.2°W starting on April 1, 2025 and through the operational transition. Until the transition time and during the final stretch of Post Launch Product Testing (PLPT), GOES-19 products are considered non-operational regardless of their validation maturity level. Shortly following the transition of GOES-19 to GOES-East, all data distribution from GOES-16 will be turned off. GOES-16 will drift to the storage location at 104.7°W. GOES-19 data should begin flowing again on April 4th once this maneuver is complete.
NEW GOES 16 Reprocess Data!! The reprocessed GOES-16 ABI L1b data mitigates systematic data issues (including data gaps and image artifacts) seen in the Operational products, and improves the stability of both the radiometric and geometric calibration over the course of the entire mission life. These data were produced by recomputing the L1b radiance products from input raw L0 data using improved calibration algorithms and look-up tables, derived from data analysis of the NIST-traceable, on-board sources. In addition, the reprocessed data products contain enhancements to the L1b file format, including limb pixels and pixel timestamps, while maintaining compatibility with the operational products. The datasets currently available span the operational life of GOES-16 ABI, from early 2018 through the end of 2024. The Reprocessed L1b dataset shows improvement over the Operational L1b products but may still contain data gaps or discrepancies. Please provide feedback to Dan Lindsey (dan.lindsey@noaa.gov) and Gary Lin (guoqing.lin-1@nasa.gov). More information can be found in the GOES-R ABI Reprocess User Guide.
NOTICE: As of January 10th 2023, GOES-18 assumed the GOES-West position and all data files are deemed both operational and provisional, so no ‘preliminary, non-operational’ caveat is needed. GOES-17 is now offline, shifted approximately 105 degree West, where it will be in on-orbit storage. GOES-17 data will no longer flow into the GOES-17 bucket. Operational GOES-West products can be found in the GOES-18 bucket.
GOES satellites (GOES-16, GOES-17, GOES-18 & GOES-19) provide continuous weather imagery and
monitoring of meteorological and space environment data across North America.
GOES satellites provide the kind of continuous monitoring necessary for
intensive data analysis. They hover continuously over one position on the surface.
The satellites orbit high enough to allow for a full-disc view of the Earth. Because
they stay above a fixed spot on the surface, they provide a constant vigil for the
atmospheric "triggers" for severe weather conditions such as tornadoes, flash floods,
hailstorms, and hurricanes. When these conditions develop, the GOES satellites are able
to monitor storm development and track their movements. SUVI products available in both NetCDF and FITS.
The ARGO ship classification dataset holds 1750 labelled images from PlanetScope-4-Band satelites. The dataset creation process and results on the dataset are published in the demo paper:
{CITE}
The imagery is provided as numpy binary files. All image data is licensed by Planet Labs PBC. The channel ordering is BGRN. The dataset is provided in two folders named "ship" and "non_ship". Those folders correspond to the original labels created during automated dataset creation. The files are numbered.
Two additional .csv files are provided. The shipsAIS_2017_Zone17.csv file holds the AIS information on the imagery contained in the ship folder. The data was retrieved from marinecadastre.gov.
During the experiments errors in the automatically created dataset emerged which are further described in the paper. The manual relabelling is supplied in the corrected_labels.csv file.
https://www.cognitivemarketresearch.com/privacy-policyhttps://www.cognitivemarketresearch.com/privacy-policy
According to Cognitive Market Research, the global commercial satellite imaging market size will be USD 10.21 billion in 2024 and will expand at a compound annual growth rate (CAGR) of 10.95% from 2024 to 2031.
• The global commercial satellite imaging market will expand significantly by 10.95% CAGR between 2024 to 2031. • North America held the major market of more than XX% of the global revenue with a market size of USD XX million in 2023 and will grow at a compound annual growth rate (CAGR) of XX% from 2024 to 2031. • Europe accounted for a share of over XX% of the global market size of USD XX million. • Asia Pacific held a market of around XX% of the global revenue with a market size of USD XX million in 2023 and will grow at a compound annual growth rate (CAGR) of XX% from 2024 to 2031. • Latin America's market will have more than XX% of the global revenue with a market size of USD XX million in 2023 and will grow at a compound annual growth rate (CAGR) of XX% from 2024 to 2031. • Middle East and Africa held the major market of around XX% of the global revenue with a market size of USD XX million in 2023 and will grow at a compound annual growth rate (CAGR) of XX% from 2024 to 2031. • The Geospatial Data Acquisition segment is set to rise due to the need for evaluating a range of economic factors, such as farming methods, infrastructure, urbanization, and environmental effects. Governments and businesses in the private sector are also investing heavily in satellite imaging to obtain information on urban planning and natural resources.
• The commercial satellite imaging market is driven by the increasing use of Satellite Images for real-time data access in defense applications, Government Support, rising demand for High-resolution imaging for various end-use applications, and Technological advancements leading to high-resolution satellite imaging. • North America held the highest commercial satellite imaging market revenue share in 2023.
Current Scenario of the Commercial Satellite Imaging Market
Key Drivers of the Commercial Satellite Imaging Market
Increasing the Use of Satellite Images for Real-Time Data Access in Defence Applications to Accelerate Market Growth
A comprehensive understanding of Automated Optical Inspection (AOI) and satellite imagery has become an advantage and a necessity in today's asymmetric warfare. Digital Elevation Models (DEMs) and 3D models of rural and urban regions may be produced quickly and reliably with the help of Airbus' Pleiades Neo military satellite. To determine if a target is mobile or fixed, high-resolution photos are particularly helpful. Furthermore, assets or targets can be recognized, identified, and detected down to the finest detail due to the Very High Resolution (VHR). Additionally, reliable topography data from satellite imagery helps the armed forces plan ahead and gain a comprehensive understanding of the situation. For instance, according to a report published by the Government of India, Digital Video Broadcasting-Satellite Version 2 (DVB-S2) technology has been added to the satellite-based communication network to improve efficiency and make the best use of available spectrum. More than 785 DCPW, State/UT Police, and CAPF-updated VSATs have been deployed.
(Source-https://www.mha.gov.in/sites/default/files/AnnualreportEnglish_04102023.pdf )
According to a news report by Airbus, Poland, and Airbus Defence and Space have signed a deal for the development, production, launch, and onboard supply of two high-performance optical Earth observation satellites as part of a geospatial intelligence system. Moreover, the contract includes the provision of Very High Resolution (VHR) imagery.
Thus, the increasing use of satellite images for real-time data access in defense applications accelerates market growth.
Government Support will drive the Commercial Satellite Imaging market-
Governments throughout the world are realizing increasingly how important sate...
Citation: If using this dataset please cite the following in your work: @misc{VotDasNemSri2010 , author = "Petr Votava and Kamalika Das and Rama Nemani and Ashok N. Srivastava", year = "2010", title = "MODIS surface reflectance data repository", url = "https://c3.ndc.nasa.gov/dashlink/resources/331/", institution = "NASA Ames Research Center" } Petr Votava, Kamalika Das, Rama Nemani, Ashok N. Srivastava. (2010). MODIS surface reflectance data repository. NASA Ames Research Center. Data Description: The California satellite dataset using the MODerate-resolution Imaging Spectroradiometer (MODIS) product MCD43A4 provides reflectance data adjusted using a bidirectional reflectance distribution function (BRDF) to model the values as if they were taken from nadir view. Both Terra and Aqua data are used in the generation of this product, providing the highest probability for quality input data. More information at: https://lpdaac.usgs.gov/lpdaac/products/modis_products_table/nadir_brdf_adjusted_reflectance/16_day_l3_global_500m/v5/combined Data Organization: The nine data folders correspond to three years of data.Under this top level directory structure are separate files for each band (1 - 7) and each 8-day period of the particular year. Within the period the best observations were selected for each location. File Naming Conventions: Each of the files represent a 2D dataset with the naming conventions as follows: MCD43A4.CA_1KM.005.. .flt32 where is the beginning year-day of the period that where YYYY = year and DDD = day of year (001 - 366) represents the observations in particular (spectral) band (band 1 - band 7) - since the indexing is 0-based, the range of indexes on the files is from 0 - 6 (where 0 = band 1, and 6 = band 7) The spectral band frequencies for the MODIS acquisitions are as follows: BAND1 620 - 670 nm BAND2 841 - 876 nm BAND3 459 - 479 nm BAND4 545 - 565 nm BAND5 1230 - 1250 nm BAND6 1628 - 1652 nm BAND7 2105 - 2155 nm File Specifications: Each file is a single 2D dataset. DATA TYPE: 32-bit floating point (IEEE754) with little-Endian byte ordering NUMBER OF ROWS: 1203 NUMBER OF COLUMNS: 738 FILL VALUES (observations that are either not valid or not on land, such as ocean etc.): -999.0 Overview: DATASET: MODIS 8-day Surface Reflectance BRDF-adjusted from Terra and Aqua COLLECTION: 5 DATA TYPE: IEEE754 float (32-bit float) BYTE ORDER: LITTLE ENDIAN (Intel) DIMS: 1203 rows x 738 columns FILL VALUE: -999.0 SPATIAL RESOLUTION: 1km PROJECTION: Lambert Azimuthal Equal Area
A zip file containing two archives: one containing KML formated contour files corresponding to similarly named KMZ SAR images in the other archive. Each entry in each archive indicates date, time, and satellite within the file name, i.e. the file 20100630120052_Radarsat1_contours.kml indicates that the file corresponds to a Radarsat-1 image taken June 30th, 2010 at 12:00:52 UTC, and that this is a "contours" files, i.e. a collection of contours indicating areal extent of apparent surface oil within the image's field of view.archive. Each entry in each archive indicates date, time, and satellite within the file name, i.e. the file 20100630120052_Radarsat1_contours.kml indicates that the file corresponds to a Radarsat-1 image taken June 30th, 2010 at 12:00:52 UTC, and that this is a "contours" files, i.e. a collection of contours indicating areal extent of apparent surface oil within the image's field of view.
This dataset contains the Marine Optical Buoy (MOBY) "gold" satellite sensor calibrated files. These are TXT ASCII files. The files have Lw and Es, Lwn, or Lw (and Lwn) using Modeled KL which have been weighted by the specific satellite channels relative spectral response. The MOBY project has included only data that is good or questionable. These MOBY gold files are specific to the depth pairs used to make the measurement (Lw1(1m), Lw2(1m) and Lw7(5m). Lw1 is the Water-Leaving Radiance calculated from LuTop and Kl1(LuTop-LuMid) Lw2 is the Water-Leaving Radiance calculated from LuTop and Kl2(LuTop-LuBot) Lw7 is the Water-Leaving Radiance calculated from LuMid and Kl3(LuMid-LuBot) The time period covered by the MOBY "gold" satellite sensor calibrated dataset begins at 1997-07-29. The project is ongoing and continuous. More data is added as updates to the data record become available. MOBY is an autonomous radiometric buoy stationed in the waters off Lanai, Hawaii. MOBY has been the primary in-water oceanic observatory for the vicarious calibration of U. S. satellite ocean color sensors since 1997. The MOBY project data set has been evaluated by NOAA, NASA, and NIST for its accuracy and precision numerous times since 1997. The MOBY measurements are calibrated with SI standards every per and post deployments to ensure the data sets continue to provide accurate and precise data. The satellite ocean color vicarious calibration community uses the data to validate the satellite measured radiance. MOBY was designed to measure sunlight incidents both on and scattered out of the ocean. These measurements are provided in near real-time for the vicarious calibration procedures conducted by ocean color scientists. It is a NOAA-funded project that provides for the vicarious calibration of ocean color satellites such as SeaWiFS and MODIS. Currently, MOBY provides data to JPSS VIIRS and to non-NOAA agency partners, including Copernicus Sentinel 3A and 3B.
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
For the purposes of training AI-based models to identify (map) road features in rural/remote tropical regions on the basis of true-colour satellite imagery, and subsequently testing the accuracy of these AI-derived road maps, we produced a dataset of 8904 satellite image ‘tiles’ and their corresponding known road features across Equatorial Asia (Indonesia, Malaysia, Papua New Guinea). Methods
The main dataset shared here was derived from a set of 200 input satellite images, also provided here. These 200 images are effectively ‘screenshots’ (i.e., reduced-resolution copies) of high-resolution true-colour satellite imagery (~0.5-1m pixel resolution) observed using the Elvis Elevation and Depth spatial data portal (https://elevation.fsdf.org.au/), which here is functionally equivalent to the more familiar Google Earth. Each of these original images was initially acquired at a resolution of 1920x886 pixels. Actual image resolution was coarser than the native high-resolution imagery. Visual inspection of these 200 images suggests a pixel resolution of ~5 meters, given the number of pixels required to span features of familiar scale, such as roads and roofs, as well as the ready discrimination of specific land uses, vegetation types, etc. These 200 images generally spanned either forest-agricultural mosaics or intact forest landscapes with limited human intervention. Sloan et al. (2023) present a map indicating the various areas of Equatorial Asia from which these images were sourced.
IMAGE NAMING CONVENTION
A common naming convention applies to satellite images’ file names:
XX##.png
where:
XX – denotes the geographical region / major island of Equatorial Asia of the image, as follows: ‘bo’ (Borneo), ‘su’ (Sumatra), ‘sl’ (Sulawesi), ‘pn’ (Papua New Guinea), ‘jv’ (java), ‘ng’ (New Guinea [i.e., Papua and West Papua provinces of Indonesia])
INTERPRETING ROAD FEATURES IN THE IMAGES For each of the 200 input satellite images, its road was visually interpreted and manually digitized to create a reference image dataset by which to train, validate, and test AI road-mapping models, as detailed in Sloan et al. (2023). The reference dataset of road features was digitized using the ‘pen tool’ in Adobe Photoshop. The pen’s ‘width’ was held constant over varying scales of observation (i.e., image ‘zoom’) during digitization. Consequently, at relatively small scales at least, digitized road features likely incorporate vegetation immediately bordering roads. The resultant binary (Road / Not Road) reference images were saved as PNG images with the same image dimensions as the original 200 images.
IMAGE TILES AND REFERENCE DATA FOR MODEL DEVELOPMENT
The 200 satellite images and the corresponding 200 road-reference images were both subdivided (aka ‘sliced’) into thousands of smaller image ‘tiles’ of 256x256 pixels each. Subsequent to image subdivision, subdivided images were also rotated by 90, 180, or 270 degrees to create additional, complementary image tiles for model development. In total, 8904 image tiles resulted from image subdivision and rotation. These 8904 image tiles are the main data of interest disseminated here. Each image tile entails the true-colour satellite image (256x256 pixels) and a corresponding binary road reference image (Road / Not Road).
Of these 8904 image tiles, Sloan et al. (2023) randomly selected 80% for model training (during which a model ‘learns’ to recognize road features in the input imagery), 10% for model validation (during which model parameters are iteratively refined), and 10% for final model testing (during which the final accuracy of the output road map is assessed). Here we present these data in two folders accordingly:
'Training’ – contains 7124 image tiles used for model training in Sloan et al. (2023), i.e., 80% of the original pool of 8904 image tiles. ‘Testing’– contains 1780 image tiles used for model validation and model testing in Sloan et al. (2023), i.e., 20% of the original pool of 8904 image tiles, being the combined set of image tiles for model validation and testing in Sloan et al. (2023).
IMAGE TILE NAMING CONVENTION A common naming convention applies to image tiles’ directories and file names, in both the ‘training’ and ‘testing’ folders: XX##_A_B_C_DrotDDD where
XX – denotes the geographical region / major island of Equatorial Asia of the original input 1920x886 pixel image, as follows: ‘bo’ (Borneo), ‘su’ (Sumatra), ‘sl’ (Sulawesi), ‘pn’ (Papua New Guinea), ‘jv’ (java), ‘ng’ (New Guinea [i.e., Papua and West Papua provinces of Indonesia])
A, B, C and D – can all be ignored. These values, which are one of 0, 256, 512, 768, 1024, 1280, 1536, and 1792, are effectively ‘pixel coordinates’ in the corresponding original 1920x886-pixel input image. They were recorded within the names of image tiles’ sub-directories and file names merely to ensure that names/directory were uniquely named)
rot – implies an image rotation. Not all image tiles are rotated, so ‘rot’ will appear only occasionally.
DDD – denotes the degree of image-tile rotation, e.g., 90, 180, 270. Not all image tiles are rotated, so ‘DD’ will appear only occasionally.
Note that the designator ‘XX##’ is directly equivalent to the filenames of the corresponding 1920x886-pixel input satellite images, detailed above. Therefore, each image tiles can be ‘matched’ with its parent full-scale satellite image. For example, in the ‘training’ folder, the subdirectory ‘Bo12_0_0_256_256’ indicates that its image tile therein (also named ‘Bo12_0_0_256_256’) would have been sourced from the full-scale image ‘Bo12.png’.
This dataset consists of ground-based Global Navigation Satellite System (GNSS) Combined Broadcast Ephemeris Data (daily files of all distinct navigation messages received in one day) from the NASA Crustal Dynamics Data Information System (CDDIS). GNSS provide autonomous geo-spatial positioning with global coverage. GNSS data sets from ground receivers at the CDDIS consist primarily of the data from the U.S. Global Positioning System (GPS) and the Russian GLObal NAvigation Satellite System (GLONASS). Since 2011, the CDDIS GNSS archive includes data from other GNSS (Europe’s Galileo, China’s Beidou, Japan’s Quasi-Zenith Satellite System/QZSS, the Indian Regional Navigation Satellite System/IRNSS, and worldwide Satellite Based Augmentation Systems/SBASs), which are similar to the U.S. GPS in terms of the satellite constellation, orbits, and signal structure. The daily GNSS broadcast ephemeris files contain one day of mixed GNSS navigation (30-second sampling) data in RINEX format from a global permanent network of ground-based receivers, one file per site. More information about these data is available on the CDDIS website at https://cddis.nasa.gov/Data_and_Derived_Products/GNSS/daily_30second_data.html.
This dataset contains NASA Langley Satellite Cloud products from the High Ice Water Content (HIWC) Radar Study project that took place in Fort Lauderdale, Florida. These data are derived from the GOES-13 satellite and are pixel-level cloud products. NASA-Langley cloud and radiation products are produced using the VISST (Visible Infrared Solar-infrared Split-Window Technique), SIST (Solar-infrared Infrared Split-Window Technique) and SINT (Solar-infrared Infrared Near-Infrared Technique). The data files are in netCDF format and are grouped into .tar files by day.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
UCS-JSpOC-soy-panel-22.csv
. This dataset combines and cleans data from the Union of Concerned Scientists and Space-Track.org to create a panel of satellites, operators, and years. This dataset is used in the paper "Oligopoly competition between satellite constellations will reduce economic welfare from orbit use". The final dataset can also be downloaded from the replication files for that paper: https://doi.org/10.57968/Middlebury.23816994.v1A "living" version of this repository can be found at: https://github.com/akhilrao/orbital-ownership-data# Repository structure* /UCS data
contains Excel and CSV data files from the Union of Concerned Scientists, as well as output files generated from data cleaning. You can find the UCS Satellite Database here: https://www.ucsusa.org/resources/satellite-database . Historical data was obtained from Dr. Teri Grimwood.* /Space-Track data
contains JSON data from Space-Track.org, files to help identify operator names for harmonization in UCS_text_cleaner.R
, and output generated from cleaning and merging data. * API queries to generate the JSON files can be found in json_cleaned_script.R. They are restated below for convenience. These queries were run on January 1, 2023 to produce the data used in "Oligopoly competition between satellite constellations will reduce economic welfare from orbit use". * 33999/OBJECT_TYPE/PAYLOAD/orderby/INTLDES asc/emptyresult/show* /Current R scripts
contains R scripts to process the data. * combined_scripts.R
loads and cleans UCS data. It takes the raw CSV files from /UCS data
as input and produces UCS_Combined_Data.csv
as output. * UCS_text_cleaner.R
harmonizes various text fields in the UCS data, including operator and owner names. Best efforts were made to ensure correctness and completeness, but some gaps may remain. * json_cleaned_script.R
loads and cleans Space-Track data, and merges it with the cleaned and combined UCS data. * panel_builder.R
uses the cleaned and merged files to construct the satellite-operator-year panel dataset with annual satellite histories and operator information. The logic behind the dataset construction approach is described in this blog post: https://akhilrao.github.io/blog//data/2020/08/20/build_stencil_cut/* /Output_figures
contains figures produced by the scripts. Some are diagnostic, some are just interesting.* /Output_data
contains the final data outputs.* /data-cleaning-notes
contains Excel and CSV files used to assist in harmonizing text fields in UCS_text_cleaner.R
. They are included here for completeness.# Creating the datasetTo reproduce the UCS-JSpOC-soy-panel-22.csv
dataset:1. Ensure R
is installed along with the required packages2. Run the scripts in /Current R scripts
in the following order: * combined_scripts.R
(this will call UCS_text_cleaner.R
) * json_cleaned_script.R
* panel_builder.R
3. The output file UCS-JSpOC-soy-panel-22.csv
, along with several intermediate files used to create it, will be generated in /Output data
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is the AI-ready benchmark dataset (OPSSAT-AD) containing the telemetry data acquired on board OPS-SAT---a CubeSat mission that has been operated by the European Space Agency.
It is accompanied by the paper with baseline results obtained using 30 supervised and unsupervised classic and deep machine learning algorithms for anomaly detection. They were trained and validated using the training-test dataset split introduced in this work, and we present a suggested set of quality metrics that should always be calculated to confront the new algorithms for anomaly detection while exploiting OPSSAT-AD. We believe that this work may become an important step toward building a fair, reproducible, and objective validation procedure that can be used to quantify the capabilities of the emerging anomaly detection techniques in an unbiased and fully transparent way.
The two included files are:
segments.csv with the acquired telemetry signals from ESA OPS-SAT aircraft,
dataset.csv with the extracted, synthetic features are computed for each manually split and labeled telemetry segment.
Please have a look at our two papers commenting on this dataset:
The benchmark paper with results of 30 supervised and unsupervised anomaly detection models for this collection:Ruszczak, B., Kotowski. K., Nalepa, J., Evans, D.: The OPS-SAT benchmark for detecting anomalies in satellite telemetry, 2024, preprint arxiv: 2407.04730,
the conference paper in which we presented some preliminary results for this dataset:Ruszczak, B., Kotowski. K., Andrzejewski, J., et al.: (2023). Machine Learning Detects Anomalies in OPS-SAT Telemetry. Computational Science – ICCS 2023. LNCS, vol 14073. Springer, Cham, DOI:10.1007/978-3-031-35995-8_21.
This dataset consists of ground-based Satellite Laser Ranging observation data (normal points, monthly files) from the NASA Crustal Dynamics Data Information System (CDDIS). SLR provides unambiguous range measurements to mm precision that can be aggregated over the global network to provide very accurate satellite orbits, time histories of station position and motion, and many other geophysical parameters. SLR operates in the optical region and is the only space geodetic technique that measures unambiguous range directly. Analysis of SLR data contributes to the terrestrial reference frame, modeling of the spatial and temporal variations of the Earth's gravitational field, and monitoring of millimeter-level variations in the location of the center of mass of the total Earth system (solid Earth-atmosphere-oceans). In addition, SLR provides precise orbit determination for spaceborne radar altimeter missions. It provides a means for sub-nanosecond global time transfer, and a basis for special tests of the Theory of General Relativity. Analysis Centers (ACs) of the International Laser Ranging Service (ILRS) retrieve SLR data on regular schedules to produce precise station positions and velocities for stations in the ILRS network. The monthly SLR normal point observation files contain one month of SLR data from a global network of stations ranging to satellites equipped with retroreflectors. Data are available in ILRS data format (older data sets) and/or the Consolidated Ranging Data (CRD) format. More information about these data is available on the CDDIS website at https://cddis.nasa.gov/Data_and_Derived_Products/SLR/Normal_point_data.html.
description: These files contain the environmental data as particular emissions or resources associated with a BEA sectors that are used in the USEEIO model. They are organized by the emission or resources type, as described in the manuscript. The main files (without SI) show the final "satellite tables" in the 'Exchanges' sheet which have emissions or resource use per USD for 2013. The other sheets in these files provide meta data for the create of the tables, including general information, sources, etc. The 'export' sheet is used for saving the satellite table for csv export. The data dictionary describes the fields in this sheet. The supporting files provide all the details data transformation and organization for the development of the satellite tables. This dataset is associated with the following publication: Yang, Y., W. Ingwersen, T. Hawkins, and D. Meyer. USEEIO: A new and transparent United States environmentally extended input-output model. JOURNAL OF CLEANER PRODUCTION. Elsevier Science Ltd, New York, NY, USA, 158: 308-318, (2017).; abstract: These files contain the environmental data as particular emissions or resources associated with a BEA sectors that are used in the USEEIO model. They are organized by the emission or resources type, as described in the manuscript. The main files (without SI) show the final "satellite tables" in the 'Exchanges' sheet which have emissions or resource use per USD for 2013. The other sheets in these files provide meta data for the create of the tables, including general information, sources, etc. The 'export' sheet is used for saving the satellite table for csv export. The data dictionary describes the fields in this sheet. The supporting files provide all the details data transformation and organization for the development of the satellite tables. This dataset is associated with the following publication: Yang, Y., W. Ingwersen, T. Hawkins, and D. Meyer. USEEIO: A new and transparent United States environmentally extended input-output model. JOURNAL OF CLEANER PRODUCTION. Elsevier Science Ltd, New York, NY, USA, 158: 308-318, (2017).