18 datasets found

f
Predictive modeling of treatment resistant depression using data from STAR*D...
plos.figshare.com
docx
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zhi Nie; Srinivasan Vairavan; Vaibhav A. Narayan; Jieping Ye; Qingqin S. Li (2023). Predictive modeling of treatment resistant depression using data from STAR*D and an independent clinical study [Dataset]. http://doi.org/10.1371/journal.pone.0197268
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0197268
Dataset updated
Jun 1, 2023
Dataset provided by
PLOS ONE
Authors
Zhi Nie; Srinivasan Vairavan; Vaibhav A. Narayan; Jieping Ye; Qingqin S. Li
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Identification of risk factors of treatment resistance may be useful to guide treatment selection, avoid inefficient trial-and-error, and improve major depressive disorder (MDD) care. We extended the work in predictive modeling of treatment resistant depression (TRD) via partition of the data from the Sequenced Treatment Alternatives to Relieve Depression (STAR*D) cohort into a training and a testing dataset. We also included data from a small yet completely independent cohort RIS-INT-93 as an external test dataset. We used features from enrollment and level 1 treatment (up to week 2 response only) of STAR*D to explore the feature space comprehensively and applied machine learning methods to model TRD outcome at level 2. For TRD defined using QIDS-C16 remission criteria, multiple machine learning models were internally cross-validated in the STAR*D training dataset and externally validated in both the STAR*D testing dataset and RIS-INT-93 independent dataset with an area under the receiver operating characteristic curve (AUC) of 0.70–0.78 and 0.72–0.77, respectively. The upper bound for the AUC achievable with the full set of features could be as high as 0.78 in the STAR*D testing dataset. Model developed using top 30 features identified using feature selection technique (k-means clustering followed by χ2 test) achieved an AUC of 0.77 in the STAR*D testing dataset. In addition, the model developed using overlapping features between STAR*D and RIS-INT-93, achieved an AUC of > 0.70 in both the STAR*D testing and RIS-INT-93 datasets. Among all the features explored in STAR*D and RIS-INT-93 datasets, the most important feature was early or initial treatment response or symptom severity at week 2. These results indicate that prediction of TRD prior to undergoing a second round of antidepressant treatment could be feasible even in the absence of biomarker data.
d
SWOT Level 2 River Single-Pass Vector Reach Data Product, Version 2.0
catalog.data.gov
Updated Apr 10, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
NASA/JPL/PODAAC (2025). SWOT Level 2 River Single-Pass Vector Reach Data Product, Version 2.0 [Dataset]. https://catalog.data.gov/dataset/swot-level-2-river-single-pass-vector-reach-data-product-version-2-0-58b1b
Explore at:
Dataset updated
Apr 10, 2025
Dataset provided by
NASA/JPL/PODAAC
Description
The SWOT Level 2 River Single-Pass Vector Reach Data Product from the Surface Water Ocean Topography (SWOT) mission provides water surface elevation, slope, width, and discharge derived from the high rate (HR) data stream from the Ka-band Radar Interferometer (KaRIn). SWOT launched on December 16, 2022 from Vandenberg Air Force Base in California into a 1-day repeat orbit for the "calibration" or "fast-sampling" phase of the mission, which completed in early July 2023. After the calibration phase, SWOT entered a 21-day repeat orbit in August 2023 to start the "science" phase of the mission, which is expected to continue through 2025. Water surface elevation, slope, width, and discharge are provided for river reaches (approximately 10 km long) and nodes (approximately 200 m spacing) identified in the prior river database, and distributed as feature datasets covering the full swath for each continent-pass. These data are generally produced for inland and coastal hydrology surfaces, as controlled by the reloadable KaRIn HR mask. The dataset is distributed in ESRI Shapefile format. This collection is a sub-collection of its parent: https://podaac.jpl.nasa.gov/dataset/SWOT_L2_HR_RiverSP_2.0 It contains only river reaches.
d
Protected Areas Database of the United States (PAD-US) 3.0 (ver. 2.0, March...
catalog.data.gov
data.usgs.gov
Updated Jul 6, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2024). Protected Areas Database of the United States (PAD-US) 3.0 (ver. 2.0, March 2023) [Dataset]. https://catalog.data.gov/dataset/protected-areas-database-of-the-united-states-pad-us-3-0-ver-2-0-march-2023
Explore at:
Dataset updated
Jul 6, 2024
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Area covered
United States
Description
The USGS Protected Areas Database of the United States (PAD-US) is the nation's inventory of protected areas, including public land and voluntarily provided private protected areas, identified as an A-16 National Geospatial Data Asset in the Cadastre Theme ( https://communities.geoplatform.gov/ngda-cadastre/ ). The PAD-US is an ongoing project with several published versions of a spatial database including areas dedicated to the preservation of biological diversity, and other natural (including extraction), recreational, or cultural uses, managed for these purposes through legal or other effective means. The database was originally designed to support biodiversity assessments; however, its scope expanded in recent years to include all open space public and nonprofit lands and waters. Most are public lands owned in fee (the owner of the property has full and irrevocable ownership of the land); however, permanent and long-term easements, leases, agreements, Congressional (e.g. 'Wilderness Area'), Executive (e.g. 'National Monument'), and administrative designations (e.g. 'Area of Critical Environmental Concern') documented in agency management plans are also included. The PAD-US strives to be a complete inventory of U.S. public land and other protected areas, compiling “best available” data provided by managing agencies and organizations. The PAD-US geodatabase maps and describes areas using thirty-six attributes and five separate feature classes representing the U.S. protected areas network: Fee (ownership parcels), Designation, Easement, Marine, Proclamation and Other Planning Boundaries. An additional Combined feature class includes the full PAD-US inventory to support data management, queries, web mapping services, and analyses. The Feature Class (FeatClass) field in the Combined layer allows users to extract data types as needed. A Federal Data Reference file geodatabase lookup table (PADUS3_0Combined_Federal_Data_References) facilitates the extraction of authoritative federal data provided or recommended by managing agencies from the Combined PAD-US inventory. This PAD-US Version 3.0 dataset includes a variety of updates from the previous Version 2.1 dataset (USGS, 2020, https://doi.org/10.5066/P92QM3NT ), achieving goals to: 1) Annually update and improve spatial data representing the federal estate for PAD-US applications; 2) Update state and local lands data as state data-steward and PAD-US Team resources allow; and 3) Automate data translation efforts to increase PAD-US update efficiency. The following list summarizes the integration of "best available" spatial data to ensure public lands and other protected areas from all jurisdictions are represented in the PAD-US (other data were transferred from PAD-US 2.1). Federal updates - The USGS remains committed to updating federal fee owned lands data and major designation changes in annual PAD-US updates, where authoritative data provided directly by managing agencies are available or alternative data sources are recommended. The following is a list of updates or revisions associated with the federal estate: 1) Major update of the Federal estate (fee ownership parcels, easement interest, and management designations where available), including authoritative data from 8 agencies: Bureau of Land Management (BLM), U.S. Census Bureau (Census Bureau), Department of Defense (DOD), U.S. Fish and Wildlife Service (FWS), National Park Service (NPS), Natural Resources Conservation Service (NRCS), U.S. Forest Service (USFS), and National Oceanic and Atmospheric Administration (NOAA). The federal theme in PAD-US is developed in close collaboration with the Federal Geographic Data Committee (FGDC) Federal Lands Working Group (FLWG, https://communities.geoplatform.gov/ngda-govunits/federal-lands-workgroup/ ). 2) Improved the representation (boundaries and attributes) of the National Park Service, U.S. Forest Service, Bureau of Land Management, and U.S. Fish and Wildlife Service lands, in collaboration with agency data-stewards, in response to feedback from the PAD-US Team and stakeholders. 3) Added a Federal Data Reference file geodatabase lookup table (PADUS3_0Combined_Federal_Data_References) to the PAD-US 3.0 geodatabase to facilitate the extraction (by Data Provider, Dataset Name, and/or Aggregator Source) of authoritative data provided directly (or recommended) by federal managing agencies from the full PAD-US inventory. A summary of the number of records (Frequency) and calculated GIS Acres (vs Documented Acres) associated with features provided by each Aggregator Source is included; however, the number of records may vary from source data as the "State Name" standard is applied to national files. The Feature Class (FeatClass) field in the table and geodatabase describe the data type to highlight overlapping features in the full inventory (e.g. Designation features often overlap Fee features) and to assist users in building queries for applications as needed. 4) Scripted the translation of the Department of Defense, Census Bureau, and Natural Resource Conservation Service source data into the PAD-US format to increase update efficiency. 5) Revised conservation measures (GAP Status Code, IUCN Category) to more accurately represent protected and conserved areas. For example, Fish and Wildlife Service (FWS) Waterfowl Production Area Wetland Easements changed from GAP Status Code 2 to 4 as spatial data currently represents the complete parcel (about 10.54 million acres primarily in North Dakota and South Dakota). Only aliquot parts of these parcels are documented under wetland easement (1.64 million acres). These acreages are provided by the U.S. Fish and Wildlife Service and are referenced in the PAD-US geodatabase Easement feature class 'Comments' field. State updates - The USGS is committed to building capacity in the state data-steward network and the PAD-US Team to increase the frequency of state land updates, as resources allow. The USGS supported efforts to significantly increase state inventory completeness with the integration of local parks data in the PAD-US 2.1, and developed a state-to-PAD-US data translation script during PAD-US 3.0 development to pilot in future updates. Additional efforts are in progress to support the technical and organizational strategies needed to increase the frequency of state updates. The PAD-US 3.0 included major updates to the following three states: 1) California - added or updated state, regional, local, and nonprofit lands data from the California Protected Areas Database (CPAD), managed by GreenInfo Network, and integrated conservation and recreation measure changes following review coordinated by the data-steward with state managing agencies. Developed a data translation Python script (see Process Step 2 Source Data Documentation) in collaboration with the data-steward to increase the accuracy and efficiency of future PAD-US updates from CPAD. 2) Virginia - added or updated state, local, and nonprofit protected areas data (and removed legacy data) from the Virginia Conservation Lands Database, provided by the Virginia Department of Conservation and Recreation's Natural Heritage Program, and integrated conservation and recreation measure changes following review by the data-steward. 3) West Virginia - added or updated state, local, and nonprofit protected areas data provided by the West Virginia University, GIS Technical Center. For more information regarding the PAD-US dataset please visit, https://www.usgs.gov/gapanalysis/PAD-US/. For more information about data aggregation please review the PAD-US Data Manual available at https://www.usgs.gov/core-science-systems/science-analytics-and-synthesis/gap/pad-us-data-manual . A version history of PAD-US updates is summarized below (See https://www.usgs.gov/core-science-systems/science-analytics-and-synthesis/gap/pad-us-data-history for more information): 1) First posted - April 2009 (Version 1.0 - available from the PAD-US: Team pad-us@usgs.gov). 2) Revised - May 2010 (Version 1.1 - available from the PAD-US: Team pad-us@usgs.gov). 3) Revised - April 2011 (Version 1.2 - available from the PAD-US: Team pad-us@usgs.gov). 4) Revised - November 2012 (Version 1.3) https://doi.org/10.5066/F79Z92XD 5) Revised - May 2016 (Version 1.4) https://doi.org/10.5066/F7G73BSZ 6) Revised - September 2018 (Version 2.0) https://doi.org/10.5066/P955KPLE 7) Revised - September 2020 (Version 2.1) https://doi.org/10.5066/P92QM3NT 8) Revised - January 2022 (Version 3.0) https://doi.org/10.5066/P9Q9LQ4B Comparing protected area trends between PAD-US versions is not recommended without consultation with USGS as many changes reflect improvements to agency and organization GIS systems, or conservation and recreation measure classification, rather than actual changes in protected area acquisition on the ground.

Analytic Engineer

kaggle.com

Updated Jun 7, 2022

Facebook

Twitter

Click to copy link

Link copied

Cite

NxtWave Data Engineers (2022). Analytic Engineer [Dataset]. https://www.kaggle.com/datasets/nxtwavedataengineers/data-engineer

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Jun 7, 2022

Dataset provided by

Kagglehttp://kaggle.com/

Authors

NxtWave Data Engineers

Description

Context:

You are an Analytics Engineer at an EdTech company focused on improving customer learning experiences. Your team relies on in-depth analysis of user data to enhance the learning journey and inform product feature updates.

Objective:

Your mission is to transform raw data into structured views that enable data analysts to effectively monitor and analyze user activities, performance patterns, and feedback. These insights are critical for data-informed decision-making within the Customer Success team.

Dataset Overview:

Your company organizes content in a hierarchical structure, categorized as Track → Course → Topic → Lesson. Each lesson can take various formats, such as videos, practice exercises, exams, etc.
Any learning activity done by the user on a lesson is stored in logs in the user_lesson_progress_log table. A user can have multiple logs for a lesson in a day.
You have user registration data that store registration information and demographic information of users.
A user can give feedback on a lesson multiple times.

DB Diagram: https://dbdiagram.io/d/627100b17f945876b6a93e54 (use the ‘Highlight’ option to understand the relationships)

Table Descriptions

track_table: Contains all tracks

Column	Description	Schema
track_id	unique id for an individual track	string
track_title	name of the track	string

course_table: Contains all courses

Column	Description	Schema
course_id	unique id for an individual course	string
track_id	track id to which this course belongs to	string
course_title	name of the course	string

topic_table: Contains all topics

Column	Description	Schema
topic_id	unique id for an individual topic	string
course_id	course id to which this topic belongs to	string
topic_title	name of the topic	string

lesson_table: Contains all lessons

Column	Description	Schema
lesson_id	unique id for individual lesson	string
topic_id	topic id to which this lesson belongs to	string
lesson_title	name of the lesson	string
lesson_type	type of the lesson i.e., it may be practice, video, exam	string
`duration_in_sec`	ideal duration of the lesson (in seconds) in which user can complete the lesson	float

user_registrations: Contains the registration information of the users. A user has only one entry

Column	Description	Schema
user_id	unique id for an individual user	string
registration_date	date at which a user registered	string
user_info	contains information about the users. The field stores address, education_info, and profile in JSON format	string

user_lesson_progress_log: Any learning activity done by the user on a lesson is stored in logs. A user can have multiple logs for a lesson in a day. Every time a lesson completion percentage of a user is updated, a log is recorded here.

Column	Description	Schema
id	unique id for each entry	string
user_id	unique id for an individual user	string
lesson_id	unique id for a particular lesson	string
`overall_completion_percentage`	total completion percentage of the lesson at the time of log	float
`completion_percentage_difference`	Difference between the overall _completion _percentage of the lesson and the immediate preceding overall _completion _percentage	float
`activity_recorded_datetime_in_ utc`	datetime at which the user has done some activity on the lesson	datetime

Example: If a user u1 has started the lesson lesson1 and completed 10% of the lesson at May 1st 2022 8:00:00 UTC. And, the user completed 30% of the lesson at May 1st 2022 10:00:00 UTC and 20% of the lesson at May 3rd 2022 10:00:00 UTC, then the logs are recorded as follows:

id	user_id	lesson_id	`overall_completion_percentage`	`completion_percentage_difference`	`activity_recorded_datetime_in_utc`
id1	u1	lesson1	10	10	2022-05-01 08:00:00
id2	u1	lesson1	40	30	2022-05-01 10:00:00
id3	u1	lesson1	60	20	2022-05-03 10:00:00

user_feedback: The table contains the feedback data given by the users. A user can give feedback to a lesson multiple times. Each feedback contains multiple questions. Each question and response is stored in an entry.

Column	Description	Schema
id	unique id for each entry	string
feedback_id	unique id for each feedback	string
creation_datetime	datetime at which user gave a feedback	string
user_id	user id who gave the feedback	float
lesson_id	...

W
Asset database for the Gloucester subregion on 12 February 2016
cloud.csiss.gmu.edu
researchdata.edu.au
+2more
Updated Dec 13, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Australia (2019). Asset database for the Gloucester subregion on 12 February 2016 [Dataset]. https://cloud.csiss.gmu.edu/uddi/dataset/72a47bec-1393-49d6-b379-0e48551d26a9
Explore at:
Dataset updated
Dec 13, 2019
Dataset provided by
Australia
Description
Abstract

The dataset was derived by the Bioregional Assessment Programme from multiple source datasets. The source datasets are identified in the Lineage field in this metadata statement. The processes undertaken to produce this derived dataset are described in the History field in this metadata statement.

This Gloucester dataset contains v8.2 of the Asset database (GLO_asset_database_20160212.mdb), a Geodatabase version for GIS mapping purposes (GLO_asset_database_20160212_GISOnly.gdb), the draft Water Dependent Asset Register spreadsheet (BA-NSB-GLO-130-WaterDependentAssetRegister-AssetList-v20160212.xlsx), a data dictionary (GLO_asset_database_doc_20160212.doc), a folder (Indigenous_doc) containing documentation associated with Indigenous water asset project, a folder (NRM_DOC) and a folder (NRM_DOC) containing documentation associated with the Water Asset Information Tool (WAIT) process as outlined below.

The Gloucester Asset database v8.2 supersedes the previous version of the GLO Asset database in asset relevant tables/ feature class only (i.e. AssetDecisions, AssetList, Element_to_Asset, ElementList, tbl_Indigenous_water_asset, tbl_GAL_Species_TEC_decisions_review_23112015 in GLO_asset_database_20160212.mdb and GM_GLO_AssetList_pt, GM_GLO_ElementList_pt in GLO_asset_database_20160212_GISOnly.gdb). This version of GLO asset database has been updated to:

(1) Total number of registered water assets was increased by 18 due to:

(a) The 3 assets changed M2 test to "Yes" from the review done by Ecologist group.

(b) 15 indigenous water assets from OWS were added.

The Asset database is registered to the BA repository as an ESRI personal goedatabase (.mdb - doubling as a MS Access database) that can store, query, and manage non-spatial data while the spatial data is in a separated file geodatabase joined by AID/Element ID/BARID. Under the BA program, a spatial assets database is developed for each defined bioregional assessment project. The spatial elements that underpin the identification of water dependent assets are identified in the first instance by regional NRM organisations (via the WAIT tool) and supplemented with additional elements from national and state/territory government datasets. All reports received associated with the WAIT process for Gloucester are included in the zip file as part of this dataset. Elements are initially included in the preliminary assets database if they are partly or wholly within the subregion's preliminary assessment extent (Materiality Test 1, M1). Elements are then grouped into assets which are evaluated by project teams to determine whether they meet the second Materiality Test (M2). Assets meeting both Materiality Tests comprise the water dependent asset list. Descriptions of the assets identified in the Gloucester subregion are found in the "AssetList" table of the database. In this version of the database only M1 has been assessed. Assets are the spatial features used by project teams to model scenarios under the BA program. Detailed attribution does not exist at the asset level. Asset attribution includes only the core set of BA-derived attributes reflecting the BA classification hierarchy, as described in Appendix A of "GLO_asset_database_doc_20160212.doc", located in the zip file as part of this dataset. The "Element_to_Asset" table contains the relationships and identifies the elements that were grouped to create each asset. Detailed information describing the database structure and content can be found in the document "GLO_asset_database_doc_20160212.doc" located in the zip file. The public version of this asset database can be accessed via the following dataset: Asset database for the Gloucester subregion on 12 February 2016 Public v02 (https://data.gov.au/data/dataset/5def411c-dbc4-4b75-b509-4230964ce0fa).

Purpose

Used for Gloucester subregion for bioregional assessments

The public version of this asset database can be accessed via the following dataset: Asset database for the Gloucester subregion on 12 February 2016 Public v02 (https://data.gov.au/data/dataset/5def411c-dbc4-4b75-b509-4230964ce0fa).

Dataset History

VersionID Date Notes

1.0 17/03/2014 Initial database

1.01 19/03/2014 Update classification using latest one

2.0 23/05/2014 Update asset area for some assets

3.0 9/07/2014 updated to include new assets and elements identified by community.

4.0 29/08/2014 updated assets and elements from WSP

5.0 4/09/2014 Table AssetDecisions is added to record decision making process and decisions about M2 are also added in table

asset list

6.0 8/04/2015 195/9 Groundwater economic point elements/assets were added in while 81/7 Groundwater economic point

elements/assets were turned off

7.0 27/05/2015 The receptor data ( tables: ReceptorList, tbl_Receptors_GDE, tbl_Receptors_GW, tbl_Receptors_SW and

tbl_Receptors_SW_Catchment_Ref_Only; and spatial data: GM_GLO_ReceptorList_pt) is added

7.1 21/08/2015 "(1) Delete (a) line 26 from tab "Description" and (b) column E from tab "Receptor register" about "Depth" parameters

in BA-NSB-GLO-140-ReceptorRegister-v20150821.xlsx (2) Delete field of "Depth" from table "ReceptorList" in GLO_asset_database_20150821.mdb (3) Add two fields of "InRegister" and "Registered Date" to table "ReceptorList" in GLO_asset_database_20150821.mdb for the consistency with other subregions in the future"

8 16/09/2015 "(1) (a) Update Latitude, Longitude, LandscapeClass using the latest data from GLO project team and update the values

for RegisteredDate, and Group using "GDE", "SW" and "GW" in table ReceptorList in GLO_asset_database_20150916.mdb; (b) Create draft BA-NSB-GLO-140-ReceptorRegister-v20150916.xlsx (2) Update tbl_Receptors_GDE, tbl_Receptors_GW and tbl_Receptors_SW in GLO_asset_database_20150916.mdb, using the latest data from GLO project team. (3) Update GM_GLO_ReceptorList_pt in GLO_asset_database_20150916_GISOnly.gdb, using the latest data from GLO project team"

8.1 29/10/2015 (a) Update LandscapeClass field in table ReceptorList for all 222 economic Receptors to match the latest decision about

this parameter (b) Create draft BA-NSB-GLO-140-ReceptorRegister-v20151029.xlsx

8.2 12/02/2016 "(1) Total number of registered water assets was increased by 18 due to:

(a) The 3 assets changed M2 test to "Yes" from the review done by Ecologist group. The original data is included the database as the table tbl_GLO_Species_TEC_decisions_review_23112015 (b) 15 indigenous water assets from OWS were added. The data and documents from OWS are included in subdirectory Indigenous_doc (c)The draft new Water Dependent Asset Register file (BA-NSB-GLO-130-WaterDependentAssetRegister-AssetList- v20160212.xlsx) was created"

The source metadata was updated to meet the purpose of the Bioregional Assessment Programme

Dataset Citation

Bioregional Assessment Programme (2014) Asset database for the Gloucester subregion on 12 February 2016. Bioregional Assessment Derived Dataset. Viewed 18 July 2018, http://data.bioregionalassessments.gov.au/dataset/72a47bec-1393-49d6-b379-0e48551d26a9.

Dataset Ancestors

Derived From Standard Instrument Local Environmental Plan (LEP) - Heritage (HER) (NSW)

Derived From NSW Office of Water GW licence extract linked to spatial locations - GLO v5 UID elements 27032014

Derived From Asset database for the Gloucester subregion on 21 August 2015

Derived From Gloucester digitised coal mine boundaries

Derived From Groundwater Dependent Ecosystems supplied by the NSW Office of Water on 13/05/2014

Derived From [NSW Office of Water GW licence extract linked to spatial locations GLOv4 UID
c
Standardization in Quantitative Imaging: A Multi-center Comparison of...
cancerimagingarchive.net
stage.cancerimagingarchive.net
+1more
n/a, nifti and zip +1
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Cancer Imaging Archive, Standardization in Quantitative Imaging: A Multi-center Comparison of Radiomic Feature Values [Dataset]. http://doi.org/10.7937/tcia.2020.9era-gg29
Explore at:
xlsx, n/a, nifti and zipAvailable download formats
Unique identifier
https://doi.org/10.7937/tcia.2020.9era-gg29
Dataset authored and provided by
The Cancer Imaging Archive
License
https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
Time period covered
Jun 9, 2020
Dataset funded by
National Cancer Institutehttp://www.cancer.gov/
Description
This dataset was used by the NCI's Quantitative Imaging Network (QIN) PET-CT Subgroup for their project titled: Multi-center Comparison of Radiomic Features from Different Software Packages on Digital Reference Objects and Patient Datasets. The purpose of this project was to assess the agreement among radiomic features when computed by several groups by using different software packages under very tightly controlled conditions, which included common image data sets and standardized feature definitions. The image datasets (and Volumes of Interest – VOIs) provided here are the same ones used in that project and reported in the publication listed below (ISSN 2379-1381 https://doi.org/10.18383/j.tom.2019.00031). In addition, we have provided detailed information about the software packages used (Table 1 in that publication) as well as the individual feature value results for each image dataset and each software package that was used to create the summary tables (Tables 2, 3 and 4) in that publication. For that project, nine common quantitative imaging features were selected for comparison including features that describe morphology, intensity, shape, and texture and that are described in detail in the International Biomarker Standardisation Initiative (IBSI, https://arxiv.org/abs/1612.07003 and publication (Zwanenburg A. Vallières M, et al, The Image Biomarker Standardization Initiative: Standardized Quantitative Radiomics for High-Throughput Image-based Phenotyping. Radiology. 2020 May;295(2):328-338. doi: https://doi.org/10.1148/radiol.2020191145). There are three datasets provided – two image datasets and one dataset consisting of four excel spreadsheets containing feature values.

The first image dataset is a set of three Digital Reference Objects (DROs) used in the project, which are: (a) a sphere with uniform intensity, (b) a sphere with intensity variation (c) a nonspherical (but mathematically defined) object with uniform intensity. These DROs were created by the team at Stanford University and are described in (Jaggi A, Mattonen SA, McNitt-Gray M, Napel S. Stanford DRO Toolkit: digital reference objects for standardization of radiomic features. Tomography. 2019;6:–.) and are a subset of the DROs described in DRO Toolkit. Each DRO is represented in both DICOM and NIfTI format and the VOI was provided in each format as well (DICOM Segmentation Object (DSO) as well as NIfTI segmentation boundary).

The second image dataset is the set of 10 patient CT scans, originating from the LIDC-IDRI dataset, that were used in the QIN multi-site collection of Lung CT data with Nodule Segmentations project ( https://doi.org/10.7937/K9/TCIA.2015.1BUVFJR7 ). In that QIN study, a single lesion from each case was identified for analysis and then nine VOIs were generated using three repeat runs of three segmentation algorithms (one from each of three academic institutions) on each lesion. To eliminate one source of variability in our project, only one of the VOIs previously created for each lesion was identified and all sites used that same VOI definition. The specific VOI chosen for each lesion was the first run of the first algorithm (algorithm 1, run 1). DICOM images were provided for each dataset and the VOI was provided in both DICOM Segmentation Object (DSO) and NIfTI segmentation formats.

The third dataset is a collection of four excel spreadsheets, each of which contains detailed information corresponding to each of the four tables in the publication. For example, the raw feature values and the summary tables for Tables 2,3 and 4 reported in the publication cited (https://doi.org/10.18383/j.tom.2019.00031). These tables are:

Software Package details : This table contains detailed information about the software packages used in the study (and listed in Table 1 in the publication) including version number and any parameters specified in the calculation of the features reported. DRO results : This contains the original feature values obtained for each software package for each DRO as well as the table summarizing results across software packages (Table 2 in the publication) . Patient Dataset results: This contains the original feature values for each software package for each patient dataset (1 lesion per case) as well as the table summarizing results across software packages and patient datasets (Table 3 in the publication). Harmonized GLCM Entropy Results : This contains the values for the “Harmonized” GLCM Entropy feature for each patient dataset and each software package as well as the summary across software packages (Table 4 in the publication).
COVID-19 Case Surveillance Public Use Data
data.cdc.gov
opendatalab.com
+5more
application/rdfxml +5
Updated Jul 9, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CDC Data, Analytics and Visualization Task Force (2024). COVID-19 Case Surveillance Public Use Data [Dataset]. https://data.cdc.gov/Case-Surveillance/COVID-19-Case-Surveillance-Public-Use-Data/vbim-akqf
Explore at:
application/rdfxml, tsv, csv, json, xml, application/rssxmlAvailable download formats
Dataset updated
Jul 9, 2024
Dataset provided by
Centers for Disease Control and Preventionhttp://www.cdc.gov/
Authors
CDC Data, Analytics and Visualization Task Force
License
https://www.usa.gov/government-workshttps://www.usa.gov/government-works
Description
Note: Reporting of new COVID-19 Case Surveillance data will be discontinued July 1, 2024, to align with the process of removing SARS-CoV-2 infections (COVID-19 cases) from the list of nationally notifiable diseases. Although these data will continue to be publicly available, the dataset will no longer be updated.

Authorizations to collect certain public health data expired at the end of the U.S. public health emergency declaration on May 11, 2023. The following jurisdictions discontinued COVID-19 case notifications to CDC: Iowa (11/8/21), Kansas (5/12/23), Kentucky (1/1/24), Louisiana (10/31/23), New Hampshire (5/23/23), and Oklahoma (5/2/23). Please note that these jurisdictions will not routinely send new case data after the dates indicated. As of 7/13/23, case notifications from Oregon will only include pediatric cases resulting in death.

This case surveillance public use dataset has 12 elements for all COVID-19 cases shared with CDC and includes demographics, any exposure history, disease severity indicators and outcomes, presence of any underlying medical conditions and risk behaviors, and no geographic data.

CDC has three COVID-19 case surveillance datasets:
COVID-19 Case Surveillance Public Use Data with Geography: Public use, patient-level dataset with clinical data (including symptoms), demographics, and county and state of residence. (19 data elements)
COVID-19 Case Surveillance Public Use Data: Public use, patient-level dataset with clinical and symptom data and demographics, with no geographic data. (12 data elements)
COVID-19 Case Surveillance Restricted Access Detailed Data: Restricted access, patient-level dataset with clinical and symptom data, demographics, and state and county of residence. Access requires a registration process and a data use agreement. (33 data elements)
The following apply to all three datasets:
Data elements can be found on the COVID-19 case report form located at www.cdc.gov/coronavirus/2019-ncov/downloads/pui-form.pdf.
Data are considered provisional by CDC and are subject to change until the data are reconciled and verified with the state and territorial data providers.
Some data cells are suppressed to protect individual privacy.
The datasets will include all cases with the earliest date available in each record (date received by CDC or date related to illness/specimen collection) at least 14 days prior to the creation of the current datasets. This 14-day lag allows case reporting to be stabilized and ensures that time-dependent outcome data are accurately captured.
Datasets are updated monthly.
Datasets are created using CDC’s Policy on Public Health Research and Nonresearch Data Management and Access and include protections designed to protect individual privacy.
For more information about data collection and reporting, please see https://www.cdc.gov/coronavirus/2019-ncov/covid-data/about-us-cases-deaths.html.
For more information about the COVID-19 case surveillance data, please see https://www.cdc.gov/coronavirus/2019-ncov/covid-data/faq-surveillance.html

Overview

The COVID-19 case surveillance database includes individual-level data reported to U.S. states and autonomous reporting entities, including New York City and the District of Columbia (D.C.), as well as U.S. territories and affiliates. On April 5, 2020, COVID-19 was added to the Nationally Notifiable Condition List and classified as “immediately notifiable, urgent (within 24 hours)” by a Council of State and Territorial Epidemiologists (CSTE) Interim Position Statement (Interim-20-ID-01). CSTE updated the position statement on August 5, 2020, to clarify the interpretation of antigen detection tests and serologic test results within the case classification (Interim-20-ID-02). The statement also recommended that all states and territories enact laws to make COVID-19 reportable in their jurisdiction, and that jurisdictions conducting surveillance should submit case notifications to CDC. COVID-19 case surveillance data are collected by jurisdictions and reported voluntarily to CDC.

For more information: NNDSS Supports the COVID-19 Response | CDC.

The deidentified data in the “COVID-19 Case Surveillance Public Use Data” include demographic characteristics, any exposure history, disease severity indicators and outcomes, clinical data, laboratory diagnostic test results, and presence of any underlying medical conditions and risk behaviors. All data elements can be found on the COVID-19 case report form located at www.cdc.gov/coronavirus/2019-ncov/downloads/pui-form.pdf.

COVID-19 Case Reports

COVID-19 case reports have been routinely submitted using nationally standardized case reporting forms. On April 5, 2020, CSTE released an Interim Position Statement with national surveillance case definitions for COVID-19 included. Current versions of these case definitions are available here: https://ndc.services.cdc.gov/case-definitions/coronavirus-disease-2019-2021/.

All cases reported on or after were requested to be shared by public health departments to CDC using the standardized case definitions for laboratory-confirmed or probable cases. On May 5, 2020, the standardized case reporting form was revised. Case reporting using this new form is ongoing among U.S. states and territories.

Data are Considered Provisional

The COVID-19 case surveillance data are dynamic; case reports can be modified at any time by the jurisdictions sharing COVID-19 data with CDC. CDC may update prior cases shared with CDC based on any updated information from jurisdictions. For instance, as new information is gathered about previously reported cases, health departments provide updated data to CDC. As more information and data become available, analyses might find changes in surveillance data and trends during a previously reported time window. Data may also be shared late with CDC due to the volume of COVID-19 cases.
Annual finalized data: To create the final NNDSS data used in the annual tables, CDC works carefully with the reporting jurisdictions to reconcile the data received during the year until each state or territorial epidemiologist confirms that the data from their area are correct.
Access Addressing Gaps in Public Health Reporting of Race and Ethnicity for COVID-19, a report from the Council of State and Territorial Epidemiologists, to better understand the challenges in completing race and ethnicity data for COVID-19 and recommendations for improvement.

Data Limitations

To learn more about the limitations in using case surveillance data, visit FAQ: COVID-19 Data and Surveillance.

Data Quality Assurance Procedures

CDC’s Case Surveillance Section routinely performs data quality assurance procedures (i.e., ongoing corrections and logic checks to address data errors). To date, the following data cleaning steps have been implemented:
Questions that have been left unanswered (blank) on the case report form are reclassified to a Missing value, if applicable to the question. For example, in the question “Was the individual hospitalized?” where the possible answer choices include “Yes,” “No,” or “Unknown,” the blank value is recoded to Missing because the case report form did not include a response to the question.
Logic checks are performed for date data. If an illogical date has been provided, CDC reviews the data with the reporting jurisdiction. For example, if a symptom onset date in the future is reported to CDC, this value is set to null until the reporting jurisdiction updates the date appropriately.
Additional data quality processing to recode free text data is ongoing. Data on symptoms, race and ethnicity, and healthcare worker status have been prioritized.

Data Suppression

To prevent release of data that could be used to identify people, data cells are suppressed for low frequency (<5) records and indirect identifiers (e.g., date of first positive specimen). Suppression includes rare combinations of demographic characteristics (sex, age group, race/ethnicity). Suppressed values are re-coded to the NA answer option; records with data suppression are never removed.

For questions, please contact Ask SRRG (eocevent394@cdc.gov).

Additional COVID-19 Data

COVID-19 data are available to the public as summary or aggregate count files, including total counts of cases and deaths by state and by county. These
d
Asset database for the Clarence-Moreton bioregion on 16 September 2015
data.gov.au
researchdata.edu.au
+1more
Updated Nov 19, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bioregional Assessment Program (2019). Asset database for the Clarence-Moreton bioregion on 16 September 2015 [Dataset]. https://data.gov.au/data/dataset/activity/e7940ec8-ec73-4cc5-bc4e-0c85f98354f1
Explore at:
Dataset updated
Nov 19, 2019
Dataset provided by
Bioregional Assessment Program
Description
Abstract

The dataset was derived by the Bioregional Assessment Programme from multiple source datasets. The source datasets are identified in the Lineage field in this metadata statement. The processes undertaken to produce this derived dataset are described in the History field in this metadata statement.

This V7 database has been updated to include the Receptor data from Clarence-Moreton assessment team. The relevant tables of ReceptorList and tbl_Receptors and the spatial data of GM_CLM_ReceptorList_pt were added to this version. The Clarence-Moreton Asset database v7 supersedes the previous version of the Asset database only in Receptor relevant tables/ feature class (i.e. ReceptorList, tbl_Receptors and GM_CLM_ReceptorList_pt ).

This dataset contains v7 of the Asset database (CLM_asset_database_20150916.mdb), a Geodatabase version for GIS mapping purposes (CLM_asset_database_20150916_GISOnly.gdb), the draft Receptor Register spreadsheet (BA-CLM-CLM-140-ReceptorRegister-V20150916.xlsx), a data dictionary (CLM_asset_database_doc_20150916.doc), and a folder (NRM_DOC) containing documentation associated with the WAIT process as outlined below.

Under the BA program, a spatial assets database is developed for each defined bioregional assessment project. The spatial elements that underpin the identification of water dependent assets are identified in the first instance by regional NRM organisations (via the WAIT tool) and supplemented with additional elements from national and state/territory government datasets. All reports received associated with the WAIT process for Clarence-Morton are included in the zip file as part of this dataset. Elements are initially included in the preliminary assets database if they are partly or wholly within the subregion's preliminary assessment extent (Materiality Test 1, M1). Elements are then grouped into assets which are evaluated by project teams to determine whether they meet the second Materiality Test (M2). Assets meeting both Materiality Tests comprise the water dependent asset list. Descriptions of the assets identified in the Clarence-Morton subregion are found in the "AssetList" table of the database. In this version of the database M1 and M2 have both been assessed. Assets are the spatial features used by project teams to model scenarios under the BA program. Detailed attribution does not exist at the asset level. Asset attribution includes only the core set of BA-derived attributes reflecting the BA classification hierarchy, as described in Appendix A of "CLM_asset_database_doc_20150916.doc", located in the zip file as part of this dataset. The "Element_to_Asset" table contains the relationships and identifies the elements that were grouped to create each asset. Detailed information describing the database structure and content can be found in the document "CLM_asset_database_doc_20150916.doc" located in the zip file. Some of the source data used in the compilation of this dataset is restricted.

Dataset History

OBJECTID VersionID Date_ Notes

1 1 2/07/2014 Initial database.

2 2 15/08/2014 Initial database with new WSP assets

3 3 9/09/2014 add 87 line assets from early SEQLD WAIT data ; updated NSW RegERiv and GWMP assets, and changed AID (5010 to 5180 to 15010 to 15180)

4 3.1 15/09/2014 " Updated class ""Groundwater-dependent ecosystems"" to ""Groundwater-dependent ecosystem"""

5 4 20/02/2015 Add additional eight datasets from the community consultation and assessment team: QLD RE 11, NSW GDE, QLD Wetland System 100K, QLD GDE Surface Areas, QLD GDE Line, QLD GDE Terrestrial Areas, NGIS QLD Bores and AHGF Network Stream. Turn off National GDE assets

6 5 3/06/2015 As requested by CLM assessment team, those NGIS economic assets AIDs from17009 and 17013 inclusive replace previous NGIS ecological assets (from same QLD NGIS elements) AIDs from16800 and 16803 inclusive. Turn off assets AIDs from16800 and 16803

7 6 6/07/2015 This v6 CLM Asset database includes M2 test results (Does the asset pass the water dependency test? or WDTest ) from Clarence-Moreton assessment project team.

8 6.1 19/08/2015 "(1) Corrected the spelling error of PAE_Region to Clarence-Moreton for all assets and elements

(2) (a) Extracted long ( >255 characters) WD rationale for 2 assets in tab "Water-dependent asset register" and 4 assets in tab "Asset list " in 1.30 Excel file (b) recreated in BA-CLM-CLM-130-WaterDependentAssetRegister-AssetList-V20150819.xlsx

(3) Modified queries (Find_All_Asset_List and Find_Waterdependent_asset_register) for (2)(a)"

9 7 16/09/2015 "(1)(a) Add table ReceptorList in CLM_asset_database_20150916.mdb, using the Excel file from CLM project team (b) Create draft BA-CLM-CLM-140-ReceptorRegister-V20150916.xlsx

(2) Add table tbl_Receptors in CLM_asset_database_20150916.mdb and GM_CLM_ReceptorList_pt in CLM_asset_database_20150916_GISOnly.gdb, using the spatial data from CLM project team;

(3)Add SQL query "Find_used_Receptor" for extracting all used receptor for the register"

Dataset Citation

Bioregional Assessment Programme (2014) Asset database for the Clarence-Moreton bioregion on 16 September 2015. Bioregional Assessment Derived Dataset. Viewed 10 July 2017, http://data.bioregionalassessments.gov.au/dataset/e7940ec8-ec73-4cc5-bc4e-0c85f98354f1.

Dataset Ancestors

Derived From QLD Dept of Natural Resources and Mines, Groundwater Entitlements 20131204

Derived From Combined Surface Waterbodies for the Clarence-Moreton bioregion

Derived From Queensland QLD - Regional - NRM - Water Asset Information Tool - WAIT - databases

Derived From Version 02 Asset list for Clarence Morton 8/8/2014 - ERIN ORIGINAL DATA

Derived From Asset database for the Clarence-Moreton bioregion on 11 December 2014, minor version v20150603

Derived From NSW Office of Water Surface Water Entitlements Locations v1_Oct2013

Derived From Geofabric Surface Catchments - V2.1

Derived From Matters of State environmental significance (version 4.1), Queensland

Derived From Geofabric Surface Network - V2.1

Derived From Communities of National Environmental Significance Database - RESTRICTED - Metadata only

Derived From Ramsar Wetlands of Australia

Derived From National Groundwater Dependent Ecosystems (GDE) Atlas

Derived From QLD Dept of Natural Resources and Mines, Groundwater Entitlements linked to bores v3 03122014

Derived From Multi-resolution Valley Bottom Flatness MrVBF at three second resolution CSIRO 20000211

Derived From National Groundwater Information System (NGIS) v1.1

Derived From Birds Australia - Important Bird Areas (IBA) 2009

Derived From Geofabric Surface Network - V2.1.1

Derived From Queensland QLD Regional CMA Water Asset Information WAIT tool databases RESTRICTED Includes ALL Reports

Derived From Queensland wetland data version 3 - wetland areas.

Derived From Multi-resolution Ridge Top Flatness at 3 second resolution CSIRO 20000211

Derived From South East Queensland GDE (draft)

Derived From Geofabric Surface Cartography - V2.1

Derived From Version 01 Asset list for Clarence Morton 10/3/2014 - ERIN ORIGINAL DATA

Derived From National Groundwater Dependent Ecosystems (GDE) Atlas (including WA)

Derived From CLM - 16swo NSW Office of Water Surface Water Offtakes - Clarence Moreton v1 24102013

Derived From Species Profile and Threats Database (SPRAT) - Australia - Species of National Environmental Significance Database (BA subset - RESTRICTED - Metadata only)

Derived From QLD Dept of Natural Resources and Mines, Surface Water Entitlements 131204

Derived From GEODATA TOPO 250K Series 3, File Geodatabase format (.gdb)

Derived From [Asset database for the Clarence-Moreton bioregion on 19 August
Connectomix test dataset 2
openneuro.org
Updated Dec 8, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Antonin Rovai (2024). Connectomix test dataset 2 [Dataset]. http://doi.org/10.18112/openneuro.ds005699.v1.0.0
Explore at:
Unique identifier
https://doi.org/10.18112/openneuro.ds005699.v1.0.0
Dataset updated
Dec 8, 2024
Dataset provided by
OpenNeurohttps://openneuro.org/
Authors
Antonin Rovai
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Connectomix test dataset 2

This dataset is designed to test features of Connectomix. For more information, please visit the GitHub repository:

https://github.com/ln2t/connectomix

The dataset contains data for 4 participants.

Rawdata

Anatomical: T1w image (defaced)

Functional: A standard 6 minutes resting-state scan

Groups: the participants are split in control and patient groups in the participants.tsv file. This split has been done artificially and serves only testing purposes.

Derivatives

fmriprep: preprocessed data using fMRIPrep

connectomix: results of the connectomix software (see below for details)

Commands

The exact commands to run the analyzes depends on your installation of fMRIPrep. In what follows, we simply assume that fmriprep is the command for fMRIPrep. We show here the simplest version of the commands, assuming you adapt those depending on your setup (e.g. if you use Docker).

We also assume that the data are at the following locations: bash bids_dir='/data/ds005699' derivatives_dir='/data/ds005699/derivatives'

Preprocessing (fMRIPrep)

fmriprep $bids_dir ${derivatives_dir}/fmriprep participant --fs-license-file /path/to/fs/license

Analysis: connectomix

Note: The following has been tested for connectomix version 1.0.1.

First set-up path to connectomix script: bash connectomix_cmd='/path/to/connectomix/connectomix/connectomix.py'

Second, set-up paths to config directory: bash config_dir='/data/ds005625/code/connectomix/config'

Participant-level

$connectomix_cmd ${bids_dir} ${derivatives_dir}/connectomix participant --derivatives fmriprep="${derivatives_dir}/fmriprep" --config "${config_dir}/participant_level_config.yaml"

Group-level

Notes: - this is an example of Independent two-samples t-test - Since the dataset contains only four subjects (two subjects per group), the number of possible permutations is very low. For this reason, the number of computed permutations is set to 4, and connectomix can then complete the group level-analysis. Of course, realistic cases should not only include much more participants, but also a much larger number of permutations (see connectomix documentation).

$connectomix_cmd ${bids_dir} ${derivatives_dir}/connectomix group --config "${config_dir}/group_level_config.yaml"
I
SWOT Level 2 River Single-Pass Vector Reach Data Product for Ukraine
ihp-wins.unesco.org
data.dev-wins.com
pdf, shp
Updated Jan 31, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). SWOT Level 2 River Single-Pass Vector Reach Data Product for Ukraine [Dataset]. https://ihp-wins.unesco.org/dataset/swot-level-2-river-single-pass-vector-reach-data-product-for-ukraine
Explore at:
shp, pdfAvailable download formats
Dataset updated
Jan 31, 2025
License
http://www.opendefinition.org/licenses/cc-by-sahttp://www.opendefinition.org/licenses/cc-by-sa
Description
The SWOT Level 2 River Single-Pass Vector Reach Data Product from the Surface Water Ocean Topography (SWOT) mission provides water surface elevation, slope, width, and discharge derived from the high rate (HR) data stream from the Ka-band Radar Interferometer (KaRIn). SWOT launched on December 16, 2022 from Vandenberg Air Force Base in California into a 1-day repeat orbit for the "calibration" or "fast-sampling" phase of the mission, which completed in early July 2023. After the calibration phase, SWOT entered a 21-day repeat orbit in August 2023 to start the "science" phase of the mission, which is expected to continue through 2025.

Water surface elevation, slope, width, and discharge are provided for river reaches (approximately 10 km long) and nodes (approximately 200 m spacing) identified in the prior river database, and distributed as feature datasets covering the full swath for each continent-pass. These data are generally produced for inland and coastal hydrology surfaces, as controlled by the reloadable KaRIn HR mask. The dataset is distributed in ESRI Shapefile format. Please note that this collection contains SWOT Version C science data products.

This collection is a sub-collection of its parent: https://podaac.jpl.nasa.gov/dataset/SWOT_L2_HR_RiverSP_2.0 It contains only river reaches.
d
Protected Areas Database of the United States (PAD-US) 2.1
catalog.data.gov
data.usgs.gov
Updated Jul 6, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2024). Protected Areas Database of the United States (PAD-US) 2.1 [Dataset]. https://catalog.data.gov/dataset/protected-areas-database-of-the-united-states-pad-us-2-1
Explore at:
Dataset updated
Jul 6, 2024
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Area covered
United States
Description
NOTE: A more current version of the Protected Areas Database of the United States (PAD-US) is available: PAD-US 3.0 https://doi.org/10.5066/P9Q9LQ4B. The USGS Protected Areas Database of the United States (PAD-US) is the nation's inventory of protected areas, including public land and voluntarily provided private protected areas, identified as an A-16 National Geospatial Data Asset in the Cadastre Theme (https://communities.geoplatform.gov/ngda-cadastre/). The PAD-US is an ongoing project with several published versions of a spatial database including areas dedicated to the preservation of biological diversity, and other natural (including extraction), recreational, or cultural uses, managed for these purposes through legal or other effective means. The database was originally designed to support biodiversity assessments; however, its scope expanded in recent years to include all public and nonprofit lands and waters. Most are public lands owned in fee (the owner of the property has full and irrevocable ownership of the land); however, long-term easements, leases, agreements, Congressional (e.g. 'Wilderness Area'), Executive (e.g. 'National Monument'), and administrative designations (e.g. 'Area of Critical Environmental Concern') documented in agency management plans are also included. The PAD-US strives to be a complete inventory of public land and other protected areas, compiling “best available” data provided by managing agencies and organizations. The PAD-US geodatabase maps and describes areas using over twenty-five attributes and five feature classes representing the U.S. protected areas network in separate feature classes: Fee (ownership parcels), Designation, Easement, Marine, Proclamation and Other Planning Boundaries. Five additional feature classes include various combinations of the primary layers (for example, Combined_Fee_Easement) to support data management, queries, web mapping services, and analyses. This PAD-US Version 2.1 dataset includes a variety of updates and new data from the previous Version 2.0 dataset (USGS, 2018 https://doi.org/10.5066/P955KPLE ), achieving the primary goal to "Complete the PAD-US Inventory by 2020" (https://www.usgs.gov/core-science-systems/science-analytics-and-synthesis/gap/science/pad-us-vision) by addressing known data gaps with newly available data. The following list summarizes the integration of "best available" spatial data to ensure public lands and other protected areas from all jurisdictions are represented in PAD-US, along with continued improvements and regular maintenance of the federal theme. Completing the PAD-US Inventory: 1) Integration of over 75,000 city parks in all 50 States (and the District of Columbia) from The Trust for Public Land's (TPL) ParkServe data development initiative (https://parkserve.tpl.org/) added nearly 2.7 million acres of protected area and significantly reduced the primary known data gap in previous PAD-US versions (local government lands). 2) First-time integration of the Census American Indian/Alaskan Native Areas (AIA) dataset (https://www2.census.gov/geo/tiger/TIGER2019/AIANNH) representing the boundaries for federally recognized American Indian reservations and off-reservation trust lands across the nation (as of January 1, 2020, as reported by the federally recognized tribal governments through the Census Bureau's Boundary and Annexation Survey) addressed another major PAD-US data gap. 3) Aggregation of nearly 5,000 protected areas owned by local land trusts in 13 states, aggregated by Ducks Unlimited through data calls for easements to update the National Conservation Easement Database (https://www.conservationeasement.us/), increased PAD-US protected areas by over 350,000 acres. Maintaining regular Federal updates: 1) Major update of the Federal estate (fee ownership parcels, easement interest, and management designations), including authoritative data from 8 agencies: Bureau of Land Management (BLM), U.S. Census Bureau (Census), Department of Defense (DOD), U.S. Fish and Wildlife Service (FWS), National Park Service (NPS), Natural Resources Conservation Service (NRCS), U.S. Forest Service (USFS), National Oceanic and Atmospheric Administration (NOAA). The federal theme in PAD-US is developed in close collaboration with the Federal Geographic Data Committee (FGDC) Federal Lands Working Group (FLWG, https://communities.geoplatform.gov/ngda-govunits/federal-lands-workgroup/); 2) Complete National Marine Protected Areas (MPA) update: from the National Oceanic and Atmospheric Administration (NOAA) MPA Inventory, including conservation measure ('GAP Status Code', 'IUCN Category') review by NOAA; Other changes: 1) PAD-US field name change - The "Public Access" field name changed from 'Access' to 'Pub_Access' to avoid unintended scripting errors associated with the script command 'access'. 2) Additional field - The "Feature Class" (FeatClass) field was added to all layers within PAD-US 2.1 (only included in the "Combined" layers of PAD-US 2.0 to describe which feature class data originated from). 3) Categorical GAP Status Code default changes - National Monuments are categorically assigned GAP Status Code = 2 (previously GAP 3), in the absence of other information, to better represent biodiversity protection restrictions associated with the designation. The Bureau of Land Management Areas of Environmental Concern (ACECs) are categorically assigned GAP Status Code = 3 (previously GAP 2) as the areas are administratively protected, not permanent. More information is available upon request. 4) Agency Name (FWS) geodatabase domain description changed to U.S. Fish and Wildlife Service (previously U.S. Fish & Wildlife Service). 5) Select areas in the provisional PAD-US 2.1 Proclamation feature class were removed following a consultation with the data-steward (Census Bureau). Tribal designated statistical areas are purely a geographic area for providing Census statistics with no land base. Most affected areas are relatively small; however, 4,341,120 acres and 37 records were removed in total. Contact Mason Croft (masoncroft@boisestate) for more information about how to identify these records. For more information regarding the PAD-US dataset please visit, https://usgs.gov/gapanalysis/PAD-US/. For more information about data aggregation please review the Online PAD-US Data Manual available at https://www.usgs.gov/core-science-systems/science-analytics-and-synthesis/gap/pad-us-data-manual .
I
SWOT Level 2 River Single-Pass Vector Node Data Product for Ukraine
data.dev-wins.com
ihp-wins.unesco.org
0, shp
Updated Feb 20, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). SWOT Level 2 River Single-Pass Vector Node Data Product for Ukraine [Dataset]. https://data.dev-wins.com/dataset/swot-level-2-river-single-pass-vector-node-data-product-for-ukraine
Explore at:
shp, 0Available download formats
Dataset updated
Feb 20, 2025
License
http://www.opendefinition.org/licenses/cc-by-sahttp://www.opendefinition.org/licenses/cc-by-sa
Area covered
Ukraine
Description
The SWOT Level 2 River Single-Pass Vector Node Data Product from the Surface Water Ocean Topography (SWOT) mission provides water surface elevation, slope, width, and discharge derived from the high rate (HR) data stream from the Ka-band Radar Interferometer (KaRIn). SWOT launched on December 16, 2022 from Vandenberg Air Force Base in California into a 1-day repeat orbit for the "calibration" or "fast-sampling" phase of the mission, which completed in early July 2023. After the calibration phase, SWOT entered a 21-day repeat orbit in August 2023 to start the "science" phase of the mission, which is expected to continue through 2025.

Water surface elevation, slope, width, and discharge are provided for river reaches (approximately 10 km long) and nodes (approximately 200 m spacing) identified in the prior river database, and distributed as feature datasets covering the full swath for each continent-pass. These data are generally produced for inland and coastal hydrology surfaces, as controlled by the reloadable KaRIn HR mask. The dataset is distributed in ESRI Shapefile format. Please note that this collection contains SWOT Version C science data products.

This collection is a sub-collection of its parent: https://podaac.jpl.nasa.gov/dataset/SWOT_L2_HR_RiverSP_2.0 It contains only river nodes.
o
ANV - Probability distribution for Corylus avellana
data.opendatascience.eu
Updated Jan 2, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2021). ANV - Probability distribution for Corylus avellana [Dataset]. https://data.opendatascience.eu/geonetwork/srv/search?type=dataset
Explore at:
Dataset updated
Jan 2, 2021
Description
Overview: Actual Natural Vegetation (ANV): probability of occurrence for the Common hazel in its realized environment for the period 2000 - 2022 Traceability (lineage): This is an original dataset produced with a machine learning framework which used a combination of point datasets and raster datasets as inputs. Point dataset is a harmonized collection of tree occurrence data, comprising observations from National Forest Inventories (EU-Forest), GBIF and LUCAS. The complete dataset is available on Zenodo. Raster datasets used as input are: harmonized and gapfilled time series of seasonal aggregates of the Landsat GLAD ARD dataset (bands and spectral indices); monthly time series air and surface temperature and precipitation from a reprocessed version of the Copernicus ERA5 dataset; long term averages of bioclimatic variables from CHELSA, tree species distribution maps from the European Atlas of Forest Tree Species; elevation, slope and other elevation-derived metrics; long term monthly averages snow probability and long term monthly averages of cloud fraction from MODIS. For a more comprehensive list refer to Bonannella et al. (2022) (in review, preprint available at: https://doi.org/10.21203/rs.3.rs-1252972/v1). Scientific methodology: Probability and uncertainty maps were the output of a spatiotemporal ensemble machine learning framework based on stacked regularization. Three base models (random forest, gradient boosted trees and generalized linear models) were first trained on the input dataset and their predictions were used to train an additional model (logistic regression) which provided the final predictions. More details on the whole workflow are available in the listed publication. Usability: Probability maps can be used to detect potential forest degradation and compositional change across the time period analyzed. Some possible applications for these topics are explained in the listed publication. Uncertainty quantification: Uncertainty is quantified by taking the standard deviation of the probabilities predicted by the three components of the spatiotemporal ensemble model. Data validation approaches: Distribution maps were validated using a spatial 5-fold cross validation following the workflow detailed in the listed publication. Completeness: The raster files perfectly cover the entire Geo-harmonizer region as defined by the landmask raster dataset available here. Consistency: Areas which are outside of the calibration area of the point dataset (Iceland, Norway) usually have high uncertainty values. This is not only a problem of extrapolation but also of poor representation in the feature space available to the model of the conditions that are present in this countries. Positional accuracy: The rasters have a spatial resolution of 30m. Temporal accuracy: The maps cover the period 2000 - 2020, each map covers a certain number of years according to the following scheme: (1) 2000--2002, (2) 2002--2006, (3) 2006--2010, (4) 2010--2014, (5) 2014--2018 and (6) 2018--2020 Thematic accuracy: Both probability and uncertainty maps contain values from 0 to 100: in the case of probability maps, they indicate the probability of occurrence of a single individual of the target species, while uncertainty maps indicate the standard deviation of the ensemble model.
S
Galaxy, star, quasar dataset
scidb.cn
Updated Feb 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Li Xin (2023). Galaxy, star, quasar dataset [Dataset]. http://doi.org/10.57760/sciencedb.07177
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.57760/sciencedb.07177
Dataset updated
Feb 3, 2023
Dataset provided by
Science Data Bank
Authors
Li Xin
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
The data used in this paper is from the 16th issue of SDSS. SDSS-DR16 contains a total of 930,268 photometric images, with 1.2 billion observation sources and tens of millions of spectra. The data obtained in this paper is downloaded from the official website of SDSS. Specifically, the data is obtained through the SkyServerAPI structure by using SQL query statements in the subwebsite CasJobs. As the current SDSS photometric table PhotoObj can only classify all observed sources as point sources and surface sources, the target sources can be better classified as galaxies, stars and quasars through spectra. Therefore, we obtain calibrated sources in CasJobs by crossing SpecPhoto with the PhotoObj star list, and obtain target position information (right ascension and declination). Calibrated sources can tell them apart precisely and quickly. Each calibrated source is labeled with the parameter "Class" as "galaxy", "star", or "quasar". In this paper, observation day area 3462, 3478, 3530 and other 4 areas in SDSS-DR16 are selected as experimental data, because a large number of sources can be obtained in these areas to provide rich sample data for the experiment. For example, there are 9891 sources in the 3462-day area, including 2790 galactic sources, 2378 stellar sources and 4723 quasar sources. There are 3862 sources in the 3478 day area, including 1759 galactic sources, 577 stellar sources and 1526 quasar sources. FITS files are a commonly used data format in the astronomical community. By cross-matching the star list and FITS files in the local celestial region, we obtained images of 5 bands of u, g, r, i and z of 12499 galaxy sources, 16914 quasar sources and 16908 star sources as training and testing data.1.1 Image SynthesisSDSS photometric data includes photometric images of five bands u, g, r, i and z, and these photometric image data are respectively packaged in single-band format in FITS files. Images of different bands contain different information. Since the three bands g, r and i contain more feature information and less noise, Astronomical researchers typically use the g, r, and i bands corresponding to the R, G, and B channels of the image to synthesize photometric images. Generally, different bands cannot be directly synthesized. If three bands are directly synthesized, the image of different bands may not be aligned. Therefore, this paper adopts the RGB multi-band image synthesis software written by He Zhendong et al. to synthesize images in g, r and i bands. This method effectively avoids the problem that images in different bands cannot be aligned. The pixel of each photometry image in this paper is 2048×1489.1.2 Data tailoringThis paper first clipped the target image, image clipping can use image segmentation tools to solve this problem, this paper uses Python to achieve this process. In the process of clipping, we convert the right ascension and declination of the source in the star list into pixel coordinates on the photometric image through the coordinate conversion formula, and determine the specific position of the source through the pixel coordinates. The coordinates are regarded as the center point and clipping is carried out in the form of a rectangular box. We found that the input image size affects the experimental results. Therefore, according to the target size of the source, we selected three different cutting sizes, 40×40, 60×60 and 80×80 respectively. Through experiment and analysis, we find that convolutional neural network has better learning ability and higher accuracy for data with small image size. In the end, we chose to divide the surface source galaxies, point source quasars, and stars into 40×40 sizes.1.3 Division of training and test dataIn order to make the algorithm have more accurate recognition performance, we need enough image samples. The selection of training set, verification set and test set is an important factor affecting the final recognition accuracy. In this paper, the training set, verification set and test set are set according to the ratio of 8:1:1. The purpose of verification set is used to revise the algorithm, and the purpose of test set is used to evaluate the generalization ability of the final algorithm. Table 1 shows the specific data partitioning information. The total sample size is 34,000 source images, including 11543 galaxy sources, 11967 star sources, and 10490 quasar sources.1.4 Data preprocessingIn this experiment, the training set and test set can be used as the training and test input of the algorithm after data preprocessing. The data quantity and quality largely determine the recognition performance of the algorithm. The pre-processing of the training set and the test set are different. In the training set, we first perform vertical flip, horizontal flip and scale on the cropped image to enrich the data samples and enhance the generalization ability of the algorithm. Since the features in the celestial object source have the flip invariability, the labels of galaxies, stars and quasars will not change after rotation. In the test set, our preprocessing process is relatively simple compared with the training set. We carry out simple scaling processing on the input image and test input the obtained image.
r
Asset database for the Hunter subregion on 24 February 2016 Public 20170112...
researchdata.edu.au
cloud.csiss.gmu.edu
+2more
Updated Mar 20, 2017
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bioregional Assessment Program (2017). Asset database for the Hunter subregion on 24 February 2016 Public 20170112 v02 [Dataset]. https://researchdata.edu.au/asset-database-hunter-20170112-v02/2993836
Explore at:
Dataset updated
Mar 20, 2017
Dataset provided by
data.gov.au
Authors
Bioregional Assessment Program
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Abstract

The dataset was derived by the Bioregional Assessment Programme. This dataset was derived from multiple datasets including Natural Resource Management regions, and Australian and state and territory government databases. You can find a link to the parent datasets in the Lineage Field in this metadata statement. The History Field in this metadata statement describes how this dataset was derived. A single asset is represented spatially in the asset database by single or multiple spatial features (point, line or polygon). Individual points, lines or polygons are termed elements.

This data set holds the publicly-available version of the database of water-dependent assets that was compiled for the bioregional assessment (BA) of the Hunter subregion as part of the Bioregional Assessment Technical Programme. Though all life is dependent on water, for the purposes of a bioregional assessment, a water-dependent asset is an asset potentially impacted by changes in the groundwater and/or surface water regime due to coal resource development. The water must be other than local rainfall. Examples include wetlands, rivers, bores and groundwater dependent ecosystems.

This dataset contains the unrestricted publicly-available components of spatial and non-spatial (attribute) data of the (restricted) Asset database for the Hunter subregion on 24 February 2016 (a39290ac-3925-4abc-9ecb-b91e911f008f). The database is provided primarily as an ESRI File geodatabase (.gdb), which is able to be opened in readily available open source software such as QGIS. Other formats include the Microsoft Access database (.mdb in ESRI Personal Geodatabase format), industry-standard ESRI Shapefiles and tab-delimited text files of all the attribute tables.

The restricted version of the Hunter Asset database has a total count of 182 277 Elements (and 2 545 Assets). In the public version of the Asset Hunter database 69 330 spatial Element features (\~38%) have been removed from the Element List and Element Layer(s) and 1124 spatial Assets (\~44%) have been removed from the spatial Asset Layer(s)

The elements/assets removed from the restricted Asset Database are from the following data sources:

1) Species Profile and Threats Database (SPRAT) - RESTRICTED - Metadata only) (7276dd93-cc8c-4c01-8df0-cef743c72112)

2) Threatened migratory shorebird habitat mapping DECCW May 2006 (cc0b62a0-ded7-4c14-b954-1552337b395e)

2) Australia, Register of the National Estate (RNE) - Spatial Database (RNESDB) (878f6780-be97-469b-8517-54bd12a407d0)

3) Communities of National Environmental Significance Database - RESTRICTED - Metadata only (c01c4693-0a51-4dbc-bbbd-7a07952aa5f6)

4) Hunter CMA GDEs (DPI pre-release) - RESTRICTED - Metadata only (469d6d2e-900f-47a7-a137-946b89b3d188)

These important assets are included in the bioregional assessment, but are unable to be publicly distributed by the Bioregional Assessment Programme due to restrictions in their licensing conditions. Please note that many of these data sets are available directly from their custodian. For more precise details please see the associated explanatory Data Dictionary document enclosed with this dataset.

Purpose

The data are for any external party that wants to access the asset database used for the assessment. The BATP is required to release these wherever possible, to comply with the requirements of transparency and repeatability.

Dataset History

The public version of the asset database retains all of the unrestricted components of the Asset database for the Hunter subregion on 24 February 2016 - any material that is unable to be published or redistributed to a third party by the BA Programme has been removed from the database. The data presented corresponds to the assets published in product 1.3: Description of the water-dependent asset register and asset list for the Hunter subregion on 24 February 2016, and the associated Water-dependent asset register and asset list for the Hunter subregion on 24 February 2016.

Individual spatial features or elements are initially included in database if they are partly or wholly within the subregion's preliminary assessment extent (Materiality Test 1, M1). In accordance to BA submethodology M02: Compiling water-dependent assets, individual spatial elements are then grouped into assets which are evaluated by project teams to determine whether they meet materiality test 2 (M2), which are assets that are considered to be water dependent.

Following delivery of the first pass asset list, project teams make a determination as to whether an asset (comprised of one or more elements) is water dependent, as assessed against the materiality tests detailed in the BA Methodology. These decisions are provided to ERIN by the assessment team and incorporated into the AssetList table in the Asset database.

Development of the Asset Register from the Asset database:

Decisions for M0 (fit for BA purpose), M1 (PAE) and M2 (water dependent) determine which assets are included in the "asset list" and "water-dependent asset register" which are published as Product 1.3.

The rule sets are applied as follows:

M0\tM1\tM2\tResult

No\tn/a\tn/a\tAsset is not included in the asset list or the water-dependent asset register

(≠ No)\tNo\tn/a\tAsset is not included in the asset list or the water-dependent asset register

(≠ No)\tYes\tNo\tAsset included in published asset list but not in water dependent asset register

(≠ No)\tYes\tYes\tAsset included in both asset list and water-dependent asset register

Assessment teams are then able to use the database to assign receptors and impact variables to water-dependent assets and the development of a receptor register as detailed in BA submethodology M03: Assigning receptors to water-dependent assets and the receptor register is then incorporated into the asset database.

At this stage of its development, the Asset database for the Hunter subregion on 24 February 2016, which this document describes, does not contain receptor information.

Dataset Citation

Bioregional Assessment Programme (2015) Asset database for the Hunter subregion on 24 February 2016 Public 20170112 v02. Bioregional Assessment Derived Dataset. Viewed 13 March 2019, http://data.bioregionalassessments.gov.au/dataset/9d16592c-543b-42d9-a1f4-0f6d70b9ffe7.

Dataset Ancestors

Derived From GW Element Bores with Unknown FTYPE Hunter NSW Office of Water 20150514

Derived From Travelling Stock Route Conservation Values

Derived From NSW Wetlands

Derived From Climate Change Corridors Coastal North East NSW

Derived From Communities of National Environmental Significance Database - RESTRICTED - Metadata only

Derived From Climate Change Corridors for Nandewar and New England Tablelands

Derived From National Groundwater Dependent Ecosystems (GDE) Atlas

Derived From Asset database for the Hunter subregion on 27 August 2015

Derived From Birds Australia - Important Bird Areas (IBA) 2009

Derived From Estuarine Macrophytes of Hunter Subregion NSW DPI Hunter 2004

Derived From Hunter CMA GDEs (DRAFT DPI pre-release)

Derived From Camerons Gorge Grassy White Box Endangered Ecological Community (EEC) 2008

Derived From Asset database for the Hunter subregion on 16 June 2015

Derived From Spatial Threatened Species and Communities (TESC) NSW 20131129

Derived From Asset database for the Hunter subregion on 24 February 2016

Derived From Threatened migratory shorebird habitat mapping DECCW May 2006

Derived From Gosford Council Endangered Ecological Communities (Umina woodlands) EEC3906

Derived From NSW Office of Water Surface Water Offtakes - Hunter v1 24102013

Derived From National Groundwater Dependent Ecosystems (GDE) Atlas (including WA)

Derived From Asset list for Hunter - CURRENT

Derived From NSW Office of Water Surface Water Entitlements Locations v1_Oct2013

Derived From Species Profile and Threats Database (SPRAT) - Australia - Species of National Environmental Significance Database (BA subset - RESTRICTED - Metadata only)

Derived From Ramsar Wetlands of Australia

Derived From
a
Parcels Composite of NJ (Download)
hub.arcgis.com
anrgeodata.vermont.gov
Updated Jun 13, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
New Jersey Office of GIS (2025). Parcels Composite of NJ (Download) [Dataset]. https://hub.arcgis.com/documents/d543ddcc1e6844319ffa826fee52fccf
Explore at:
Dataset updated
Jun 13, 2025
Dataset authored and provided by
New Jersey Office of GIS
Area covered

Description
The statewide composite of parcels (cadastral) data for New Jersey was developed during the Parcels Normalization Project in 2008-2014 by the NJ Office of Information Technology, Office of GIS (NJOGIS.) The normalized parcels data are compatible with the NJ Department of the Treasury system currently used by Tax Assessors. This composite of parcels data serves as one of NJ's framework GIS datasets. Stewardship and maintenance of the data will continue to be the purview of county and municipal governments, but the statewide composite will be maintained by NJOGIS.Parcel attributes were normalized to a standard structure, specified in the NJ GIS Parcel Mapping Standard, to store parcel information and provide a PIN (parcel identification number) field that can be used to match records with suitably-processed property tax data. The standard is available for viewing and download at https://geoapps.nj.gov/njgin/parcel/NJGIS_ParcelMappingStandardv3.2.pdf. This feature class includes only those minimal attributes. The statewide property tax table is available as a separate download "MOD-IV Tax List Search Plus Database of New Jersey" or combined with the parcels as a separate download "Parcels and MOD-IV Composite of New Jersey." Also available separately are countywide parcels and tables of property ownership and tax information extracted from the NJ Division of Taxation database.The polygons delineated in this dataset do not represent legal boundaries and should not be used to provide a legal determination of land ownership. Parcels are not survey data and should not be used as such. Please note that these parcel datasets are not intended for use as tax maps. They are intended to provide reasonable representations of parcel boundaries for planning and other purposes. Please see Data Quality / Process Steps for details about updates to this composite since its first publication.***NOTE*** For users who incorporate NJOGIS services into web maps and/or web applications, please sign up for the NJ Geospatial Forum discussion listserv for early notification of service changes. Visit https://nj.gov/njgf/about/listserv/ for more information.
a
USFS CA MT Units Priority shp
hub.arcgis.com
arc-gis-hub-home-arcgishub.hub.arcgis.com
Updated Mar 26, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Adventure Scientists (2023). USFS CA MT Units Priority shp [Dataset]. https://hub.arcgis.com/datasets/AdvSci::usfs-ca-mt-units-priority-shp?uiVersion=content-views
Explore at:
Dataset updated
Mar 26, 2023
Dataset authored and provided by
Adventure Scientists
Area covered

Description
The USGS Protected Areas Database of the United States (PAD-US) is the nation's inventory of protected areas, including public land and voluntarily provided private protected areas, identified as an A-16 National Geospatial Data Asset in the Cadastre Theme (https://communities.geoplatform.gov/ngda-cadastre/). The PAD-US is an ongoing project with several published versions of a spatial database including areas dedicated to the preservation of biological diversity, and other natural (including extraction), recreational, or cultural uses, managed for these purposes through legal or other effective means. The database was originally designed to support biodiversity assessments; however, its scope expanded in recent years to include all public and nonprofit lands and waters. Most are public lands owned in fee (the owner of the property has full and irrevocable ownership of the land); however, long-term easements, leases, agreements, Congressional (e.g. 'Wilderness Area'), Executive (e.g. 'National Monument'), and administrative designations (e.g. 'Area of Critical Environmental Concern') documented in agency management plans are also included. The PAD-US strives to be a complete inventory of public land and other protected areas, compiling “best available” data provided by managing agencies and organizations. The PAD-US geodatabase maps and describes areas using over twenty-five attributes and five feature classes representing the U.S. protected areas network in separate feature classes: Fee (ownership parcels), Designation, Easement, Marine, Proclamation and Other Planning Boundaries. Five additional feature classes include various combinations of the primary layers (for example, Combined_Fee_Easement) to support data management, queries, web mapping services, and analyses. This PAD-US Version 2.1 dataset includes a variety of updates and new data from the previous Version 2.0 dataset (USGS, 2018 https://doi.org/10.5066/P955KPLE ), achieving the primary goal to "Complete the PAD-US Inventory by 2020" (https://www.usgs.gov/core-science-systems/science-analytics-and-synthesis/gap/science/pad-us-vision) by addressing known data gaps with newly available data. The following list summarizes the integration of "best available" spatial data to ensure public lands and other protected areas from all jurisdictions are represented in PAD-US, along with continued improvements and regular maintenance of the federal theme. Completing the PAD-US Inventory: 1) Integration of over 75,000 city parks in all 50 States (and the District of Columbia) from The Trust for Public Land's (TPL) ParkServe data development initiative (https://parkserve.tpl.org/) added nearly 2.7 million acres of protected area and significantly reduced the primary known data gap in previous PAD-US versions (local government lands). 2) First-time integration of the Census American Indian/Alaskan Native Areas (AIA) dataset (https://www2.census.gov/geo/tiger/TIGER2019/AIANNH) representing the boundaries for federally recognized American Indian reservations and off-reservation trust lands across the nation (as of January 1, 2020, as reported by the federally recognized tribal governments through the Census Bureau's Boundary and Annexation Survey) addressed another major PAD-US data gap. 3) Aggregation of nearly 5,000 protected areas owned by local land trusts in 13 states, aggregated by Ducks Unlimited through data calls for easements to update the National Conservation Easement Database (https://www.conservationeasement.us/), increased PAD-US protected areas by over 350,000 acres. Maintaining regular Federal updates: 1) Major update of the Federal estate (fee ownership parcels, easement interest, and management designations), including authoritative data from 8 agencies: Bureau of Land Management (BLM), U.S. Census Bureau (Census), Department of Defense (DOD), U.S. Fish and Wildlife Service (FWS), National Park Service (NPS), Natural Resources Conservation Service (NRCS), U.S. Forest Service (USFS), National Oceanic and Atmospheric Administration (NOAA). The federal theme in PAD-US is developed in close collaboration with the Federal Geographic Data Committee (FGDC) Federal Lands Working Group (FLWG, https://communities.geoplatform.gov/ngda-govunits/federal-lands-workgroup/); 2) Complete National Marine Protected Areas (MPA) update: from the National Oceanic and Atmospheric Administration (NOAA) MPA Inventory, including conservation measure ('GAP Status Code', 'IUCN Category') review by NOAA; Other changes: 1) PAD-US field name change - The "Public Access" field name changed from 'Access' to 'Pub_Access' to avoid unintended scripting errors associated with the script command 'access'. 2) Additional field - The "Feature Class" (FeatClass) field was added to all layers within PAD-US 2.1 (only included in the "Combined" layers of PAD-US 2.0 to describe which feature class data originated from). 3) Categorical GAP Status Code default changes - National Monuments are categorically assigned GAP Status Code = 2 (previously GAP 3), in the absence of other information, to better represent biodiversity protection restrictions associated with the designation. The Bureau of Land Management Areas of Environmental Concern (ACECs) are categorically assigned GAP Status Code = 3 (previously GAP 2) as the areas are administratively protected, not permanent. More information is available upon request. 4) Agency Name (FWS) geodatabase domain description changed to U.S. Fish and Wildlife Service (previously U.S. Fish & Wildlife Service). 5) Select areas in the provisional PAD-US 2.1 Proclamation feature class were removed following a consultation with the data-steward (Census Bureau). Tribal designated statistical areas are purely a geographic area for providing Census statistics with no land base. Most affected areas are relatively small; however, 4,341,120 acres and 37 records were removed in total. Contact Mason Croft (masoncroft@boisestate) for more information about how to identify these records. For more information regarding the PAD-US dataset please visit, https://usgs.gov/gapanalysis/PAD-US/. For more information about data aggregation please review the Online PAD-US Data Manual available at https://www.usgs.gov/core-science-systems/science-analytics-and-synthesis/gap/pad-us-data-manual .
A
‘PLACES: County Data (GIS Friendly Format), 2021 release’ analyzed by...
analyst-2.ai
Updated Feb 12, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘PLACES: County Data (GIS Friendly Format), 2021 release’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/data-gov-places-county-data-gis-friendly-format-2021-release-a9b7/68cba9fb/?iid=034-326&v=presentation
Explore at:
Dataset updated
Feb 12, 2022
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘PLACES: County Data (GIS Friendly Format), 2021 release’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://catalog.data.gov/dataset/e128e2f2-02af-4605-81aa-97ebdb8b2fc8 on 12 February 2022.

--- Dataset description provided by original source is as follows ---

This dataset contains model-based county-level estimates for the PLACES 2021 release in GIS-friendly format. PLACES is the expansion of the original 500 Cities Project and covers the entire United States—50 states and the District of Columbia (DC)—at county, place, census tract, and ZIP Code Tabulation Area (ZCTA) levels. It represents a first-of-its kind effort to release information uniformly on this large scale for local areas at 4 geographic levels. Estimates were provided by the Centers for Disease Control and Prevention (CDC), Division of Population Health, Epidemiology and Surveillance Branch. Project was funded by the Robert Wood Johnson Foundation (RWJF) in conjunction with the CDC Foundation. Data sources used to generate these model-based estimates include Behavioral Risk Factor Surveillance System (BRFSS) 2019 or 2018 data, Census Bureau 2019 or 2018 county population estimates, and American Community Survey (ACS) 2015–2019 or 2014–2018 estimates. The 2021 release uses 2019 BRFSS data for 22 measures and 2018 BRFSS data for 7 measures (all teeth lost, dental visits, mammograms, cervical cancer screening, colorectal cancer screening, core preventive services among older adults, and sleeping less than 7 hours a night). Seven measures are based on the 2018 BRFSS data because the relevant questions are only asked every other year in the BRFSS. These data can be joined with the census 2015 county boundary file in a GIS system to produce maps for 29 measures at the county level. An ArcGIS Online feature service is also available for users to make maps online or to add data to desktop GIS software. https://cdcarcgis.maps.arcgis.com/home/item.html?id=3b7221d4e47740cab9235b839fa55cd7

--- Original source retains full ownership of the source dataset ---
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Zhi Nie; Srinivasan Vairavan; Vaibhav A. Narayan; Jieping Ye; Qingqin S. Li (2023). Predictive modeling of treatment resistant depression using data from STAR*D and an independent clinical study [Dataset]. http://doi.org/10.1371/journal.pone.0197268

Predictive modeling of treatment resistant depression using data from STAR*D and an independent clinical study

Explore at:

26 scholarly articles cite this dataset (View in Google Scholar)

docxAvailable download formats

Unique identifier

https://doi.org/10.1371/journal.pone.0197268

Dataset updated

Jun 1, 2023

Dataset provided by

PLOS ONE

Authors

Zhi Nie; Srinivasan Vairavan; Vaibhav A. Narayan; Jieping Ye; Qingqin S. Li

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Identification of risk factors of treatment resistance may be useful to guide treatment selection, avoid inefficient trial-and-error, and improve major depressive disorder (MDD) care. We extended the work in predictive modeling of treatment resistant depression (TRD) via partition of the data from the Sequenced Treatment Alternatives to Relieve Depression (STAR*D) cohort into a training and a testing dataset. We also included data from a small yet completely independent cohort RIS-INT-93 as an external test dataset. We used features from enrollment and level 1 treatment (up to week 2 response only) of STAR*D to explore the feature space comprehensively and applied machine learning methods to model TRD outcome at level 2. For TRD defined using QIDS-C16 remission criteria, multiple machine learning models were internally cross-validated in the STAR*D training dataset and externally validated in both the STAR*D testing dataset and RIS-INT-93 independent dataset with an area under the receiver operating characteristic curve (AUC) of 0.70–0.78 and 0.72–0.77, respectively. The upper bound for the AUC achievable with the full set of features could be as high as 0.78 in the STAR*D testing dataset. Model developed using top 30 features identified using feature selection technique (k-means clustering followed by χ2 test) achieved an AUC of 0.77 in the STAR*D testing dataset. In addition, the model developed using overlapping features between STAR*D and RIS-INT-93, achieved an AUC of > 0.70 in both the STAR*D testing and RIS-INT-93 datasets. Among all the features explored in STAR*D and RIS-INT-93 datasets, the most important feature was early or initial treatment response or symptom severity at week 2. These results indicate that prediction of TRD prior to undergoing a second round of antidepressant treatment could be feasible even in the absence of biomarker data.

Clear search

Close search

Google apps

Main menu

Predictive modeling of treatment resistant depression using data from STAR*D...

SWOT Level 2 River Single-Pass Vector Reach Data Product, Version 2.0

Protected Areas Database of the United States (PAD-US) 3.0 (ver. 2.0, March...

Analytic Engineer

Context:

Objective:

Dataset Overview:

Table Descriptions

Asset database for the Gloucester subregion on 12 February 2016

Abstract

Purpose

Dataset History

Dataset Citation

Dataset Ancestors

Standardization in Quantitative Imaging: A Multi-center Comparison of...

COVID-19 Case Surveillance Public Use Data

CDC has three COVID-19 case surveillance datasets:

Overview

COVID-19 Case Reports

Data are Considered Provisional

Data Limitations

Data Quality Assurance Procedures

Data Suppression

Additional COVID-19 Data

Asset database for the Clarence-Moreton bioregion on 16 September 2015

Abstract

Dataset History

Dataset Citation

Dataset Ancestors

Connectomix test dataset 2

Connectomix test dataset 2

Rawdata

Derivatives

Commands

Preprocessing (fMRIPrep)

Analysis: connectomix

Participant-level

Group-level

SWOT Level 2 River Single-Pass Vector Reach Data Product for Ukraine

Protected Areas Database of the United States (PAD-US) 2.1

SWOT Level 2 River Single-Pass Vector Node Data Product for Ukraine

ANV - Probability distribution for Corylus avellana

Galaxy, star, quasar dataset

Asset database for the Hunter subregion on 24 February 2016 Public 20170112...

Abstract

Purpose

Dataset History

Dataset Citation

Dataset Ancestors

Parcels Composite of NJ (Download)

USFS CA MT Units Priority shp

‘PLACES: County Data (GIS Friendly Format), 2021 release’ analyzed by...

Predictive modeling of treatment resistant depression using data from STAR*D and an independent clinical study