Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Identification of risk factors of treatment resistance may be useful to guide treatment selection, avoid inefficient trial-and-error, and improve major depressive disorder (MDD) care. We extended the work in predictive modeling of treatment resistant depression (TRD) via partition of the data from the Sequenced Treatment Alternatives to Relieve Depression (STAR*D) cohort into a training and a testing dataset. We also included data from a small yet completely independent cohort RIS-INT-93 as an external test dataset. We used features from enrollment and level 1 treatment (up to week 2 response only) of STAR*D to explore the feature space comprehensively and applied machine learning methods to model TRD outcome at level 2. For TRD defined using QIDS-C16 remission criteria, multiple machine learning models were internally cross-validated in the STAR*D training dataset and externally validated in both the STAR*D testing dataset and RIS-INT-93 independent dataset with an area under the receiver operating characteristic curve (AUC) of 0.70–0.78 and 0.72–0.77, respectively. The upper bound for the AUC achievable with the full set of features could be as high as 0.78 in the STAR*D testing dataset. Model developed using top 30 features identified using feature selection technique (k-means clustering followed by χ2 test) achieved an AUC of 0.77 in the STAR*D testing dataset. In addition, the model developed using overlapping features between STAR*D and RIS-INT-93, achieved an AUC of > 0.70 in both the STAR*D testing and RIS-INT-93 datasets. Among all the features explored in STAR*D and RIS-INT-93 datasets, the most important feature was early or initial treatment response or symptom severity at week 2. These results indicate that prediction of TRD prior to undergoing a second round of antidepressant treatment could be feasible even in the absence of biomarker data.
The SWOT Level 2 River Single-Pass Vector Reach Data Product from the Surface Water Ocean Topography (SWOT) mission provides water surface elevation, slope, width, and discharge derived from the high rate (HR) data stream from the Ka-band Radar Interferometer (KaRIn). SWOT launched on December 16, 2022 from Vandenberg Air Force Base in California into a 1-day repeat orbit for the "calibration" or "fast-sampling" phase of the mission, which completed in early July 2023. After the calibration phase, SWOT entered a 21-day repeat orbit in August 2023 to start the "science" phase of the mission, which is expected to continue through 2025. Water surface elevation, slope, width, and discharge are provided for river reaches (approximately 10 km long) and nodes (approximately 200 m spacing) identified in the prior river database, and distributed as feature datasets covering the full swath for each continent-pass. These data are generally produced for inland and coastal hydrology surfaces, as controlled by the reloadable KaRIn HR mask. The dataset is distributed in ESRI Shapefile format. This collection is a sub-collection of its parent: https://podaac.jpl.nasa.gov/dataset/SWOT_L2_HR_RiverSP_2.0 It contains only river reaches.
The USGS Protected Areas Database of the United States (PAD-US) is the nation's inventory of protected areas, including public land and voluntarily provided private protected areas, identified as an A-16 National Geospatial Data Asset in the Cadastre Theme ( https://communities.geoplatform.gov/ngda-cadastre/ ). The PAD-US is an ongoing project with several published versions of a spatial database including areas dedicated to the preservation of biological diversity, and other natural (including extraction), recreational, or cultural uses, managed for these purposes through legal or other effective means. The database was originally designed to support biodiversity assessments; however, its scope expanded in recent years to include all open space public and nonprofit lands and waters. Most are public lands owned in fee (the owner of the property has full and irrevocable ownership of the land); however, permanent and long-term easements, leases, agreements, Congressional (e.g. 'Wilderness Area'), Executive (e.g. 'National Monument'), and administrative designations (e.g. 'Area of Critical Environmental Concern') documented in agency management plans are also included. The PAD-US strives to be a complete inventory of U.S. public land and other protected areas, compiling “best available” data provided by managing agencies and organizations. The PAD-US geodatabase maps and describes areas using thirty-six attributes and five separate feature classes representing the U.S. protected areas network: Fee (ownership parcels), Designation, Easement, Marine, Proclamation and Other Planning Boundaries. An additional Combined feature class includes the full PAD-US inventory to support data management, queries, web mapping services, and analyses. The Feature Class (FeatClass) field in the Combined layer allows users to extract data types as needed. A Federal Data Reference file geodatabase lookup table (PADUS3_0Combined_Federal_Data_References) facilitates the extraction of authoritative federal data provided or recommended by managing agencies from the Combined PAD-US inventory. This PAD-US Version 3.0 dataset includes a variety of updates from the previous Version 2.1 dataset (USGS, 2020, https://doi.org/10.5066/P92QM3NT ), achieving goals to: 1) Annually update and improve spatial data representing the federal estate for PAD-US applications; 2) Update state and local lands data as state data-steward and PAD-US Team resources allow; and 3) Automate data translation efforts to increase PAD-US update efficiency. The following list summarizes the integration of "best available" spatial data to ensure public lands and other protected areas from all jurisdictions are represented in the PAD-US (other data were transferred from PAD-US 2.1). Federal updates - The USGS remains committed to updating federal fee owned lands data and major designation changes in annual PAD-US updates, where authoritative data provided directly by managing agencies are available or alternative data sources are recommended. The following is a list of updates or revisions associated with the federal estate: 1) Major update of the Federal estate (fee ownership parcels, easement interest, and management designations where available), including authoritative data from 8 agencies: Bureau of Land Management (BLM), U.S. Census Bureau (Census Bureau), Department of Defense (DOD), U.S. Fish and Wildlife Service (FWS), National Park Service (NPS), Natural Resources Conservation Service (NRCS), U.S. Forest Service (USFS), and National Oceanic and Atmospheric Administration (NOAA). The federal theme in PAD-US is developed in close collaboration with the Federal Geographic Data Committee (FGDC) Federal Lands Working Group (FLWG, https://communities.geoplatform.gov/ngda-govunits/federal-lands-workgroup/ ). 2) Improved the representation (boundaries and attributes) of the National Park Service, U.S. Forest Service, Bureau of Land Management, and U.S. Fish and Wildlife Service lands, in collaboration with agency data-stewards, in response to feedback from the PAD-US Team and stakeholders. 3) Added a Federal Data Reference file geodatabase lookup table (PADUS3_0Combined_Federal_Data_References) to the PAD-US 3.0 geodatabase to facilitate the extraction (by Data Provider, Dataset Name, and/or Aggregator Source) of authoritative data provided directly (or recommended) by federal managing agencies from the full PAD-US inventory. A summary of the number of records (Frequency) and calculated GIS Acres (vs Documented Acres) associated with features provided by each Aggregator Source is included; however, the number of records may vary from source data as the "State Name" standard is applied to national files. The Feature Class (FeatClass) field in the table and geodatabase describe the data type to highlight overlapping features in the full inventory (e.g. Designation features often overlap Fee features) and to assist users in building queries for applications as needed. 4) Scripted the translation of the Department of Defense, Census Bureau, and Natural Resource Conservation Service source data into the PAD-US format to increase update efficiency. 5) Revised conservation measures (GAP Status Code, IUCN Category) to more accurately represent protected and conserved areas. For example, Fish and Wildlife Service (FWS) Waterfowl Production Area Wetland Easements changed from GAP Status Code 2 to 4 as spatial data currently represents the complete parcel (about 10.54 million acres primarily in North Dakota and South Dakota). Only aliquot parts of these parcels are documented under wetland easement (1.64 million acres). These acreages are provided by the U.S. Fish and Wildlife Service and are referenced in the PAD-US geodatabase Easement feature class 'Comments' field. State updates - The USGS is committed to building capacity in the state data-steward network and the PAD-US Team to increase the frequency of state land updates, as resources allow. The USGS supported efforts to significantly increase state inventory completeness with the integration of local parks data in the PAD-US 2.1, and developed a state-to-PAD-US data translation script during PAD-US 3.0 development to pilot in future updates. Additional efforts are in progress to support the technical and organizational strategies needed to increase the frequency of state updates. The PAD-US 3.0 included major updates to the following three states: 1) California - added or updated state, regional, local, and nonprofit lands data from the California Protected Areas Database (CPAD), managed by GreenInfo Network, and integrated conservation and recreation measure changes following review coordinated by the data-steward with state managing agencies. Developed a data translation Python script (see Process Step 2 Source Data Documentation) in collaboration with the data-steward to increase the accuracy and efficiency of future PAD-US updates from CPAD. 2) Virginia - added or updated state, local, and nonprofit protected areas data (and removed legacy data) from the Virginia Conservation Lands Database, provided by the Virginia Department of Conservation and Recreation's Natural Heritage Program, and integrated conservation and recreation measure changes following review by the data-steward. 3) West Virginia - added or updated state, local, and nonprofit protected areas data provided by the West Virginia University, GIS Technical Center. For more information regarding the PAD-US dataset please visit, https://www.usgs.gov/gapanalysis/PAD-US/. For more information about data aggregation please review the PAD-US Data Manual available at https://www.usgs.gov/core-science-systems/science-analytics-and-synthesis/gap/pad-us-data-manual . A version history of PAD-US updates is summarized below (See https://www.usgs.gov/core-science-systems/science-analytics-and-synthesis/gap/pad-us-data-history for more information): 1) First posted - April 2009 (Version 1.0 - available from the PAD-US: Team pad-us@usgs.gov). 2) Revised - May 2010 (Version 1.1 - available from the PAD-US: Team pad-us@usgs.gov). 3) Revised - April 2011 (Version 1.2 - available from the PAD-US: Team pad-us@usgs.gov). 4) Revised - November 2012 (Version 1.3) https://doi.org/10.5066/F79Z92XD 5) Revised - May 2016 (Version 1.4) https://doi.org/10.5066/F7G73BSZ 6) Revised - September 2018 (Version 2.0) https://doi.org/10.5066/P955KPLE 7) Revised - September 2020 (Version 2.1) https://doi.org/10.5066/P92QM3NT 8) Revised - January 2022 (Version 3.0) https://doi.org/10.5066/P9Q9LQ4B Comparing protected area trends between PAD-US versions is not recommended without consultation with USGS as many changes reflect improvements to agency and organization GIS systems, or conservation and recreation measure classification, rather than actual changes in protected area acquisition on the ground.
You are an Analytics Engineer at an EdTech company focused on improving customer learning experiences. Your team relies on in-depth analysis of user data to enhance the learning journey and inform product feature updates.
Track
→ Course
→ Topic
→ Lesson
. Each lesson can take various formats, such as videos, practice exercises, exams, etc.user_lesson_progress_log
table. A user can have multiple logs for a lesson in a day.DB Diagram: https://dbdiagram.io/d/627100b17f945876b6a93e54 (use the ‘Highlight’ option to understand the relationships)
track_table
: Contains all tracks
Column | Description | Schema |
---|---|---|
track_id | unique id for an individual track | string |
track_title | name of the track | string |
course_table
: Contains all courses
Column | Description | Schema |
---|---|---|
course_id | unique id for an individual course | string |
track_id | track id to which this course belongs to | string |
course_title | name of the course | string |
topic_table
: Contains all topics
Column | Description | Schema |
---|---|---|
topic_id | unique id for an individual topic | string |
course_id | course id to which this topic belongs to | string |
topic_title | name of the topic | string |
lesson_table
: Contains all lessons
Column | Description | Schema |
---|---|---|
lesson_id | unique id for individual lesson | string |
topic_id | topic id to which this lesson belongs to | string |
lesson_title | name of the lesson | string |
lesson_type | type of the lesson i.e., it may be practice, video, exam | string |
duration_in_sec | ideal duration of the lesson (in seconds) in which user can complete the lesson | float |
user_registrations
: Contains the registration information of the users. A user has only one entry
Column | Description | Schema |
---|---|---|
user_id | unique id for an individual user | string |
registration_date | date at which a user registered | string |
user_info | contains information about the users. The field stores address, education_info, and profile in JSON format | string |
user_lesson_progress_log
: Any learning activity done by the user on a lesson is stored in logs. A user can have multiple logs for a lesson in a day. Every time a lesson completion percentage of a user is updated, a log is recorded here.
Column | Description | Schema |
---|---|---|
id | unique id for each entry | string |
user_id | unique id for an individual user | string |
lesson_id | unique id for a particular lesson | string |
overall_completion_percentage | total completion percentage of the lesson at the time of log | float |
completion_percentage_difference | Difference between the overall _completion _percentage of the lesson and the immediate preceding overall _completion _percentage | float |
activity_recorded_datetime_in_ utc | datetime at which the user has done some activity on the lesson | datetime |
Example: If a user u1 has started the lesson lesson1 and completed 10% of the lesson at May 1st 2022 8:00:00 UTC. And, the user completed 30% of the lesson at May 1st 2022 10:00:00 UTC and 20% of the lesson at May 3rd 2022 10:00:00 UTC, then the logs are recorded as follows:
id | user_id | lesson_id | overall_completion_percentage | completion_percentage_difference | activity_recorded_datetime_in_utc |
---|---|---|---|---|---|
id1 | u1 | lesson1 | 10 | 10 | 2022-05-01 08:00:00 |
id2 | u1 | lesson1 | 40 | 30 | 2022-05-01 10:00:00 |
id3 | u1 | lesson1 | 60 | 20 | 2022-05-03 10:00:00 |
user_feedback
: The table contains the feedback data given by the users. A user can give feedback to a lesson multiple times. Each feedback contains multiple questions. Each question and response is stored in an entry.
Column | Description | Schema |
---|---|---|
id | unique id for each entry | string |
feedback_id | unique id for each feedback | string |
creation_datetime | datetime at which user gave a feedback | string |
user_id | user id who gave the feedback | float |
lesson_id | ... |
The dataset was derived by the Bioregional Assessment Programme from multiple source datasets. The source datasets are identified in the Lineage field in this metadata statement. The processes undertaken to produce this derived dataset are described in the History field in this metadata statement.
This Gloucester dataset contains v8.2 of the Asset database (GLO_asset_database_20160212.mdb), a Geodatabase version for GIS mapping purposes (GLO_asset_database_20160212_GISOnly.gdb), the draft Water Dependent Asset Register spreadsheet (BA-NSB-GLO-130-WaterDependentAssetRegister-AssetList-v20160212.xlsx), a data dictionary (GLO_asset_database_doc_20160212.doc), a folder (Indigenous_doc) containing documentation associated with Indigenous water asset project, a folder (NRM_DOC) and a folder (NRM_DOC) containing documentation associated with the Water Asset Information Tool (WAIT) process as outlined below.
The Gloucester Asset database v8.2 supersedes the previous version of the GLO Asset database in asset relevant tables/ feature class only (i.e. AssetDecisions, AssetList, Element_to_Asset, ElementList, tbl_Indigenous_water_asset, tbl_GAL_Species_TEC_decisions_review_23112015 in GLO_asset_database_20160212.mdb and GM_GLO_AssetList_pt, GM_GLO_ElementList_pt in GLO_asset_database_20160212_GISOnly.gdb). This version of GLO asset database has been updated to:
(1) Total number of registered water assets was increased by 18 due to:
(a) The 3 assets changed M2 test to "Yes" from the review done by Ecologist group.
(b) 15 indigenous water assets from OWS were added.
The Asset database is registered to the BA repository as an ESRI personal goedatabase (.mdb - doubling as a MS Access database) that can store, query, and manage non-spatial data while the spatial data is in a separated file geodatabase joined by AID/Element ID/BARID. Under the BA program, a spatial assets database is developed for each defined bioregional assessment project. The spatial elements that underpin the identification of water dependent assets are identified in the first instance by regional NRM organisations (via the WAIT tool) and supplemented with additional elements from national and state/territory government datasets. All reports received associated with the WAIT process for Gloucester are included in the zip file as part of this dataset. Elements are initially included in the preliminary assets database if they are partly or wholly within the subregion's preliminary assessment extent (Materiality Test 1, M1). Elements are then grouped into assets which are evaluated by project teams to determine whether they meet the second Materiality Test (M2). Assets meeting both Materiality Tests comprise the water dependent asset list. Descriptions of the assets identified in the Gloucester subregion are found in the "AssetList" table of the database. In this version of the database only M1 has been assessed. Assets are the spatial features used by project teams to model scenarios under the BA program. Detailed attribution does not exist at the asset level. Asset attribution includes only the core set of BA-derived attributes reflecting the BA classification hierarchy, as described in Appendix A of "GLO_asset_database_doc_20160212.doc", located in the zip file as part of this dataset. The "Element_to_Asset" table contains the relationships and identifies the elements that were grouped to create each asset. Detailed information describing the database structure and content can be found in the document "GLO_asset_database_doc_20160212.doc" located in the zip file. The public version of this asset database can be accessed via the following dataset: Asset database for the Gloucester subregion on 12 February 2016 Public v02 (https://data.gov.au/data/dataset/5def411c-dbc4-4b75-b509-4230964ce0fa).
Used for Gloucester subregion for bioregional assessments
The public version of this asset database can be accessed via the following dataset: Asset database for the Gloucester subregion on 12 February 2016 Public v02 (https://data.gov.au/data/dataset/5def411c-dbc4-4b75-b509-4230964ce0fa).
VersionID Date Notes
1.0 17/03/2014 Initial database
1.01 19/03/2014 Update classification using latest one
2.0 23/05/2014 Update asset area for some assets
3.0 9/07/2014 updated to include new assets and elements identified by community.
4.0 29/08/2014 updated assets and elements from WSP
5.0 4/09/2014 Table AssetDecisions is added to record decision making process and decisions about M2 are also added in table
asset list
6.0 8/04/2015 195/9 Groundwater economic point elements/assets were added in while 81/7 Groundwater economic point
elements/assets were turned off
7.0 27/05/2015 The receptor data ( tables: ReceptorList, tbl_Receptors_GDE, tbl_Receptors_GW, tbl_Receptors_SW and
tbl_Receptors_SW_Catchment_Ref_Only; and spatial data: GM_GLO_ReceptorList_pt) is added
7.1 21/08/2015 "(1) Delete (a) line 26 from tab "Description" and (b) column E from tab "Receptor register" about "Depth" parameters
in BA-NSB-GLO-140-ReceptorRegister-v20150821.xlsx
(2) Delete field of "Depth" from table "ReceptorList" in GLO_asset_database_20150821.mdb
(3) Add two fields of "InRegister" and "Registered Date" to table "ReceptorList" in GLO_asset_database_20150821.mdb
for the consistency with other subregions in the future"
8 16/09/2015 "(1) (a) Update Latitude, Longitude, LandscapeClass using the latest data from GLO project team and update the values
for RegisteredDate, and Group using "GDE", "SW" and "GW" in table ReceptorList in
GLO_asset_database_20150916.mdb; (b) Create draft BA-NSB-GLO-140-ReceptorRegister-v20150916.xlsx
(2) Update tbl_Receptors_GDE, tbl_Receptors_GW and tbl_Receptors_SW in GLO_asset_database_20150916.mdb,
using the latest data from GLO project team.
(3) Update GM_GLO_ReceptorList_pt in GLO_asset_database_20150916_GISOnly.gdb, using the latest data from GLO
project team"
8.1 29/10/2015 (a) Update LandscapeClass field in table ReceptorList for all 222 economic Receptors to match the latest decision about
this parameter (b) Create draft BA-NSB-GLO-140-ReceptorRegister-v20151029.xlsx
8.2 12/02/2016 "(1) Total number of registered water assets was increased by 18 due to:
(a) The 3 assets changed M2 test to "Yes" from the review done by Ecologist group. The original data is included the
database as the table tbl_GLO_Species_TEC_decisions_review_23112015
(b) 15 indigenous water assets from OWS were added. The data and documents from OWS are included in
subdirectory Indigenous_doc
(c)The draft new Water Dependent Asset Register file (BA-NSB-GLO-130-WaterDependentAssetRegister-AssetList-
v20160212.xlsx) was created"
The source metadata was updated to meet the purpose of the Bioregional Assessment Programme
Bioregional Assessment Programme (2014) Asset database for the Gloucester subregion on 12 February 2016. Bioregional Assessment Derived Dataset. Viewed 18 July 2018, http://data.bioregionalassessments.gov.au/dataset/72a47bec-1393-49d6-b379-0e48551d26a9.
Derived From Standard Instrument Local Environmental Plan (LEP) - Heritage (HER) (NSW)
Derived From NSW Office of Water GW licence extract linked to spatial locations - GLO v5 UID elements 27032014
Derived From Asset database for the Gloucester subregion on 21 August 2015
Derived From Gloucester digitised coal mine boundaries
Derived From Groundwater Dependent Ecosystems supplied by the NSW Office of Water on 13/05/2014
Derived From [NSW Office of Water GW licence extract linked to spatial locations GLOv4 UID
https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
This dataset was used by the NCI's Quantitative Imaging Network (QIN) PET-CT Subgroup for their project titled: Multi-center Comparison of Radiomic Features from Different Software Packages on Digital Reference Objects and Patient Datasets. The purpose of this project was to assess the agreement among radiomic features when computed by several groups by using different software packages under very tightly controlled conditions, which included common image data sets and standardized feature definitions.
The image datasets (and Volumes of Interest – VOIs) provided here are the same ones used in that project and reported in the publication listed below (ISSN 2379-1381 https://doi.org/10.18383/j.tom.2019.00031). In addition, we have provided detailed information about the software packages used (Table 1 in that publication) as well as the individual feature value results for each image dataset and each software package that was used to create the summary tables (Tables 2, 3 and 4) in that publication.
For that project, nine common quantitative imaging features were selected for comparison including features that describe morphology, intensity, shape, and texture and that are described in detail in the International Biomarker Standardisation Initiative (IBSI, https://arxiv.org/abs/1612.07003 and publication (Zwanenburg A. Vallières M, et al, The Image Biomarker Standardization Initiative: Standardized Quantitative Radiomics for High-Throughput Image-based Phenotyping. Radiology. 2020 May;295(2):328-338. doi: https://doi.org/10.1148/radiol.2020191145).
There are three datasets provided – two image datasets and one dataset consisting of four excel spreadsheets containing feature values.
https://www.usa.gov/government-workshttps://www.usa.gov/government-works
Note: Reporting of new COVID-19 Case Surveillance data will be discontinued July 1, 2024, to align with the process of removing SARS-CoV-2 infections (COVID-19 cases) from the list of nationally notifiable diseases. Although these data will continue to be publicly available, the dataset will no longer be updated.
Authorizations to collect certain public health data expired at the end of the U.S. public health emergency declaration on May 11, 2023. The following jurisdictions discontinued COVID-19 case notifications to CDC: Iowa (11/8/21), Kansas (5/12/23), Kentucky (1/1/24), Louisiana (10/31/23), New Hampshire (5/23/23), and Oklahoma (5/2/23). Please note that these jurisdictions will not routinely send new case data after the dates indicated. As of 7/13/23, case notifications from Oregon will only include pediatric cases resulting in death.
This case surveillance public use dataset has 12 elements for all COVID-19 cases shared with CDC and includes demographics, any exposure history, disease severity indicators and outcomes, presence of any underlying medical conditions and risk behaviors, and no geographic data.
The COVID-19 case surveillance database includes individual-level data reported to U.S. states and autonomous reporting entities, including New York City and the District of Columbia (D.C.), as well as U.S. territories and affiliates. On April 5, 2020, COVID-19 was added to the Nationally Notifiable Condition List and classified as “immediately notifiable, urgent (within 24 hours)” by a Council of State and Territorial Epidemiologists (CSTE) Interim Position Statement (Interim-20-ID-01). CSTE updated the position statement on August 5, 2020, to clarify the interpretation of antigen detection tests and serologic test results within the case classification (Interim-20-ID-02). The statement also recommended that all states and territories enact laws to make COVID-19 reportable in their jurisdiction, and that jurisdictions conducting surveillance should submit case notifications to CDC. COVID-19 case surveillance data are collected by jurisdictions and reported voluntarily to CDC.
For more information:
NNDSS Supports the COVID-19 Response | CDC.
The deidentified data in the “COVID-19 Case Surveillance Public Use Data” include demographic characteristics, any exposure history, disease severity indicators and outcomes, clinical data, laboratory diagnostic test results, and presence of any underlying medical conditions and risk behaviors. All data elements can be found on the COVID-19 case report form located at www.cdc.gov/coronavirus/2019-ncov/downloads/pui-form.pdf.
COVID-19 case reports have been routinely submitted using nationally standardized case reporting forms. On April 5, 2020, CSTE released an Interim Position Statement with national surveillance case definitions for COVID-19 included. Current versions of these case definitions are available here: https://ndc.services.cdc.gov/case-definitions/coronavirus-disease-2019-2021/.
All cases reported on or after were requested to be shared by public health departments to CDC using the standardized case definitions for laboratory-confirmed or probable cases. On May 5, 2020, the standardized case reporting form was revised. Case reporting using this new form is ongoing among U.S. states and territories.
To learn more about the limitations in using case surveillance data, visit FAQ: COVID-19 Data and Surveillance.
CDC’s Case Surveillance Section routinely performs data quality assurance procedures (i.e., ongoing corrections and logic checks to address data errors). To date, the following data cleaning steps have been implemented:
To prevent release of data that could be used to identify people, data cells are suppressed for low frequency (<5) records and indirect identifiers (e.g., date of first positive specimen). Suppression includes rare combinations of demographic characteristics (sex, age group, race/ethnicity). Suppressed values are re-coded to the NA answer option; records with data suppression are never removed.
For questions, please contact Ask SRRG (eocevent394@cdc.gov).
COVID-19 data are available to the public as summary or aggregate count files, including total counts of cases and deaths by state and by county. These
The dataset was derived by the Bioregional Assessment Programme from multiple source datasets. The source datasets are identified in the Lineage field in this metadata statement. The processes undertaken to produce this derived dataset are described in the History field in this metadata statement.
This V7 database has been updated to include the Receptor data from Clarence-Moreton assessment team. The relevant tables of ReceptorList and tbl_Receptors and the spatial data of GM_CLM_ReceptorList_pt were added to this version. The Clarence-Moreton Asset database v7 supersedes the previous version of the Asset database only in Receptor relevant tables/ feature class (i.e. ReceptorList, tbl_Receptors and GM_CLM_ReceptorList_pt ).
This dataset contains v7 of the Asset database (CLM_asset_database_20150916.mdb), a Geodatabase version for GIS mapping purposes (CLM_asset_database_20150916_GISOnly.gdb), the draft Receptor Register spreadsheet (BA-CLM-CLM-140-ReceptorRegister-V20150916.xlsx), a data dictionary (CLM_asset_database_doc_20150916.doc), and a folder (NRM_DOC) containing documentation associated with the WAIT process as outlined below.
Under the BA program, a spatial assets database is developed for each defined bioregional assessment project. The spatial elements that underpin the identification of water dependent assets are identified in the first instance by regional NRM organisations (via the WAIT tool) and supplemented with additional elements from national and state/territory government datasets. All reports received associated with the WAIT process for Clarence-Morton are included in the zip file as part of this dataset. Elements are initially included in the preliminary assets database if they are partly or wholly within the subregion's preliminary assessment extent (Materiality Test 1, M1). Elements are then grouped into assets which are evaluated by project teams to determine whether they meet the second Materiality Test (M2). Assets meeting both Materiality Tests comprise the water dependent asset list. Descriptions of the assets identified in the Clarence-Morton subregion are found in the "AssetList" table of the database. In this version of the database M1 and M2 have both been assessed. Assets are the spatial features used by project teams to model scenarios under the BA program. Detailed attribution does not exist at the asset level. Asset attribution includes only the core set of BA-derived attributes reflecting the BA classification hierarchy, as described in Appendix A of "CLM_asset_database_doc_20150916.doc", located in the zip file as part of this dataset. The "Element_to_Asset" table contains the relationships and identifies the elements that were grouped to create each asset. Detailed information describing the database structure and content can be found in the document "CLM_asset_database_doc_20150916.doc" located in the zip file. Some of the source data used in the compilation of this dataset is restricted.
OBJECTID VersionID Date_ Notes
1 1 2/07/2014 Initial database.
2 2 15/08/2014 Initial database with new WSP assets
3 3 9/09/2014 add 87 line assets from early SEQLD WAIT data ; updated NSW RegERiv and GWMP assets, and changed AID (5010 to 5180 to 15010 to 15180)
4 3.1 15/09/2014 " Updated class ""Groundwater-dependent ecosystems"" to ""Groundwater-dependent ecosystem"""
5 4 20/02/2015 Add additional eight datasets from the community consultation and assessment team: QLD RE 11, NSW GDE, QLD Wetland System 100K, QLD GDE Surface Areas, QLD GDE Line, QLD GDE Terrestrial Areas, NGIS QLD Bores and AHGF Network Stream. Turn off National GDE assets
6 5 3/06/2015 As requested by CLM assessment team, those NGIS economic assets AIDs from17009 and 17013 inclusive replace previous NGIS ecological assets (from same QLD NGIS elements) AIDs from16800 and 16803 inclusive. Turn off assets AIDs from16800 and 16803
7 6 6/07/2015 This v6 CLM Asset database includes M2 test results (Does the asset pass the water dependency test? or WDTest ) from Clarence-Moreton assessment project team.
8 6.1 19/08/2015 "(1) Corrected the spelling error of PAE_Region to Clarence-Moreton for all assets and elements
(2) (a) Extracted long ( >255 characters) WD rationale for 2 assets in tab "Water-dependent asset register" and 4 assets in tab "Asset list " in 1.30 Excel file (b) recreated in BA-CLM-CLM-130-WaterDependentAssetRegister-AssetList-V20150819.xlsx
(3) Modified queries (Find_All_Asset_List and Find_Waterdependent_asset_register) for (2)(a)"
9 7 16/09/2015 "(1)(a) Add table ReceptorList in CLM_asset_database_20150916.mdb, using the Excel file from CLM project team (b) Create draft BA-CLM-CLM-140-ReceptorRegister-V20150916.xlsx
(2) Add table tbl_Receptors in CLM_asset_database_20150916.mdb and GM_CLM_ReceptorList_pt in CLM_asset_database_20150916_GISOnly.gdb, using the spatial data from CLM project team;
(3)Add SQL query "Find_used_Receptor" for extracting all used receptor for the register"
Bioregional Assessment Programme (2014) Asset database for the Clarence-Moreton bioregion on 16 September 2015. Bioregional Assessment Derived Dataset. Viewed 10 July 2017, http://data.bioregionalassessments.gov.au/dataset/e7940ec8-ec73-4cc5-bc4e-0c85f98354f1.
Derived From QLD Dept of Natural Resources and Mines, Groundwater Entitlements 20131204
Derived From Combined Surface Waterbodies for the Clarence-Moreton bioregion
Derived From Queensland QLD - Regional - NRM - Water Asset Information Tool - WAIT - databases
Derived From Version 02 Asset list for Clarence Morton 8/8/2014 - ERIN ORIGINAL DATA
Derived From Asset database for the Clarence-Moreton bioregion on 11 December 2014, minor version v20150603
Derived From NSW Office of Water Surface Water Entitlements Locations v1_Oct2013
Derived From Geofabric Surface Catchments - V2.1
Derived From Matters of State environmental significance (version 4.1), Queensland
Derived From Geofabric Surface Network - V2.1
Derived From Communities of National Environmental Significance Database - RESTRICTED - Metadata only
Derived From Ramsar Wetlands of Australia
Derived From National Groundwater Dependent Ecosystems (GDE) Atlas
Derived From QLD Dept of Natural Resources and Mines, Groundwater Entitlements linked to bores v3 03122014
Derived From Multi-resolution Valley Bottom Flatness MrVBF at three second resolution CSIRO 20000211
Derived From National Groundwater Information System (NGIS) v1.1
Derived From Birds Australia - Important Bird Areas (IBA) 2009
Derived From Geofabric Surface Network - V2.1.1
Derived From Queensland QLD Regional CMA Water Asset Information WAIT tool databases RESTRICTED Includes ALL Reports
Derived From Queensland wetland data version 3 - wetland areas.
Derived From Multi-resolution Ridge Top Flatness at 3 second resolution CSIRO 20000211
Derived From South East Queensland GDE (draft)
Derived From Geofabric Surface Cartography - V2.1
Derived From Version 01 Asset list for Clarence Morton 10/3/2014 - ERIN ORIGINAL DATA
Derived From National Groundwater Dependent Ecosystems (GDE) Atlas (including WA)
Derived From CLM - 16swo NSW Office of Water Surface Water Offtakes - Clarence Moreton v1 24102013
Derived From QLD Dept of Natural Resources and Mines, Surface Water Entitlements 131204
Derived From GEODATA TOPO 250K Series 3, File Geodatabase format (.gdb)
Derived From [Asset database for the Clarence-Moreton bioregion on 19 August
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset is designed to test features of Connectomix. For more information, please visit the GitHub repository:
https://github.com/ln2t/connectomix
The dataset contains data for 4 participants.
control
and patient
groups in the participants.tsv
file. This split has been done artificially and serves only testing purposes.The exact commands to run the analyzes depends on your installation of fMRIPrep.
In what follows, we simply assume that fmriprep
is the command for fMRIPrep.
We show here the simplest version of the commands, assuming you adapt those depending on your setup (e.g. if you use Docker).
We also assume that the data are at the following locations:
bash
bids_dir='/data/ds005699'
derivatives_dir='/data/ds005699/derivatives'
fmriprep $bids_dir ${derivatives_dir}/fmriprep participant --fs-license-file /path/to/fs/license
Note: The following has been tested for connectomix version 1.0.1.
First set-up path to connectomix script:
bash
connectomix_cmd='/path/to/connectomix/connectomix/connectomix.py'
Second, set-up paths to config directory:
bash
config_dir='/data/ds005625/code/connectomix/config'
$connectomix_cmd ${bids_dir} ${derivatives_dir}/connectomix participant --derivatives fmriprep="${derivatives_dir}/fmriprep" --config "${config_dir}/participant_level_config.yaml"
Notes: - this is an example of Independent two-samples t-test - Since the dataset contains only four subjects (two subjects per group), the number of possible permutations is very low. For this reason, the number of computed permutations is set to 4, and connectomix can then complete the group level-analysis. Of course, realistic cases should not only include much more participants, but also a much larger number of permutations (see connectomix documentation).
$connectomix_cmd ${bids_dir} ${derivatives_dir}/connectomix group --config "${config_dir}/group_level_config.yaml"
http://www.opendefinition.org/licenses/cc-by-sahttp://www.opendefinition.org/licenses/cc-by-sa
The SWOT Level 2 River Single-Pass Vector Reach Data Product from the Surface Water Ocean Topography (SWOT) mission provides water surface elevation, slope, width, and discharge derived from the high rate (HR) data stream from the Ka-band Radar Interferometer (KaRIn). SWOT launched on December 16, 2022 from Vandenberg Air Force Base in California into a 1-day repeat orbit for the "calibration" or "fast-sampling" phase of the mission, which completed in early July 2023. After the calibration phase, SWOT entered a 21-day repeat orbit in August 2023 to start the "science" phase of the mission, which is expected to continue through 2025.
Water surface elevation, slope, width, and discharge are provided for river reaches (approximately 10 km long) and nodes (approximately 200 m spacing) identified in the prior river database, and distributed as feature datasets covering the full swath for each continent-pass. These data are generally produced for inland and coastal hydrology surfaces, as controlled by the reloadable KaRIn HR mask. The dataset is distributed in ESRI Shapefile format. Please note that this collection contains SWOT Version C science data products.
This collection is a sub-collection of its parent: https://podaac.jpl.nasa.gov/dataset/SWOT_L2_HR_RiverSP_2.0 It contains only river reaches.
NOTE: A more current version of the Protected Areas Database of the United States (PAD-US) is available: PAD-US 3.0 https://doi.org/10.5066/P9Q9LQ4B. The USGS Protected Areas Database of the United States (PAD-US) is the nation's inventory of protected areas, including public land and voluntarily provided private protected areas, identified as an A-16 National Geospatial Data Asset in the Cadastre Theme (https://communities.geoplatform.gov/ngda-cadastre/). The PAD-US is an ongoing project with several published versions of a spatial database including areas dedicated to the preservation of biological diversity, and other natural (including extraction), recreational, or cultural uses, managed for these purposes through legal or other effective means. The database was originally designed to support biodiversity assessments; however, its scope expanded in recent years to include all public and nonprofit lands and waters. Most are public lands owned in fee (the owner of the property has full and irrevocable ownership of the land); however, long-term easements, leases, agreements, Congressional (e.g. 'Wilderness Area'), Executive (e.g. 'National Monument'), and administrative designations (e.g. 'Area of Critical Environmental Concern') documented in agency management plans are also included. The PAD-US strives to be a complete inventory of public land and other protected areas, compiling “best available” data provided by managing agencies and organizations. The PAD-US geodatabase maps and describes areas using over twenty-five attributes and five feature classes representing the U.S. protected areas network in separate feature classes: Fee (ownership parcels), Designation, Easement, Marine, Proclamation and Other Planning Boundaries. Five additional feature classes include various combinations of the primary layers (for example, Combined_Fee_Easement) to support data management, queries, web mapping services, and analyses. This PAD-US Version 2.1 dataset includes a variety of updates and new data from the previous Version 2.0 dataset (USGS, 2018 https://doi.org/10.5066/P955KPLE ), achieving the primary goal to "Complete the PAD-US Inventory by 2020" (https://www.usgs.gov/core-science-systems/science-analytics-and-synthesis/gap/science/pad-us-vision) by addressing known data gaps with newly available data. The following list summarizes the integration of "best available" spatial data to ensure public lands and other protected areas from all jurisdictions are represented in PAD-US, along with continued improvements and regular maintenance of the federal theme. Completing the PAD-US Inventory: 1) Integration of over 75,000 city parks in all 50 States (and the District of Columbia) from The Trust for Public Land's (TPL) ParkServe data development initiative (https://parkserve.tpl.org/) added nearly 2.7 million acres of protected area and significantly reduced the primary known data gap in previous PAD-US versions (local government lands). 2) First-time integration of the Census American Indian/Alaskan Native Areas (AIA) dataset (https://www2.census.gov/geo/tiger/TIGER2019/AIANNH) representing the boundaries for federally recognized American Indian reservations and off-reservation trust lands across the nation (as of January 1, 2020, as reported by the federally recognized tribal governments through the Census Bureau's Boundary and Annexation Survey) addressed another major PAD-US data gap. 3) Aggregation of nearly 5,000 protected areas owned by local land trusts in 13 states, aggregated by Ducks Unlimited through data calls for easements to update the National Conservation Easement Database (https://www.conservationeasement.us/), increased PAD-US protected areas by over 350,000 acres. Maintaining regular Federal updates: 1) Major update of the Federal estate (fee ownership parcels, easement interest, and management designations), including authoritative data from 8 agencies: Bureau of Land Management (BLM), U.S. Census Bureau (Census), Department of Defense (DOD), U.S. Fish and Wildlife Service (FWS), National Park Service (NPS), Natural Resources Conservation Service (NRCS), U.S. Forest Service (USFS), National Oceanic and Atmospheric Administration (NOAA). The federal theme in PAD-US is developed in close collaboration with the Federal Geographic Data Committee (FGDC) Federal Lands Working Group (FLWG, https://communities.geoplatform.gov/ngda-govunits/federal-lands-workgroup/); 2) Complete National Marine Protected Areas (MPA) update: from the National Oceanic and Atmospheric Administration (NOAA) MPA Inventory, including conservation measure ('GAP Status Code', 'IUCN Category') review by NOAA; Other changes: 1) PAD-US field name change - The "Public Access" field name changed from 'Access' to 'Pub_Access' to avoid unintended scripting errors associated with the script command 'access'. 2) Additional field - The "Feature Class" (FeatClass) field was added to all layers within PAD-US 2.1 (only included in the "Combined" layers of PAD-US 2.0 to describe which feature class data originated from). 3) Categorical GAP Status Code default changes - National Monuments are categorically assigned GAP Status Code = 2 (previously GAP 3), in the absence of other information, to better represent biodiversity protection restrictions associated with the designation. The Bureau of Land Management Areas of Environmental Concern (ACECs) are categorically assigned GAP Status Code = 3 (previously GAP 2) as the areas are administratively protected, not permanent. More information is available upon request. 4) Agency Name (FWS) geodatabase domain description changed to U.S. Fish and Wildlife Service (previously U.S. Fish & Wildlife Service). 5) Select areas in the provisional PAD-US 2.1 Proclamation feature class were removed following a consultation with the data-steward (Census Bureau). Tribal designated statistical areas are purely a geographic area for providing Census statistics with no land base. Most affected areas are relatively small; however, 4,341,120 acres and 37 records were removed in total. Contact Mason Croft (masoncroft@boisestate) for more information about how to identify these records. For more information regarding the PAD-US dataset please visit, https://usgs.gov/gapanalysis/PAD-US/. For more information about data aggregation please review the Online PAD-US Data Manual available at https://www.usgs.gov/core-science-systems/science-analytics-and-synthesis/gap/pad-us-data-manual .
http://www.opendefinition.org/licenses/cc-by-sahttp://www.opendefinition.org/licenses/cc-by-sa
The SWOT Level 2 River Single-Pass Vector Node Data Product from the Surface Water Ocean Topography (SWOT) mission provides water surface elevation, slope, width, and discharge derived from the high rate (HR) data stream from the Ka-band Radar Interferometer (KaRIn). SWOT launched on December 16, 2022 from Vandenberg Air Force Base in California into a 1-day repeat orbit for the "calibration" or "fast-sampling" phase of the mission, which completed in early July 2023. After the calibration phase, SWOT entered a 21-day repeat orbit in August 2023 to start the "science" phase of the mission, which is expected to continue through 2025.
Water surface elevation, slope, width, and discharge are provided for river reaches (approximately 10 km long) and nodes (approximately 200 m spacing) identified in the prior river database, and distributed as feature datasets covering the full swath for each continent-pass. These data are generally produced for inland and coastal hydrology surfaces, as controlled by the reloadable KaRIn HR mask. The dataset is distributed in ESRI Shapefile format. Please note that this collection contains SWOT Version C science data products.
This collection is a sub-collection of its parent: https://podaac.jpl.nasa.gov/dataset/SWOT_L2_HR_RiverSP_2.0 It contains only river nodes.
Overview: Actual Natural Vegetation (ANV): probability of occurrence for the Common hazel in its realized environment for the period 2000 - 2022 Traceability (lineage): This is an original dataset produced with a machine learning framework which used a combination of point datasets and raster datasets as inputs. Point dataset is a harmonized collection of tree occurrence data, comprising observations from National Forest Inventories (EU-Forest), GBIF and LUCAS. The complete dataset is available on Zenodo. Raster datasets used as input are: harmonized and gapfilled time series of seasonal aggregates of the Landsat GLAD ARD dataset (bands and spectral indices); monthly time series air and surface temperature and precipitation from a reprocessed version of the Copernicus ERA5 dataset; long term averages of bioclimatic variables from CHELSA, tree species distribution maps from the European Atlas of Forest Tree Species; elevation, slope and other elevation-derived metrics; long term monthly averages snow probability and long term monthly averages of cloud fraction from MODIS. For a more comprehensive list refer to Bonannella et al. (2022) (in review, preprint available at: https://doi.org/10.21203/rs.3.rs-1252972/v1). Scientific methodology: Probability and uncertainty maps were the output of a spatiotemporal ensemble machine learning framework based on stacked regularization. Three base models (random forest, gradient boosted trees and generalized linear models) were first trained on the input dataset and their predictions were used to train an additional model (logistic regression) which provided the final predictions. More details on the whole workflow are available in the listed publication. Usability: Probability maps can be used to detect potential forest degradation and compositional change across the time period analyzed. Some possible applications for these topics are explained in the listed publication. Uncertainty quantification: Uncertainty is quantified by taking the standard deviation of the probabilities predicted by the three components of the spatiotemporal ensemble model. Data validation approaches: Distribution maps were validated using a spatial 5-fold cross validation following the workflow detailed in the listed publication. Completeness: The raster files perfectly cover the entire Geo-harmonizer region as defined by the landmask raster dataset available here. Consistency: Areas which are outside of the calibration area of the point dataset (Iceland, Norway) usually have high uncertainty values. This is not only a problem of extrapolation but also of poor representation in the feature space available to the model of the conditions that are present in this countries. Positional accuracy: The rasters have a spatial resolution of 30m. Temporal accuracy: The maps cover the period 2000 - 2020, each map covers a certain number of years according to the following scheme: (1) 2000--2002, (2) 2002--2006, (3) 2006--2010, (4) 2010--2014, (5) 2014--2018 and (6) 2018--2020 Thematic accuracy: Both probability and uncertainty maps contain values from 0 to 100: in the case of probability maps, they indicate the probability of occurrence of a single individual of the target species, while uncertainty maps indicate the standard deviation of the ensemble model.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The data used in this paper is from the 16th issue of SDSS. SDSS-DR16 contains a total of 930,268 photometric images, with 1.2 billion observation sources and tens of millions of spectra. The data obtained in this paper is downloaded from the official website of SDSS. Specifically, the data is obtained through the SkyServerAPI structure by using SQL query statements in the subwebsite CasJobs. As the current SDSS photometric table PhotoObj can only classify all observed sources as point sources and surface sources, the target sources can be better classified as galaxies, stars and quasars through spectra. Therefore, we obtain calibrated sources in CasJobs by crossing SpecPhoto with the PhotoObj star list, and obtain target position information (right ascension and declination). Calibrated sources can tell them apart precisely and quickly. Each calibrated source is labeled with the parameter "Class" as "galaxy", "star", or "quasar". In this paper, observation day area 3462, 3478, 3530 and other 4 areas in SDSS-DR16 are selected as experimental data, because a large number of sources can be obtained in these areas to provide rich sample data for the experiment. For example, there are 9891 sources in the 3462-day area, including 2790 galactic sources, 2378 stellar sources and 4723 quasar sources. There are 3862 sources in the 3478 day area, including 1759 galactic sources, 577 stellar sources and 1526 quasar sources. FITS files are a commonly used data format in the astronomical community. By cross-matching the star list and FITS files in the local celestial region, we obtained images of 5 bands of u, g, r, i and z of 12499 galaxy sources, 16914 quasar sources and 16908 star sources as training and testing data.1.1 Image SynthesisSDSS photometric data includes photometric images of five bands u, g, r, i and z, and these photometric image data are respectively packaged in single-band format in FITS files. Images of different bands contain different information. Since the three bands g, r and i contain more feature information and less noise, Astronomical researchers typically use the g, r, and i bands corresponding to the R, G, and B channels of the image to synthesize photometric images. Generally, different bands cannot be directly synthesized. If three bands are directly synthesized, the image of different bands may not be aligned. Therefore, this paper adopts the RGB multi-band image synthesis software written by He Zhendong et al. to synthesize images in g, r and i bands. This method effectively avoids the problem that images in different bands cannot be aligned. The pixel of each photometry image in this paper is 2048×1489.1.2 Data tailoringThis paper first clipped the target image, image clipping can use image segmentation tools to solve this problem, this paper uses Python to achieve this process. In the process of clipping, we convert the right ascension and declination of the source in the star list into pixel coordinates on the photometric image through the coordinate conversion formula, and determine the specific position of the source through the pixel coordinates. The coordinates are regarded as the center point and clipping is carried out in the form of a rectangular box. We found that the input image size affects the experimental results. Therefore, according to the target size of the source, we selected three different cutting sizes, 40×40, 60×60 and 80×80 respectively. Through experiment and analysis, we find that convolutional neural network has better learning ability and higher accuracy for data with small image size. In the end, we chose to divide the surface source galaxies, point source quasars, and stars into 40×40 sizes.1.3 Division of training and test dataIn order to make the algorithm have more accurate recognition performance, we need enough image samples. The selection of training set, verification set and test set is an important factor affecting the final recognition accuracy. In this paper, the training set, verification set and test set are set according to the ratio of 8:1:1. The purpose of verification set is used to revise the algorithm, and the purpose of test set is used to evaluate the generalization ability of the final algorithm. Table 1 shows the specific data partitioning information. The total sample size is 34,000 source images, including 11543 galaxy sources, 11967 star sources, and 10490 quasar sources.1.4 Data preprocessingIn this experiment, the training set and test set can be used as the training and test input of the algorithm after data preprocessing. The data quantity and quality largely determine the recognition performance of the algorithm. The pre-processing of the training set and the test set are different. In the training set, we first perform vertical flip, horizontal flip and scale on the cropped image to enrich the data samples and enhance the generalization ability of the algorithm. Since the features in the celestial object source have the flip invariability, the labels of galaxies, stars and quasars will not change after rotation. In the test set, our preprocessing process is relatively simple compared with the training set. We carry out simple scaling processing on the input image and test input the obtained image.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset was derived by the Bioregional Assessment Programme. This dataset was derived from multiple datasets including Natural Resource Management regions, and Australian and state and territory government databases. You can find a link to the parent datasets in the Lineage Field in this metadata statement. The History Field in this metadata statement describes how this dataset was derived. A single asset is represented spatially in the asset database by single or multiple spatial features (point, line or polygon). Individual points, lines or polygons are termed elements.
This data set holds the publicly-available version of the database of water-dependent assets that was compiled for the bioregional assessment (BA) of the Hunter subregion as part of the Bioregional Assessment Technical Programme. Though all life is dependent on water, for the purposes of a bioregional assessment, a water-dependent asset is an asset potentially impacted by changes in the groundwater and/or surface water regime due to coal resource development. The water must be other than local rainfall. Examples include wetlands, rivers, bores and groundwater dependent ecosystems.
This dataset contains the unrestricted publicly-available components of spatial and non-spatial (attribute) data of the (restricted) Asset database for the Hunter subregion on 24 February 2016 (a39290ac-3925-4abc-9ecb-b91e911f008f). The database is provided primarily as an ESRI File geodatabase (.gdb), which is able to be opened in readily available open source software such as QGIS. Other formats include the Microsoft Access database (.mdb in ESRI Personal Geodatabase format), industry-standard ESRI Shapefiles and tab-delimited text files of all the attribute tables.
The restricted version of the Hunter Asset database has a total count of 182 277 Elements (and 2 545 Assets). In the public version of the Asset Hunter database 69 330 spatial Element features (\~38%) have been removed from the Element List and Element Layer(s) and 1124 spatial Assets (\~44%) have been removed from the spatial Asset Layer(s)
The elements/assets removed from the restricted Asset Database are from the following data sources:
1) Species Profile and Threats Database (SPRAT) - RESTRICTED - Metadata only) (7276dd93-cc8c-4c01-8df0-cef743c72112)
2) Threatened migratory shorebird habitat mapping DECCW May 2006 (cc0b62a0-ded7-4c14-b954-1552337b395e)
2) Australia, Register of the National Estate (RNE) - Spatial Database (RNESDB) (878f6780-be97-469b-8517-54bd12a407d0)
3) Communities of National Environmental Significance Database - RESTRICTED - Metadata only (c01c4693-0a51-4dbc-bbbd-7a07952aa5f6)
4) Hunter CMA GDEs (DPI pre-release) - RESTRICTED - Metadata only (469d6d2e-900f-47a7-a137-946b89b3d188)
These important assets are included in the bioregional assessment, but are unable to be publicly distributed by the Bioregional Assessment Programme due to restrictions in their licensing conditions. Please note that many of these data sets are available directly from their custodian. For more precise details please see the associated explanatory Data Dictionary document enclosed with this dataset.
The data are for any external party that wants to access the asset database used for the assessment. The BATP is required to release these wherever possible, to comply with the requirements of transparency and repeatability.
The public version of the asset database retains all of the unrestricted components of the Asset database for the Hunter subregion on 24 February 2016 - any material that is unable to be published or redistributed to a third party by the BA Programme has been removed from the database. The data presented corresponds to the assets published in product 1.3: Description of the water-dependent asset register and asset list for the Hunter subregion on 24 February 2016, and the associated Water-dependent asset register and asset list for the Hunter subregion on 24 February 2016.
Individual spatial features or elements are initially included in database if they are partly or wholly within the subregion's preliminary assessment extent (Materiality Test 1, M1). In accordance to BA submethodology M02: Compiling water-dependent assets, individual spatial elements are then grouped into assets which are evaluated by project teams to determine whether they meet materiality test 2 (M2), which are assets that are considered to be water dependent.
Following delivery of the first pass asset list, project teams make a determination as to whether an asset (comprised of one or more elements) is water dependent, as assessed against the materiality tests detailed in the BA Methodology. These decisions are provided to ERIN by the assessment team and incorporated into the AssetList table in the Asset database.
Development of the Asset Register from the Asset database:
Decisions for M0 (fit for BA purpose), M1 (PAE) and M2 (water dependent) determine which assets are included in the "asset list" and "water-dependent asset register" which are published as Product 1.3.
The rule sets are applied as follows:
M0\tM1\tM2\tResult
No\tn/a\tn/a\tAsset is not included in the asset list or the water-dependent asset register
(≠ No)\tNo\tn/a\tAsset is not included in the asset list or the water-dependent asset register
(≠ No)\tYes\tNo\tAsset included in published asset list but not in water dependent asset register
(≠ No)\tYes\tYes\tAsset included in both asset list and water-dependent asset register
Assessment teams are then able to use the database to assign receptors and impact variables to water-dependent assets and the development of a receptor register as detailed in BA submethodology M03: Assigning receptors to water-dependent assets and the receptor register is then incorporated into the asset database.
At this stage of its development, the Asset database for the Hunter subregion on 24 February 2016, which this document describes, does not contain receptor information.
Bioregional Assessment Programme (2015) Asset database for the Hunter subregion on 24 February 2016 Public 20170112 v02. Bioregional Assessment Derived Dataset. Viewed 13 March 2019, http://data.bioregionalassessments.gov.au/dataset/9d16592c-543b-42d9-a1f4-0f6d70b9ffe7.
Derived From GW Element Bores with Unknown FTYPE Hunter NSW Office of Water 20150514
Derived From Travelling Stock Route Conservation Values
Derived From NSW Wetlands
Derived From Climate Change Corridors Coastal North East NSW
Derived From Communities of National Environmental Significance Database - RESTRICTED - Metadata only
Derived From Climate Change Corridors for Nandewar and New England Tablelands
Derived From National Groundwater Dependent Ecosystems (GDE) Atlas
Derived From Asset database for the Hunter subregion on 27 August 2015
Derived From Birds Australia - Important Bird Areas (IBA) 2009
Derived From Estuarine Macrophytes of Hunter Subregion NSW DPI Hunter 2004
Derived From Hunter CMA GDEs (DRAFT DPI pre-release)
Derived From Camerons Gorge Grassy White Box Endangered Ecological Community (EEC) 2008
Derived From Asset database for the Hunter subregion on 16 June 2015
Derived From Spatial Threatened Species and Communities (TESC) NSW 20131129
Derived From Asset database for the Hunter subregion on 24 February 2016
Derived From Threatened migratory shorebird habitat mapping DECCW May 2006
Derived From Gosford Council Endangered Ecological Communities (Umina woodlands) EEC3906
Derived From NSW Office of Water Surface Water Offtakes - Hunter v1 24102013
Derived From National Groundwater Dependent Ecosystems (GDE) Atlas (including WA)
Derived From Asset list for Hunter - CURRENT
Derived From NSW Office of Water Surface Water Entitlements Locations v1_Oct2013
Derived From Ramsar Wetlands of Australia
Derived From
The statewide composite of parcels (cadastral) data for New Jersey was developed during the Parcels Normalization Project in 2008-2014 by the NJ Office of Information Technology, Office of GIS (NJOGIS.) The normalized parcels data are compatible with the NJ Department of the Treasury system currently used by Tax Assessors. This composite of parcels data serves as one of NJ's framework GIS datasets. Stewardship and maintenance of the data will continue to be the purview of county and municipal governments, but the statewide composite will be maintained by NJOGIS.Parcel attributes were normalized to a standard structure, specified in the NJ GIS Parcel Mapping Standard, to store parcel information and provide a PIN (parcel identification number) field that can be used to match records with suitably-processed property tax data. The standard is available for viewing and download at https://geoapps.nj.gov/njgin/parcel/NJGIS_ParcelMappingStandardv3.2.pdf. This feature class includes only those minimal attributes. The statewide property tax table is available as a separate download "MOD-IV Tax List Search Plus Database of New Jersey" or combined with the parcels as a separate download "Parcels and MOD-IV Composite of New Jersey." Also available separately are countywide parcels and tables of property ownership and tax information extracted from the NJ Division of Taxation database.The polygons delineated in this dataset do not represent legal boundaries and should not be used to provide a legal determination of land ownership. Parcels are not survey data and should not be used as such. Please note that these parcel datasets are not intended for use as tax maps. They are intended to provide reasonable representations of parcel boundaries for planning and other purposes. Please see Data Quality / Process Steps for details about updates to this composite since its first publication.***NOTE*** For users who incorporate NJOGIS services into web maps and/or web applications, please sign up for the NJ Geospatial Forum discussion listserv for early notification of service changes. Visit https://nj.gov/njgf/about/listserv/ for more information.
The USGS Protected Areas Database of the United States (PAD-US) is the nation's inventory of protected areas, including public land and voluntarily provided private protected areas, identified as an A-16 National Geospatial Data Asset in the Cadastre Theme (https://communities.geoplatform.gov/ngda-cadastre/). The PAD-US is an ongoing project with several published versions of a spatial database including areas dedicated to the preservation of biological diversity, and other natural (including extraction), recreational, or cultural uses, managed for these purposes through legal or other effective means. The database was originally designed to support biodiversity assessments; however, its scope expanded in recent years to include all public and nonprofit lands and waters. Most are public lands owned in fee (the owner of the property has full and irrevocable ownership of the land); however, long-term easements, leases, agreements, Congressional (e.g. 'Wilderness Area'), Executive (e.g. 'National Monument'), and administrative designations (e.g. 'Area of Critical Environmental Concern') documented in agency management plans are also included. The PAD-US strives to be a complete inventory of public land and other protected areas, compiling “best available” data provided by managing agencies and organizations. The PAD-US geodatabase maps and describes areas using over twenty-five attributes and five feature classes representing the U.S. protected areas network in separate feature classes: Fee (ownership parcels), Designation, Easement, Marine, Proclamation and Other Planning Boundaries. Five additional feature classes include various combinations of the primary layers (for example, Combined_Fee_Easement) to support data management, queries, web mapping services, and analyses. This PAD-US Version 2.1 dataset includes a variety of updates and new data from the previous Version 2.0 dataset (USGS, 2018 https://doi.org/10.5066/P955KPLE ), achieving the primary goal to "Complete the PAD-US Inventory by 2020" (https://www.usgs.gov/core-science-systems/science-analytics-and-synthesis/gap/science/pad-us-vision) by addressing known data gaps with newly available data. The following list summarizes the integration of "best available" spatial data to ensure public lands and other protected areas from all jurisdictions are represented in PAD-US, along with continued improvements and regular maintenance of the federal theme. Completing the PAD-US Inventory: 1) Integration of over 75,000 city parks in all 50 States (and the District of Columbia) from The Trust for Public Land's (TPL) ParkServe data development initiative (https://parkserve.tpl.org/) added nearly 2.7 million acres of protected area and significantly reduced the primary known data gap in previous PAD-US versions (local government lands). 2) First-time integration of the Census American Indian/Alaskan Native Areas (AIA) dataset (https://www2.census.gov/geo/tiger/TIGER2019/AIANNH) representing the boundaries for federally recognized American Indian reservations and off-reservation trust lands across the nation (as of January 1, 2020, as reported by the federally recognized tribal governments through the Census Bureau's Boundary and Annexation Survey) addressed another major PAD-US data gap. 3) Aggregation of nearly 5,000 protected areas owned by local land trusts in 13 states, aggregated by Ducks Unlimited through data calls for easements to update the National Conservation Easement Database (https://www.conservationeasement.us/), increased PAD-US protected areas by over 350,000 acres. Maintaining regular Federal updates: 1) Major update of the Federal estate (fee ownership parcels, easement interest, and management designations), including authoritative data from 8 agencies: Bureau of Land Management (BLM), U.S. Census Bureau (Census), Department of Defense (DOD), U.S. Fish and Wildlife Service (FWS), National Park Service (NPS), Natural Resources Conservation Service (NRCS), U.S. Forest Service (USFS), National Oceanic and Atmospheric Administration (NOAA). The federal theme in PAD-US is developed in close collaboration with the Federal Geographic Data Committee (FGDC) Federal Lands Working Group (FLWG, https://communities.geoplatform.gov/ngda-govunits/federal-lands-workgroup/); 2) Complete National Marine Protected Areas (MPA) update: from the National Oceanic and Atmospheric Administration (NOAA) MPA Inventory, including conservation measure ('GAP Status Code', 'IUCN Category') review by NOAA; Other changes: 1) PAD-US field name change - The "Public Access" field name changed from 'Access' to 'Pub_Access' to avoid unintended scripting errors associated with the script command 'access'. 2) Additional field - The "Feature Class" (FeatClass) field was added to all layers within PAD-US 2.1 (only included in the "Combined" layers of PAD-US 2.0 to describe which feature class data originated from). 3) Categorical GAP Status Code default changes - National Monuments are categorically assigned GAP Status Code = 2 (previously GAP 3), in the absence of other information, to better represent biodiversity protection restrictions associated with the designation. The Bureau of Land Management Areas of Environmental Concern (ACECs) are categorically assigned GAP Status Code = 3 (previously GAP 2) as the areas are administratively protected, not permanent. More information is available upon request. 4) Agency Name (FWS) geodatabase domain description changed to U.S. Fish and Wildlife Service (previously U.S. Fish & Wildlife Service). 5) Select areas in the provisional PAD-US 2.1 Proclamation feature class were removed following a consultation with the data-steward (Census Bureau). Tribal designated statistical areas are purely a geographic area for providing Census statistics with no land base. Most affected areas are relatively small; however, 4,341,120 acres and 37 records were removed in total. Contact Mason Croft (masoncroft@boisestate) for more information about how to identify these records. For more information regarding the PAD-US dataset please visit, https://usgs.gov/gapanalysis/PAD-US/. For more information about data aggregation please review the Online PAD-US Data Manual available at https://www.usgs.gov/core-science-systems/science-analytics-and-synthesis/gap/pad-us-data-manual .
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘PLACES: County Data (GIS Friendly Format), 2021 release’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://catalog.data.gov/dataset/e128e2f2-02af-4605-81aa-97ebdb8b2fc8 on 12 February 2022.
--- Dataset description provided by original source is as follows ---
This dataset contains model-based county-level estimates for the PLACES 2021 release in GIS-friendly format. PLACES is the expansion of the original 500 Cities Project and covers the entire United States—50 states and the District of Columbia (DC)—at county, place, census tract, and ZIP Code Tabulation Area (ZCTA) levels. It represents a first-of-its kind effort to release information uniformly on this large scale for local areas at 4 geographic levels. Estimates were provided by the Centers for Disease Control and Prevention (CDC), Division of Population Health, Epidemiology and Surveillance Branch. Project was funded by the Robert Wood Johnson Foundation (RWJF) in conjunction with the CDC Foundation. Data sources used to generate these model-based estimates include Behavioral Risk Factor Surveillance System (BRFSS) 2019 or 2018 data, Census Bureau 2019 or 2018 county population estimates, and American Community Survey (ACS) 2015–2019 or 2014–2018 estimates. The 2021 release uses 2019 BRFSS data for 22 measures and 2018 BRFSS data for 7 measures (all teeth lost, dental visits, mammograms, cervical cancer screening, colorectal cancer screening, core preventive services among older adults, and sleeping less than 7 hours a night). Seven measures are based on the 2018 BRFSS data because the relevant questions are only asked every other year in the BRFSS. These data can be joined with the census 2015 county boundary file in a GIS system to produce maps for 29 measures at the county level. An ArcGIS Online feature service is also available for users to make maps online or to add data to desktop GIS software. https://cdcarcgis.maps.arcgis.com/home/item.html?id=3b7221d4e47740cab9235b839fa55cd7
--- Original source retains full ownership of the source dataset ---
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Identification of risk factors of treatment resistance may be useful to guide treatment selection, avoid inefficient trial-and-error, and improve major depressive disorder (MDD) care. We extended the work in predictive modeling of treatment resistant depression (TRD) via partition of the data from the Sequenced Treatment Alternatives to Relieve Depression (STAR*D) cohort into a training and a testing dataset. We also included data from a small yet completely independent cohort RIS-INT-93 as an external test dataset. We used features from enrollment and level 1 treatment (up to week 2 response only) of STAR*D to explore the feature space comprehensively and applied machine learning methods to model TRD outcome at level 2. For TRD defined using QIDS-C16 remission criteria, multiple machine learning models were internally cross-validated in the STAR*D training dataset and externally validated in both the STAR*D testing dataset and RIS-INT-93 independent dataset with an area under the receiver operating characteristic curve (AUC) of 0.70–0.78 and 0.72–0.77, respectively. The upper bound for the AUC achievable with the full set of features could be as high as 0.78 in the STAR*D testing dataset. Model developed using top 30 features identified using feature selection technique (k-means clustering followed by χ2 test) achieved an AUC of 0.77 in the STAR*D testing dataset. In addition, the model developed using overlapping features between STAR*D and RIS-INT-93, achieved an AUC of > 0.70 in both the STAR*D testing and RIS-INT-93 datasets. Among all the features explored in STAR*D and RIS-INT-93 datasets, the most important feature was early or initial treatment response or symptom severity at week 2. These results indicate that prediction of TRD prior to undergoing a second round of antidepressant treatment could be feasible even in the absence of biomarker data.