Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The data set presents the results of the quality assessment of data sources by the working group on data collection for the identification of emerging risks related to food and feed. For this assessment, the WG defined text descriptors and quality parameters (i.e. link with indicators, data type, geographic and period coverage, language, edition, timeliness, accessibility, clarity and comparability). These data sources were linked to eleven priority indicators (i.e. the ESCO indicators) and qualitatively assessed and profiled.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Background: Digital data sources have become ubiquitous in modern culture in the era of digital technology but often tend to be under-researched because of restricted access to data sources due to fragmentation, privacy issues, or industry ownership, and the methodological complexity of demonstrating their measurable impact on human health. Even though new big data sources have shown unprecedented potential for disease diagnosis and outbreak detection, we need to investigate results in the existing literature to gain a comprehensive understanding of their impact on and benefits to human health.Objective: A systematic review of systematic reviews on identifying digital data sources and their impact area on people's health, including challenges, opportunities, and good practices.Methods: A multidatabase search was performed. Peer-reviewed papers published between January 2010 and November 2020 relevant to digital data sources on health were extracted, assessed, and reviewed.Results: The 64 reviews are covered by three domains, that is, universal health coverage (UHC), public health emergencies, and healthier populations, defined in WHO's General Programme of Work, 2019–2023, and the European Programme of Work, 2020–2025. In all three categories, social media platforms are the most popular digital data source, accounting for 47% (N = 8), 84% (N = 11), and 76% (N = 26) of studies, respectively. The second most utilized data source are electronic health records (EHRs) (N = 13), followed by websites (N = 7) and mass media (N = 5). In all three categories, the most studied impact of digital data sources is on prevention, management, and intervention of diseases (N = 40), and as a tool, there are also many studies (N = 10) on early warning systems for infectious diseases. However, they could also pose health hazards (N = 13), for instance, by exacerbating mental health issues and promoting smoking and drinking behavior among young people.Conclusions: The digital data sources presented are essential for collecting and mining information about human health. The key impact of social media, electronic health records, and websites is in the area of infectious diseases and early warning systems, and in the area of personal health, that is, on mental health and smoking and drinking prevention. However, further research is required to address privacy, trust, transparency, and interoperability to leverage the potential of data held in multiple datastores and systems. This study also identified the apparent gap in systematic reviews investigating the novel big data streams, Internet of Things (IoT) data streams, and sensor, mobile, and GPS data researched using artificial intelligence, complex network, and other computer science methods, as in this domain systematic reviews are not common.
Facebook
TwitterGroundwater is a major drinking water resource but its quality with regard to organic micropollutants (MPs) is insufficiently assessed. Therefore, we aimed to investigate Swiss groundwater more comprehensively using liquid chromatography high-resolution tandem mass spectrometry (LC-HRMS/MS). First, samples from 60 sites were classified as having high or low urban or agricultural influence based on 498 target compounds associated with either urban or agricultural sources. Second, all LC-HRMS signals were related to their potential origin (urban, urban and agricultural, agricultural, or not classifiable) based on their occurrence and intensity in the classified samples. A considerable fraction of estimated concentrations associated with urban and/or agricultural sources could not be explained by the 139 detected targets. The most intense nontarget signals were automatically annotated with structure proposals using MetFrag and SIRIUS4/CSI:FingerID with a list of >988,000 compounds. Additionally, suspect screening was performed for 1162 compounds with predicted high groundwater mobility from primarily urban sources. Finally, 12 nontargets and 11 suspects were identified unequivocally (Level 1), while 17 further compounds were tentatively identified (Level 2a/3). amongst these were 13 pollutants thus far not reported in groundwater, such as: the industrial chemicals 2,5-dichlorobenzenesulfonic acid (19 detections, up to 100 ng L-1), phenylphosponic acid (10 detections, up to 50 ng L-1), triisopropanolamine borate (2 detections, up to 40 ng L-1), O-des[2-aminoethyl]-O-carboxymethyl dehydroamlodipine, a transformation product (TP) of the blood pressure regulator amlodipine (17 detections), and the TP SYN542490 of the herbicide metolachlor (Level 3, 33 detections, estimated concentrations up to 100–500 ng L-1). One monitoring site was far more contaminated than other sites based on estimated total concentrations of potential MPs, which was supported by the elucidation of site-specific nontarget signals such as the carcinogen chlorendic acid, and various naphthalenedisulfonic acids. Many compounds remained unknown, but overall, source related prioritisation proved an effective approach to support identification of compounds in groundwater.
Facebook
TwitterBackground: In Brazil, studies that map electronic healthcare databases in order to assess their suitability for use in pharmacoepidemiologic research are lacking. We aimed to identify, catalogue, and characterize Brazilian data sources for Drug Utilization Research (DUR).Methods: The present study is part of the project entitled, “Publicly Available Data Sources for Drug Utilization Research in Latin American (LatAm) Countries.” A network of Brazilian health experts was assembled to map secondary administrative data from healthcare organizations that might provide information related to medication use. A multi-phase approach including internet search of institutional government websites, traditional bibliographic databases, and experts’ input was used for mapping the data sources. The reviewers searched, screened and selected the data sources independently; disagreements were resolved by consensus. Data sources were grouped into the following categories: 1) automated databases; 2) Electronic Medical Records (EMR); 3) national surveys or datasets; 4) adverse event reporting systems; and 5) others. Each data source was characterized by accessibility, geographic granularity, setting, type of data (aggregate or individual-level), and years of coverage. We also searched for publications related to each data source.Results: A total of 62 data sources were identified and screened; 38 met the eligibility criteria for inclusion and were fully characterized. We grouped 23 (60%) as automated databases, four (11%) as adverse event reporting systems, four (11%) as EMRs, three (8%) as national surveys or datasets, and four (11%) as other types. Eighteen (47%) were classified as publicly and conveniently accessible online; providing information at national level. Most of them offered more than 5 years of comprehensive data coverage, and presented data at both the individual and aggregated levels. No information about population coverage was found. Drug coding is not uniform; each data source has its own coding system, depending on the purpose of the data. At least one scientific publication was found for each publicly available data source.Conclusions: There are several types of data sources for DUR in Brazil, but a uniform system for drug classification and data quality evaluation does not exist. The extent of population covered by year is unknown. Our comprehensive and structured inventory reveals a need for full characterization of these data sources.
Facebook
TwitterThis dataset was updated May, 2025.This ownership dataset was generated primarily from CPAD data, which already tracks the majority of ownership information in California. CPAD is utilized without any snapping or clipping to FRA/SRA/LRA. CPAD has some important data gaps, so additional data sources are used to supplement the CPAD data. Currently this includes the most currently available data from BIA, DOD, and FWS. Additional sources may be added in subsequent versions. Decision rules were developed to identify priority layers in areas of overlap.Starting in 2022, the ownership dataset was compiled using a new methodology. Previous versions attempted to match federal ownership boundaries to the FRA footprint, and used a manual process for checking and tracking Federal ownership changes within the FRA, with CPAD ownership information only being used for SRA and LRA lands. The manual portion of that process was proving difficult to maintain, and the new method (described below) was developed in order to decrease the manual workload, and increase accountability by using an automated process by which any final ownership designation could be traced back to a specific dataset.The current process for compiling the data sources includes: Clipping input datasets to the California boundary Filtering the FWS data on the Primary Interest field to exclude lands that are managed by but not owned by FWS (ex: Leases, Easements, etc) Supplementing the BIA Pacific Region Surface Trust lands data with the Western Region portion of the LAR dataset which extends into California. Filtering the BIA data on the Trust Status field to exclude areas that represent mineral rights only. Filtering the CPAD data on the Ownership Level field to exclude areas that are Privately owned (ex: HOAs) In the case of overlap, sources were prioritized as follows: FWS > BIA > CPAD > DOD As an exception to the above, DOD lands on FRA which overlapped with CPAD lands that were incorrectly coded as non-Federal were treated as an override, such that the DOD designation could win out over CPAD.In addition to this ownership dataset, a supplemental _source dataset is available which designates the source that was used to determine the ownership in this dataset. Data Sources: GreenInfo Network's California Protected Areas Database (CPAD2023a). https://www.calands.org/cpad/; https://www.calands.org/wp-content/uploads/2023/06/CPAD-2023a-Database-Manual.pdf US Fish and Wildlife Service FWSInterest dataset (updated December, 2023). https://gis-fws.opendata.arcgis.com/datasets/9c49bd03b8dc4b9188a8c84062792cff_0/explore Department of Defense Military Bases dataset (updated September 2023) https://catalog.data.gov/dataset/military-bases Bureau of Indian Affairs, Pacific Region, Surface Trust and Pacific Region Office (PRO) land boundaries data (2023) via John Mosley John.Mosley@bia.gov Bureau of Indian Affairs, Land Area Representations (LAR) and BIA Regions datasets (updated Oct 2019) https://biamaps.doi.gov/bogs/datadownload.html Data Gaps & Changes:Known gaps include several BOR, ACE and Navy lands which were not included in CPAD nor the DOD MIRTA dataset. Our hope for future versions is to refine the process by pulling in additional data sources to fill in some of those data gaps. Additionally, any feedback received about missing or inaccurate data can be taken back to the appropriate source data where appropriate, so fixes can occur in the source data, instead of just in this dataset.25_1: The CPAD Input dataset was amended to merge large gaps in certain areas of the state known to be erroneous, such as Yosemite National Park, and to eliminate overlaps from the original input. The FWS input dataset was updated in February of 2025, and the DOD input dataset was updated in October of 2024. The BIA input dataset was the same as was used for the previous ownership version.24_1: Input datasets this year included numerous changes since the previous version, particularly the CPAD and DOD inputs. Of particular note was the re-addition of Camp Pendleton to the DOD input dataset, which is reflected in this version of the ownership dataset. We were unable to obtain an updated input for tribral data, so the previous inputs was used for this version.23_1: A few discrepancies were discovered between data changes that occurred in CPAD when compared with parcel data. These issues will be taken to CPAD for clarification for future updates, but for ownership23_1 it reflects the data as it was coded in CPAD at the time. In addition, there was a change in the DOD input data between last year and this year, with the removal of Camp Pendleton. An inquiry was sent for clarification on this change, but for ownership23_1 it reflects the data per the DOD input dataset.22_1 : represents an initial version of ownership with a new methodology which was developed under a short timeframe. A comparison with previous versions of ownership highlighted the some data gaps with the current version. Some of these known gaps include several BOR, ACE and Navy lands which were not included in CPAD nor the DOD MIRTA dataset. Our hope for future versions is to refine the process by pulling in additional data sources to fill in some of those data gaps. In addition, any topological errors (like overlaps or gaps) that exist in the input datasets may thus carry over to the ownership dataset. Ideally, any feedback received about missing or inaccurate data can be taken back to the relevant source data where appropriate, so fixes can occur in the source data, instead of just in this dataset.
Facebook
TwitterBy Data Society [source]
This dataset provides county-level mortality and health indicators that are useful for measuring the impact of health policies in the United States. It includes data elements and values from over a dozen categories, including Demographics, Leading Causes of Death, Summary Measures of Health, Measures of Birth and Death, Relative Health Importance, Vulnerable Populations and Environmental Health, Preventive Services Use, Risk Factors and Access to Care. Additionally, this dataset offers Healthy People 2010 Targets and US Percentages or Rates for easy comparison across states. With comprehensive information for each county in each indicator domain available here at your fingertips could help you get insight into American population health from the local level like never before. Discover trends on disease outbreaks or immunizations that are unprecedentedly localized with insights from this dataset!
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
This dataset contains various data elements related to the mortality and health of the US population at various levels such as county, state, etc. This dataset is an ideal source of information for researchers and policy makers who are interested in exploring patterns in the mortality and health of US citizens.
In order to use this dataset effectively, it is important to understand the different indicators included as well as how to interpret these indicators. In this guide we will look at each indicator domain separately so that users can easily identify which relevant data elements they need for their analysis.
Demographics: The Demographics indicator domain includes data elements related to demographic characteristics such as age composition, gender composition etc. These indicators can be used to explore trends across different parts of the country or identify disparities among populations.
Leading Causes of Death: The Leading Causes of Death indicator domain contains information on fatalities by cause over a set period of time -- either two years or five years depending on availability -- so that researchers can identify causes that pose major threats to public health overall or in more specific regions such as certain counties. It is important to note that these largely report figures based on death certificates which may not always tell an exact story due to reporting inaccuracies caused by both individual factors and registration biases across counties/states over time.
**Summary Measures Of Health**: The Summary Measures Of Health Indicator Domain includes measures commonly used for gauging overall population health such as birth rates and death rates but also key quality-of-life considerations like prevalence rate physical activity rate . These can be used together with other data sources (such as income info) when analyzing population health outcomes from a broader perspective than individual diseases or conditions would allow for . **Measures Of Birth And Death**: This category provides further insight into the important summary level figures mentioned earlier by providing observations about frequency , timing , type etc where available . Additionally , it offers valuable insights about trends related specifically (among others ) out - migration /in - migration mortality ratio changes/births outside hospitals marriage age / labor force participation trends etc – all essential ingredients when trying solve complex issues related improving public one's life expectancy positively **Relative Health Importance & Vulnerable Populations And Environment Capacity :** This section covers two closely intertwined fields revealing how they interact – socioeconomic status disparities & environment quality – around boundaries & neighborhoods influencing risks factors (not only related medical matters ) aspects such disabilities insurance coverage alcohol use & smoking habits road fatalities veh
- Using the Health Status Indicators as input features, machine learning models can be built to predict county-level mortality rate, which can then be used as an important indicator for health and medical resource allocation.
- The data can also be used to analyze the social determinants of health in different counties by combining with socioeconomic indicators such as poverty, population density and educational attainment levels.
- Additionally, the dataset could help assess th...
Facebook
TwitterJurisdictional Unit, 2022-05-21. For use with WFDSS, IFTDSS, IRWIN, and InFORM.This is a feature service which provides Identify and Copy Feature capabilities. If fast-drawing at coarse zoom levels is a requirement, consider using the tile (map) service layer located at https://nifc.maps.arcgis.com/home/item.html?id=3b2c5daad00742cd9f9b676c09d03d13.OverviewThe Jurisdictional Agencies dataset is developed as a national land management geospatial layer, focused on representing wildland fire jurisdictional responsibility, for interagency wildland fire applications, including WFDSS (Wildland Fire Decision Support System), IFTDSS (Interagency Fuels Treatment Decision Support System), IRWIN (Interagency Reporting of Wildland Fire Information), and InFORM (Interagency Fire Occurrence Reporting Modules). It is intended to provide federal wildland fire jurisdictional boundaries on a national scale. The agency and unit names are an indication of the primary manager name and unit name, respectively, recognizing that:There may be multiple owner names.Jurisdiction may be held jointly by agencies at different levels of government (ie State and Local), especially on private lands, Some owner names may be blocked for security reasons.Some jurisdictions may not allow the distribution of owner names. Private ownerships are shown in this layer with JurisdictionalUnitIdentifier=null,JurisdictionalUnitAgency=null, JurisdictionalUnitKind=null, and LandownerKind="Private", LandownerCategory="Private". All land inside the US country boundary is covered by a polygon.Jurisdiction for privately owned land varies widely depending on state, county, or local laws and ordinances, fire workload, and other factors, and is not available in a national dataset in most cases.For publicly held lands the agency name is the surface managing agency, such as Bureau of Land Management, United States Forest Service, etc. The unit name refers to the descriptive name of the polygon (i.e. Northern California District, Boise National Forest, etc.).These data are used to automatically populate fields on the WFDSS Incident Information page.This data layer implements the NWCG Jurisdictional Unit Polygon Geospatial Data Layer Standard.Relevant NWCG Definitions and StandardsUnit2. A generic term that represents an organizational entity that only has meaning when it is contextualized by a descriptor, e.g. jurisdictional.Definition Extension: When referring to an organizational entity, a unit refers to the smallest area or lowest level. Higher levels of an organization (region, agency, department, etc) can be derived from a unit based on organization hierarchy.Unit, JurisdictionalThe governmental entity having overall land and resource management responsibility for a specific geographical area as provided by law.Definition Extension: 1) Ultimately responsible for the fire report to account for statistical fire occurrence; 2) Responsible for setting fire management objectives; 3) Jurisdiction cannot be re-assigned by agreement; 4) The nature and extent of the incident determines jurisdiction (for example, Wildfire vs. All Hazard); 5) Responsible for signing a Delegation of Authority to the Incident Commander.See also: Unit, Protecting; LandownerUnit IdentifierThis data standard specifies the standard format and rules for Unit Identifier, a code used within the wildland fire community to uniquely identify a particular government organizational unit.Landowner Kind & CategoryThis data standard provides a two-tier classification (kind and category) of landownership. Attribute Fields JurisdictionalAgencyKind Describes the type of unit Jurisdiction using the NWCG Landowner Kind data standard. There are two valid values: Federal, and Other. A value may not be populated for all polygons.JurisdictionalAgencyCategoryDescribes the type of unit Jurisdiction using the NWCG Landowner Category data standard. Valid values include: ANCSA, BIA, BLM, BOR, DOD, DOE, NPS, USFS, USFWS, Foreign, Tribal, City, County, OtherLoc (other local, not in the standard), State. A value may not be populated for all polygons.JurisdictionalUnitNameThe name of the Jurisdictional Unit. Where an NWCG Unit ID exists for a polygon, this is the name used in the Name field from the NWCG Unit ID database. Where no NWCG Unit ID exists, this is the “Unit Name” or other specific, descriptive unit name field from the source dataset. A value is populated for all polygons.JurisdictionalUnitIDWhere it could be determined, this is the NWCG Standard Unit Identifier (Unit ID). Where it is unknown, the value is ‘Null’. Null Unit IDs can occur because a unit may not have a Unit ID, or because one could not be reliably determined from the source data. Not every land ownership has an NWCG Unit ID. Unit ID assignment rules are available from the Unit ID standard, linked above.LandownerKindThe landowner category value associated with the polygon. May be inferred from jurisdictional agency, or by lack of a jurisdictional agency. A value is populated for all polygons. There are three valid values: Federal, Private, or Other.LandownerCategoryThe landowner kind value associated with the polygon. May be inferred from jurisdictional agency, or by lack of a jurisdictional agency. A value is populated for all polygons. Valid values include: ANCSA, BIA, BLM, BOR, DOD, DOE, NPS, USFS, USFWS, Foreign, Tribal, City, County, OtherLoc (other local, not in the standard), State, Private.DataSourceThe database from which the polygon originated. Be as specific as possible, identify the geodatabase name and feature class in which the polygon originated.SecondaryDataSourceIf the Data Source is an aggregation from other sources, use this field to specify the source that supplied data to the aggregation. For example, if Data Source is "PAD-US 2.1", then for a USDA Forest Service polygon, the Secondary Data Source would be "USDA FS Automated Lands Program (ALP)". For a BLM polygon in the same dataset, Secondary Source would be "Surface Management Agency (SMA)."SourceUniqueIDIdentifier (GUID or ObjectID) in the data source. Used to trace the polygon back to its authoritative source.MapMethod:Controlled vocabulary to define how the geospatial feature was derived. Map method may help define data quality. MapMethod will be Mixed Method by default for this layer as the data are from mixed sources. Valid Values include: GPS-Driven; GPS-Flight; GPS-Walked; GPS-Walked/Driven; GPS-Unknown Travel Method; Hand Sketch; Digitized-Image; DigitizedTopo; Digitized-Other; Image Interpretation; Infrared Image; Modeled; Mixed Methods; Remote Sensing Derived; Survey/GCDB/Cadastral; Vector; Phone/Tablet; OtherDateCurrentThe last edit, update, of this GIS record. Date should follow the assigned NWCG Date Time data standard, using 24 hour clock, YYYY-MM-DDhh.mm.ssZ, ISO8601 Standard.CommentsAdditional information describing the feature. GeometryIDPrimary key for linking geospatial objects with other database systems. Required for every feature. This field may be renamed for each standard to fit the feature.JurisdictionalUnitID_sansUSNWCG Unit ID with the "US" characters removed from the beginning. Provided for backwards compatibility.JoinMethodAdditional information on how the polygon was matched information in the NWCG Unit ID database.LocalNameLocalName for the polygon provided from PADUS or other source.LegendJurisdictionalAgencyJurisdictional Agency but smaller landholding agencies, or agencies of indeterminate status are grouped for more intuitive use in a map legend or summary table.LegendLandownerAgencyLandowner Agency but smaller landholding agencies, or agencies of indeterminate status are grouped for more intuitive use in a map legend or summary table.DataSourceYearYear that the source data for the polygon were acquired.Data InputThis dataset is based on an aggregation of 4 spatial data sources: Protected Areas Database US (PAD-US 2.1), data from Bureau of Indian Affairs regional offices, the BLM Alaska Fire Service/State of Alaska, and Census Block-Group Geometry. NWCG Unit ID and Agency Kind/Category data are tabular and sourced from UnitIDActive.txt, in the WFMI Unit ID application (https://wfmi.nifc.gov/unit_id/Publish.html). Areas of with unknown Landowner Kind/Category and Jurisdictional Agency Kind/Category are assigned LandownerKind and LandownerCategory values of "Private" by use of the non-water polygons from the Census Block-Group geometry.PAD-US 2.1:This dataset is based in large part on the USGS Protected Areas Database of the United States - PAD-US 2.`. PAD-US is a compilation of authoritative protected areas data between agencies and organizations that ultimately results in a comprehensive and accurate inventory of protected areas for the United States to meet a variety of needs (e.g. conservation, recreation, public health, transportation, energy siting, ecological, or watershed assessments and planning). Extensive documentation on PAD-US processes and data sources is available.How these data were aggregated:Boundaries, and their descriptors, available in spatial databases (i.e. shapefiles or geodatabase feature classes) from land management agencies are the desired and primary data sources in PAD-US. If these authoritative sources are unavailable, or the agency recommends another source, data may be incorporated by other aggregators such as non-governmental organizations. Data sources are tracked for each record in the PAD-US geodatabase (see below).BIA and Tribal Data:BIA and Tribal land management data are not available in PAD-US. As such, data were aggregated from BIA regional offices. These data date from 2012 and were substantially updated in 2022. Indian Trust Land affiliated with Tribes, Reservations, or BIA Agencies: These data are not considered the system of record and are not intended to be used as such. The Bureau of Indian Affairs (BIA), Branch of Wildland Fire Management (BWFM) is not the originator of these data. The
Facebook
TwitterBy Health [source]
The Behavioral Risk Factor Surveillance System (BRFSS) offers an expansive collection of data on the health-related quality of life (HRQOL) from 1993 to 2010. Over this time period, the Health-Related Quality of Life dataset consists of a comprehensive survey reflecting the health and well-being of non-institutionalized US adults aged 18 years or older. The data collected can help track and identify unmet population health needs, recognize trends, identify disparities in healthcare, determine determinants of public health, inform decision making and policy development, as well as evaluate programs within public healthcare services.
The HRQOL surveillance system has developed a compact set of HRQOL measures such as a summary measure indicating unhealthy days which have been validated for population health surveillance purposes and have been widely implemented in practice since 1993. Within this study's dataset you will be able to access information such as year recorded, location abbreviations & descriptions, category & topic overviews, questions asked in surveys and much more detailed information including types & units regarding data values retrieved from respondents along with their sample sizes & geographical locations involved!
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
This dataset tracks the Health-Related Quality of Life (HRQOL) from 1993 to 2010 using data from the Behavioral Risk Factor Surveillance System (BRFSS). This dataset includes information on the year, location abbreviation, location description, type and unit of data value, sample size, category and topic of survey questions.
Using this dataset on BRFSS: HRQOL data between 1993-2010 will allow for a variety of analyses related to population health needs. The compact set of HRQOL measures can be used to identify trends in population health needs as well as determine disparities among various locations. Additionally, responses to survey questions can be used to inform decision making and program and policy development in public health initiatives.
- Analyzing trends in HRQOL over the years by location to identify disparities in health outcomes between different populations and develop targeted policy interventions.
- Developing new models for predicting HRQOL indicators at a regional level, and using this information to inform medical practice and public health implementation efforts.
- Using the data to understand differences between states in terms of their HRQOL scores and establish best practices for healthcare provision based on that understanding, including areas such as access to care, preventative care services availability, etc
If you use this dataset in your research, please credit the original authors. Data Source
See the dataset description for more information.
File: rows.csv | Column name | Description | |:-------------------------------|:----------------------------------------------------------| | Year | Year of survey. (Integer) | | LocationAbbr | Abbreviation of location. (String) | | LocationDesc | Description of location. (String) | | Category | Category of survey. (String) | | Topic | Topic of survey. (String) | | Question | Question asked in survey. (String) | | DataSource | Source of data. (String) | | Data_Value_Unit | Unit of data value. (String) | | Data_Value_Type | Type of data value. (String) | | Data_Value_Footnote_Symbol | Footnote symbol for data value. (String) | | Data_Value_Std_Err | Standard error of the data value. (Float) | | Sample_Size | Sample size used in sample. (Integer) | | Break_Out | Break out categories used. (String) | | Break_Out_Category | Type break out assessed. (String) | | **GeoLocation*...
Facebook
Twitterhttps://datacatalog.worldbank.org/public-licenses?fragment=cchttps://datacatalog.worldbank.org/public-licenses?fragment=cc
This dataset contains metadata (title, abstract, date of publication, field, etc) for around 1 million academic articles. Each record contains additional information on the country of study and whether the article makes use of data. Machine learning tools were used to classify the country of study and data use.
Our data source of academic articles is the Semantic Scholar Open Research Corpus (S2ORC) (Lo et al. 2020). The corpus contains more than 130 million English language academic papers across multiple disciplines. The papers included in the Semantic Scholar corpus are gathered directly from publishers, from open archives such as arXiv or PubMed, and crawled from the internet.
We placed some restrictions on the articles to make them usable and relevant for our purposes. First, only articles with an abstract and parsed PDF or latex file are included in the analysis. The full text of the abstract is necessary to classify the country of study and whether the article uses data. The parsed PDF and latex file are important for extracting important information like the date of publication and field of study. This restriction eliminated a large number of articles in the original corpus. Around 30 million articles remain after keeping only articles with a parsable (i.e., suitable for digital processing) PDF, and around 26% of those 30 million are eliminated when removing articles without an abstract. Second, only articles from the year 2000 to 2020 were considered. This restriction eliminated an additional 9% of the remaining articles. Finally, articles from the following fields of study were excluded, as we aim to focus on fields that are likely to use data produced by countries’ national statistical system: Biology, Chemistry, Engineering, Physics, Materials Science, Environmental Science, Geology, History, Philosophy, Math, Computer Science, and Art. Fields that are included are: Economics, Political Science, Business, Sociology, Medicine, and Psychology. This third restriction eliminated around 34% of the remaining articles. From an initial corpus of 136 million articles, this resulted in a final corpus of around 10 million articles.
Due to the intensive computer resources required, a set of 1,037,748 articles were randomly selected from the 10 million articles in our restricted corpus as a convenience sample.
The empirical approach employed in this project utilizes text mining with Natural Language Processing (NLP). The goal of NLP is to extract structured information from raw, unstructured text. In this project, NLP is used to extract the country of study and whether the paper makes use of data. We will discuss each of these in turn.
To determine the country or countries of study in each academic article, two approaches are employed based on information found in the title, abstract, or topic fields. The first approach uses regular expression searches based on the presence of ISO3166 country names. A defined set of country names is compiled, and the presence of these names is checked in the relevant fields. This approach is transparent, widely used in social science research, and easily extended to other languages. However, there is a potential for exclusion errors if a country’s name is spelled non-standardly.
The second approach is based on Named Entity Recognition (NER), which uses machine learning to identify objects from text, utilizing the spaCy Python library. The Named Entity Recognition algorithm splits text into named entities, and NER is used in this project to identify countries of study in the academic articles. SpaCy supports multiple languages and has been trained on multiple spellings of countries, overcoming some of the limitations of the regular expression approach. If a country is identified by either the regular expression search or NER, it is linked to the article. Note that one article can be linked to more than one country.
The second task is to classify whether the paper uses data. A supervised machine learning approach is employed, where 3500 publications were first randomly selected and manually labeled by human raters using the Mechanical Turk service (Paszke et al. 2019).[1] To make sure the human raters had a similar and appropriate definition of data in mind, they were given the following instructions before seeing their first paper:
Each of these documents is an academic article. The goal of this study is to measure whether a specific academic article is using data and from which country the data came.
There are two classification tasks in this exercise:
1. identifying whether an academic article is using data from any country
2. Identifying from which country that data came.
For task 1, we are looking specifically at the use of data. Data is any information that has been collected, observed, generated or created to produce research findings. As an example, a study that reports findings or analysis using a survey data, uses data. Some clues to indicate that a study does use data includes whether a survey or census is described, a statistical model estimated, or a table or means or summary statistics is reported.
After an article is classified as using data, please note the type of data used. The options are population or business census, survey data, administrative data, geospatial data, private sector data, and other data. If no data is used, then mark "Not applicable". In cases where multiple data types are used, please click multiple options.[2]
For task 2, we are looking at the country or countries that are studied in the article. In some cases, no country may be applicable. For instance, if the research is theoretical and has no specific country application. In some cases, the research article may involve multiple countries. In these cases, select all countries that are discussed in the paper.
We expect between 10 and 35 percent of all articles to use data.
The median amount of time that a worker spent on an article, measured as the time between when the article was accepted to be classified by the worker and when the classification was submitted was 25.4 minutes. If human raters were exclusively used rather than machine learning tools, then the corpus of 1,037,748 articles examined in this study would take around 50 years of human work time to review at a cost of $3,113,244, which assumes a cost of $3 per article as was paid to MTurk workers.
A model is next trained on the 3,500 labelled articles. We use a distilled version of the BERT (bidirectional Encoder Representations for transformers) model to encode raw text into a numeric format suitable for predictions (Devlin et al. (2018)). BERT is pre-trained on a large corpus comprising the Toronto Book Corpus and Wikipedia. The distilled version (DistilBERT) is a compressed model that is 60% the size of BERT and retains 97% of the language understanding capabilities and is 60% faster (Sanh, Debut, Chaumond, Wolf 2019). We use PyTorch to produce a model to classify articles based on the labeled data. Of the 3,500 articles that were hand coded by the MTurk workers, 900 are fed to the machine learning model. 900 articles were selected because of computational limitations in training the NLP model. A classification of “uses data” was assigned if the model predicted an article used data with at least 90% confidence.
The performance of the models classifying articles to countries and as using data or not can be compared to the classification by the human raters. We consider the human raters as giving us the ground truth. This may underestimate the model performance if the workers at times got the allocation wrong in a way that would not apply to the model. For instance, a human rater could mistake the Republic of Korea for the Democratic People’s Republic of Korea. If both humans and the model perform the same kind of errors, then the performance reported here will be overestimated.
The model was able to predict whether an article made use of data with 87% accuracy evaluated on the set of articles held out of the model training. The correlation between the number of articles written about each country using data estimated under the two approaches is given in the figure below. The number of articles represents an aggregate total of
Facebook
TwitterBy City of Chicago [source]
This public health dataset contains a comprehensive selection of indicators related to natality, mortality, infectious disease, lead poisoning, and economic status from Chicago community areas. It is an invaluable resource for those interested in understanding the current state of public health within each area in order to identify any deficiencies or areas of improvement needed.
The data includes 27 indicators such as birth and death rates, prenatal care beginning in first trimester percentages, preterm birth rates, breast cancer incidences per hundred thousand female population, all-sites cancer rates per hundred thousand population and more. For each indicator provided it details the geographical region so that analyses can be made regarding trends on a local level. Furthermore this dataset allows various stakeholders to measure performance along these indicators or even compare different community areas side-by-side.
This dataset provides a valuable tool for those striving toward better public health outcomes for the citizens of Chicago's communities by allowing greater insight into trends specific to geographic regions that could potentially lead to further research and implementation practices based on empirical evidence gathered from this comprehensive yet digestible selection of indicators
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
In order to use this dataset effectively to assess the public health of a given area or areas in the city: - Understand which data is available: The list of data included in this dataset can be found above. It is important to know all that are included as well as their definitions so that accurate conclusions can be made when utilizing the data for research or analysis. - Identify areas of interest: Once you are familiar with what type of data is present it can help to identify which community areas you would like to study more closely or compare with one another. - Choose your variables: Once you have identified your areas it will be helpful to decide which variables are most relevant for your studies and research specific questions regarding these variables based on what you are trying to learn from this data set.
- Analyze the Data : Once your variables have been selected and clarified take right into analyzing the corresponding values across different community areas using statistical tests such as t-tests or correlations etc.. This will help answer questions like “Are there significant differences between two outputs?” allowing you to compare how different Chicago Community Areas stack up against each other with regards to public health statistics tracked by this dataset!
- Creating interactive maps that show data on public health indicators by Chicago community area to allow users to explore the data more easily.
- Designing a machine learning model to predict future variations in public health indicators by Chicago community area such as birth rate, preterm births, and childhood lead poisoning levels.
- Developing an app that enables users to search for public health information in their own community areas and compare with other areas within the city or across different cities in the US
If you use this dataset in your research, please credit the original authors. Data Source
See the dataset description for more information.
File: public-health-statistics-selected-public-health-indicators-by-chicago-community-area-1.csv | Column name | Description | |:-----------------------------------------------|:--------------------------------------------------------------------------------------------------| | Community Area | Unique identifier for each community area in Chicago. (Integer) | | Community Area Name | Name of the community area in Chicago. (String) | | Birth Rate | Number of live births per 1,000 population. (Float) | | General Fertility Rate | Number of live births per 1,000 women aged 15-44. (Float) ...
Facebook
TwitterThere is no single, simple way to measure racial equity. Instead, researchers and governments pull from a range of different data sources to identify ways in which people might experience racial equity gaps – using ‘indicators’. With this knowledge, the Mayor's Office of Racial Equity (ORE) and the Office of the Chief Technology Officer (OCTO) developed the District’s first Racial Equity Dashboard. The dashboard serves as a key supportive action in the District’s goal of eliminating racial and ethnic inequities, as articulated in the recently launched Districtwide Racial Equity Action Plan (REAP), primarily Goal 2.1: “Identify and measure long-standing racial equity indicators in partnership with other agencies, community-based organizations (CBOs), and District residents.” The Dashboard is a comprehensive, multi-page, online platform providing timely, relevant, and accessible data on 42 racial equity indicators across 7 dimensions of life in the District. This tool will allow the public and District government staff to learn more about racial equity in the District and track our progress towards eliminating racial and ethnic equity gaps and improving outcomes for all District residents. To keep the Dashboard meaningful and relevant, ORE cannot include every known, valid indicator of racial equity – in effect, those included in this Dashboard are subject to change based on ongoing stakeholder input and continued data availability.
Facebook
TwitterBackground There has been increasing concern regarding the potential effects of the commercialization of research. Methods In order to examine the relationships between funding source, trial outcome and reporting quality, recent issues of five peer-reviewed, high impact factor, general medical journals were hand-searched to identify a sample of 100 randomized controlled trials (20 trials/journal). Relevant data, including funding source (industry/not-for-profit/mixed/not reported) and statistical significance of primary outcome (favouring new treatment/favouring conventional treatment/neutral/unclear), were abstracted. Quality scores were assigned using the Jadad scale and the adequacy of allocation concealment. Results Sixty-six percent of trials received some industry funding. Trial outcome was not associated with funding source (p= .461). There was a preponderance of favourable statistical conclusions among published trials with 67% reporting results that favored a new treatment whereas 6% favoured the conventional treatment. Quality scores were not associated with funding source or trial outcome. Conclusions It is not known whether the absence of significant associations between funding source, trial outcome and reporting quality reflects a true absence of an association or is an artefact of inadequate statistical power, reliance on voluntary disclosure of funding information, a focus on trials recently published in the top medical journals, or some combination thereof. Continued and expanded monitoring of potential conflicts is recommended, particularly in light of new guidelines for disclosure that have been endorsed by the ICMJE.
Facebook
TwitterBy US Open Data Portal, data.gov [source]
The National Weather Service (NWS) provides Storm Data, detailing the statistics of personal injuries and damage estimates resulting from numerous types of severe weather events that have occurred in the United States. Compiling records as early as 1950 to the present, Storm Data allows users to select storms by county or other custom criteria, listing hurricanes, tornadoes, thunderstorms, hail, floods, drought conditions, lightning strikes, blustering winds and snowfall accumulations among many other natural phenomena of varying intensities. All this raw material is organized chronologically by state and used selectively to gain a better understanding of our nation's diverse weather experiences. A maximum 120 day delay may exist in providing up-to-date Storm Data due to periodic updates released by NWS so users are afforded greater accuracy with their research efforts regardless for what purpose it’s being sought – whether for analysis or education. Making an informed decision about safety measures or studying historic trends related to climate change demands reliable data from trusted sources such as NCDC Storm Events Database; empowering us all with unimpeded access towards achieving a higher understanding of our environment
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
This dataset contains detailed information about various types of storms that have occured in the United States since 1950, including hurricanes, tornadoes, hail storms, and floods. The data is organized by county and can be used to analyze weather patterns or conduct research on severe weather events in particular regions of the country. Here are some steps to help you get started using this dataset:
Select a geographic area: You can start by selecting a specific state or county from where you would like to explore data on storm events. This will narrow down your search results so you can easily find more relevant data points.
Filter for desired storm type(s): Next step is to filter for the particular type of storm event that interests you—such as hurricanes, snowstorms, lightning strikes —to further refine your search results based on specific criteria such as date range and damage estimates etc..
Analyze the resulting data set: Lastly you’ll need to analyze any additional fields available post filtering process like fatalities numbers , damage estimates , cities affected . Run analyses that could result in trends on severity of storms over time or location-based distribution etc.. Alternatively create charts / graphs which could help visualize any insights drawn from your findings better
- Looking into the correlation between severe weather events and climate change by tracking historical data points from 1950 onwards with the NCDC Storm Events Database.
- Identifying trends in storm damages, such as those caused by hail, high winds, and other weather phenomena that could lead to better preparedness strategies for businesses or individuals who are vulnerable to certain risks in their area.
- Analyzing which states or counties have experienced the most severe weather events over the years and use this information to inform better mitigation planning ahead of potential disasters in those areas
If you use this dataset in your research, please credit the original authors. Data Source
Unknown License - Please check the dataset description for more information.
If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit US Open Data Portal, data.gov.
Facebook
TwitterA data science project's primary objective is to analyze and train the data in preparation for the relevant machine learning project. Gathering the necessary data from the beauty domain is a crucial step to provide accurate results for the machine learning project. To ensure that the data gathered is sufficient and relevant, it is vital to identify the appropriate data sources and analyze them. Homemade remedy recipes are becoming increasingly popular around the world. There are numerous remedy recipe videos available on YouTube and Google. The information provided above is required to recommend a remedy based on the conditions. The data set contains 18 different types of skin conditions that were identified by the user through surveys.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Public health-related decision-making on policies aimed at controlling the COVID-19 pandemic outbreak depends on complex epidemiological models that are compelled to be robust and use all relevant available data. This data article provides a new combined worldwide COVID-19 dataset obtained from official data sources with improved systematic measurement errors and a dedicated dashboard for online data visualization and summary. The dataset adds new measures and attributes to the normal attributes of official data sources, such as daily mortality, and fatality rates. We used comparative statistical analysis to evaluate the measurement errors of COVID-19 official data collections from the Chinese Center for Disease Control and Prevention (Chinese CDC), World Health Organization (WHO) and European Centre for Disease Prevention and Control (ECDC). The data is collected by using text mining techniques and reviewing pdf reports, metadata, and reference data. The combined dataset includes complete spatial data such as countries area, international number of countries, Alpha-2 code, Alpha-3 code, latitude, longitude, and some additional attributes such as population. The improved dataset benefits from major corrections on the referenced data sets and official reports such as adjustments in the reporting dates, which suffered from a one to two days lag, removing negative values, detecting unreasonable changes in historical data in new reports and corrections on systematic measurement errors, which have been increasing as the pandemic outbreak spreads and more countries contribute data for the official repositories. Additionally, the root mean square error of attributes in the paired comparison of datasets was used to identify the main data problems. The data for China is presented separately and in more detail, and it has been extracted from the attached reports available on the main page of the CCDC website. This dataset is a comprehensive and reliable source of worldwide COVID-19 data that can be used in epidemiological models assessing the magnitude and timeline for confirmed cases, long-term predictions of deaths or hospital utilization, the effects of quarantine, stay-at-home orders and other social distancing measures, the pandemic’s turning point or in economic and social impact analysis, helping to inform national and local authorities on how to implement an adaptive response approach to re-opening the economy, re-open schools, alleviate business and social distancing restrictions, design economic programs or allow sports events to resume.
Facebook
TwitterUnited States agricultural researchers have many options for making their data available online. This dataset aggregates the primary sources of ag-related data and determines where researchers are likely to deposit their agricultural data. These data serve as both a current landscape analysis and also as a baseline for future studies of ag research data. Purpose As sources of agricultural data become more numerous and disparate, and collaboration and open data become more expected if not required, this research provides a landscape inventory of online sources of open agricultural data. An inventory of current agricultural data sharing options will help assess how the Ag Data Commons, a platform for USDA-funded data cataloging and publication, can best support data-intensive and multi-disciplinary research. It will also help agricultural librarians assist their researchers in data management and publication. The goals of this study were to establish where agricultural researchers in the United States-- land grant and USDA researchers, primarily ARS, NRCS, USFS and other agencies -- currently publish their data, including general research data repositories, domain-specific databases, and the top journals compare how much data is in institutional vs. domain-specific vs. federal platforms determine which repositories are recommended by top journals that require or recommend the publication of supporting data ascertain where researchers not affiliated with funding or initiatives possessing a designated open data repository can publish data Approach The National Agricultural Library team focused on Agricultural Research Service (ARS), Natural Resources Conservation Service (NRCS), and United States Forest Service (USFS) style research data, rather than ag economics, statistics, and social sciences data. To find domain-specific, general, institutional, and federal agency repositories and databases that are open to US research submissions and have some amount of ag data, resources including re3data, libguides, and ARS lists were analysed. Primarily environmental or public health databases were not included, but places where ag grantees would publish data were considered. Search methods We first compiled a list of known domain specific USDA / ARS datasets / databases that are represented in the Ag Data Commons, including ARS Image Gallery, ARS Nutrition Databases (sub-components), SoyBase, PeanutBase, National Fungus Collection, i5K Workspace @ NAL, and GRIN. We then searched using search engines such as Bing and Google for non-USDA / federal ag databases, using Boolean variations of “agricultural data” /“ag data” / “scientific data” + NOT + USDA (to filter out the federal / USDA results). Most of these results were domain specific, though some contained a mix of data subjects. We then used search engines such as Bing and Google to find top agricultural university repositories using variations of “agriculture”, “ag data” and “university” to find schools with agriculture programs. Using that list of universities, we searched each university web site to see if their institution had a repository for their unique, independent research data if not apparent in the initial web browser search. We found both ag specific university repositories and general university repositories that housed a portion of agricultural data. Ag specific university repositories are included in the list of domain-specific repositories. Results included Columbia University – International Research Institute for Climate and Society, UC Davis – Cover Crops Database, etc. If a general university repository existed, we determined whether that repository could filter to include only data results after our chosen ag search terms were applied. General university databases that contain ag data included Colorado State University Digital Collections, University of Michigan ICPSR (Inter-university Consortium for Political and Social Research), and University of Minnesota DRUM (Digital Repository of the University of Minnesota). We then split out NCBI (National Center for Biotechnology Information) repositories. Next we searched the internet for open general data repositories using a variety of search engines, and repositories containing a mix of data, journals, books, and other types of records were tested to determine whether that repository could filter for data results after search terms were applied. General subject data repositories include Figshare, Open Science Framework, PANGEA, Protein Data Bank, and Zenodo. Finally, we compared scholarly journal suggestions for data repositories against our list to fill in any missing repositories that might contain agricultural data. Extensive lists of journals were compiled, in which USDA published in 2012 and 2016, combining search results in ARIS, Scopus, and the Forest Service's TreeSearch, plus the USDA web sites Economic Research Service (ERS), National Agricultural Statistics Service (NASS), Natural Resources and Conservation Service (NRCS), Food and Nutrition Service (FNS), Rural Development (RD), and Agricultural Marketing Service (AMS). The top 50 journals' author instructions were consulted to see if they (a) ask or require submitters to provide supplemental data, or (b) require submitters to submit data to open repositories. Data are provided for Journals based on a 2012 and 2016 study of where USDA employees publish their research studies, ranked by number of articles, including 2015/2016 Impact Factor, Author guidelines, Supplemental Data?, Supplemental Data reviewed?, Open Data (Supplemental or in Repository) Required? and Recommended data repositories, as provided in the online author guidelines for each the top 50 journals. Evaluation We ran a series of searches on all resulting general subject databases with the designated search terms. From the results, we noted the total number of datasets in the repository, type of resource searched (datasets, data, images, components, etc.), percentage of the total database that each term comprised, any dataset with a search term that comprised at least 1% and 5% of the total collection, and any search term that returned greater than 100 and greater than 500 results. We compared domain-specific databases and repositories based on parent organization, type of institution, and whether data submissions were dependent on conditions such as funding or affiliation of some kind. Results A summary of the major findings from our data review: Over half of the top 50 ag-related journals from our profile require or encourage open data for their published authors. There are few general repositories that are both large AND contain a significant portion of ag data in their collection. GBIF (Global Biodiversity Information Facility), ICPSR, and ORNL DAAC were among those that had over 500 datasets returned with at least one ag search term and had that result comprise at least 5% of the total collection. Not even one quarter of the domain-specific repositories and datasets reviewed allow open submission by any researcher regardless of funding or affiliation. See included README file for descriptions of each individual data file in this dataset. Resources in this dataset:Resource Title: Journals. File Name: Journals.csvResource Title: Journals - Recommended repositories. File Name: Repos_from_journals.csvResource Title: TDWG presentation. File Name: TDWG_Presentation.pptxResource Title: Domain Specific ag data sources. File Name: domain_specific_ag_databases.csvResource Title: Data Dictionary for Ag Data Repository Inventory. File Name: Ag_Data_Repo_DD.csvResource Title: General repositories containing ag data. File Name: general_repos_1.csvResource Title: README and file inventory. File Name: README_InventoryPublicDBandREepAgData.txt
Facebook
TwitterDeep radio observations at 1.4 GHz for the Extended Chandra Deep Field South were performed in 2007 June through September and presented in a first data release (Miller et al. 2008, ApJS, 179, 114). The survey was made using six separate pointings of the Very Large Array with over 40 hr of observation per pointing. In the current study, the authors improve on the data reduction to produce a second data release (DR2) mosaic image. This DR2 image covers an area of about a third of a square degree, reaches a best rms sensitivity of 6 µJy (µJy), and has a typical sensitivity of 7.4 uJy per 2.8" by 1.6" beam. The authors also present a more comprehensive catalog, including sources down to peak flux densities of five or more times the local rms noise, along with information on source sizes and relevant pointing data. In their paper, they discuss in some detail the consideration of whether sources are resolved under the complication of a radio image created as a mosaic of separate pointings, each suffering some degree of bandwidth smearing, and the accurate evaluation of the flux densities of such sources. Finally, the radio morphologies and optical/near-IR counterpart identifications are used to identify 17 likely multiple-component sources so as to arrive at a catalog of 883 radio sources (and also 49 individual components of the 17 multi-component sources), which is roughly double the number of sources contained in the first data release. In order to cover the full E-CDF-S area at near-uniform sensitivity, the authors pointed the VLA at six separate coordinate locations arranged in a hexagonal grid around the adopted center of the CDF-S, viz. RA, Dec (J2000) 03h 32m 28.00s, -27o 48' 30.0". The observations were spread over many days on account of the low declination of the field and typically amounted to 5 hr of time per calendar date. The details of the individual pointings are:
Pointing ID R.A. (J2000) DE. (J2000) rms sensitivity for final image ECDFS 1 03:33:22.25 -27:48:30.0 10.5 uJy ECDFS 2 03:32:55.12 -27:38:03.0 9.4 uJy ECDFS 3 03:32:00.88 -27:38:03.0 9.7 uJy ECDFS 4 03:31:33.75 -27:48:30.0 9.5 uJy ECDFS 5 03:32:00.88 -27:58:57.0 10.0 uJy ECDFS 6 03:32:55.12 -27:58:57.0 9.3 uJyThe images corresponding to the six individual pointings were combined to form the final mosaic image (shown in Figure 1 of the reference paper). This HEASARC table contains the catalog of 883 radio sources (Table 3 in the reference paper) and also the catalog of 49 individual components of the 17 multi-component sources (Table 4 in the reference paper), so that there are a total of 932 entries in the present table. To allow users to easily distinguish these types of entry, the HEASARC created a parameter type_flag which is set to 'S' for the 883 source entries and to 'C' for the 49 component entries. The HEASARC created names for the sources following the standard CDS and IAU recommendations for position-based names and using the prefix of '[MBF2013]' for Miller, Bonzini, Fomalont (2013), the first 3 authors and the date of publication of the reference paper. For the components, we have used the names based on the positions of the parent sources and the suffixes 'A', 'B', etc, in order of increasing J2000.0 RA. Thus, for the multi-component source [MBF2013] J033115.0-275518 which has 3 components, there are 4 entries in this table, one for the entire source, and one for each component, e.g.:
Name | type_flag | RA (J2000.0) Dec (J2000.0) [MBF2013] J033115.0-275518 | S | 03 31 15.04 | -27 55 18.8 [MBF2013] J033115.0-275518 A| C | 03 31 13.99 | -27 55 19.9 [MBF2013] J033115.0-275518 B| C | 03 31 15.06 | -27 55 18.9 [MBF2013] J033115.0-275518 C| C | 03 31 17.05 | -27 55 15.2The 17 sources thought to consist of multiple components associated with a single host object are each listed with a single aggregate integrated flux density. Gaussian fits to the individual components associated with these sources are separately listed for their components This table was created by the HEASARC in May 2013 based on CDS Catalog J/ApJS/205/13 files table3.dat and table4.dat. This is a service provided by NASA HEASARC .
Facebook
TwitterTechsalerator’s News Event Data in North America offers a comprehensive and detailed dataset designed to provide businesses, analysts, journalists, and researchers with a thorough view of significant news events across North America. This dataset captures and categorizes major events reported from a diverse range of news sources, including press releases, industry news sites, blogs, and PR platforms, providing valuable insights into regional developments, economic shifts, political changes, and cultural events.
Key Features of the Dataset: Extensive Coverage:
The dataset aggregates news events from a wide array of sources, including company press releases, industry-specific news outlets, blogs, PR sites, and traditional media. This broad coverage ensures a diverse range of information from multiple reporting channels. Categorization of Events:
News events are categorized into various types such as business and economic updates, political developments, technological advancements, legal and regulatory changes, and cultural events. This categorization helps users quickly find and analyze information relevant to their interests or sectors. Real-Time Updates:
The dataset is updated regularly to include the most current events, ensuring that users have access to up-to-date news and can stay informed about recent developments as they happen. Geographic Segmentation:
Events are tagged with their respective countries and territories within North America. This geographic segmentation allows users to filter and analyze news events based on specific locations, facilitating targeted research and analysis. Event Details:
Each event entry includes comprehensive details such as the date of occurrence, source of the news, a description of the event, and relevant keywords. This thorough detailing helps users understand the context and significance of each event. Historical Data:
The dataset includes historical news event data, enabling users to track trends and conduct comparative analysis over time. This feature supports longitudinal studies and provides insights into how news events evolve. Advanced Search and Filter Options:
Users can search and filter news events based on criteria such as date range, event type, location, and keywords. This functionality allows for precise and efficient retrieval of relevant information. North American Countries and Territories Covered: Countries: Canada Mexico United States Territories: American Samoa (U.S. territory) French Polynesia (French overseas collectivity; included for regional relevance) Guam (U.S. territory) New Caledonia (French special collectivity; included for regional relevance) Northern Mariana Islands (U.S. territory) Puerto Rico (U.S. territory) Saint Pierre and Miquelon (French overseas territory; geographically close to North America and included for regional comprehensiveness) Wallis and Futuna (French overseas collectivity; included for regional relevance) Benefits of the Dataset: Strategic Insights: Businesses and analysts can use the dataset to gain insights into significant regional developments, economic conditions, and political changes, aiding in strategic decision-making and market analysis. Market and Industry Trends: The dataset provides valuable information on industry-specific trends and events, helping users understand market dynamics and identify emerging opportunities. Media and PR Monitoring: Journalists and PR professionals can track relevant news across North America, enabling them to monitor media coverage, identify emerging stories, and manage public relations efforts effectively. Academic and Research Use: Researchers can utilize the dataset for longitudinal studies, trend analysis, and academic research on various topics related to North American news and events. Techsalerator’s News Event Data in North America is a crucial resource for accessing and analyzing significant news events across the continent. By providing detailed, categorized, and up-to-date information, it supports effective decision-making, research, and media monitoring across diverse sectors.
Facebook
TwitterThis Data folder contains the MATLAB code, final product, tables used in Parker, L.E.; Zhang, N.; Abatzoglou, J.T.; Ostoja, S.M.; Pathak, T.B. Observed Changes in Agroclimate Metrics Relevant for Specialty Crop Production in California. Agronomy 2022, 12, 205. https://doi.org/10.3390/agronomy12010205.Data Source: The primary data source for this study was the GridMET dataset, which provides high-resolution meteorological data across the contiguous United States.Analytical Tools: We employed MATLAB for all data processing and analysis, ensuring rigorous computational accuracy. The specific scripts and methodologies used are included within the dataset to facilitate replication.Spatial Analysis: Geographic Information Systems (GIS) were utilized to overlay crop cover data with climate metrics, allowing for a nuanced analysis of regional impacts. This includes shapefiles for California’s state and county boundaries, as well as specific agricultural regions.Agroclimatic Metric Calculation: Metrics such as growing degree days, frost days, and reference evapotranspiration were computed to assess climate trends and their agricultural impacts.Trend Analysis: We applied statistical techniques to identify significant changes and trends in climate metrics over the 40-year study period.The collection and use of climatological data from the GridMET dataset comply with public data use agreements. No personally identifiable information (PII) or restricted environmental data were used in this study.This dataset encompasses a detailed analysis of agroclimate metrics relevant to specialty crop production in California over the period from 1981 to 2020. Using the GridMET meteorological dataset, we calculated 12 distinct agroclimatic metrics that are crucial for understanding the impact of climate variability on agricultural outputs.This dataset is intended for researchers and policymakers interested in agricultural planning and climate adaptation strategies. It provides a valuable resource for developing sustainable agricultural practices in response to changing climatic conditions in California.
Facebook
TwitterTechsalerator’s News Event Data in Asia offers a detailed and expansive dataset designed to provide businesses, analysts, journalists, and researchers with comprehensive insights into significant news events across the Asian continent. This dataset captures and categorizes major events reported from a diverse range of news sources, including press releases, industry news sites, blogs, and PR platforms, offering valuable perspectives on regional developments, economic shifts, political changes, and cultural occurrences.
Key Features of the Dataset: Extensive Coverage:
The dataset aggregates news events from a wide range of sources such as company press releases, industry-specific news outlets, blogs, PR sites, and traditional media. This broad coverage ensures a diverse array of information from multiple reporting channels. Categorization of Events:
News events are categorized into various types including business and economic updates, political developments, technological advancements, legal and regulatory changes, and cultural events. This categorization helps users quickly find and analyze information relevant to their interests or sectors. Real-Time Updates:
The dataset is updated regularly to include the most current events, ensuring users have access to the latest news and can stay informed about recent developments as they happen. Geographic Segmentation:
Events are tagged with their respective countries and regions within Asia. This geographic segmentation allows users to filter and analyze news events based on specific locations, facilitating targeted research and analysis. Event Details:
Each event entry includes comprehensive details such as the date of occurrence, source of the news, a description of the event, and relevant keywords. This thorough detailing helps users understand the context and significance of each event. Historical Data:
The dataset includes historical news event data, enabling users to track trends and perform comparative analysis over time. This feature supports longitudinal studies and provides insights into the evolution of news events. Advanced Search and Filter Options:
Users can search and filter news events based on criteria such as date range, event type, location, and keywords. This functionality allows for precise and efficient retrieval of relevant information. Asian Countries and Territories Covered: Central Asia: Kazakhstan Kyrgyzstan Tajikistan Turkmenistan Uzbekistan East Asia: China Hong Kong (Special Administrative Region of China) Japan Mongolia North Korea South Korea Taiwan South Asia: Afghanistan Bangladesh Bhutan India Maldives Nepal Pakistan Sri Lanka Southeast Asia: Brunei Cambodia East Timor (Timor-Leste) Indonesia Laos Malaysia Myanmar (Burma) Philippines Singapore Thailand Vietnam Western Asia (Middle East): Armenia Azerbaijan Bahrain Cyprus Georgia Iraq Israel Jordan Kuwait Lebanon Oman Palestine Qatar Saudi Arabia Syria Turkey (partly in Europe, but often included in Asia contextually) United Arab Emirates Yemen Benefits of the Dataset: Strategic Insights: Businesses and analysts can use the dataset to gain insights into significant regional developments, economic conditions, and political changes, aiding in strategic decision-making and market analysis. Market and Industry Trends: The dataset provides valuable information on industry-specific trends and events, helping users understand market dynamics and identify emerging opportunities. Media and PR Monitoring: Journalists and PR professionals can track relevant news across Asia, enabling them to monitor media coverage, identify emerging stories, and manage public relations efforts effectively. Academic and Research Use: Researchers can utilize the dataset for longitudinal studies, trend analysis, and academic research on various topics related to Asian news and events. Techsalerator’s News Event Data in Asia is a crucial resource for accessing and analyzing significant news events across the continent. By offering detailed, categorized, and up-to-date information, it supports effective decision-making, research, and media monitoring across diverse sectors.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The data set presents the results of the quality assessment of data sources by the working group on data collection for the identification of emerging risks related to food and feed. For this assessment, the WG defined text descriptors and quality parameters (i.e. link with indicators, data type, geographic and period coverage, language, edition, timeliness, accessibility, clarity and comparability). These data sources were linked to eleven priority indicators (i.e. the ESCO indicators) and qualitatively assessed and profiled.