69 datasets found
  1. d

    Rapid Update Cycle (RUC) model: hybrid analysis data, 20-km resolution

    • catalog.data.gov
    • data.amerigeoss.org
    • +1more
    Updated Nov 12, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Atmospheric Radiation Measurement Data Center (2020). Rapid Update Cycle (RUC) model: hybrid analysis data, 20-km resolution [Dataset]. https://catalog.data.gov/dataset/rapid-update-cycle-ruc-model-hybrid-analysis-data-20-km-resolution
    Explore at:
    Dataset updated
    Nov 12, 2020
    Dataset provided by
    Atmospheric Radiation Measurement Data Center
    Description

    No description found

  2. r

    NAM Impact and Risk Analysis Database v01

    • researchdata.edu.au
    Updated Dec 11, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bioregional Assessment Program (2018). NAM Impact and Risk Analysis Database v01 [Dataset]. https://researchdata.edu.au/nam-impact-risk-database-v01/2987800
    Explore at:
    Dataset updated
    Dec 11, 2018
    Dataset provided by
    data.gov.au
    Authors
    Bioregional Assessment Program
    Description

    Abstract

    The Namoi Impact and Risk Analysis Database (Analysis Database) is a fit-for-purpose geospatial information system developed for the Impact and Risk Analysis (Component 3-4) products of the Bioregional Assessment Technical Programme (BATP). The Analysis Database brings together many of the data sets of the scientific disciplines of the Programme and includes modelling results from hydrogeology and hydrology, landscape classes and economic, sociocultural and ecological assets. These data sets are listed in the Data Register for each subregion and can be found on the Bioregional Assessments web site (http://www.bioregionalassessments.gov.au/).

    An Analysis Database of common design and schema was implemented for each individual subregion where a full Impact and Risk Analysis was completed. To populate each database, input datasets were transformed, normalised and inserted into their respective Analysis Database in accord with the common design and schema. The approach enabled the universal treatment of data analysis across all bioregions despite data being of a different specification and origin.

    The Analysis Database provided for this subregion is an exact replica of the original used for the assessment of the subregion with the exception that a few spatial data for individual Assets subject to restrictions have been removed before publication. The restrictions are typically for threatened species spatial data but occasionally, restrictive licencing conditions imposed by some custodians prevented publication of some data. The database is constructed using the Open Source platform PostgreSQL coupled with PostGIS. This technology was considered to better enable the provenance and transparency requirements of the Programme. The files provided here have been prepared using the PostgreSQL version 9.5 SQL Dump function - pg_dump.

    A detailed description of the Analysis Database, its design, structure and application is provided in the supporting documentation: http://data.bioregionalassessments.gov.au/dataset/05e851cf-57a5-4127-948a-1b41732d538c

    Purpose

    The Namoi Impact and Risk Analysis Database (Analysis Database) is the geospatial database for completing the Impact and Risk Analysis component of a Bioregional Assessment. This includes the creating of results, tables and maps that appear in the relevant Products of each assessment. The database also manages the data used by the BA Explorer.

    An individual instance of the Analysis Database was developed for each subregion where a component 3-4 Impact and Risks Assessment was conducted. With the exception of the subregion-specific data contained within it and the removal of restricted data records, each analysis database is of identical design and structure.

    Dataset History

    This Analysis Database is an instance of PostgreSQL version 9.5 hosted on Linux Red Hat Enterprise Linux version 4.8.5-4. PostgreSQL geospatial capabilities are provided by POSTGIS version 2.2.

    Data pre-processing and upload into each PostgreSQL database was completed using FME Desktop (Oracle Edition) version 2016.1.2.1. Analysis data and results are provided to users and systems via the geospatial services of Geoserver version 2.9.1. Scientific analysis and mapping was undertaken by connecting a range of data using a combination of Microsoft Excel, QGIS and ArcMap systems.

    During the Programme and for its working life, the Analysis Database was hosted and managed on instances of Amazon Web Services managed by Geoscience Australia and the Bureau of Meteorology.

    Dataset Citation

    Bioregional Assessment Programme (2018) NAM Impact and Risk Analysis Database v01. Bioregional Assessment Derived Dataset. Viewed 11 December 2018, http://data.bioregionalassessments.gov.au/dataset/1549c88d-927b-4cb5-b531-1d584d59be58.

    Dataset Ancestors

  3. C

    Housing Market Value Analysis 2021

    • data.wprdc.org
    • gimi9.com
    • +1more
    geojson, html, pdf +2
    Updated Jul 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Allegheny County (2025). Housing Market Value Analysis 2021 [Dataset]. https://data.wprdc.org/dataset/market-value-analysis-2021
    Explore at:
    xlsx(22669), html, zip(1996574), pdf(28782887), zip(2039140), pdf(881980), geojson(10301172)Available download formats
    Dataset updated
    Jul 8, 2025
    Dataset provided by
    Allegheny County
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    In 2021, Allegheny County Economic Development (ACED), in partnership with Urban Redevelopment Authority of Pittsburgh(URA), completed the a Market Value Analysis (MVA) for Allegheny County. This analysis services as both an update to previous MVA’s commissioned separately by ACED and the URA and combines the MVA for the whole of Allegheny County (inclusive of the City of Pittsburgh). The MVA is a unique tool for characterizing markets because it creates an internally referenced index of a municipality’s residential real estate market. It identifies areas that are the highest demand markets as well as areas of greatest distress, and the various markets types between. The MVA offers insight into the variation in market strength and weakness within and between traditional community boundaries because it uses Census block groups as the unit of analysis. Where market types abut each other on the map becomes instructive about the potential direction of market change, and ultimately, the appropriateness of types of investment or intervention strategies.

    This MVA utilized data that helps to define the local real estate market. The data used covers the 2017-2019 period, and data used in the analysis includes:

    • Residential Real Estate Sales
    • Mortgage Foreclosures
    • Residential Vacancy
    • Parcel Year Built
    • Parcel Condition
    • Building Violations
    • Owner Occupancy
    • Subsidized Housing Units

    The MVA uses a statistical technique known as cluster analysis, forming groups of areas (i.e., block groups) that are similar along the MVA descriptors, noted above. The goal is to form groups within which there is a similarity of characteristics within each group, but each group itself different from the others. Using this technique, the MVA condenses vast amounts of data for the universe of all properties to a manageable, meaningful typology of market types that can inform area-appropriate programs and decisions regarding the allocation of resources.

    Please refer to the presentation and executive summary for more information about the data, methodology, and findings.

  4. u

    Update: Research Data Management in Canada

    • hsscommons.rs-dev.uvic.ca
    • hsscommons.ca
    Updated Oct 23, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Caroline Winter (2023). Update: Research Data Management in Canada [Dataset]. http://doi.org/10.80230/8Z97-A987
    Explore at:
    Dataset updated
    Oct 23, 2023
    Dataset provided by
    Canadian HSS Commons
    Authors
    Caroline Winter
    Area covered
    Canada
    Description

    In March 2021, the Government of Canada announced the release of the Tri-Agency Research Data Management Policy (RDM Policy). A draft of this policy for consultation was released in May 2018 (see “Tri-Agency Research Data Management Policy”).

  5. i

    Household Health Survey 2012-2013, Economic Research Forum (ERF)...

    • datacatalog.ihsn.org
    • catalog.ihsn.org
    Updated Jun 26, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Central Statistical Organization (CSO) (2017). Household Health Survey 2012-2013, Economic Research Forum (ERF) Harmonization Data - Iraq [Dataset]. https://datacatalog.ihsn.org/catalog/6937
    Explore at:
    Dataset updated
    Jun 26, 2017
    Dataset provided by
    Central Statistical Organization (CSO)
    Economic Research Forum
    Kurdistan Regional Statistics Office (KRSO)
    Time period covered
    2012 - 2013
    Area covered
    Iraq
    Description

    Abstract

    The harmonized data set on health, created and published by the ERF, is a subset of Iraq Household Socio Economic Survey (IHSES) 2012. It was derived from the household, individual and health modules, collected in the context of the above mentioned survey. The sample was then used to create a harmonized health survey, comparable with the Iraq Household Socio Economic Survey (IHSES) 2007 micro data set.

    ----> Overview of the Iraq Household Socio Economic Survey (IHSES) 2012:

    Iraq is considered a leader in household expenditure and income surveys where the first was conducted in 1946 followed by surveys in 1954 and 1961. After the establishment of Central Statistical Organization, household expenditure and income surveys were carried out every 3-5 years in (1971/ 1972, 1976, 1979, 1984/ 1985, 1988, 1993, 2002 / 2007). Implementing the cooperation between CSO and WB, Central Statistical Organization (CSO) and Kurdistan Region Statistics Office (KRSO) launched fieldwork on IHSES on 1/1/2012. The survey was carried out over a full year covering all governorates including those in Kurdistan Region.

    The survey has six main objectives. These objectives are:

    1. Provide data for poverty analysis and measurement and monitor, evaluate and update the implementation Poverty Reduction National Strategy issued in 2009.
    2. Provide comprehensive data system to assess household social and economic conditions and prepare the indicators related to the human development.
    3. Provide data that meet the needs and requirements of national accounts.
    4. Provide detailed indicators on consumption expenditure that serve making decision related to production, consumption, export and import.
    5. Provide detailed indicators on the sources of households and individuals income.
    6. Provide data necessary for formulation of a new consumer price index number.

    The raw survey data provided by the Statistical Office were then harmonized by the Economic Research Forum, to create a comparable version with the 2006/2007 Household Socio Economic Survey in Iraq. Harmonization at this stage only included unifying variables' names, labels and some definitions. See: Iraq 2007 & 2012- Variables Mapping & Availability Matrix.pdf provided in the external resources for further information on the mapping of the original variables on the harmonized ones, in addition to more indications on the variables' availability in both survey years and relevant comments.

    Geographic coverage

    National coverage: Covering a sample of urban, rural and metropolitan areas in all the governorates including those in Kurdistan Region.

    Analysis unit

    1- Household/family. 2- Individual/person.

    Universe

    The survey was carried out over a full year covering all governorates including those in Kurdistan Region.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    ----> Design:

    Sample size was (25488) household for the whole Iraq, 216 households for each district of 118 districts, 2832 clusters each of which includes 9 households distributed on districts and governorates for rural and urban.

    ----> Sample frame:

    Listing and numbering results of 2009-2010 Population and Housing Survey were adopted in all the governorates including Kurdistan Region as a frame to select households, the sample was selected in two stages: Stage 1: Primary sampling unit (blocks) within each stratum (district) for urban and rural were systematically selected with probability proportional to size to reach 2832 units (cluster). Stage two: 9 households from each primary sampling unit were selected to create a cluster, thus the sample size of total survey clusters was 25488 households distributed on the governorates, 216 households in each district.

    ----> Sampling Stages:

    In each district, the sample was selected in two stages: Stage 1: based on 2010 listing and numbering frame 24 sample points were selected within each stratum through systematic sampling with probability proportional to size, in addition to the implicit breakdown urban and rural and geographic breakdown (sub-district, quarter, street, county, village and block). Stage 2: Using households as secondary sampling units, 9 households were selected from each sample point using systematic equal probability sampling. Sampling frames of each stages can be developed based on 2010 building listing and numbering without updating household lists. In some small districts, random selection processes of primary sampling may lead to select less than 24 units therefore a sampling unit is selected more than once , the selection may reach two cluster or more from the same enumeration unit when it is necessary.

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    ----> Preparation:

    The questionnaire of 2006 survey was adopted in designing the questionnaire of 2012 survey on which many revisions were made. Two rounds of pre-test were carried out. Revision were made based on the feedback of field work team, World Bank consultants and others, other revisions were made before final version was implemented in a pilot survey in September 2011. After the pilot survey implemented, other revisions were made in based on the challenges and feedbacks emerged during the implementation to implement the final version in the actual survey.

    ----> Questionnaire Parts:

    The questionnaire consists of four parts each with several sections: Part 1: Socio – Economic Data: - Section 1: Household Roster - Section 2: Emigration - Section 3: Food Rations - Section 4: housing - Section 5: education - Section 6: health - Section 7: Physical measurements - Section 8: job seeking and previous job

    Part 2: Monthly, Quarterly and Annual Expenditures: - Section 9: Expenditures on Non – Food Commodities and Services (past 30 days). - Section 10 : Expenditures on Non – Food Commodities and Services (past 90 days). - Section 11: Expenditures on Non – Food Commodities and Services (past 12 months). - Section 12: Expenditures on Non-food Frequent Food Stuff and Commodities (7 days). - Section 12, Table 1: Meals Had Within the Residential Unit. - Section 12, table 2: Number of Persons Participate in the Meals within Household Expenditure Other Than its Members.

    Part 3: Income and Other Data: - Section 13: Job - Section 14: paid jobs - Section 15: Agriculture, forestry and fishing - Section 16: Household non – agricultural projects - Section 17: Income from ownership and transfers - Section 18: Durable goods - Section 19: Loans, advances and subsidies - Section 20: Shocks and strategy of dealing in the households - Section 21: Time use - Section 22: Justice - Section 23: Satisfaction in life - Section 24: Food consumption during past 7 days

    Part 4: Diary of Daily Expenditures: Diary of expenditure is an essential component of this survey. It is left at the household to record all the daily purchases such as expenditures on food and frequent non-food items such as gasoline, newspapers…etc. during 7 days. Two pages were allocated for recording the expenditures of each day, thus the roster will be consists of 14 pages.

    Cleaning operations

    ----> Raw Data:

    Data Editing and Processing: To ensure accuracy and consistency, the data were edited at the following stages: 1. Interviewer: Checks all answers on the household questionnaire, confirming that they are clear and correct. 2. Local Supervisor: Checks to make sure that questions has been correctly completed. 3. Statistical analysis: After exporting data files from excel to SPSS, the Statistical Analysis Unit uses program commands to identify irregular or non-logical values in addition to auditing some variables. 4. World Bank consultants in coordination with the CSO data management team: the World Bank technical consultants use additional programs in SPSS and STAT to examine and correct remaining inconsistencies within the data files. The software detects errors by analyzing questionnaire items according to the expected parameter for each variable.

    ----> Harmonized Data:

    • The SPSS package is used to harmonize the Iraq Household Socio Economic Survey (IHSES) 2007 with Iraq Household Socio Economic Survey (IHSES) 2012.
    • The harmonization process starts with raw data files received from the Statistical Office.
    • A program is generated for each dataset to create harmonized variables.
    • Data is saved on the household and individual level, in SPSS and then converted to STATA, to be disseminated.

    Response rate

    Iraq Household Socio Economic Survey (IHSES) reached a total of 25488 households. Number of households refused to response was 305, response rate was 98.6%. The highest interview rates were in Ninevah and Muthanna (100%) while the lowest rates were in Sulaimaniya (92%).

  6. T

    2015 Municipal and Industrial Water Use Databases

    • opendata.utah.gov
    application/rdfxml +5
    Updated Aug 20, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). 2015 Municipal and Industrial Water Use Databases [Dataset]. https://opendata.utah.gov/dataset/2015-Municipal-and-Industrial-Water-Use-Databases/hbit-64ni
    Explore at:
    json, csv, tsv, application/rssxml, xml, application/rdfxmlAvailable download formats
    Dataset updated
    Aug 20, 2022
    Description

    Water use and supply data for 2015 joined to spatial boundaries. GPCD = Gallons Per Capita Day or Gallons Per Person Per Day. Supply and Use numbers are in Acre Feet Per Year (ACFT).


    This database contains municipal, institutional, commercial and industrial water use data gathered by the Utah Division of Water Rights for the 2015 calendar year. The Utah Division of Water Resources has analyzed water use data every five years since 1990; however, this new 2015 dataset marks a significant methodologic and data accuracy milestone.

    The updated and improved methodology is based on recommendations from a 2015 Legislative Audit, 2017 Legislative Audit Update and a 2018 third party analysis of our processes. All recommendations necessary for this data release have been implemented. Changes in recommended secondary water use estimate inputs, as well as the transfer of second homes from the commercial category to the residential category, are examples of updates that impact categorical or total use estimates.

    While we are encouraged by the improvements, these changes make comparing the 2015 numbers to past water use data problematic due to the significant methodology differences. As a result, we will be using the 2015 data as the new baseline for comparison and planning moving forward. The audit reports and third party recommendations can be found at: https://dwre-utahdnr.opendata.arcgis.com/pages/municipal-and-industrial.

    Likewise, comparisons from region to region within Utah are problematic due to differences in climate, number of vacation homes and other factors. Comparisons between Utah’s water use numbers and data from other states have little value given there is no nationally consistent methodology standard for analyzing and reporting water use numbers.

    It should be noted that administrative processes were changed in 2016 to ensure community water system data corrections are updated in the Utah Division of Water Rights’ database and website; however, these updated processes did not occur for the 2015 data. As a result, the data released in this database will often differ from what is reflected on the Utah Division of Water Rights’ website. That said, this data underwent both legislative auditor and third party review, and our division is confident that it is reflective of regional water use and useful for planning purposes.

    Utah’s Open Water Data Portal can be found at https://dwre-utahdnr.opendata.arcgis.com/. The division believes that data accessibility and transparency is vital as water decisions become more complicated and critical.

  7. n

    HadISD: Global sub-daily, surface meteorological station data, 1931-2022,...

    • data-search.nerc.ac.uk
    • catalogue.ceda.ac.uk
    Updated Jul 24, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2021). HadISD: Global sub-daily, surface meteorological station data, 1931-2022, v3.3.0.2022f [Dataset]. https://data-search.nerc.ac.uk/geonetwork/srv/search?keyword=dewpoint
    Explore at:
    Dataset updated
    Jul 24, 2021
    Description

    This is version v3.3.0.2022f of Met Office Hadley Centre's Integrated Surface Database, HadISD. These data are global sub-daily surface meteorological data. The quality controlled variables in this dataset are: temperature, dewpoint temperature, sea-level pressure, wind speed and direction, cloud data (total, low, mid and high level). Past significant weather and precipitation data are also included, but have not been quality controlled, so their quality and completeness cannot be guaranteed. Quality control flags and data values which have been removed during the quality control process are provided in the qc_flags and flagged_values fields, and ancillary data files show the station listing with a station listing with IDs, names and location information. The data are provided as one NetCDF file per station. Files in the station_data folder station data files have the format "station_code"_HadISD_HadOBS_19310101-20230101_v3.3.1.2022f.nc. The station codes can be found under the docs tab. The station codes file has five columns as follows: 1) station code, 2) station name 3) station latitude 4) station longitude 5) station height. To keep informed about updates, news and announcements follow the HadOBS team on twitter @metofficeHadOBS. For more detailed information e.g bug fixes, routine updates and other exploratory analysis, see the HadISD blog: http://hadisd.blogspot.co.uk/ References: When using the dataset in a paper you must cite the following papers (see Docs for link to the publications) and this dataset (using the "citable as" reference) : Dunn, R. J. H., (2019), HadISD version 3: monthly updates, Hadley Centre Technical Note. Dunn, R. J. H., Willett, K. M., Parker, D. E., and Mitchell, L.: Expanding HadISD: quality-controlled, sub-daily station data from 1931, Geosci. Instrum. Method. Data Syst., 5, 473-491, doi:10.5194/gi-5-473-2016, 2016. Dunn, R. J. H., et al. (2012), HadISD: A Quality Controlled global synoptic report database for selected variables at long-term stations from 1973-2011, Clim. Past, 8, 1649-1679, 2012, doi:10.5194/cp-8-1649-2012 Smith, A., N. Lott, and R. Vose, 2011: The Integrated Surface Database: Recent Developments and Partnerships. Bulletin of the American Meteorological Society, 92, 704–708, doi:10.1175/2011BAMS3015.1 For a homogeneity assessment of HadISD please see this following reference Dunn, R. J. H., K. M. Willett, C. P. Morice, and D. E. Parker. "Pairwise homogeneity assessment of HadISD." Climate of the Past 10, no. 4 (2014): 1501-1522. doi:10.5194/cp-10-1501-2014, 2014.

  8. Comparative Analysis of Data-Driven Anomaly Detection Methods

    • data.nasa.gov
    • s.cnmilf.com
    • +2more
    Updated Mar 31, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). Comparative Analysis of Data-Driven Anomaly Detection Methods [Dataset]. https://data.nasa.gov/dataset/comparative-analysis-of-data-driven-anomaly-detection-methods
    Explore at:
    Dataset updated
    Mar 31, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    This paper provides a review of three different advanced machine learning algorithms for anomaly detection in continuous data streams from a ground-test firing of a subscale Solid Rocket Motor (SRM). This study compares Orca, one-class support vector machines, and the Inductive Monitoring System (IMS) for anomaly detection on the data streams. We measure the performance of the algorithm with respect to the detection horizon for situations where fault information is available. These algorithms have been also studied by the present authors (and other co-authors) as applied to liquid propulsion systems. The trade space will be explored between these algorithms for both types of propulsion systems.

  9. A

    ‘Hitters Baseball Data’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Sep 30, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2021). ‘Hitters Baseball Data’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-hitters-baseball-data-00a7/90da49b5/?iid=020-554&v=presentation
    Explore at:
    Dataset updated
    Sep 30, 2021
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Hitters Baseball Data’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/mathchi/hitters-baseball-data on 30 September 2021.

    --- Dataset description provided by original source is as follows ---

    Baseball Data

    Description

    Major League Baseball Data from the 1986 and 1987 seasons.

    Usage

    Hitters

    Format

    A data frame with 322 observations of major league players on the following 20 variables.

    • AtBat: Number of times at bat in 1986

    • Hits: Number of hits in 1986

    • HmRun: Number of home runs in 1986

    • Runs: Number of runs in 1986

    • RBI: Number of runs batted in in 1986

    • Walks: Number of walks in 1986

    • Years: Number of years in the major leagues

    • CAtBat: Number of times at bat during his career

    • CHits: Number of hits during his career

    • CHmRun: Number of home runs during his career

    • CRuns: Number of runs during his career

    • CRBI: Number of runs batted in during his career

    • CWalks: Number of walks during his career

    • League: A factor with levels A and N indicating player's league at the end of 1986

    • Division: A factor with levels E and W indicating player's division at the end of 1986

    • PutOuts: Number of put outs in 1986

    • Assists: Number of assists in 1986

    • Errors: Number of errors in 1986

    • Salary: 1987 annual salary on opening day in thousands of dollars

    • NewLeague: A factor with levels A and N indicating player's league at the beginning of 1987

    Source

    This dataset was taken from the StatLib library which is maintained at Carnegie Mellon University. This is part of the data that was used in the 1988 ASA Graphics Section Poster Session. The salary data were originally from Sports Illustrated, April 20, 1987. The 1986 and career statistics were obtained from The 1987 Baseball Encyclopedia Update published by Collier Books, Macmillan Publishing Company, New York.

    References

    Games, G., Witten, D., Hastie, T., and Tibshirani, R. (2013) An Introduction to Statistical Learning with applications in R, www.StatLearning.com, Springer-Verlag, New York

    Examples

    summary(Hitters)

    lm(Salary~AtBat+Hits,data=Hitters)

    Dataset imported from https://www.r-project.org.

    --- Original source retains full ownership of the source dataset ---

  10. W

    MBC Impact and Risk Analysis Database v01

    • cloud.csiss.gmu.edu
    • researchdata.edu.au
    • +2more
    Updated Dec 13, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Australia (2019). MBC Impact and Risk Analysis Database v01 [Dataset]. https://cloud.csiss.gmu.edu/uddi/dataset/69075f3e-67ba-405b-8640-96e6cb2a189a
    Explore at:
    Dataset updated
    Dec 13, 2019
    Dataset provided by
    Australia
    Description

    Abstract

    The Maranoa-Balonne-Condamine Impact and Risk Analysis Database (Analysis Database) is a fit-for-purpose geospatial information system developed for the Impact and Risk Analysis (Component 3-4) products of the Bioregional Assessment Technical Programme (BATP).

    The version provided here for public download has been slightly modified to remove restricted material such as the co-ordinates of protected or threatened species. This version was used to populate BA Explorer.

    The Analysis Database brings together many of the data sets used in Components 1 and 2 of the assessments and includes hydrology and hydrogeology modelling results, landscape classes and economic, sociocultural and ecological assets. These data sets are listed in the Component 1 and 2 products under the Assessments tab in http://www.bioregionalassessments.gov.au/.

    An Analysis Database of common design and schema was implemented for each subregion where a full Impact and Risk Analysis was completed. To populate each database, input datasets were transformed, normalised and inserted into their respective Analysis Databases in accord with the common design and schema. The approach enabled the universal treatment of data analysis across all bioregions despite data being of different specifications and origins.

    The Analysis Database includes all the data used for the assessment of the subregion with the exception of those datasets that were not provided to the program with an open access licence. The database is constructed using the Open Source platform PostgreSQL coupled with PostGIS. This technology was considered to better enable the provenance and transparency requirements of the Programme. The files provided here have been prepared using the PostgreSQL version 9.5 SQL Dump function - pg_dump.

    A detailed description of the Analysis Database, its design, structure and application is provided in the supporting documentation: http://data.bioregionalassessments.gov.au/dataset/05e851cf-57a5-4127-948a-1b41732d538c

    Purpose

    The Maranoa-Balonne-Condamine Impact and Risk Analysis Database (Analysis Database) is the geospatial database for completing the Impact and Risk Analysis component of the Maranoa-Balonne-Condamine Bioregional Assessment. This includes the creating of results, tables and maps that appear in the relevant Products of each assessment. The database also manages the data used by the BA Explorer.

    An individual instance of the Analysis Database was developed for each subregion where a component 3-4 Impact and Risks Assessment was conducted. With the exception of the subregion-specific data contained within it and the removal of restricted data records, each analysis database is of identical design and structure.

    Dataset History

    This Analysis Database is an instance of PostgreSQL version 9.5 hosted on Linux Red Hat Enterprise Linux version 4.8.5-4. PostgreSQL geospatial capabilities are provided by POSTGIS version 2.2.

    Data pre-processing and upload into each PostgreSQL database was completed using FME Desktop (Oracle Edition) version 2016.1.2.1. Analysis data and results are provided to users and systems via the geospatial services of Geoserver version 2.9.1. Scientific analysis and mapping was undertaken by connecting a range of data using a combination of Microsoft Excel, QGIS and ArcMap systems.

    During the Programme and for its working life, the Analysis Database was hosted and managed on instances of Amazon Web Services managed by Geoscience Australia and the Bureau of Meteorology.

    Dataset Citation

    Bioregional Assessment Programme (2017) MBC Impact and Risk Analysis Database v01. Bioregional Assessment Derived Dataset. Viewed 25 October 2017, http://data.bioregionalassessments.gov.au/dataset/69075f3e-67ba-405b-8640-96e6cb2a189a.

    Dataset Ancestors

  11. National Energy Efficiency Data-Framework (NEED) report: summary of analysis...

    • gov.uk
    • s3.amazonaws.com
    Updated Aug 11, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department for Business, Energy & Industrial Strategy (2023). National Energy Efficiency Data-Framework (NEED) report: summary of analysis 2021 [Dataset]. https://www.gov.uk/government/statistics/national-energy-efficiency-data-framework-need-report-summary-of-analysis-2021
    Explore at:
    Dataset updated
    Aug 11, 2023
    Dataset provided by
    GOV.UKhttp://gov.uk/
    Authors
    Department for Business, Energy & Industrial Strategy
    Description

    The National Energy Efficiency Data-Framework (NEED) was set up to provide a better understanding of energy use and energy efficiency in domestic and non-domestic buildings in Great Britain. The data framework matches data about a property together - including energy consumption and energy efficiency measures installed - at household level.

    11 August 2023 Error notice: revisions to the June 2021 Domestic NEED annual report

    We identified 2 processing errors in this edition of the Domestic NEED Annual report and corrected them. The changes are small and do not affect the overall findings of the report, only the domestic energy consumption estimates. The revisions are summarised here:

    Error 1: Local authority consumption estimates

    Error 2: Some properties incorrectly excluded from the Scotland multiple attributes tables

    • Extent of the error: These corrections primarily affect the number in sample column for all years as some properties were incorrectly excluded from the consumption estimates. There have also been revisions to the mean, median, upper and lower quartiles. Using 2019 as an example, around 80% of the updated mean and median values are within 300 kWh of what was previously published.
    • Years affected: 2017-2019
    • Countries affected: Scotland
    • Data tables affected: Multiple attributes tables: Scotland, 2019 (all tables)

    4 August 2021 Error notice: revisions to the June 2021 Domestic NEED annual report

    We identified 2 processing errors in this edition of the Domestic NEED Annual report and corrected them. The changes are small and do not affect the overall findings of the report, only the domestic energy consumption estimates. The impact of energy efficiency measures analysis remains unchanged. The revisions are summarised here:

    Error 1: Some properties incorrectly excluded from the 2019 gas consumption estimates

  12. Global Real-time Database Market Research and Development Focus 2025-2032

    • statsndata.org
    excel, pdf
    Updated Jun 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stats N Data (2025). Global Real-time Database Market Research and Development Focus 2025-2032 [Dataset]. https://www.statsndata.org/report/real-time-database-market-337229
    Explore at:
    pdf, excelAvailable download formats
    Dataset updated
    Jun 2025
    Dataset authored and provided by
    Stats N Data
    License

    https://www.statsndata.org/how-to-orderhttps://www.statsndata.org/how-to-order

    Area covered
    Global
    Description

    The Real-time Database market has emerged as a pivotal component in the modern digital landscape, facilitating instant data retrieval and updates across a multitude of applications. This dynamic sector serves diverse industries, including e-commerce, healthcare, finance, and gaming, where the demand for immediate ac

  13. A

    Forest Inventory and Analysis Database

    • data.amerigeoss.org
    • agdatacommons.nal.usda.gov
    • +9more
    html, xml
    Updated Jan 5, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    United States (2022). Forest Inventory and Analysis Database [Dataset]. https://data.amerigeoss.org/dataset/forest-inventory-and-analysis-database-fb721
    Explore at:
    html, xmlAvailable download formats
    Dataset updated
    Jan 5, 2022
    Dataset provided by
    United States
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Forest Inventory and Analysis (FIA) research program has been in existence since mandated by Congress in 1928. FIA's primary objective is to determine the extent, condition, volume, growth, and depletion of timber on the Nation's forest land. Before 1999, all inventories were conducted on a periodic basis. The passage of the 1998 Farm Bill requires FIA to collect data annually on plots within each State. This kind of up-to-date information is essential to frame realistic forest policies and programs. Summary reports for individual States are published but the Forest Service also provides data collected in each inventory to those interested in further analysis. Data is distributed via the FIA DataMart in a standard format. This standard format, referred to as the Forest Inventory and Analysis Database (FIADB) structure, was developed to provide users with as much data as possible in a consistent manner among States. A number of inventories conducted prior to the implementation of the annual inventory are available in the FIADB. However, various data attributes may be empty or the items may have been collected or computed differently. Annual inventories use a common plot design and common data collection procedures nationwide, resulting in greater consistency among FIA work units than earlier inventories. Links to field collection manuals and the FIADB user's manual are provided in the FIA DataMart.

  14. d

    Hydrologic Derivatives for Modeling and Analysis (HDMA) database -- South...

    • catalog.data.gov
    • data.usgs.gov
    • +3more
    Updated Jul 6, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2024). Hydrologic Derivatives for Modeling and Analysis (HDMA) database -- South America [Dataset]. https://catalog.data.gov/dataset/hydrologic-derivatives-for-modeling-and-analysis-hdma-database-south-america
    Explore at:
    Dataset updated
    Jul 6, 2024
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Area covered
    South America
    Description

    This contains the South American portion of the Hydrologic Derivatives for Modeling and Analysis (HDMA) database. The HDMA database provides comprehensive and consistent global coverage of raster and vector topographically derived layers, including raster layers of digital elevation model (DEM) data, flow direction, flow accumulation, slope, and compound topographic index (CTI); and vector layers of streams and catchment boundaries. The coverage of the data is global (-180º, 180º, -90º, 90º) with the underlying DEM being a hybrid of three datasets: HydroSHEDS (Hydrological data and maps based on SHuttle Elevation Derivatives at multiple Scales), Global Multi-resolution Terrain Elevation Data 2010 (GMTED2010) and the Shuttle Radar Topography Mission (SRTM). For most of the globe south of 60º North, the raster resolution of the data is 3-arc-seconds, corresponding to the resolution of the SRTM. For the areas North of 60º, the resolution is 7.5-arc-seconds (the smallest resolution of the GMTED2010 dataset) except for Greenland, where the resolution is 30-arc-seconds. The streams and catchments are attributed with Pfafstetter codes, based on a hierarchical numbering system, that carry important topological information.

  15. JNCC Sentinel-1 indices Analysis Ready Data (ARD) Radar Vegetation Index...

    • catalogue.ceda.ac.uk
    Updated Dec 2, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Joint Nature Conservation Committee (JNCC) (2021). JNCC Sentinel-1 indices Analysis Ready Data (ARD) Radar Vegetation Index (RVI) [Dataset]. https://catalogue.ceda.ac.uk/uuid/22ae54ba3ab14ce8aa6a5271dfddaeb3
    Explore at:
    Dataset updated
    Dec 2, 2021
    Dataset provided by
    Centre for Environmental Data Analysishttp://www.ceda.ac.uk/
    Authors
    Joint Nature Conservation Committee (JNCC)
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Area covered
    Description

    These data have been created by the Joint Nature Conservation Committee (JNCC) as part of a Defra NCEA project to produce a regional, and ultimately national, system for detecting a change in habitat conditions at a land parcel level. The first stage of the project is focused on Yorkshire, UK, and therefore the dataset includes granules and scenes covering Yorkshire and surrounding areas only. The dataset contains the following indices derived from Defra and JNCC Sentinel-1 Analysis Ready Data.

    RVI and RVIv files are generated for Sentinel-1 orbit 132 (ascending) every 12 days.

    Indices have been generated using the Defra and JNCC Sentinel-1 and Sentinel-2 ARD for the granules and scenes described above. As the project continues, JNCC will expand the geographical coverage of this dataset and will provide continuous updates as ARD becomes available.

  16. f

    Data from: Critical Analysis of CCSD Data Quality

    • figshare.com
    zip
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    K. S. Egorova; Ph. V. Toukach (2023). Critical Analysis of CCSD Data Quality [Dataset]. http://doi.org/10.1021/ci3002815.s001
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    ACS Publications
    Authors
    K. S. Egorova; Ph. V. Toukach
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Area covered
    Clark County School District
    Description

    Systematization and classification of carbohydrates contribute greatly to development of modern biomedical sciences. CCSD (CarbBank) data constitute the significant part of nearly all existing carbohydrate databases. However, these data have not been verified from their original deposit. During the expansion of Bacterial Carbohydrate Structure Database (BCSDB) project, we checked CCSD data quality and found that about 35% of records contained errors. The CCSD data cannot be used without manual verification, while CCSD errors migrate from database to database.

  17. G

    Graph Database Market Report

    • promarketreports.com
    doc, pdf, ppt
    Updated Jan 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pro Market Reports (2025). Graph Database Market Report [Dataset]. https://www.promarketreports.com/reports/graph-database-market-8060
    Explore at:
    doc, pdf, pptAvailable download formats
    Dataset updated
    Jan 6, 2025
    Dataset authored and provided by
    Pro Market Reports
    License

    https://www.promarketreports.com/privacy-policyhttps://www.promarketreports.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The size of the Graph Database Market was valued at USD 19942.01 million in 2023 and is projected to reach USD 64282.28 million by 2032, with an expected CAGR of 18.20% during the forecast period. A Graph Database is a type of NoSQL database designed to represent and store data in the form of graphs, consisting of nodes, edges, and properties. This database model is optimized for handling data that is highly interconnected, allowing for the representation of relationships and networks with ease. The nodes in a graph database represent entities such as people, places, or events, while the edges represent the relationships or connections between these entities. Properties can be attached to both nodes and edges to store additional information, providing a rich structure for complex data sets. Unlike traditional relational databases, which use tables to organize data in rows and columns, graph databases use graph theory to model the relationships between data points, which enables more efficient querying and analysis, especially for large and complex data structures. This growth is attributed to factors such as increased data complexity, need for real-time insights, and advancements in AI and ML. Graph databases provide efficient storage and analysis of highly interconnected data, making them valuable for fraud detection, social network analysis, and recommendation systems. Key players include Oracle Corporation, IBM Corporation, and Amazon Web Services, Inc. Recent developments include: June 2021: Neo4j has released its most recent graph database version, 4.3. Graph data analysis, relationship asset indexes, new smart 10 scheduling, and parallelized backup are some of the features included in the most recent version of the graph database., April 2021: The MarkLogic Data Hub Central low-code/no-code user interface was introduced by MarkLogic Corp. With the ease and agility of using the data infrastructure, MarkLogic's launch provides organizations with a clear roadmap for cloud modernization., October 2020: Microsoft Corporation unveiled a brand-new artificial intelligence platform that can caption and describe photos. Azure Cognitive Services offers the system..

  18. f

    Comparability of Mixed IC50 Data – A Statistical Analysis

    • plos.figshare.com
    docx
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tuomo Kalliokoski; Christian Kramer; Anna Vulpetti; Peter Gedeck (2023). Comparability of Mixed IC50 Data – A Statistical Analysis [Dataset]. http://doi.org/10.1371/journal.pone.0061007
    Explore at:
    docxAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Tuomo Kalliokoski; Christian Kramer; Anna Vulpetti; Peter Gedeck
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The biochemical half maximal inhibitory concentration (IC50) is the most commonly used metric for on-target activity in lead optimization. It is used to guide lead optimization, build large-scale chemogenomics analysis, off-target activity and toxicity models based on public data. However, the use of public biochemical IC50 data is problematic, because they are assay specific and comparable only under certain conditions. For large scale analysis it is not feasible to check each data entry manually and it is very tempting to mix all available IC50 values from public database even if assay information is not reported. As previously reported for Ki database analysis, we first analyzed the types of errors, the redundancy and the variability that can be found in ChEMBL IC50 database. For assessing the variability of IC50 data independently measured in two different labs at least ten IC50 data for identical protein-ligand systems against the same target were searched in ChEMBL. As a not sufficient number of cases of this type are available, the variability of IC50 data was assessed by comparing all pairs of independent IC50 measurements on identical protein-ligand systems. The standard deviation of IC50 data is only 25% larger than the standard deviation of Ki data, suggesting that mixing IC50 data from different assays, even not knowing assay conditions details, only adds a moderate amount of noise to the overall data. The standard deviation of public ChEMBL IC50 data, as expected, resulted greater than the standard deviation of in-house intra-laboratory/inter-day IC50 data. Augmenting mixed public IC50 data by public Ki data does not deteriorate the quality of the mixed IC50 data, if the Ki is corrected by an offset. For a broad dataset such as ChEMBL database a Ki- IC50 conversion factor of 2 was found to be the most reasonable.

  19. INTEGRAL Public Data Results Catalog

    • data.staging.idas-ds1.appdat.jsc.nasa.gov
    • s.cnmilf.com
    • +2more
    Updated Mar 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). INTEGRAL Public Data Results Catalog [Dataset]. https://data.staging.idas-ds1.appdat.jsc.nasa.gov/dataset/integral-public-data-results-catalog
    Explore at:
    Dataset updated
    Mar 7, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    The INTEGRAL Public Data Results Catalog is based on publicly available data from the two main instruments (IBIS and SPI) on board INTEGRAL (see Winkler et al. 2003, A&A, 411, L1 for a description of the INTEGRAL spacecraft and instrument packages). INTEGRAL began collecting data in October 2002. This catalog will be regularly updated as data become public (~14 months after they are obtained). This catalog is a collaborative effort between the INTEGRAL Science Data Center (ISDC) in Switzerland and the NASA Goddard Space Flight Center (GSFC) INTEGRAL Guest Observer Facility (GOF). The results presented here are a result of a semi-automated analysis and they should be considered as approximate: they are intended to serve as a guideline to those interested in pursuing more detailed follow-up analyses. The data from the imager ISGRI (Lebrun et al. 2003, A&A, 411, L141) have been analyzed at the INTEGRAL Science Data Centre (ISDC), while the SPI (Vedrenne et al. 2003, A&A, 411, L63) data analysis was performed at GSFC as a service of the INTEGRAL GOF. Note: For cases where two or more proposals have been amalgamated (entries with pi_lname = 'Amalgamated') for a given observation, the same observation is listed for each of the amalgamated proposal numbers. This database table was first created in September 2004. It is based on the online web page maintained by the INTEGRAL GOF at the URL http://heasarc.gsfc.nasa.gov/docs/integral/obslist.html and was updated on a weekly basis whenever that web page was updated. Automatic updates were discontinued in June 2019. Duplicate entries were removed in June 2019, also. This is a service provided by NASA HEASARC .

  20. m

    Data from: Project IPAD, a database to catalogue the analysis of Fukushima...

    • data.mendeley.com
    • narcis.nl
    Updated Jul 15, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Peter Martin (2020). Project IPAD, a database to catalogue the analysis of Fukushima Daiichi accident fragmental release material [Dataset]. http://doi.org/10.17632/nz6hjbfs65.3
    Explore at:
    Dataset updated
    Jul 15, 2020
    Authors
    Peter Martin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    For the full database, please visit: www.projectipad.org

    The 2011 accident at Japan’s Fukushima Daiichi Nuclear Power Plant released a considerable inventory of radioactive material into the local and global environments. While the vast majority of this contamination was in the form of gaseous and aerosol species, of which a large component was distributed out over the neighbouring Pacific Ocean (where is was subsequently deposited), a substantial portion of the radioactive release was in particulate form and was deposited across Fukushima Prefecture. To provide an underpinning understanding of the dynamics of this catastrophic accident, alongside assisting in the off-site remediation and eventual reactor decommissioning activities, the ‘International Particle Analysis Database’, or ‘IPAD’, was established to serve as an interactive repository for the continually expanding analysis dataset of the sub-mm ejecta particulate. In addition to a fully interrogatable database of analysis results for registered users (exploiting multiple search methods), the database also comprises an open-access front-end for members of the public to engage with the multi-national analysis activities by exploring a streamlined version of the data.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Atmospheric Radiation Measurement Data Center (2020). Rapid Update Cycle (RUC) model: hybrid analysis data, 20-km resolution [Dataset]. https://catalog.data.gov/dataset/rapid-update-cycle-ruc-model-hybrid-analysis-data-20-km-resolution

Rapid Update Cycle (RUC) model: hybrid analysis data, 20-km resolution

Explore at:
Dataset updated
Nov 12, 2020
Dataset provided by
Atmospheric Radiation Measurement Data Center
Description

No description found

Search
Clear search
Close search
Google apps
Main menu