87 datasets found
  1. f

    Data from: Ordered correlation forest

    • tandf.figshare.com
    txt
    Updated Feb 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Riccardo Di Francesco (2025). Ordered correlation forest [Dataset]. http://doi.org/10.6084/m9.figshare.28218061.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Feb 10, 2025
    Dataset provided by
    Taylor & Francis
    Authors
    Riccardo Di Francesco
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Empirical studies in various social sciences often involve categorical outcomes with inherent ordering, such as self-evaluations of subjective well-being and self-assessments in health domains. While ordered choice models, such as the ordered logit and ordered probit, are popular tools for analyzing these outcomes, they may impose restrictive parametric and distributional assumptions. This article introduces a novel estimator, the ordered correlation forest, that can naturally handle non linearities in the data and does not assume a specific error term distribution. The proposed estimator modifies a standard random forest splitting criterion to build a collection of forests, each estimating the conditional probability of a single class. Under an “honesty” condition, predictions are consistent and asymptotically normal. The weights induced by each forest are used to obtain standard errors for the predicted probabilities and the covariates’ marginal effects. Evidence from synthetic data shows that the proposed estimator features a superior prediction performance than alternative forest-based estimators and demonstrates its ability to construct valid confidence intervals for the covariates’ marginal effects. Comparisons using various real-world data sets further highlight the advantages of forest-based estimators over parametric models in larger samples while showing that the ordered correlation forest remains competitive in smaller samples.

  2. Virginia Springs/Groundwater Layers - 2023

    • data.virginia.gov
    • hub.arcgis.com
    • +3more
    Updated Jul 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Virginia Department of Environmental Quality (2025). Virginia Springs/Groundwater Layers - 2023 [Dataset]. https://data.virginia.gov/dataset/virginia-springs-groundwater-layers-2023
    Explore at:
    html, arcgis geoservices rest apiAvailable download formats
    Dataset updated
    Jul 29, 2025
    Dataset authored and provided by
    Virginia Department of Environmental Qualityhttps://deq.virginia.gov/
    Area covered
    Hot Springs
    Description
    The VDEQ Spring SITES database contains data describing the geographic locations and site attributes of natural springs throughout the commonwealth. This data coverage continues to evolve and contains only spring locations known to exist with a reasonable degree of certainty on the date of publication. The dataset does not replace site specific inventorying or receptor surveys but can be used as a starting point. VDEQ's initial geospatial dataset of approximately 325 springs was formed in 2008 by digitizing historical spring information sheets created by State Water Control Board geologists in the 1970s through early 1990s. Additional data has been consolidated from the EPA STORET database, the U.S. Geological Survey's Ground Water Site Inventory (GWSI) and Geographic Names Inventory System (GNIS), the Virginia Department of Health SDWIS database, the Virginia DEQ Virginia Water Use Data Set (VWUDS), the Commonwealth of Virginia Division of Water Resources and Power Bulletin No. 1: "Springs of Virginia" by Collins et al., 1930 as well as several VDWR&P Surface Water Supply bulletins from the 1940's - 1950's. A 1992 Virginia Department of Game and Inland Fisheries / Virginia Tech sponsored study by Helfrich et al. titled "Evaluation of the Natural Springs of Virginia: Fisheries Management Implications", a 2004 Rockbridge County groundwater resources report written by Frits van der Leeden, and several smaller datasets from consultants and citizens were evaluated and added to the database when confidence in locational accuracy was high or could be verified with aerial or LIDAR imagery. Significant contributions have been made throughout the years by VDEQ Groundwater Characterization staff site visits as well as other geologists working in the region including: Matt Heller at Virginia Division of Geology and Mineral Resources (VDMME), Wil Orndorff at the Virginia Department of Conservation and Recreation Karst Program (VDCR), and David Nelms and Dan Doctor of the U.S. Geological Survey (USGS). Substantial effort has been made to improve locational accuracy and remove duplication present between data sources. Hundreds of spring locations that were originally obtained using topographic maps or unknown methods were updated to sub-meter locational accuracy using post-processed differential GPS (PPGPS) and through the use of several generations of aerial imagery (2002-2017) obtained from Virginia's Geographic Information Network (VGIN) and 1-meter LIDAR, where available. Scores of new spring locations were also obtained by systematic quadrangle by quadrangle analysis in areas of the Shenandoah Valley where 1-meter LIDAR datasets where obtained from the U.S. Geological Survey. Future improvements to the dataset will result when statewide 1-meter LIDAR datasets becomes available and through continued field work by DEQ staff and other contributors working in the region. Please do not hesitate to contact the author to correct mistakes or to contribute to the database.

    The VDEQ Spring FIELD MEASUREMENTS database contains data describing field derived physio-chemical properties of spring discharges measured throughout the Commonwealth of Virginia. Field visits compiled in this dataset were performed from 1928 to 2019 by geologists with the State Water Control Board, the Virginia Division of Water and Power, the Virginia Department of Environmental Quality, and the U.S. Geological Survey with contributions from other sources as noted. Values of -9999 indicate that measurements were not performed for the referenced parameter. Please do not hesitate to contact the author to add data to the database or correct errors.


    The VDEQ_Spring_WQ database is a geodatabase containing groundwater sample information collected from springs throughout Virginia. Sample specific information include: location and site information, measured field parameters, and lab verified quantifications of major ionic concentrations, trace element concentrations, nutrient concentrations, and radiological data. The VDEQ_Spring_WQ database is a subset of the VDEQ GWCHEM database which is a flat-file geodatabase containing groundwater sample information from groundwater wells and springs throughout Virginia. Sample information has been correlated via DEQ Well # and projected using coordinates in VDEQ_Spring_SITES database. The GWCHEM database is comprised of historic groundwater sample data originally archived in the United States Geological Survey (USGS) National Water Information System (NWIS) and the Environmental Protection Agency (EPA) Storage and Retrieval (STORET) data warehouse. Archived STORET data originated as groundwater sample data collected and uploaded by Virginia State Water Control Board Personnel. While groundwater sample data in the STORET data warehouse are static, new groundwater sample data are periodically uploaded to NWIS and spring laboratory WQ data reflect NWIS downloaded on 9/30/2019. Recent groundwater sample data collected by Virginia Department of Environmental Quality (DEQ) personnel as part of the Ambient Groundwater Sampling Program are entered into the database as lab results are made available by the Division of Consolidated Laboratory Services (DCLS). When possible, charge balances were calculated for samples with reported values for major ions including (at a minimum) calcium, magnesium, potassium, sodium, bicarbonate, chloride, and sulfate. Reported values for Nitrate as N, carbonate, and fluoride were included in the charge balance calculation when available. Field determined values for bicarbonate and carbonate were used in the charge balance calculation when available. For much of the legacy DEQ groundwater sample data, bicarbonate values were derived from lab reported values of alkalinity (as mg/CaCO3) under the assumption that there was no contribution by carbonate to the reported alkalinity value. Charge balance values are reported in the "Charge Balance" column of the GWCHEM geodatabase. The closer the charge balance value is to unity (1), the lower the assumed charge balance error.In order to preserve the numerical capabilities of the database, non- numeric lab qualifiers were given the following numeric identifiers:- (minus sign) = less than the concentration specified to the right of the sign-11110 = estimated-22220 = presence verified but not quantified-33330 = radchem non-detect, below sslc-4440 = analyzed for but not detected-55550 = greater than the concentration to the right of the zero-66660 = sample held beyond normal holding time-77770 = quality control failure. Data not valid.-88880 = sample held beyond normal holding time. Sample analyzed for but not detected. Value stored is limit of detection for proces in use.-11120 = Value reported is less than the criteria of detection.-9999 = no data (parameter not quantified)

    A more in depth descprition and hydrogeologic analysis of the database can be found here
    An in Depth data fact sheet can be found here
  3. S

    2014 - 2017 Regents

    • splitgraph.com
    • data.cityofnewyork.us
    • +1more
    Updated Jul 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    cityofnewyork-us (2024). 2014 - 2017 Regents [Dataset]. https://www.splitgraph.com/cityofnewyork-us/2014-2017-regents-cbrh-qrk4/
    Explore at:
    application/vnd.splitgraph.image, application/openapi+json, jsonAvailable download formats
    Dataset updated
    Jul 5, 2024
    Authors
    cityofnewyork-us
    Description

    New York City Department of Education 2014 - 2017 Regents

    Testing and score data includes all administrations of the Regents exam: January, June, and August. It reports the highest score for each student for each Regents exam for each school year. Non-numeric marks are dropped from the data.

    Splitgraph serves as an HTTP API that lets you run SQL queries directly on this data to power Web applications. For example:

    See the Splitgraph documentation for more information.

  4. Quarterly Labour Force Survey 2011, Third Quarter - South Africa

    • microdata.worldbank.org
    Updated Sep 4, 2014
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statistics South Africa (2014). Quarterly Labour Force Survey 2011, Third Quarter - South Africa [Dataset]. https://microdata.worldbank.org/index.php/catalog/1301
    Explore at:
    Dataset updated
    Sep 4, 2014
    Dataset authored and provided by
    Statistics South Africahttp://www.statssa.gov.za/
    Time period covered
    2011
    Area covered
    South Africa
    Description

    Abstract

    The Quarterly Labour Force Survey (QLFS) is a household-based sample survey conducted by Statistics South Africa (Stats SA). It collects data on the labour market activities of individuals aged 15 years and above who live in South Africa.

    Geographic coverage

    National Coverage

    Analysis unit

    Individuals, households

    Universe

    The QLFS sample covers the non-institutional population except for workers' hostels. However, persons living in private dwelling units within institutions are also enumerated. For example, within a school compound, one would enumerate the schoolmaster's house and teachers' accommodation because these are private dwellings. Students living in a dormitory on the school compound would, however, be excluded.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    The QLFS sample covers the non-institutional population except for workers' hostels. However, persons living in private dwelling units within institutions are also enumerated. For example, within a school compound, you would enumerate the schoolmaster's house and teachers' accommodation because these are private dwellings. Students living in a dormitory on the school compound would therefore be excluded.

    Survey requirements and design :

    The Labour Force Survey frame has been developed as a general purpose household survey frame that can be used by all other household surveys irrespective of the sample size requirement of the survey. The sample size for the QLFS is roughly 30 000 dwellings and these are divided equally into four rotation groups, i.e. 7 500 dwellings per rotation group. The sample is based on information collected during the 2001 Population Census conducted by Stats SA. In preparation for the 2001 census, the country was divided into 80 787 enumeration areas (EAs). Some of these EAs are small in terms of the number of households that were enumerated in them at the time of Census 2001. Stats SA's household-based surveys use a Master Sample which comprises of EAs that are drawn from across the country. For the purposes of the Master Sample the EAs that contained less than 25 households were excluded from the sampling frame, and those that contained between 25 and 99 households were combined with other EAs to form Primary Sampling Units (PSUs). The number of EAs per PSU ranges between one and four. On the other hand, very large EAs represent two or more PSUs. The sample is designed to be representative at the provincial level and within provinces at the metro/non-metro level. Within the metros, the sample is further distributed by geography type. The four geography types are: urban formal, urban informal, farms and tribal. This implies that for example, that within a metropolitan area the sample is designed to be representative at the different geography types that may exist within that metro. The current sample size is 3 080 PSUs. It is equally divided into four sub-groups or panels called rotation groups. The rotation groups are designed in such a way that each of these groups has the same distribution pattern as that which is observed in the whole sample. They are numbered from one to four and these numbers also correspond to the quarters of the year in which the sample will be rotated for the particular group. The sample for the redesigned Labour Force Survey is based on a stratified two-stage design with probability proportional to size (PPS) sampling of primary sampling units (PSUs) in the first stage, and sampling of dwelling units (DUs) with systematic sampling in the second stage.

    Sample rotation :

    The sampled PSUs have been assigned to 4 rotation groups, and dwellings selected from the PSUs assigned to rotation group "1" are rotated in the first quarter. Similarly, the dwellings selected from the PSUs assigned to rotation group "2" are rotated in the second quarter, and so on. Thus, each sampled dwelling will remain in the sample for four consecutive quarters. It should be noted that the sampling unit is the dwelling, and the unit of observation is the household. Therefore, if a household moves out of a dwelling after being in the sample for, say 2 quarters and a new household moves in then the new household will be enumerated for the next two quarters. If no household moves into the sampled dwelling, the dwelling will be classified as vacant (unoccupied). Each quarter, ¼ of the sampled dwellings rotate out of the sample and are replaced by new dwellings from the same PSU or the next PSU on the list. A total of 3 080 PSUs were selected for the redesigned LFS, and 770 have been assigned to each of the four rotation groups.

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    The questionnaire consists of the following sections:

    Section 1 - Biographical information (marital status, language, migration, education,training, literacy, etc. Section 2 - Economic activities Section 3 - Unemployment and economic inactivity Section 4 - Main work activities in the last week Section 5 - Earnings in the main job All sections - Comprehensive coverage of all aspects of the labour market

    Cleaning operations

    Data Processing

    Introduction : The purpose of data processing is to ensure that the information collected from the sampled primary sampling units, dwelling units and households (i.e. the boxes containing QLFS questionnaires) are physically received, stored and processed. The aim is to produce a clean dataset that has all the information contained in the questionnaires. Except for the scanning system, all other elements of the data processing system were developed in-house. One important innovation that is central to the smooth operation of the entire system is the development of barcodes that are linked to a unique number on each questionnaire. This information provides the link between the information recorded in the Master Sample database and other processes such as editing and imputation as well as weighting and variance estimation.

    Processing phases : QLFS data processing is continuous, starting on the second week of every month. Data processing for each quarter must be completed by the first Friday of the subsequent month to ensure that the four-week deadline for publication of the QLFS results is met.

    The phases listed below occur sequentially.

    Receiving of questionnaires : The contents of the boxes containing questionnaires sent from the regional offices are verified when received at the DPC. The questionnaire barcodes captured in the provinces are captured again at the DPC to ensure that all questionnaires have been received.

    Primary preparation : The purpose of primary preparation is to ensure that all questionnaires are correctly stacked and positioned prior to being guillotined.

    Guillotining: The purpose of the guillotine process is to cut off the spines of the questionnaires in order to have pages separated for scanning.

    Secondary preparation : The purpose of secondary preparation is to ensure that the questionnaires are correctly stacked and positioned for scanning. At the same time, quality assurance takes place on the work done during the primary preparation and guillotining processes.

    Scanning : The purpose of scanning and recognition is to convert the questionnaires into an electronic format and Tagged Image File Format (TIFF) images.

    Verification : The purpose of scanning verification is to manually correct un-interpretable characters, missing data and errors detected by validation rules.

    Electronic coding: Industry and occupation codes are assigned using the electronic coding system which converts the respondents' industry and occupation descriptions into numeric codes based on Standard Industry Classification (SIC) and South African Standard Occupation Classification (SASCO). If the system fails to assign a code for either industry or occupation, the coding is assigned manually.

    Automated editing and imputation : QLFS uses the editing and imputation module to ensure that output data is both clean and complete10. There are three basic components, called functions, in the Edit and Imputation Module:

    Function A: Record acceptance Function B: Edit and imputation Function C: Clean up, derived variables and preparation for weighting Function A: Record acceptance

    This function is divided into three phases:

    First phase: Pre-function A : The first phase ensures that the records contain valid information in selected Cover Page questions required during edit and imputation and during the subsequent weighting and variance estimation. Any blanks or other errors that need to be corrected are done here before processing of the record can proceed.

    Second phase: Function A record acceptance : The second phase ensures that there is enough demographic and labour market activity information to ensure that editing and imputation can be successfully completed.

    Third phase: Post Function A clean up : This phase ensures that certain data are present where there is evidence that they should be. This for example, involves: • Ensuring that if there is written material in the job description questions then there are corresponding industry and occupation codes for them. • Ensuring that partial blanks or non-numeric characters that appear in questions where the Survey Officer is required to enter numbers are validated. • Ensuring that where there is written material in the space provided for "Other - specify" that the corresponding option is marked.

    Function B: Edit and imputation : Having determined in Function A that the content of the record would support extensive editing and imputation, this function carries out those activities. Editing is the

  5. I

    India HCE: No of Sample Households Reporting Consumption: Haryana: Rural:...

    • ceicdata.com
    Updated Mar 15, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com (2023). India HCE: No of Sample Households Reporting Consumption: Haryana: Rural: Non Food: Consumer Services [Dataset]. https://www.ceicdata.com/en/india/household-consumer-expenditure-haryana-rural/hce-no-of-sample-households-reporting-consumption-haryana-rural-non-food-consumer-services
    Explore at:
    Dataset updated
    Mar 15, 2023
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jun 1, 2005 - Jun 1, 2012
    Area covered
    India
    Variables measured
    Household Income and Expenditure Survey
    Description

    HCE: Number of Sample Households Reporting Consumption: Haryana: Rural: Non Food: Consumer Services data was reported at 1,422.000 Unit in 2012. This records a decrease from the previous number of 1,431.000 Unit for 2010. HCE: Number of Sample Households Reporting Consumption: Haryana: Rural: Non Food: Consumer Services data is updated yearly, averaging 1,431.000 Unit from Jun 2005 (Median) to 2012, with 3 observations. The data reached an all-time high of 1,668.000 Unit in 2005 and a record low of 1,422.000 Unit in 2012. HCE: Number of Sample Households Reporting Consumption: Haryana: Rural: Non Food: Consumer Services data remains active status in CEIC and is reported by Ministry of Statistics and Programme Implementation. The data is categorized under India Premium Database’s Domestic Trade and Household Survey – Table IN.HB040: HCES: Uniform Reference Period (URP): Average Monthly Per Capita Consumption Expenditure (MPCE): by Item Group: Haryana: Rural (Discontinued).

  6. S

    Budget - 2021 Budget Recommendations - Appropriations

    • splitgraph.com
    • data.cityofchicago.org
    • +2more
    Updated Oct 20, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Chicago (2020). Budget - 2021 Budget Recommendations - Appropriations [Dataset]. https://www.splitgraph.com/cityofchicago/budget-2021-budget-recommendations-appropriations-385z-7dwt/
    Explore at:
    application/vnd.splitgraph.image, application/openapi+json, jsonAvailable download formats
    Dataset updated
    Oct 20, 2020
    Dataset authored and provided by
    City of Chicago
    Description

    The dataset details 2021 Budget Recommendations, which is the line-item budget document proposed by the Mayor to the City Council for approval. Budgeted expenditures are identified by department, appropriation account, and funding type: Local, Community Development Block Grant Program (CDBG), and other Grants. “Local” funds refer to those line items that are balanced with locally-generated revenue sources, including but not limited to the Corporate Fund, Water Fund, Midway and O’Hare Airport funds, Vehicle Tax Fund, Library Fund and General Obligation Bond funds.

    This dataset follows the format of the equivalent datasets from past years except that Appropriation Authority and Appropriation Account have changed from Number to Text in order to accommodate non-numeric values.

    For more information about the budget process, visit the Budget Documents page: http://j.mp/lPotWf.

    Splitgraph serves as an HTTP API that lets you run SQL queries directly on this data to power Web applications. For example:

    See the Splitgraph documentation for more information.

  7. Sepulveda et al. The shifting climate portfolio of the Greater Yellowstone...

    • figshare.com
    txt
    Updated Jan 20, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mike Tercek (2016). Sepulveda et al. The shifting climate portfolio of the Greater Yellowstone Area [Dataset]. http://doi.org/10.6084/m9.figshare.1615873.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jan 20, 2016
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Mike Tercek
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    http://dx.doi.org/10.6084/m9.figshare.1615873 Data sets and supplementary information for Sepulveda et al. The shifting climate portfolio of the Greater Yellowstone Area. All units are metric (mm, degrees C, cubic meters). Missing values are marked with "nan," which stands for "not a number." Missing entries are due either to completely missing data or because time periods had insufficent data for an accurate calculation. See Sepulveda et al. for details. SNOTEL temperature data were identical to NRCS sources at the time of writing, but NRCS plans recalculate their historical datasets in the near future. This may result in some differences between the data provided here and the data that is in future available from the NRCS web sites. Files with "normalized data" were normalized as follows: 1. For each stream gage or SNOTEL, separately calculate the mean and standard deviation for each parameter during the period spanning water years 1993 - 1994 through 2012-2013. These are defined as "mean" and "sd." 2. For each stream gage or SNOTEL, separately calculate the z value as z = (x-mean) / sd for each parameter for each water year, where x = the annual value for a parameter at a specific gage / SNOTEL. This results in normalized time series for each gage / SNOTEL that are all on a common scale of measure. 3. To find the zone or "overall" mean for a particular parameter for a particular water year, average all the z-scores available in that year for that parameter. The zonal z values appear in the columns labelled "overall_{parameter} in the normalized files below." 4. Stations or gages that do not have complete data during the reference period 1993 - 1994 through 2012 - 2013 (stations with short or poor quality records) have been excluded from normalization. You will notice that normalized values for these stations have been all replaced with "nan." This is to ensure that the same number of gages / stations were used to calculate the overall zone averages and sd (step 2) for each year. This restriction was relaxed for the SNOTEL monthly Tmax, Tmin, and Precip files presented below because too many SNOTEL stations were eliminated, and upon examination, it was found that each station only had a handful of missing months. We can relax this condition for all files, but exploratory graphs show that it produces jump or discontinuous time series that seem less likely to reveal true patterns. Notice that "peak date" and other non-numeric fields cannot be normalized because they are expressed as calendar references, e.g. 6/01/1958. These columns are all replaced with "nan" in the normalized files, but there are numeric equivalents that have been normalized. For example, stream peak dates are available numerically in the variable "peak_index," which is the numbered day of the water year at which peak occurred. Files: meta.csv - metadata describing the weather stations used.---tx_reduced_stn_set_snotel_monthly_tmin.csv - monthly averages of daily Tmaxfrom TopoWx (Oyler et al) infilled and corrected station data files--- tx_reduced_stn_set_snotel_monthly_tmax.csv - monthly averages of daily Tmax from TopoWx (Oyler et al) infilled and corrected station data files---

    tx_reduced_stn_set_snotel_seasonal_tmax.csv - Seasonal averages of tmax for snotel stations from TopoWx infilled and corrected station files---tx_reduced_stn_set_snotel_seasonal_tmin.csv - Seasonal averages of tmin for snotel stations from TopoWx infilled and corrected station files---tx_all_monthly_tmin.csv - monthly averages of daily tmin for all station types in the yellowstone area calculated from TopoWx infilled and corrected data sets---tx_all_monthly_tmax.csv - monthly averages of daily tmax for all station types in the yellowstone area calculated from TopoWx infilled and corrected data sets---tx_all_seasonal_tmin.csv - seasonal averages of daily tmin for all station types in the yellowstone area calculated from TopoWx infilled and corrected data sets---tx_all_seasonal_tmax.csv - seasonal averages of daily tmax for all station types in the yellowstone area calculated from TopoWx infilled and corrected data sets---tx_all_daily_tmin.csv - daily tmin for all station types in the yellowstone area calculated from TopoWx infilled and corrected data sets---tx_all_daily_tmax.csv - daily tmax for all station types in the yellowstone area calculated from TopoWx infilled and corrected data sets

    Annual_SNOTEL_stats.csv - For each water year at each station in the GYA, the following variables are reported: Peak Snow Water Equivalent (PWE) - millimeters Peak snow day - day on which peak SWE occurred expressed as number of days since the start of the water year (October 1) Peak snow date - day on which peak swe occurred expressed as year / month Winter Length - Number of days with Snow Water Equivalent (SWE) greater than zero

    Melt - Number of days between peak SWE and complete melt

    april_1_swe.csv - Snow water equivalent (mm) on April 1 for each year at each station. ---monthly_pwe.csv - For each month during each water year at each station, Peak Snow Water Equivalent (mm). Months are indicated in the header as numbers. For example, gunsight_pass_01 is the PWE for January at the Gunsight Pass SNOTEL station. --Normalized_monthly_pwe_all_stations.csv - normalized monthly peak swe. See above for normalization procedure --snotel_monthly_tmax.csv - monthly averages of daily tmax for snotel stations calculated from NRCS data. NOTE all SNOTEL temperature data are identical to NRCS sources at the time of writing but NRCS plans to recalculate their daily historical values in the near future. Future comparisons to NRCS web sites may reveal some differences. ---snotel_monthly_tmin.csv - monthly averages of daily tmin for snotel stations calculated from NRCS data. --melt_out_dates.csv - Day of complete melt out (zero swe) for each water year, expressed as number of days after October 1 ---stream_summaries_05012014.csv -For each water year at each station, the following variables are presented: - StationName_median_# (e.g. soda_boundary_01)= Median flow value (cubic meters per second) for the numbered month. Months are numbered 1 - 12 - StationName_min_# = Minimum daily flow (Cubic meters per second) for the numbered month. Months are numbered 1 - 12. - date_half_disch = Calendar date at which centroid of flow was reached - index_half_disch = Day of water year (number of days after October 1) on which centroid was reached - height_half_disch = flow rate (cubic meters per second) on the date that centroid of flow was reached -peak_date = Calendar date of peak flow - peak_index = Day of water year (number of days after October 1) on which peak flow occurred -peak_cms = Peak Flow (Cubic meters per second) - min_date, min_index, min_cms = Same as the last 3 above but for minimum flow during each water year - total_vol = total volume (cubic meters) of water for each water year -peak_minus_min = Peak flow rate - min flow rate -moving_25th_percentile_flow = 25th percentile flow (cubic meters per second) for the 10 year period including the water year listed and the previous 9 years. -days_below_moving_25th = # days in the listed water year that had flow below the moving 25th percentile -days_below_recent_25th = # days below the 25th percentile flow during the period 1981 - 2010. -gradient_index = A hydrograph "spikiness" index which is calculated as the sum of all the first derivative values in a water year -days_above_recent_winter_90th = The number of days during November - March in each water year that exceeds the November - March 90th percentile flow. 90th Percentile for Nov - Mar is calculated during the years 2001 - 2010 -days_below_recent_summer_10th = The numbers of days during July - September in each water year that are below the 10th percentile flow for July - September. 10th percentile calculated over 2001 - 2010. -days_below_recent_summer_25th = The number of days during July - September in each water year that are below the 25th percentile flow for July - September. 25th percentile is calculated over 2001 - 2010. -est_vals = The number of flow measurements in each water year that are estimated (have data flag e). Estimated values USUALLY occur when there is ice on the gage. ---Normalized_stream_data_all_gages.csv - normalized stream statistics. See normalization procedure above.

  8. Conflict sample data instance.

    • plos.figshare.com
    xls
    Updated Sep 14, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tianjun Feng; Jingyao Liu; Chunyan Liang; Xiujuan Tian; Chun Chen; Keke Liu (2023). Conflict sample data instance. [Dataset]. http://doi.org/10.1371/journal.pone.0291504.t008
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Sep 14, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Tianjun Feng; Jingyao Liu; Chunyan Liang; Xiujuan Tian; Chun Chen; Keke Liu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In order to further study the expansion characteristics of left-turning non-motorized vehicles at intersections and the relationship between expansion characteristics and vehicle-bicycle conflicts, the trajectory point data of left-turning non-motorized vehicles are extracted using video trajectory tracking technology, and construct the cubic curve expansion envelope equation with the highest fitting degree. For the purpose of quantifying the expansion degree of non-motor vehicles after starting, two intersections in Guangxi Zhuang Autonomous Region were selected for case analysis, and the numerical range of expansion degree of the intersection with a left-turn waiting area and the intersection without a left-turn waiting area was obtained. Study the mathematical relationship between the expansion degree and its influencing factors, and establish the multivariate nonlinear regression equation between the expansion degree and the left-turn non-motorized vehicle flow, the number of parallel non-motorized vehicles, and the left-turn green light time. Analyze the vehicle-bicycle conflicts caused by the expansion of left-turning non-motorized vehicles, determine the essential factors affecting the number of non-motorized vehicles, and establish the multiple linear regression equation between the number of non-motorized vehicles and the number of left-turning non-motorized vehicles, the expansion degree, and the number of parallel non-motorized vehicles, the results show that the model has high accuracy. By analyzing the expansion characteristics of left-turning non-motorized vehicles at intersections, the relationship between different influencing factors and the expansion degree is obtained. Then the vehicle-bicycle conflicts under the influence of expansion characteristics is analyzed, providing theoretical ideas for improving traffic efficiency and optimizing traffic organization at intersections.

  9. VDEQ Springs FIELD MEASUREMENTS

    • data.virginia.gov
    Updated Aug 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Virginia Department of Environmental Quality (2023). VDEQ Springs FIELD MEASUREMENTS [Dataset]. https://data.virginia.gov/dataset/vdeq-springs-field-measurements
    Explore at:
    zip, arcgis geoservices rest api, csv, geojson, html, gpkg, gdb, txt, xlsx, kmlAvailable download formats
    Dataset updated
    Aug 31, 2023
    Dataset authored and provided by
    Virginia Department of Environmental Qualityhttps://deq.virginia.gov/
    Description
    The VDEQ Spring SITES database contains data describing the geographic locations and site attributes of natural springs throughout the commonwealth. This data coverage continues to evolve and contains only spring locations known to exist with a reasonable degree of certainty on the date of publication. The dataset does not replace site specific inventorying or receptor surveys but can be used as a starting point. VDEQ's initial geospatial dataset of approximately 325 springs was formed in 2008 by digitizing historical spring information sheets created by State Water Control Board geologists in the 1970s through early 1990s. Additional data has been consolidated from the EPA STORET database, the U.S. Geological Survey's Ground Water Site Inventory (GWSI) and Geographic Names Inventory System (GNIS), the Virginia Department of Health SDWIS database, the Virginia DEQ Virginia Water Use Data Set (VWUDS), the Commonwealth of Virginia Division of Water Resources and Power Bulletin No. 1: "Springs of Virginia" by Collins et al., 1930 as well as several VDWR&P Surface Water Supply bulletins from the 1940's - 1950's. A 1992 Virginia Department of Game and Inland Fisheries / Virginia Tech sponsored study by Helfrich et al. titled "Evaluation of the Natural Springs of Virginia: Fisheries Management Implications", a 2004 Rockbridge County groundwater resources report written by Frits van der Leeden, and several smaller datasets from consultants and citizens were evaluated and added to the database when confidence in locational accuracy was high or could be verified with aerial or LIDAR imagery. Significant contributions have been made throughout the years by VDEQ Groundwater Characterization staff site visits as well as other geologists working in the region including: Matt Heller at Virginia Division of Geology and Mineral Resources (VDMME), Wil Orndorff at the Virginia Department of Conservation and Recreation Karst Program (VDCR), and David Nelms and Dan Doctor of the U.S. Geological Survey (USGS). Substantial effort has been made to improve locational accuracy and remove duplication present between data sources. Hundreds of spring locations that were originally obtained using topographic maps or unknown methods were updated to sub-meter locational accuracy using post-processed differential GPS (PPGPS) and through the use of several generations of aerial imagery (2002-2017) obtained from Virginia's Geographic Information Network (VGIN) and 1-meter LIDAR, where available. Scores of new spring locations were also obtained by systematic quadrangle by quadrangle analysis in areas of the Shenandoah Valley where 1-meter LIDAR datasets where obtained from the U.S. Geological Survey. Future improvements to the dataset will result when statewide 1-meter LIDAR datasets becomes available and through continued field work by DEQ staff and other contributors working in the region. Please do not hesitate to contact the author to correct mistakes or to contribute to the database.

    The VDEQ Spring FIELD MEASUREMENTS database contains data describing field derived physio-chemical properties of spring discharges measured throughout the Commonwealth of Virginia. Field visits compiled in this dataset were performed from 1928 to 2019 by geologists with the State Water Control Board, the Virginia Division of Water and Power, the Virginia Department of Environmental Quality, and the U.S. Geological Survey with contributions from other sources as noted. Values of -9999 indicate that measurements were not performed for the referenced parameter. Please do not hesitate to contact the author to add data to the database or correct errors.


    The VDEQ_Spring_WQ database is a geodatabase containing groundwater sample information collected from springs throughout Virginia. Sample specific information include: location and site information, measured field parameters, and lab verified quantifications of major ionic concentrations, trace element concentrations, nutrient concentrations, and radiological data. The VDEQ_Spring_WQ database is a subset of the VDEQ GWCHEM database which is a flat-file geodatabase containing groundwater sample information from groundwater wells and springs throughout Virginia. Sample information has been correlated via DEQ Well # and projected using coordinates in VDEQ_Spring_SITES database. The GWCHEM database is comprised of historic groundwater sample data originally archived in the United States Geological Survey (USGS) National Water Information System (NWIS) and the Environmental Protection Agency (EPA) Storage and Retrieval (STORET) data warehouse. Archived STORET data originated as groundwater sample data collected and uploaded by Virginia State Water Control Board Personnel. While groundwater sample data in the STORET data warehouse are static, new groundwater sample data are periodically uploaded to NWIS and spring laboratory WQ data reflect NWIS downloaded on 9/30/2019. Recent groundwater sample data collected by Virginia Department of Environmental Quality (DEQ) personnel as part of the Ambient Groundwater Sampling Program are entered into the database as lab results are made available by the Division of Consolidated Laboratory Services (DCLS). When possible, charge balances were calculated for samples with reported values for major ions including (at a minimum) calcium, magnesium, potassium, sodium, bicarbonate, chloride, and sulfate. Reported values for Nitrate as N, carbonate, and fluoride were included in the charge balance calculation when available. Field determined values for bicarbonate and carbonate were used in the charge balance calculation when available. For much of the legacy DEQ groundwater sample data, bicarbonate values were derived from lab reported values of alkalinity (as mg/CaCO3) under the assumption that there was no contribution by carbonate to the reported alkalinity value. Charge balance values are reported in the "Charge Balance" column of the GWCHEM geodatabase. The closer the charge balance value is to unity (1), the lower the assumed charge balance error.In order to preserve the numerical capabilities of the database, non- numeric lab qualifiers were given the following numeric identifiers:- (minus sign) = less than the concentration specified to the right of the sign-11110 = estimated-22220 = presence verified but not quantified-33330 = radchem non-detect, below sslc-4440 = analyzed for but not detected-55550 = greater than the concentration to the right of the zero-66660 = sample held beyond normal holding time-77770 = quality control failure. Data not valid.-88880 = sample held beyond normal holding time. Sample analyzed for but not detected. Value stored is limit of detection for proces in use.-11120 = Value reported is less than the criteria of detection.-9999 = no data (parameter not quantified)

    A more in depth descprition and hydrogeologic analysis of the database can be found here
    An in Depth data fact sheet can be found here
  10. Non-acoustic Speech Dataset

    • zenodo.org
    • data.niaid.nih.gov
    txt
    Updated Sep 20, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shiji Yuan; Ying Sun; Dezhi Zheng; Xinlei Chen; Ying Ding; Shuai Wang; Shangchun Fan; Shiji Yuan; Ying Sun; Dezhi Zheng; Xinlei Chen; Ying Ding; Shuai Wang; Shangchun Fan (2022). Non-acoustic Speech Dataset [Dataset]. http://doi.org/10.5281/zenodo.7095741
    Explore at:
    txtAvailable download formats
    Dataset updated
    Sep 20, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Shiji Yuan; Ying Sun; Dezhi Zheng; Xinlei Chen; Ying Ding; Shuai Wang; Shangchun Fan; Shiji Yuan; Ying Sun; Dezhi Zheng; Xinlei Chen; Ying Ding; Shuai Wang; Shangchun Fan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Non-acoustic speech sensing system based on flexible piezoelectric

    Version 1.0.0


    This Read_Me.txt file briefly describes the non-acoustic speech dataset and instructions to access it.

    The non-acoustic speech sensing system based on flexible piezoelectric is designed to satisfy specific needs around testing device models (in high-noise, complex environments). The system collected vibration signals from the jaws of six males and five females containing ten different control commands at 90 dB of background noise. The dataset is reliable with high intelligibility and is able to achieve 93.7% recognition accuracy by calculation. In general, this paper provides a non-acoustic speech dataset for Mandarin, including the parts collected, the number of people collected, and the environment.


    The dataset is available at:

    https://10.5281/zenodo.7090120

    The data descriptor paper with details of data collection and cleaning process is under submission. For proper citation of the manuscript, please refer to the latest version of this dataset which includes the details.

    This dataset and its descriptor paper were created by:

    Shiji Yuan, Ying Sun, Dezhi Zheng, Xinlei Chen, Ying Ding,Shuai Wang, Shangchun Fan

    For questions or suggestions, please e-mail Dezhi Zheng


    Description:

    Ten common words were chosen as the core of the vocabulary in this dataset. These ten command words can be used for commands in IoT or robotics applications: "forward", "backward", "right", "left", "stop", "up", "down", "draw", "drop", and "reset".

    The recording software is Adobe Audition2022,which adopts monophonic recording, 16-bit storage format, 16 kHz sampling frequency, and the recorded voice is saved in wav format. The dataset is provided with two storage rules, which are stored by subject number and corpus number as classification. In the first rule, the speech data of 11 subjects were stored in different folders with the subject serial number as the folder name. Each folder contains subfolders categorized by corpus. In the second rule, the speech data of ten corpus are stored in different folders, and the names of the folders are the corpus contents. The subject number, corpus number and record order are given for each data entry. For example, the data obtained when subject one recorded corpus 10 for the first time was labeled as 1-10_1.

    After the data collection process, a filtering algorithm for automatic detection of low non-acoustic speech data is designed to remove problematic data that are very short or very quiet. The script of the data filtering algorithm is provided in this repository.

    For specific detail of the data filtering process, please refer to the script (speech data filtering algorithm in MATLAB) in this repository and the data descriptor paper.

    The dataset in this repository is the processed version. The raw dataset and removed audio files are not included in this repository.



    File list:

    Non-acoustic Speech Dataset.zip

    speech data filtering algorithm.zip

    Readme.txt


  11. Communities and Crime Dataset (Unnormalized Data)

    • kaggle.com
    zip
    Updated Jun 16, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John (2022). Communities and Crime Dataset (Unnormalized Data) [Dataset]. https://www.kaggle.com/datasets/johnp47/communities-and-crime-dataset/versions/1
    Explore at:
    zip(665539 bytes)Available download formats
    Dataset updated
    Jun 16, 2022
    Authors
    John
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Source:

    Creator: Michael Redmond (redmond '@' lasalle.edu); Computer Science; La Salle University; Philadelphia, PA, 19141, USA -- culled from 1990 US Census, 1995 US FBI Uniform Crime Report, 1990 US Law Enforcement Management and Administrative Statistics Survey, available from ICPSR at U of Michigan. -- Donor: Michael Redmond (redmond '@' lasalle.edu); Computer Science; La Salle University; Philadelphia, PA, 19141, USA -- Date: July 2009

    Data Set Information:

    Many variables are included so that algorithms that select or learn weights for attributes could be tested. However, clearly unrelated attributes were not included; attributes were picked if there was any plausible connection to crime (N=122), plus the attribute to be predicted (Per Capita Violent Crimes). The variables included in the dataset involve the community, such as the percent of the population considered urban, and the median family income, and involving law enforcement, such as per capita number of police officers, and percent of officers assigned to drug units.

    The per capita violent crimes variable was calculated using population and the sum of crime variables considered violent crimes in the United States: murder, rape, robbery, and assault. There was apparently some controversy in some states concerning the counting of rapes. These resulted in missing values for rape, which resulted in incorrect values for per capita violent crime. These cities are not included in the dataset. Many of these omitted communities were from the midwestern USA.

    Data is described below based on original values. All numeric data was normalized into the decimal range 0.00-1.00 using an Unsupervised, equal-interval binning method. Attributes retain their distribution and skew (hence for example the population attribute has a mean value of 0.06 because most communities are small). E.g. An attribute described as 'mean people per household' is actually the normalized (0-1) version of that value.

    The normalization preserves rough ratios of values WITHIN an attribute (e.g. double the value for double the population within the available precision - except for extreme values (all values more than 3 SD above the mean are normalized to 1.00; all values more than 3 SD below the mean are normalized to 0.00)).

    However, the normalization does not preserve relationships between values BETWEEN attributes (e.g. it would not be meaningful to compare the value for whitePerCap with the value for blackPerCap for a community)

    A limitation was that the LEMAS survey was of the police departments with at least 100 officers, plus a random sample of smaller departments. For our purposes, communities not found in both census and crime datasets were omitted. Many communities are missing LEMAS data.

    Attribute Information:

    '(125 predictive, 4 non-predictive, 18 potential goal) ', ' communityname: Community name - not predictive - for information only (string) ', ' state: US state (by 2 letter postal abbreviation)(nominal) ', ' countyCode: numeric code for county - not predictive, and many missing values (numeric) ', ' communityCode: numeric code for community - not predictive and many missing values (numeric) ', ' fold: fold number for non-random 10 fold cross validation, potentially useful for debugging, paired tests - not predictive (numeric - integer) ', ' population: population for community: (numeric - expected to be integer) ', ' householdsize: mean people per household (numeric - decimal) ', ' racepctblack: percentage of population that is african american (numeric - decimal) ', ' racePctWhite: percentage of population that is caucasian (numeric - decimal) ', ' racePctAsian: percentage of population that is of asian heritage (numeric - decimal) ', ' racePctHisp: percentage of population that is of hispanic heritage (numeric - decimal) ', ' agePct12t21: percentage of population that is 12-21 in age (numeric - decimal) ', ' agePct12t29: percentage of population that is 12-29 in age (numeric - decimal) ', ' agePct16t24: percentage of population that is 16-24 in age (numeric - decimal) ', ' agePct65up: percentage of population that is 65 and over in age (numeric - decimal) ', ' numbUrban: number of people living in areas classified as urban (numeric - expected to be integer) ', ' pctUrban: percentage of people living in areas classified as urban (numeric - decimal) ', ' medIncome: median household income (numeric - may be integer) ', ' pctWWage: percentage of households with wage or salary income in 1989 (numeric - decimal) ', ' pctWFarmSelf: percentage of households with farm or self employment income in 1989 (numeric - decimal) ', ' pctWInvInc: percentage of households with investment / rent income in 1989 (numeric - decimal) ', ' pctWSocSec: percentage of households with social security income in 1989 (numeric - decimal) ', ' pctWPubAsst: pe...

  12. 2021 Economic Surveys: AB2100NESD01 | Nonemployer Statistics by Demographics...

    • data.census.gov
    Updated Aug 8, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ECN (2024). 2021 Economic Surveys: AB2100NESD01 | Nonemployer Statistics by Demographics series (NES-D): Statistics for Employer and Nonemployer Firms by Industry, Sex, Ethnicity, Race, and Veteran Status for the U.S., States, Metro Areas, and Counties: 2021 (ECNSVY Nonemployer Statistics by Demographics Company Summary) [Dataset]. https://data.census.gov/table/ABSNESD2021.AB2100NESD01?q=Huang+Stephen+D.+Attorney
    Explore at:
    Dataset updated
    Aug 8, 2024
    Dataset provided by
    United States Census Bureauhttp://census.gov/
    Authors
    ECN
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Time period covered
    2021
    Area covered
    United States
    Description

    Release Date: 2024-08-08.The Census Bureau has reviewed this data product to ensure appropriate access, use, and disclosure avoidance protection of the confidential source data (Project No. 7504866, Disclosure Review Board (DRB) approval number: 2021 NES-D approval number: CBDRB-FY24-0307; 2022 ABS approval number: CBDRB-FY23-0479)...Key Table Information:.Data in this table combines estimates from the Annual Business Survey (employer firms) and the Nonemployer Statistics by Demographics (nonemployer firms)...Includes U.S. firms with no paid employment or payroll, annual receipts of $1,000 or more ($1 or more in the construction industries) and filing Internal Revenue Service (IRS) tax forms for sole proprietorships (Form 1040, Schedule C), partnerships (Form 1065), or corporations (the Form 1120 series)...Includes U.S. employer firms estimates of business ownership by sex, ethnicity, race, and veteran status from the 2022 Annual Business Survey (ABS) collection. Data are also obtained from administrative records, the 2017 Economic Census, and other economic surveys...Note: For employer data only, the collection year is the year in which the data are collected. A reference year is the year that is referenced in the questions on the survey and in which the statistics are tabulated. For example, the 2022 ABS collection year produces statistics for the 2021 reference year. The "Year" column in the table is the reference year...Data Items and Other Identifying Records:.Data include estimates on:.Total number of employer and nonemployer firms. Total sales and receipts of employer and nonemployer firms (reported in $1,000 of dollars). Number of nonemployer firms (firms without paid employees). Sales and receipts of nonemployer firms (reported in $1,000s of dollars). Number of employer firms (firms with paid employees). Sales and receipts of employer firms (reported in $1,000s of dollars). Number of employees (during the March 12 pay period). Annual payroll of employer firms (reported in $1,000s of dollars)...These data are aggregated by the following demographic classifications of firm for:.All firms. Classifiable (firms classifiable by sex, ethnicity, race, and veteran status). . Sex. Female. Male. Equally male/female (50% / 50%). . Ethnicity. Hispanic. Equally Hispanic/non-Hispanic (50% / 50%). Non-Hispanic. . Race. White. Black or African American. American Indian and Alaska Native. Asian. Native Hawaiian and Other Pacific Islander. Minority (Firms classified as any race and ethnicity combination other than non-Hispanic and White). Equally minority/nonminority (50% / 50%). Nonminority (Firms classified as non-Hispanic and White). . Veteran Status (defined as having served in any branch of the U.S. Armed Forces). Veteran. Equally veteran/nonveteran (50% / 50%). Nonveteran. . . . Unclassifiable (firms not classifiable by sex, ethnicity, race, and veteran status). ...Data Notes:.. Business ownership is defined as having 51 percent or more of the stock or equity in the business. Data are provided for firms owned equally (50% / 50%) by men and women, by Hispanics and non-Hispanics, by minorities and nonminorities, and by veterans and nonveterans. Firms not classifiable by sex, ethnicity, race, and veteran status are counted and tabulated separately.. The detail may not add to the total or subtotal because a Hispanic firm may be of any race; because a firm could be tabulated in more than one racial group; or because the number of nonemployer firm's data are rounded.. Nonemployer data do not have standard error or relative standard error columns as these data are from the universe of nonemployer firms, not from a data sample....Industry and Geography Coverage:.The data are shown for the total for all sectors (00) and 2-digit NAICS code levels for:..United States. States and the District of Columbia. Metropolitan Statistical Areas. County...Data are also shown for the 3- and 4-digit NAICS code for:..United States...Nonemployer data are excluded for the following NAICS industries:.Crop and Animal Production (NAICS 111 and 112). Rail Transportation (NAICS 482). Postal Service (NAICS 491). Monetary Authorities-Central Bank (NAICS 521). Funds, Trusts, and Other Financial Vehicles (NAICS 525). Management of Companies and Enterprises (NAICS 55). Private Households (NAICS 814). Public Administration (NAICS 92). Industries Not Classified (NAICS 99)...For more information about NAICS, see NAICS Codes & Understanding Industry Classification Systems. For information about geographies used by economic programs at the Census Bureau, see Economic Census: Economic Geographies...Employer Data Footnotes:.Footnote 660 - Agriculture, forestry...

  13. I

    India HCE: No of Sample Households Reporting Consumption: Chandigarh: Urban:...

    • ceicdata.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com, India HCE: No of Sample Households Reporting Consumption: Chandigarh: Urban: Non Food: Consumer Services [Dataset]. https://www.ceicdata.com/en/india/household-consumer-expenditure-chandigarh-urban/hce-no-of-sample-households-reporting-consumption-chandigarh-urban-non-food-consumer-services
    Explore at:
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jun 1, 2005 - Jun 1, 2012
    Area covered
    India
    Variables measured
    Household Income and Expenditure Survey
    Description

    HCE: Number of Sample Households Reporting Consumption: Chandigarh: Urban: Non Food: Consumer Services data was reported at 244.000 Unit in 2012. This records a decrease from the previous number of 271.000 Unit for 2010. HCE: Number of Sample Households Reporting Consumption: Chandigarh: Urban: Non Food: Consumer Services data is updated yearly, averaging 271.000 Unit from Jun 2005 (Median) to 2012, with 3 observations. The data reached an all-time high of 278.000 Unit in 2005 and a record low of 244.000 Unit in 2012. HCE: Number of Sample Households Reporting Consumption: Chandigarh: Urban: Non Food: Consumer Services data remains active status in CEIC and is reported by Ministry of Statistics and Programme Implementation. The data is categorized under India Premium Database’s Domestic Trade and Household Survey – Table IN.HB029: HCES: Uniform Reference Period (URP): Average Monthly Per Capita Consumption Expenditure (MPCE): by Item Group: Chandigarh: Urban (Discontinued).

  14. I

    India HCE: No of Sample Households Reporting Consumption: Haryana: Urban:...

    • ceicdata.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com, India HCE: No of Sample Households Reporting Consumption: Haryana: Urban: Non Food: Tobacco [Dataset]. https://www.ceicdata.com/en/india/household-consumer-expenditure-haryana-urban/hce-no-of-sample-households-reporting-consumption-haryana-urban-non-food-tobacco
    Explore at:
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jun 1, 2005 - Jun 1, 2012
    Area covered
    India
    Variables measured
    Household Income and Expenditure Survey
    Description

    HCE: Number of Sample Households Reporting Consumption: Haryana: Urban: Non Food: Tobacco data was reported at 337.000 Unit in 2012. This records a decrease from the previous number of 435.000 Unit for 2010. HCE: Number of Sample Households Reporting Consumption: Haryana: Urban: Non Food: Tobacco data is updated yearly, averaging 435.000 Unit from Jun 2005 (Median) to 2012, with 3 observations. The data reached an all-time high of 488.000 Unit in 2005 and a record low of 337.000 Unit in 2012. HCE: Number of Sample Households Reporting Consumption: Haryana: Urban: Non Food: Tobacco data remains active status in CEIC and is reported by Ministry of Statistics and Programme Implementation. The data is categorized under India Premium Database’s Domestic Trade and Household Survey – Table IN.HB041: HCES: Uniform Reference Period (URP): Average Monthly Per Capita Consumption Expenditure (MPCE): by Item Group: Haryana: Urban (Discontinued).

  15. I

    India HCE: No of Sample Households Reporting Consumption: Haryana: Urban:...

    • ceicdata.com
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com, India HCE: No of Sample Households Reporting Consumption: Haryana: Urban: Non Food: Clothing [Dataset]. https://www.ceicdata.com/en/india/hces-average-monthly-per-capita-consumption-expenditure-mpce-by-uniform-reference-period-urp-by-item-group-haryana-urban-discontinued/hce-no-of-sample-households-reporting-consumption-haryana-urban-non-food-clothing
    Explore at:
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jun 1, 1994 - Jun 1, 2012
    Area covered
    India
    Variables measured
    Household Income and Expenditure Survey
    Description

    HCE: Number of Sample Households Reporting Consumption: Haryana: Urban: Non Food: Clothing data was reported at 716.000 Unit in 2012. This records an increase from the previous number of 669.000 Unit for 2010. HCE: Number of Sample Households Reporting Consumption: Haryana: Urban: Non Food: Clothing data is updated yearly, averaging 511.500 Unit from Jun 1994 (Median) to 2012, with 4 observations. The data reached an all-time high of 716.000 Unit in 2012 and a record low of 139.000 Unit in 1994. HCE: Number of Sample Households Reporting Consumption: Haryana: Urban: Non Food: Clothing data remains active status in CEIC and is reported by Ministry of Statistics and Programme Implementation. The data is categorized under India Premium Database’s Domestic Trade and Household Survey – Table IN.HB041: HCES: Uniform Reference Period (URP): Average Monthly Per Capita Consumption Expenditure (MPCE): by Item Group: Haryana: Urban (Discontinued).

  16. I

    India HCE: No of Sample Households Reporting Consumption: Haryana: Urban:...

    • ceicdata.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com, India HCE: No of Sample Households Reporting Consumption: Haryana: Urban: Non Food: Conveyance [Dataset]. https://www.ceicdata.com/en/india/household-consumer-expenditure-haryana-urban/hce-no-of-sample-households-reporting-consumption-haryana-urban-non-food-conveyance
    Explore at:
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jun 1, 2005 - Jun 1, 2012
    Area covered
    India
    Variables measured
    Household Income and Expenditure Survey
    Description

    HCE: Number of Sample Households Reporting Consumption: Haryana: Urban: Non Food: Conveyance data was reported at 1,058.000 Unit in 2012. This records an increase from the previous number of 982.000 Unit for 2010. HCE: Number of Sample Households Reporting Consumption: Haryana: Urban: Non Food: Conveyance data is updated yearly, averaging 982.000 Unit from Jun 2005 (Median) to 2012, with 3 observations. The data reached an all-time high of 1,058.000 Unit in 2012 and a record low of 717.000 Unit in 2005. HCE: Number of Sample Households Reporting Consumption: Haryana: Urban: Non Food: Conveyance data remains active status in CEIC and is reported by Ministry of Statistics and Programme Implementation. The data is categorized under India Premium Database’s Domestic Trade and Household Survey – Table IN.HB041: HCES: Uniform Reference Period (URP): Average Monthly Per Capita Consumption Expenditure (MPCE): by Item Group: Haryana: Urban (Discontinued).

  17. I

    India HCE: No of Sample Households Reporting Consumption: Bihar: Rural: Non...

    • ceicdata.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com, India HCE: No of Sample Households Reporting Consumption: Bihar: Rural: Non Food: Medical: Non Institutional [Dataset]. https://www.ceicdata.com/en/india/household-consumer-expenditure-bihar-rural/hce-no-of-sample-households-reporting-consumption-bihar-rural-non-food-medical-non-institutional
    Explore at:
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jun 1, 2005 - Jun 1, 2012
    Area covered
    India
    Variables measured
    Household Income and Expenditure Survey
    Description

    HCE: Number of Sample Households Reporting Consumption: Bihar: Rural: Non Food: Medical: Non Institutional data was reported at 2,841.000 Unit in 2012. This records an increase from the previous number of 2,487.000 Unit for 2010. HCE: Number of Sample Households Reporting Consumption: Bihar: Rural: Non Food: Medical: Non Institutional data is updated yearly, averaging 2,731.000 Unit from Jun 2005 (Median) to 2012, with 3 observations. The data reached an all-time high of 2,841.000 Unit in 2012 and a record low of 2,487.000 Unit in 2010. HCE: Number of Sample Households Reporting Consumption: Bihar: Rural: Non Food: Medical: Non Institutional data remains active status in CEIC and is reported by Ministry of Statistics and Programme Implementation. The data is categorized under India Premium Database’s Domestic Trade and Household Survey – Table IN.HB026: HCES: Uniform Reference Period (URP): Average Monthly Per Capita Consumption Expenditure (MPCE): by Item Group: Bihar: Rural (Discontinued).

  18. I

    India HCE: No of Sample Households Reporting Consumption: Lakshadweep:...

    • ceicdata.com
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com, India HCE: No of Sample Households Reporting Consumption: Lakshadweep: Urban: Non Food [Dataset]. https://www.ceicdata.com/en/india/hces-average-monthly-per-capita-consumption-expenditure-mpce-by-uniform-reference-period-urp-by-item-group-lakshadweep-urban-discontinued/hce-no-of-sample-households-reporting-consumption-lakshadweep-urban-non-food
    Explore at:
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jun 1, 2005 - Jun 1, 2012
    Area covered
    India
    Variables measured
    Household Income and Expenditure Survey
    Description

    HCE: Number of Sample Households Reporting Consumption: Lakshadweep: Urban: Non Food data was reported at 127.000 Unit in 2012. This records a decrease from the previous number of 128.000 Unit for 2010. HCE: Number of Sample Households Reporting Consumption: Lakshadweep: Urban: Non Food data is updated yearly, averaging 128.000 Unit from Jun 2005 (Median) to 2012, with 3 observations. The data reached an all-time high of 129.000 Unit in 2005 and a record low of 127.000 Unit in 2012. HCE: Number of Sample Households Reporting Consumption: Lakshadweep: Urban: Non Food data remains active status in CEIC and is reported by Ministry of Statistics and Programme Implementation. The data is categorized under India Premium Database’s Domestic Trade and Household Survey – Table IN.HB053: HCES: Uniform Reference Period (URP): Average Monthly Per Capita Consumption Expenditure (MPCE): by Item Group: Lakshadweep: Urban (Discontinued).

  19. I

    India HCE: No of Sample Households Reporting Consumption: Chandigarh: Urban:...

    • ceicdata.com
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com, India HCE: No of Sample Households Reporting Consumption: Chandigarh: Urban: Non Food [Dataset]. https://www.ceicdata.com/en/india/household-consumer-expenditure-chandigarh-urban/hce-no-of-sample-households-reporting-consumption-chandigarh-urban-non-food
    Explore at:
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jun 1, 2005 - Jun 1, 2012
    Area covered
    India
    Variables measured
    Household Income and Expenditure Survey
    Description

    HCE: Number of Sample Households Reporting Consumption: Chandigarh: Urban: Non Food data was reported at 248.000 Unit in 2012. This records a decrease from the previous number of 273.000 Unit for 2010. HCE: Number of Sample Households Reporting Consumption: Chandigarh: Urban: Non Food data is updated yearly, averaging 273.000 Unit from Jun 2005 (Median) to 2012, with 3 observations. The data reached an all-time high of 300.000 Unit in 2005 and a record low of 248.000 Unit in 2012. HCE: Number of Sample Households Reporting Consumption: Chandigarh: Urban: Non Food data remains active status in CEIC and is reported by Ministry of Statistics and Programme Implementation. The data is categorized under India Premium Database’s Domestic Trade and Household Survey – Table IN.HB029: HCES: Uniform Reference Period (URP): Average Monthly Per Capita Consumption Expenditure (MPCE): by Item Group: Chandigarh: Urban (Discontinued).

  20. I

    India HCE: No of Sample Households Reporting Consumption: Delhi: Urban: Non...

    • ceicdata.com
    Updated Mar 15, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com (2023). India HCE: No of Sample Households Reporting Consumption: Delhi: Urban: Non Food [Dataset]. https://www.ceicdata.com/en/india/hces-average-monthly-per-capita-consumption-expenditure-mpce-by-uniform-reference-period-urp-by-item-group-nct-of-delhi-urban-discontinued/hce-no-of-sample-households-reporting-consumption-delhi-urban-non-food
    Explore at:
    Dataset updated
    Mar 15, 2023
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jun 1, 2005 - Jun 1, 2012
    Area covered
    India
    Variables measured
    Household Income and Expenditure Survey
    Description

    HCE: Number of Sample Households Reporting Consumption: Delhi: Urban: Non Food data was reported at 887.000 Unit in 2012. This records an increase from the previous number of 842.000 Unit for 2010. HCE: Number of Sample Households Reporting Consumption: Delhi: Urban: Non Food data is updated yearly, averaging 887.000 Unit from Jun 2005 (Median) to 2012, with 3 observations. The data reached an all-time high of 1,101.000 Unit in 2005 and a record low of 842.000 Unit in 2010. HCE: Number of Sample Households Reporting Consumption: Delhi: Urban: Non Food data remains active status in CEIC and is reported by Ministry of Statistics and Programme Implementation. The data is categorized under India Premium Database’s Domestic Trade and Household Survey – Table IN.HB067: HCES: Uniform Reference Period (URP): Average Monthly Per Capita Consumption Expenditure (MPCE): by Item Group: NCT of Delhi: Urban (Discontinued).

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Riccardo Di Francesco (2025). Ordered correlation forest [Dataset]. http://doi.org/10.6084/m9.figshare.28218061.v1

Data from: Ordered correlation forest

Related Article
Explore at:
txtAvailable download formats
Dataset updated
Feb 10, 2025
Dataset provided by
Taylor & Francis
Authors
Riccardo Di Francesco
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Empirical studies in various social sciences often involve categorical outcomes with inherent ordering, such as self-evaluations of subjective well-being and self-assessments in health domains. While ordered choice models, such as the ordered logit and ordered probit, are popular tools for analyzing these outcomes, they may impose restrictive parametric and distributional assumptions. This article introduces a novel estimator, the ordered correlation forest, that can naturally handle non linearities in the data and does not assume a specific error term distribution. The proposed estimator modifies a standard random forest splitting criterion to build a collection of forests, each estimating the conditional probability of a single class. Under an “honesty” condition, predictions are consistent and asymptotically normal. The weights induced by each forest are used to obtain standard errors for the predicted probabilities and the covariates’ marginal effects. Evidence from synthetic data shows that the proposed estimator features a superior prediction performance than alternative forest-based estimators and demonstrates its ability to construct valid confidence intervals for the covariates’ marginal effects. Comparisons using various real-world data sets further highlight the advantages of forest-based estimators over parametric models in larger samples while showing that the ordered correlation forest remains competitive in smaller samples.

Search
Clear search
Close search
Google apps
Main menu