100+ datasets found
  1. Simulation Data Set

    • catalog.data.gov
    • s.cnmilf.com
    Updated Nov 12, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2020). Simulation Data Set [Dataset]. https://catalog.data.gov/dataset/simulation-data-set
    Explore at:
    Dataset updated
    Nov 12, 2020
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    These are simulated data without any identifying information or informative birth-level covariates. We also standardize the pollution exposures on each week by subtracting off the median exposure amount on a given week and dividing by the interquartile range (IQR) (as in the actual application to the true NC birth records data). The dataset that we provide includes weekly average pregnancy exposures that have already been standardized in this way while the medians and IQRs are not given. This further protects identifiability of the spatial locations used in the analysis. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: File format: R workspace file; “Simulated_Dataset.RData”. Metadata (including data dictionary) • y: Vector of binary responses (1: adverse outcome, 0: control) • x: Matrix of covariates; one row for each simulated individual • z: Matrix of standardized pollution exposures • n: Number of simulated individuals • m: Number of exposure time periods (e.g., weeks of pregnancy) • p: Number of columns in the covariate design matrix • alpha_true: Vector of “true” critical window locations/magnitudes (i.e., the ground truth that we want to estimate) Code Abstract We provide R statistical software code (“CWVS_LMC.txt”) to fit the linear model of coregionalization (LMC) version of the Critical Window Variable Selection (CWVS) method developed in the manuscript. We also provide R code (“Results_Summary.txt”) to summarize/plot the estimated critical windows and posterior marginal inclusion probabilities. Description “CWVS_LMC.txt”: This code is delivered to the user in the form of a .txt file that contains R statistical software code. Once the “Simulated_Dataset.RData” workspace has been loaded into R, the text in the file can be used to identify/estimate critical windows of susceptibility and posterior marginal inclusion probabilities. “Results_Summary.txt”: This code is also delivered to the user in the form of a .txt file that contains R statistical software code. Once the “CWVS_LMC.txt” code is applied to the simulated dataset and the program has completed, this code can be used to summarize and plot the identified/estimated critical windows and posterior marginal inclusion probabilities (similar to the plots shown in the manuscript). Optional Information (complete as necessary) Required R packages: • For running “CWVS_LMC.txt”: • msm: Sampling from the truncated normal distribution • mnormt: Sampling from the multivariate normal distribution • BayesLogit: Sampling from the Polya-Gamma distribution • For running “Results_Summary.txt”: • plotrix: Plotting the posterior means and credible intervals Instructions for Use Reproducibility (Mandatory) What can be reproduced: The data and code can be used to identify/estimate critical windows from one of the actual simulated datasets generated under setting E4 from the presented simulation study. How to use the information: • Load the “Simulated_Dataset.RData” workspace • Run the code contained in “CWVS_LMC.txt” • Once the “CWVS_LMC.txt” code is complete, run “Results_Summary.txt”. Format: Below is the replication procedure for the attached data set for the portion of the analyses using a simulated data set: Data The data used in the application section of the manuscript consist of geocoded birth records from the North Carolina State Center for Health Statistics, 2005-2008. In the simulation study section of the manuscript, we simulate synthetic data that closely match some of the key features of the birth certificate data while maintaining confidentiality of any actual pregnant women. Availability Due to the highly sensitive and identifying information contained in the birth certificate data (including latitude/longitude and address of residence at delivery), we are unable to make the data from the application section publically available. However, we will make one of the simulated datasets available for any reader interested in applying the method to realistic simulated birth records data. This will also allow the user to become familiar with the required inputs of the model, how the data should be structured, and what type of output is obtained. While we cannot provide the application data here, access to the North Carolina birth records can be requested through the North Carolina State Center for Health Statistics, and requires an appropriate data use agreement. Description Permissions: These are simulated data without any identifying information or informative birth-level covariates. We also standardize the pollution exposures on each week by subtracting off the median exposure amount on a given week and dividing by the interquartile range (IQR) (as in the actual application to the true NC birth records data). The dataset that we provide includes weekly average pregnancy exposures that have already been standardized in this way while the medians and IQRs are not given. This further protects identifiability of the spatial locations used in the analysis. This dataset is associated with the following publication: Warren, J., W. Kong, T. Luben, and H. Chang. Critical Window Variable Selection: Estimating the Impact of Air Pollution on Very Preterm Birth. Biostatistics. Oxford University Press, OXFORD, UK, 1-30, (2019).

  2. d

    Falcon distribution data

    • dune.com
    Updated Sep 14, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    mrnobody (2025). Falcon distribution data [Dataset]. https://dune.com/discover/content/relevant?resource-type=queries&q=code%3A%22falcon_ethereum.stakedusdf_evt_transfer%22
    Explore at:
    Dataset updated
    Sep 14, 2025
    Authors
    mrnobody
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Blockchain data query: Falcon distribution data

  3. h

    Figure 2 Fitted slope of E3C divided by E2C data distribution

    • hepdata.net
    Updated Apr 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Figure 2 Fitted slope of E3C divided by E2C data distribution [Dataset]. http://doi.org/10.17182/hepdata.147275.v1/t25
    Explore at:
    Dataset updated
    Apr 8, 2024
    Description

    The fitted slopes of the E3C/E2C data distributions as a function of jet pt are used to illustrate the dependency...

  4. Washington Grocery Store Locations

    • kaggle.com
    zip
    Updated Dec 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Malik Muhammad Ahmed (2024). Washington Grocery Store Locations [Dataset]. https://www.kaggle.com/datasets/malikmuhammadahmed/grocery-store-locations
    Explore at:
    zip(9395 bytes)Available download formats
    Dataset updated
    Dec 11, 2024
    Authors
    Malik Muhammad Ahmed
    License

    https://www.usa.gov/government-works/https://www.usa.gov/government-works/

    Area covered
    Washington
    Description

    Dataset Description: Washington Grocery Store Locations

    This dataset contains detailed information about the locations and operational status of grocery stores in Washington, spanning multiple years. It includes both spatial and temporal data, offering a comprehensive view of how grocery stores are distributed and have evolved over time. Below is a breakdown of the columns included in the dataset:

    1. X, Y: Geographic coordinates (latitude and longitude) representing the store's location in the dataset.

    2. STORENAME: The name of the grocery store.

    3. ADDRESS: The physical address of the grocery store.

    4. ZIPCODE: The ZIP code of the store’s location.

    5. PHONE: The contact phone number for the store.

    6. WARD: The local government ward in which the store is located.

    7. SSL: A unique identifier or code related to the store, possibly referring to specific data collection attributes.

    8. NOTES: Additional comments or information about the store.

    9. PRESENT: Temporal indicators showing the presence (likely open or closed) of each store across various years. These columns provide insights into the longevity and temporal trends of grocery store operations.

    10. GIS_ID: A unique identifier for geographic information system (GIS) data.

    11. XCOORD, YCOORD: Coordinates (likely more specific) used for spatial data analysis, providing the exact location of the store.

    12. MAR_ID: A unique identifier for marketing or regional analysis purposes.

    13. GLOBALID: A global unique identifier for the store data.

    14. CREATOR: The individual or system that created the data entry.

    15. CREATED: Timestamp showing when the data entry was created.

    16. EDITOR: The individual or system that edited the data entry.

    17. EDITED: Timestamp showing when the data entry was last edited.

    18. SE_ANNO_CAD_DATA: Specific annotation or data related to CAD (computer-aided design), possibly linked to store location details.

    19. OBJECTID: A unique identifier for the object or record within the dataset.

    Insights We Can Extract:

    • Geographic Distribution: By analyzing the X and Y coordinates along with ZIP codes and wards, we can identify where grocery stores are concentrated and map areas with high or low store density.
    • Temporal Trends: The data in the "PRESENT" columns helps us track the opening and closure patterns of grocery stores over time, providing insights into market trends and store longevity.
    • Service Gaps: We can identify areas with no grocery stores, possibly indicating food deserts or underserved communities, by mapping the stores and comparing coverage across ZIP codes and wards.
    • Operational Trends: By analyzing the temporal data and comparing store turnover, we can uncover patterns in the longevity or turnover of grocery stores.
    • Urban Planning and Accessibility: This dataset could help us assess whether the location of grocery stores aligns with urban infrastructure like transportation routes or population density, which could inform policy decisions to improve grocery access.

    This dataset is invaluable for urban planners, policymakers, and business stakeholders looking to improve food access and urban infrastructure.

  5. s

    Daisy Distribution Importer/Buyer Data in USA, Daisy Distribution Imports...

    • seair.co.in
    Updated Apr 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Seair Exim Solutions (2025). Daisy Distribution Importer/Buyer Data in USA, Daisy Distribution Imports Data [Dataset]. https://www.seair.co.in/us-import/i-daisy-distribution.aspx
    Explore at:
    .text/.csv/.xml/.xls/.binAvailable download formats
    Dataset updated
    Apr 15, 2025
    Dataset authored and provided by
    Seair Exim Solutions
    Area covered
    United States
    Description

    Find details of Daisy Distribution Buyer/importer data in US (United States) with product description, price, shipment date, quantity, imported products list, major us ports name, overseas suppliers/exporters name etc. at sear.co.in.

  6. m

    Species abundance and distribution data (Excel spreadsheets) from the Long...

    • marine-geo.org
    Updated Feb 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MGDS > Marine Geoscience Data System (2025). Species abundance and distribution data (Excel spreadsheets) from the Long Island Sound Phase IIB area (1975-2023) [Dataset]. https://www.marine-geo.org/tools/files/32487
    Explore at:
    Dataset updated
    Feb 12, 2025
    Dataset authored and provided by
    MGDS > Marine Geoscience Data System
    License

    Attribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
    License information was derived automatically

    Area covered
    Long Island Sound
    Description

    These data files provide species abundance and distribution data from analysis of diver and ROV images for the eastern Long Island Sound Phase IIB study area. The files are in Excel spreadsheet format. Funding was provided by the Long Island Sound Cable Fund Seafloor Habitat Mapping Initiative administered cooperatively by the EPA Long Island Sound Study and the Connecticut Department of Energy and Environmental Protection (DEEP).

  7. N

    Ocean View, DE Population Breakdown by Gender and Age Dataset: Male and...

    • neilsberg.com
    csv, json
    Updated Feb 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). Ocean View, DE Population Breakdown by Gender and Age Dataset: Male and Female Population Distribution Across 18 Age Groups // 2025 Edition [Dataset]. https://www.neilsberg.com/research/datasets/e1f64851-f25d-11ef-8c1b-3860777c1fe6/
    Explore at:
    json, csvAvailable download formats
    Dataset updated
    Feb 24, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Ocean View, Delaware
    Variables measured
    Male and Female Population Under 5 Years, Male and Female Population over 85 years, Male and Female Population Between 5 and 9 years, Male and Female Population Between 10 and 14 years, Male and Female Population Between 15 and 19 years, Male and Female Population Between 20 and 24 years, Male and Female Population Between 25 and 29 years, Male and Female Population Between 30 and 34 years, Male and Female Population Between 35 and 39 years, Male and Female Population Between 40 and 44 years, and 8 more
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the three variables, namely (a) Population (Male), (b) Population (Female), and (c) Gender Ratio (Males per 100 Females), we initially analyzed and categorized the data for each of the gender classifications (biological sex) reported by the US Census Bureau across 18 age groups, ranging from under 5 years to 85 years and above. These age groups are described above in the variables section. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the population of Ocean View by gender across 18 age groups. It lists the male and female population in each age group along with the gender ratio for Ocean View. The dataset can be utilized to understand the population distribution of Ocean View by gender and age. For example, using this dataset, we can identify the largest age group for both Men and Women in Ocean View. Additionally, it can be used to see how the gender ratio changes from birth to senior most age group and male to female ratio across each age group for Ocean View.

    Key observations

    Largest age group (population): Male # 75-79 years (253) | Female # 75-79 years (268). Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Age groups:

    • Under 5 years
    • 5 to 9 years
    • 10 to 14 years
    • 15 to 19 years
    • 20 to 24 years
    • 25 to 29 years
    • 30 to 34 years
    • 35 to 39 years
    • 40 to 44 years
    • 45 to 49 years
    • 50 to 54 years
    • 55 to 59 years
    • 60 to 64 years
    • 65 to 69 years
    • 70 to 74 years
    • 75 to 79 years
    • 80 to 84 years
    • 85 years and over

    Scope of gender :

    Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis.

    Variables / Data Columns

    • Age Group: This column displays the age group for the Ocean View population analysis. Total expected values are 18 and are define above in the age groups section.
    • Population (Male): The male population in the Ocean View is shown in the following column.
    • Population (Female): The female population in the Ocean View is shown in the following column.
    • Gender Ratio: Also known as the sex ratio, this column displays the number of males per 100 females in Ocean View for each age group.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Ocean View Population by Gender. You can refer the same here

  8. N

    Lake View, AL Population Breakdown by Gender and Age Dataset: Male and...

    • neilsberg.com
    csv, json
    Updated Feb 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). Lake View, AL Population Breakdown by Gender and Age Dataset: Male and Female Population Distribution Across 18 Age Groups // 2025 Edition [Dataset]. https://www.neilsberg.com/research/datasets/e1eb3b45-f25d-11ef-8c1b-3860777c1fe6/
    Explore at:
    csv, jsonAvailable download formats
    Dataset updated
    Feb 24, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Alabama, Lake View
    Variables measured
    Male and Female Population Under 5 Years, Male and Female Population over 85 years, Male and Female Population Between 5 and 9 years, Male and Female Population Between 10 and 14 years, Male and Female Population Between 15 and 19 years, Male and Female Population Between 20 and 24 years, Male and Female Population Between 25 and 29 years, Male and Female Population Between 30 and 34 years, Male and Female Population Between 35 and 39 years, Male and Female Population Between 40 and 44 years, and 8 more
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the three variables, namely (a) Population (Male), (b) Population (Female), and (c) Gender Ratio (Males per 100 Females), we initially analyzed and categorized the data for each of the gender classifications (biological sex) reported by the US Census Bureau across 18 age groups, ranging from under 5 years to 85 years and above. These age groups are described above in the variables section. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the population of Lake View by gender across 18 age groups. It lists the male and female population in each age group along with the gender ratio for Lake View. The dataset can be utilized to understand the population distribution of Lake View by gender and age. For example, using this dataset, we can identify the largest age group for both Men and Women in Lake View. Additionally, it can be used to see how the gender ratio changes from birth to senior most age group and male to female ratio across each age group for Lake View.

    Key observations

    Largest age group (population): Male # 30-34 years (252) | Female # 35-39 years (433). Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Age groups:

    • Under 5 years
    • 5 to 9 years
    • 10 to 14 years
    • 15 to 19 years
    • 20 to 24 years
    • 25 to 29 years
    • 30 to 34 years
    • 35 to 39 years
    • 40 to 44 years
    • 45 to 49 years
    • 50 to 54 years
    • 55 to 59 years
    • 60 to 64 years
    • 65 to 69 years
    • 70 to 74 years
    • 75 to 79 years
    • 80 to 84 years
    • 85 years and over

    Scope of gender :

    Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis.

    Variables / Data Columns

    • Age Group: This column displays the age group for the Lake View population analysis. Total expected values are 18 and are define above in the age groups section.
    • Population (Male): The male population in the Lake View is shown in the following column.
    • Population (Female): The female population in the Lake View is shown in the following column.
    • Gender Ratio: Also known as the sex ratio, this column displays the number of males per 100 females in Lake View for each age group.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Lake View Population by Gender. You can refer the same here

  9. s

    Dollar Tree Distribution Center Importer/Buyer Data in USA, Dollar Tree...

    • seair.co.in
    Updated Apr 19, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Seair Exim Solutions (2025). Dollar Tree Distribution Center Importer/Buyer Data in USA, Dollar Tree Distribution Center Imports Data [Dataset]. https://www.seair.co.in/us-import/i-dollar-tree-distribution-center.aspx
    Explore at:
    .text/.csv/.xml/.xls/.binAvailable download formats
    Dataset updated
    Apr 19, 2025
    Dataset authored and provided by
    Seair Exim Solutions
    Area covered
    United States
    Description

    Find details of Dollar Tree Distribution Center Buyer/importer data in US (United States) with product description, price, shipment date, quantity, imported products list, major us ports name, overseas suppliers/exporters name etc. at sear.co.in.

  10. p

    Data from: Lake View Elementary School

    • publicschoolreview.com
    json, xml
    Updated Jul 17, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Public School Review (2024). Lake View Elementary School [Dataset]. https://www.publicschoolreview.com/lake-view-elementary-school-profile/72342
    Explore at:
    xml, jsonAvailable download formats
    Dataset updated
    Jul 17, 2024
    Dataset authored and provided by
    Public School Review
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 2025 - Dec 31, 2025
    Description

    Historical Dataset of Lake View Elementary School is provided by PublicSchoolReview and contain statistics on metrics:Distribution of Students By Grade Trends

  11. s

    Rb Distribution Inc Importer/Buyer Data in USA, Rb Distribution Inc Imports...

    • seair.co.in
    Updated May 31, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Seair Exim Solutions (2025). Rb Distribution Inc Importer/Buyer Data in USA, Rb Distribution Inc Imports Data [Dataset]. https://www.seair.co.in/us-import/i-rb-distribution-inc.aspx
    Explore at:
    .text/.csv/.xml/.xls/.binAvailable download formats
    Dataset updated
    May 31, 2025
    Dataset authored and provided by
    Seair Exim Solutions
    Area covered
    United States
    Description

    Find details of Rb Distribution Inc Buyer/importer data in US (United States) with product description, price, shipment date, quantity, imported products list, major us ports name, overseas suppliers/exporters name etc. at sear.co.in.

  12. r

    Box-Cox quantile regression and the distribution of firm sizes (replication...

    • resodate.org
    Updated Oct 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    José A. F. Machado (2025). Box-Cox quantile regression and the distribution of firm sizes (replication data) [Dataset]. https://resodate.org/resources/aHR0cHM6Ly9qb3VybmFsZGF0YS56YncuZXUvZGF0YXNldC9ib3hjb3gtcXVhbnRpbGUtcmVncmVzc2lvbi1hbmQtdGhlLWRpc3RyaWJ1dGlvbi1vZi1maXJtLXNpemVz
    Explore at:
    Dataset updated
    Oct 2, 2025
    Dataset provided by
    Journal of Applied Econometrics
    ZBW Journal Data Archive
    ZBW
    Authors
    José A. F. Machado
    Description

    Using the Box-Cox quantile regression model, we analyse the size distribution of firms in Portuguese manufacturing during the 1980s. Specifically, we estimate the effect of selected industry attributes on the location, scale, skewness and kurtosis of the conditional size distributions of firms. We find that industry attributes affect the size of firms in the same direction across the distribution, but the effects of these variables are typically much greater at the largest quantiles. Over time the distribution shifted towards smaller firms, due mainly to the way the economy responds to industry characteristics rather than to changes of the level of these characteristics. The prediction of lognormality, implied by Gibrat's Law, is soundly rejected by the observed distribution of firm sizes. However, we found that, at least in 1983, lognormality is a reasonable description of the conditional size distribution.

  13. N

    Bay View, OH Population Breakdown by Gender and Age

    • neilsberg.com
    csv, json
    Updated Sep 14, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2023). Bay View, OH Population Breakdown by Gender and Age [Dataset]. https://www.neilsberg.com/research/datasets/660d3044-3d85-11ee-9abe-0aa64bf2eeb2/
    Explore at:
    csv, jsonAvailable download formats
    Dataset updated
    Sep 14, 2023
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Bay View
    Variables measured
    Male and Female Population Under 5 Years, Male and Female Population over 85 years, Male and Female Population Between 5 and 9 years, Male and Female Population Between 10 and 14 years, Male and Female Population Between 15 and 19 years, Male and Female Population Between 20 and 24 years, Male and Female Population Between 25 and 29 years, Male and Female Population Between 30 and 34 years, Male and Female Population Between 35 and 39 years, Male and Female Population Between 40 and 44 years, and 8 more
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates. To measure the three variables, namely (a) Population (Male), (b) Population (Female), and (c) Gender Ratio (Males per 100 Females), we initially analyzed and categorized the data for each of the gender classifications (biological sex) reported by the US Census Bureau across 18 age groups, ranging from under 5 years to 85 years and above. These age groups are described above in the variables section. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the population of Bay View by gender across 18 age groups. It lists the male and female population in each age group along with the gender ratio for Bay View. The dataset can be utilized to understand the population distribution of Bay View by gender and age. For example, using this dataset, we can identify the largest age group for both Men and Women in Bay View. Additionally, it can be used to see how the gender ratio changes from birth to senior most age group and male to female ratio across each age group for Bay View.

    Key observations

    Largest age group (population): Male # 15-19 years (67) | Female # 55-59 years (73). Source: U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

    Age groups:

    • Under 5 years
    • 5 to 9 years
    • 10 to 14 years
    • 15 to 19 years
    • 20 to 24 years
    • 25 to 29 years
    • 30 to 34 years
    • 35 to 39 years
    • 40 to 44 years
    • 45 to 49 years
    • 50 to 54 years
    • 55 to 59 years
    • 60 to 64 years
    • 65 to 69 years
    • 70 to 74 years
    • 75 to 79 years
    • 80 to 84 years
    • 85 years and over

    Scope of gender :

    Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis.

    Variables / Data Columns

    • Age Group: This column displays the age group for the Bay View population analysis. Total expected values are 18 and are define above in the age groups section.
    • Population (Male): The male population in the Bay View is shown in the following column.
    • Population (Female): The female population in the Bay View is shown in the following column.
    • Gender Ratio: Also known as the sex ratio, this column displays the number of males per 100 females in Bay View for each age group.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Bay View Population by Gender. You can refer the same here

  14. h

    Figure 06a - Post-fit nonresDNN score distribution

    • hepdata.net
    Updated Jun 16, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Figure 06a - Post-fit nonresDNN score distribution [Dataset]. http://doi.org/10.17182/hepdata.157024.v1/t5
    Explore at:
    Dataset updated
    Jun 16, 2025
    Description

    Distributions of scores for nonresDNN in 6b data and the background prediction after background-only fits to the observed data. The...

  15. V

    2022 - 2024 NTD Annual Data - Vehicles (Age Distribution)

    • data.virginia.gov
    • data.transportation.gov
    csv, json, rdf, xsl
    Updated Oct 17, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S Department of Transportation (2025). 2022 - 2024 NTD Annual Data - Vehicles (Age Distribution) [Dataset]. https://data.virginia.gov/dataset/2022-2024-ntd-annual-data-vehicles-age-distribution
    Explore at:
    csv, xsl, json, rdfAvailable download formats
    Dataset updated
    Oct 17, 2025
    Dataset provided by
    Federal Transit Administration
    Authors
    U.S Department of Transportation
    Description

    This dataset details vehicle types and ages for each transit agency reporting to the NTD in the 2022, 2023, and 2024 report years. Non-dedicated fleets do not report Year of Manufacture and are thus excluded from the Age Distribution table.

    Agencies do not report Useful Life Benchmark for non-dedicated fleets or fleets for which the agency does not have capital replacement responsibility. These fleets are excluded from calculations of the percentage of vehicles meeting or exceeding their useful life.

    In versions of the data tables from before 2014, you can find data on vehicles in the file called "Age Distribution of Active Vehicle Inventory."

    In years 2014-2021, you can find this data in the "Vehicles" data table on NTD Program website, at https://transit.dot.gov/ntd/ntd-data.

    If you have any other questions about this table, please contact the NTD Help Desk at NTDHelp@dot.gov.

  16. Wikipedia Movies Data

    • kaggle.com
    zip
    Updated Jan 17, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). Wikipedia Movies Data [Dataset]. https://www.kaggle.com/datasets/thedevastator/wikipedia-movie-data-from-1970-2018/discussion
    Explore at:
    zip(982030 bytes)Available download formats
    Dataset updated
    Jan 17, 2023
    Authors
    The Devastator
    Description

    Wikipedia Movie Data

    Exploring Production and Distribution Trends Across Four Decades

    By Michael Tauberg [source]

    About this dataset

    This comprehensive dataset spans a substantial sampling of movies from the last five decades, giving insight into the financial and creative successes of Hollywood film productions. Containing various production details such as director, actors, editing team, budget, and overall gross revenue, it can be used to understand how different elements come together to make a movie successful. With information covering all aspects of movie-making – from country of origin to soundtrack composer – this collection offers an unparalleled opportunity for a data-driven dive into the world of cinematic storytelling

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    The columns are important factors to analyze the data in depth – they range from general information such as year, name and language of movie to more specific info such as directors and editors of movie production teams. A good first step is to get an understanding of what kind of data exists and getting familiar with different columns.

    Good luck exploring!

    Research Ideas

    • Analyzing the correlations between budget, gross revenue, and number of awards or nominations won by a movie. Movie-makers and studios can use this data to understand what factors have an impact on the success of a movie and make better creative decisions accordingly.
    • Studying the trend of movies from different countries over time to understand how popular genres are changing over time across regions and countries; this data could be used by international film producers to identify potential opportunities for co-productions with other countries or regions.
    • Identifying unique topics for films (based on writers, directors, music etc) that hadn’t been explored in previous decades - studios can use this data to find unique stories or ideas for new films that often succeed commercially due to its novelty factor with audiences

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: Dataset copyright by authors - You are free to: - Share - copy and redistribute the material in any medium or format for any purpose, even commercially. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original. - Keep intact - all notices that refer to this license, including copyright notices.

    Columns

    File: movies_1970_2018.csv | Column name | Description | |:-------------------|:----------------------------------------------------------| | year | Year the movie was released. (Integer) | | wiki_ref | Reference to the Wikipedia page for the movie. (String) | | wiki_query | Query used to search for the movie on Wikipedia. (String) | | producer | Name of the producer of the movie. (String) | | distributor | Name of the distributor of the movie. (String) | | name | Name of the movie. (String) | | country | Country of origin of the movie. (String) | | director | Name of the director of the movie. (String) | | cinematography | Name of the cinematographer of the movie. (String) | | editing | Name of the editor of the movie. (String) | | studio | Name of the studio that produced the movie. (String) | | budget | Budget of the movie. (Integer) | | gross | Gross box office receipts of the movie. (Integer) | | runtime | Length of the movie in minutes. (Integer) | | music | Name of the composer of the movie's soundtrack. (String) | | writer | Name of the writer of the movie. (String) | | starring | Names of the actors in the movie. (String) | | language | Language of the movie. (String) |

    Acknowledgements

    If you use this dataset in your research, p...

  17. a

    OHSEM Open Distribution Point View

    • arc-gis-hub-home-arcgishub.hub.arcgis.com
    • data-moco.opendata.arcgis.com
    • +1more
    Updated Jul 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Montgomery County, Texas IT-GIS (2024). OHSEM Open Distribution Point View [Dataset]. https://arc-gis-hub-home-arcgishub.hub.arcgis.com/datasets/c1f5e768193d4501aa2703d71c659e01
    Explore at:
    Dataset updated
    Jul 11, 2024
    Dataset authored and provided by
    Montgomery County, Texas IT-GIS
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Area covered
    Description

    This filtered view displays only the Points of Distribution (PODs) that are currently open in Montgomery County. PODs are critical locations where essential supplies such as water, tarps, meals (MREs), blankets, and more are distributed to residents during emergencies. The view is accessible to the public and serves to provide real-time information about active POD locations during crises.

  18. d

    Data from: Distribution and status of five non-native fish species in the...

    • catalog.data.gov
    • search.dataone.org
    • +1more
    Updated Nov 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). Distribution and status of five non-native fish species in the Tampa Bay drainage (USA), a hot spot for fish introductions-Data [Dataset]. https://catalog.data.gov/dataset/distribution-and-status-of-five-non-native-fish-species-in-the-tampa-bay-drainage-usa-a-ho
    Explore at:
    Dataset updated
    Nov 20, 2025
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Area covered
    United States
    Description

    This dataset provides supporting information for the species distribution data used in the associated manuscript. Collections of five non-native fish species were made by a number of institutions, and several capture techniques were used. This dataset also includes number of individuals of each species captured at each locality.

  19. E

    Data from: The Cost of Stochastic Resetting

    • find.data.gov.scot
    • dtechtive.com
    txt, zip
    Updated Apr 19, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    University of Edinburgh. School of Physics and Astronomy. Institute for Condensed Matter and Complex Systems (2023). The Cost of Stochastic Resetting [Dataset]. http://doi.org/10.7488/ds/3839
    Explore at:
    txt(0.0011 MB), txt(0.0166 MB), zip(9.699 MB), zip(0.0043 MB)Available download formats
    Dataset updated
    Apr 19, 2023
    Dataset provided by
    University of Edinburgh. School of Physics and Astronomy. Institute for Condensed Matter and Complex Systems
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Resetting a stochastic process has been shown to expedite the completion time of some complex task, such as finding a target for the first time. Here we consider the cost of resetting by associating a cost to each reset, which is a function of the distance travelled during the reset event. We compute the Laplace transform of the joint probability of first passage time $t_f$, number of resets $N$ and resetting cost $C$, and use this to study the statistics of the total cost. We show that in the limit of zero resetting rate the mean cost is finite for a linear cost function, vanishes for a sub-linear cost function and diverges for a super-linear cost function. This result contrasts with the case of no resetting where the cost is always zero. For the case of an exponentially increasing cost function we show that the mean cost diverges at a finite resetting rate. We explain this by showing that the distribution of the cost has a power-law tail with continuously varying exponent that depends on the resetting rate. The dataset is related to the upcoming paper John C. Sunil, Richard A. Blythe, Martin R. Evans and Satya N. Majumdar (in submission), 'The Cost of Stochastic Resetting'.

  20. d

    Data from: Winter Steelhead Distribution [ds340]

    • catalog.data.gov
    • data.cnra.ca.gov
    • +6more
    Updated Jul 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    California Department of Fish and Wildlife (2025). Winter Steelhead Distribution [ds340] [Dataset]. https://catalog.data.gov/dataset/winter-steelhead-distribution-ds340-cc0ea
    Explore at:
    Dataset updated
    Jul 24, 2025
    Dataset provided by
    California Department of Fish and Wildlife
    Description

    Winter Steelhead Distribution June 2012 Version This dataset depicts observation-based stream-level geographic distribution of anadromous winter-run steelhead trout, Oncorhynchus mykiss irideus (O. mykiss), in California. It was developed for the express purpose of assisting with steelhead recovery planning efforts. The distributions reported in this dataset were derived from a subset of the data contained in the Aquatic Species Observation Database (ASOD), a Microsoft Access multi-species observation data capture application. ASOD is an ongoing project designed to capture as complete a set of statewide inland aquatic vertebrate species observation information as possible. Please note: A separate distribution is available for summer-run steelhead. Contact information is the same as for the above. ASOD Observation data were used to develop a network of stream segments. These lines are developed by "tracing down" from each observation to the sea using the flow properties of USGS National Hydrography Dataset (NHD) High Resolution hydrography. Lastly these lines, representing stream segments, were assigned a value of either Anad Present (Anadromous present). The end result (i.e., this layer) consists of a set of lines representing the distribution of steelhead based on observations in the Aquatic Species Observation Database. This dataset represents stream reaches that are known or believed to be used by steelhead based on steelhead observations. Thus, it contains only positive steelhead occurrences. The absence of distribution on a stream does not necessarily indicate that steelhead do not utilize that stream. Additionally, steelhead may not be found in all streams or reaches each year. This is due to natural variations in run size, water conditions, and other environmental factors. The information in this data set should be used as an indicator of steelhead presence/suspected presence at the time of the observation as indicated by the 'Late_Yr' (Latest Year) field attribute. The line features in the dataset may not represent the maximum extent of steelhead on a stream; rather it is important to note that this distribution most likely underestimates the actual distribution of steelhead. This distribution is based on observations found in the ASOD database. The individual observations may not have occurred at the upper extent of anadromous occupation. In addition, no attempt was made to capture every observation of O. mykiss and so it should not be assumed that this dataset is complete for each stream. The distribution dataset was built solely from the ASOD observational data. No additional data (habitat mapping, barriers data, gradient modeling, etc.) were utilized to either add to or validate the data. It is very possible that an anadromous observation in this dataset has been recorded above (upstream of) a barrier as identified in the Passage Assessment Database (PAD). In the near future, we hope to perform a comparative analysis between this dataset and the PAD to identify and resolve all such discrepancies. Such an analysis will add rigor to and help validate both datasets. This dataset has recently undergone a review. Data source contributors as well as CDFG fisheries biologists have been provided the opportunity to review and suggest edits or additions during a recent review. Data contributors were notified and invited to review and comment on the handling of the information that they provided. The distribution was then posted to an intranet mapping application and CDFG biologists were provided an opportunity to review and comment on the dataset. During this review, biologists were also encouraged to add new observation data. This resulting final distribution contains their suggestions and additions. Please refer to "Use Constraints" section below.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
U.S. EPA Office of Research and Development (ORD) (2020). Simulation Data Set [Dataset]. https://catalog.data.gov/dataset/simulation-data-set
Organization logo

Simulation Data Set

Explore at:
Dataset updated
Nov 12, 2020
Dataset provided by
United States Environmental Protection Agencyhttp://www.epa.gov/
Description

These are simulated data without any identifying information or informative birth-level covariates. We also standardize the pollution exposures on each week by subtracting off the median exposure amount on a given week and dividing by the interquartile range (IQR) (as in the actual application to the true NC birth records data). The dataset that we provide includes weekly average pregnancy exposures that have already been standardized in this way while the medians and IQRs are not given. This further protects identifiability of the spatial locations used in the analysis. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: File format: R workspace file; “Simulated_Dataset.RData”. Metadata (including data dictionary) • y: Vector of binary responses (1: adverse outcome, 0: control) • x: Matrix of covariates; one row for each simulated individual • z: Matrix of standardized pollution exposures • n: Number of simulated individuals • m: Number of exposure time periods (e.g., weeks of pregnancy) • p: Number of columns in the covariate design matrix • alpha_true: Vector of “true” critical window locations/magnitudes (i.e., the ground truth that we want to estimate) Code Abstract We provide R statistical software code (“CWVS_LMC.txt”) to fit the linear model of coregionalization (LMC) version of the Critical Window Variable Selection (CWVS) method developed in the manuscript. We also provide R code (“Results_Summary.txt”) to summarize/plot the estimated critical windows and posterior marginal inclusion probabilities. Description “CWVS_LMC.txt”: This code is delivered to the user in the form of a .txt file that contains R statistical software code. Once the “Simulated_Dataset.RData” workspace has been loaded into R, the text in the file can be used to identify/estimate critical windows of susceptibility and posterior marginal inclusion probabilities. “Results_Summary.txt”: This code is also delivered to the user in the form of a .txt file that contains R statistical software code. Once the “CWVS_LMC.txt” code is applied to the simulated dataset and the program has completed, this code can be used to summarize and plot the identified/estimated critical windows and posterior marginal inclusion probabilities (similar to the plots shown in the manuscript). Optional Information (complete as necessary) Required R packages: • For running “CWVS_LMC.txt”: • msm: Sampling from the truncated normal distribution • mnormt: Sampling from the multivariate normal distribution • BayesLogit: Sampling from the Polya-Gamma distribution • For running “Results_Summary.txt”: • plotrix: Plotting the posterior means and credible intervals Instructions for Use Reproducibility (Mandatory) What can be reproduced: The data and code can be used to identify/estimate critical windows from one of the actual simulated datasets generated under setting E4 from the presented simulation study. How to use the information: • Load the “Simulated_Dataset.RData” workspace • Run the code contained in “CWVS_LMC.txt” • Once the “CWVS_LMC.txt” code is complete, run “Results_Summary.txt”. Format: Below is the replication procedure for the attached data set for the portion of the analyses using a simulated data set: Data The data used in the application section of the manuscript consist of geocoded birth records from the North Carolina State Center for Health Statistics, 2005-2008. In the simulation study section of the manuscript, we simulate synthetic data that closely match some of the key features of the birth certificate data while maintaining confidentiality of any actual pregnant women. Availability Due to the highly sensitive and identifying information contained in the birth certificate data (including latitude/longitude and address of residence at delivery), we are unable to make the data from the application section publically available. However, we will make one of the simulated datasets available for any reader interested in applying the method to realistic simulated birth records data. This will also allow the user to become familiar with the required inputs of the model, how the data should be structured, and what type of output is obtained. While we cannot provide the application data here, access to the North Carolina birth records can be requested through the North Carolina State Center for Health Statistics, and requires an appropriate data use agreement. Description Permissions: These are simulated data without any identifying information or informative birth-level covariates. We also standardize the pollution exposures on each week by subtracting off the median exposure amount on a given week and dividing by the interquartile range (IQR) (as in the actual application to the true NC birth records data). The dataset that we provide includes weekly average pregnancy exposures that have already been standardized in this way while the medians and IQRs are not given. This further protects identifiability of the spatial locations used in the analysis. This dataset is associated with the following publication: Warren, J., W. Kong, T. Luben, and H. Chang. Critical Window Variable Selection: Estimating the Impact of Air Pollution on Very Preterm Birth. Biostatistics. Oxford University Press, OXFORD, UK, 1-30, (2019).

Search
Clear search
Close search
Google apps
Main menu