7 datasets found
  1. Network Statistics for Data Science

    • kaggle.com
    zip
    Updated Sep 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Master Sniffer (2024). Network Statistics for Data Science [Dataset]. https://www.kaggle.com/datasets/mastersniffer/network-statistics-for-data-science
    Explore at:
    zip(7482 bytes)Available download formats
    Dataset updated
    Sep 9, 2024
    Authors
    Master Sniffer
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Master Sniffer

    Released under MIT

    Contents

  2. d

    Cavallo, Alberto, and Roberto Rigobon (2016) \"The Billion Prices Project:...

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cavallo, Alberto; Rigobon, Roberto (2023). Cavallo, Alberto, and Roberto Rigobon (2016) \"The Billion Prices Project: Using Online Data for Measurement and Research\" - Journal of Economic Perspectives , 31(1) (Spring 2016) [Dataset]. http://doi.org/10.7910/DVN/6RQCRS
    Explore at:
    Dataset updated
    Nov 21, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Cavallo, Alberto; Rigobon, Roberto
    Description

    New data-gathering techniques, often referred to as “Big Data” have the potential to improve statistics and empirical research in economics. In this paper we describe our work with online data at the Billion Prices Project at MIT and discuss key lessons for both inflation measurement and some fundamental research questions in macro and international economics. In particular, we show how online prices can be used to construct daily price indexes in multiple countries and to avoid measurement biases that distort evidence of price stickiness and international relative prices. We emphasize how Big Data technologies are providing macro and international economists with opportunities to stop treating the data as “given” and to get directly involved with data collection.

  3. h

    us-presidential-elections-with-electoral-college

    • huggingface.co
    Updated Oct 26, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Florent Daudens (2024). us-presidential-elections-with-electoral-college [Dataset]. https://huggingface.co/datasets/fdaudens/us-presidential-elections-with-electoral-college
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 26, 2024
    Authors
    Florent Daudens
    License

    https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/

    Area covered
    United States
    Description

    U.S. Presidential Election Constituency Returns (1976-2020)

      Dataset Summary
    

    This dataset contains state-level constituency returns for U.S. presidential elections from 1976 to 2020, compiled by the MIT Election Data Science Lab. The dataset includes 4,287 observations across 15 variables, offering detailed insights into the voting patterns for presidential elections over four decades. The data sources include the biennially published document “Statistics of the… See the full description on the dataset page: https://huggingface.co/datasets/fdaudens/us-presidential-elections-with-electoral-college.

  4. d

    PATSTAT

    • search.dataone.org
    Updated Dec 20, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    European Patent Office (2023). PATSTAT [Dataset]. http://doi.org/10.7910/DVN/KQCGXP
    Explore at:
    Dataset updated
    Dec 20, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    European Patent Office
    Description

    The PATSTAT datasets, issued by the European Patent Agency, is a collection of statistics covering patent application data. Currently available in the MIT Dataverse: PATSTAT Biblio 2017, 2021, 2022 PATSTAT Legal status 2017 Documentation for PATSTAT, issued separately, can be found on this record.

  5. Presidential Precinct Map: 2020 Election Results

    • kaggle.com
    zip
    Updated Feb 2, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Paul Mooney (2021). Presidential Precinct Map: 2020 Election Results [Dataset]. https://www.kaggle.com/datasets/paultimothymooney/presidential-precinct-map-2020-election-results/code
    Explore at:
    zip(171002921 bytes)Available download formats
    Dataset updated
    Feb 2, 2021
    Authors
    Paul Mooney
    Description

    Data from https://github.com/TheUpshot/presidential-precinct-map-2020 released under MIT license: https://github.com/TheUpshot/presidential-precinct-map-2020/blob/main/LICENSE. For more detail, see https://www.nytimes.com/interactive/2021/upshot/2020-election-map.html.

    Presidential precinct data for the 2020 general election

    The Upshot scraped and standardized precinct-level election results from around the country, and joined this tabular data to precinct GIS data to create a nationwide election map. This map does not have full coverage for every state: data availability and caveats for each state are listed below, and statistics about data coverage are available here. We are releasing this map's data for attributed re-use under the MIT license in this repository.

    The GeoJSON dataset can be downloaded at: https://int.nyt.com/newsgraphics/elections/map-data/2020/national/precincts-with-results.geojson.gz

    Properties on each precinct polygon:

    • GEOID: unique identifier for the precinct, formed from the five-digit county FIPS code followed by the precinct name/ID (eg, 30003-08 or 39091-WEST MANSFIELD)
    • votes_dem: votes received by Joseph Biden
    • votes_rep: votes received by Donald Trump
    • votes_total: total votes in the precinct, including for third-party candidates and write-ins
    • votes_per_sqkm: total votes divided by the area of the precinct, rounded to one decimal place
    • pct_dem_lead: (votes_dem - votes_rep) / (votes_dem + votes_rep), rounded to one decimal place (eg, -21.3)

    Due to licensing restrictions, we are unable to include the 2016 election results that appear in our interactive map.

    Please contact dear.upshot@nytimes.com if you have any questions about data quality or sourcing, beyond the caveats we describe below.

    General caveats

    • Where possible, we used official precinct boundaries provided by the states or counties, but in most cases these were not available and we generated boundaries ourselves, using L2 voter-file points to guess the precinct for each census block group; this results in generally accurate precinct boundaries, but can be rough in no- or very-low-population places like business parks or uninhabited rural land.
      • Because of this, spatially joining our precinct GeoJSON to other geographic datasets will most likely yield less-than-ideal output.
    • Some of the results we gathered are unofficial/uncertified, since the certified tabulations hadn't yet been released at time of gathering.
    • A very small portion of the tabular precinct results (roughly 0.01%) could not be joined to the precinct boundaries, and thus these results are not present in the GeoJSON.
    • A few areas, such as rural Maine, Vermont and Hawaii, contain no voters, and those polygons are excluded from the GeoJSON.

    State-by-state data availability and caveats

    symbolmeaning
    have gathered data, no significant caveats
    ⚠️have gathered data, but doesn't cover entire state or has other significant caveats
    precinct data not usable
    precinct data not yet available

    Note: One of the most common causes of precinct data being unusable is "countywide" tabulations. This occurs when a county reports, say, all of its absentee ballots together as a single row in its Excel download (instead of precinct-by-precinct); because we can't attribute those ballots to specific precincts, that means that all precincts in the county will be missing an indeterminite number of votes, and therefore can't be reliably mapped. In these cases, we drop the entire county from our GeoJSON.

    • AL: ❌ absentee and provisional results are reported countywide
    • AK: ❌ absentee, early, and provisional results are reported district-wide
    • AZ: ✅
    • AR: ⚠️ we could not generate or procure precinct maps for Jefferson County or Phillips County
    • CA: ⚠️ only certain counties report results at the precinct level, additional collection is in progress
    • CO: ✅
    • CT: ⚠️ township-level results rather than precinct-level results
    • DE: ✅
    • DC: ✅
    • FL: ⚠️ precinct results not yet available statewide
    • GA: ✅
    • HI: ✅
    • ID: ⚠️ many counties ...
  6. Data from: Data augmentation for disruption prediction via robust surrogate...

    • osti.gov
    • dataverse.harvard.edu
    Updated Jun 6, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Massachusetts Inst. of Technology (MIT), Cambridge, MA (United States). Plasma Science and Fusion Center (2022). Data augmentation for disruption prediction via robust surrogate models [Dataset]. http://doi.org/10.7910/DVN/FMJCAD
    Explore at:
    Dataset updated
    Jun 6, 2022
    Dataset provided by
    Office of Sciencehttp://www.er.doe.gov/
    General Atomics, San Diego, CA (United States)
    Massachusetts Inst. of Technology (MIT), Cambridge, MA (United States). Plasma Science and Fusion Center
    Description

    The goal of this work is to generate large statistically representative datasets to train machine learning models for disruption prediction provided by data from few existing discharges. Such a comprehensive training database is important to achieve satisfying and reliable prediction results in artificial neural network classifiers. Here, we aim for a robust augmentation of the training database for multivariate time series data using Student-t process regression. We apply Student-t process regression in a state space formulation via Bayesian filtering to tackle challenges imposed by outliers and noise in the training data set and to reduce the computational complexity. Thus, the method can also be used if the time resolution is high. We use an uncorrelated model for each dimension and impose correlations afterwards via coloring transformations. We demonstrate the efficacy of our approach on plasma diagnostics data of three different disruption classes from the DIII-D tokamak. To evaluate if the distribution of the generated data is similar to the training data, we additionally perform statistical analyses using methods from time series analysis, descriptive statistics, and classic machine learning clustering algorithms.

  7. US Senate Election Returns (1976-2018)

    • kaggle.com
    zip
    Updated Dec 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sahitya Setu (2023). US Senate Election Returns (1976-2018) [Dataset]. https://www.kaggle.com/datasets/sahityasetu/us-senate-election-returns-1976-2018
    Explore at:
    zip(67912 bytes)Available download formats
    Dataset updated
    Dec 8, 2023
    Authors
    Sahitya Setu
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Area covered
    United States
    Description

    Constituency (state-level) returns for elections to the US Senate from 1976 to 2018. SUMMARY US House of Representatives election results at a district level, for 1976–2018, from the MIT Election Data + Science Lab.

    The original data source is the document Statistics of the Congressional Election , published biennially by the Clerk of the U.S. House of Representatives. 2018 data comes from official state election websites (in some cases, they are marked as unofficial, and will be updated at a later time).

  8. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Master Sniffer (2024). Network Statistics for Data Science [Dataset]. https://www.kaggle.com/datasets/mastersniffer/network-statistics-for-data-science
Organization logo

Network Statistics for Data Science

Explore at:
zip(7482 bytes)Available download formats
Dataset updated
Sep 9, 2024
Authors
Master Sniffer
License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

Dataset

This dataset was created by Master Sniffer

Released under MIT

Contents

Search
Clear search
Close search
Google apps
Main menu