Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset was created by Master Sniffer
Released under MIT
Facebook
TwitterNew data-gathering techniques, often referred to as “Big Data” have the potential to improve statistics and empirical research in economics. In this paper we describe our work with online data at the Billion Prices Project at MIT and discuss key lessons for both inflation measurement and some fundamental research questions in macro and international economics. In particular, we show how online prices can be used to construct daily price indexes in multiple countries and to avoid measurement biases that distort evidence of price stickiness and international relative prices. We emphasize how Big Data technologies are providing macro and international economists with opportunities to stop treating the data as “given” and to get directly involved with data collection.
Facebook
Twitterhttps://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
U.S. Presidential Election Constituency Returns (1976-2020)
Dataset Summary
This dataset contains state-level constituency returns for U.S. presidential elections from 1976 to 2020, compiled by the MIT Election Data Science Lab. The dataset includes 4,287 observations across 15 variables, offering detailed insights into the voting patterns for presidential elections over four decades. The data sources include the biennially published document “Statistics of the… See the full description on the dataset page: https://huggingface.co/datasets/fdaudens/us-presidential-elections-with-electoral-college.
Facebook
TwitterThe PATSTAT datasets, issued by the European Patent Agency, is a collection of statistics covering patent application data. Currently available in the MIT Dataverse: PATSTAT Biblio 2017, 2021, 2022 PATSTAT Legal status 2017 Documentation for PATSTAT, issued separately, can be found on this record.
Facebook
TwitterData from https://github.com/TheUpshot/presidential-precinct-map-2020 released under MIT license: https://github.com/TheUpshot/presidential-precinct-map-2020/blob/main/LICENSE. For more detail, see https://www.nytimes.com/interactive/2021/upshot/2020-election-map.html.
The Upshot scraped and standardized precinct-level election results from around the country, and joined this tabular data to precinct GIS data to create a nationwide election map. This map does not have full coverage for every state: data availability and caveats for each state are listed below, and statistics about data coverage are available here. We are releasing this map's data for attributed re-use under the MIT license in this repository.
The GeoJSON dataset can be downloaded at: https://int.nyt.com/newsgraphics/elections/map-data/2020/national/precincts-with-results.geojson.gz
Properties on each precinct polygon:
GEOID: unique identifier for the precinct, formed from the five-digit county FIPS code followed by the precinct name/ID (eg, 30003-08 or 39091-WEST MANSFIELD)votes_dem: votes received by Joseph Bidenvotes_rep: votes received by Donald Trumpvotes_total: total votes in the precinct, including for third-party candidates and write-insvotes_per_sqkm: total votes divided by the area of the precinct, rounded to one decimal placepct_dem_lead: (votes_dem - votes_rep) / (votes_dem + votes_rep), rounded to one decimal place (eg, -21.3)Due to licensing restrictions, we are unable to include the 2016 election results that appear in our interactive map.
Please contact dear.upshot@nytimes.com if you have any questions about data quality or sourcing, beyond the caveats we describe below.
| symbol | meaning |
|---|---|
| ✅ | have gathered data, no significant caveats |
| ⚠️ | have gathered data, but doesn't cover entire state or has other significant caveats |
| ❌ | precinct data not usable |
| ❓ | precinct data not yet available |
Note: One of the most common causes of precinct data being unusable is "countywide" tabulations. This occurs when a county reports, say, all of its absentee ballots together as a single row in its Excel download (instead of precinct-by-precinct); because we can't attribute those ballots to specific precincts, that means that all precincts in the county will be missing an indeterminite number of votes, and therefore can't be reliably mapped. In these cases, we drop the entire county from our GeoJSON.
AL: ❌ absentee and provisional results are reported countywideAK: ❌ absentee, early, and provisional results are reported district-wideAZ: ✅AR: ⚠️ we could not generate or procure precinct maps for Jefferson County or Phillips CountyCA: ⚠️ only certain counties report results at the precinct level, additional collection is in progressCO: ✅CT: ⚠️ township-level results rather than precinct-level resultsDE: ✅DC: ✅FL: ⚠️ precinct results not yet available statewideGA: ✅HI: ✅ID: ⚠️ many counties ...
Facebook
TwitterThe goal of this work is to generate large statistically representative datasets to train machine learning models for disruption prediction provided by data from few existing discharges. Such a comprehensive training database is important to achieve satisfying and reliable prediction results in artificial neural network classifiers. Here, we aim for a robust augmentation of the training database for multivariate time series data using Student-t process regression. We apply Student-t process regression in a state space formulation via Bayesian filtering to tackle challenges imposed by outliers and noise in the training data set and to reduce the computational complexity. Thus, the method can also be used if the time resolution is high. We use an uncorrelated model for each dimension and impose correlations afterwards via coloring transformations. We demonstrate the efficacy of our approach on plasma diagnostics data of three different disruption classes from the DIII-D tokamak. To evaluate if the distribution of the generated data is similar to the training data, we additionally perform statistical analyses using methods from time series analysis, descriptive statistics, and classic machine learning clustering algorithms.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Constituency (state-level) returns for elections to the US Senate from 1976 to 2018. SUMMARY US House of Representatives election results at a district level, for 1976–2018, from the MIT Election Data + Science Lab.
The original data source is the document Statistics of the Congressional Election , published biennially by the Clerk of the U.S. House of Representatives. 2018 data comes from official state election websites (in some cases, they are marked as unofficial, and will be updated at a later time).
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset was created by Master Sniffer
Released under MIT