Facebook
TwitterThe Distributional Financial Accounts (DFAs) provide a quarterly measure of the distribution of U.S. household wealth since 1989, based on a comprehensive integration of disaggregated household-level wealth data with official aggregate wealth measures. The data set contains the level and share of each balance sheet item on the Financial Accounts' household wealth table (Table B.101.h), for various sub-populations in the United States. In our core data set, aggregate household wealth is allocated to each of four percentile groups of wealth: the top 1 percent, the next 9 percent (i.e., 90th to 99th percentile), the next 40 percent (50th to 90th percentile), and the bottom half (below the 50th percentile). Additionally, the data set contains the level and share of aggregate household wealth by income, age, generation, education, and race. The quarterly frequency makes the data useful for studying the business cycle dynamics of wealth concentration--which are typically difficult to observe in lower-frequency data because peaks and troughs often fall between times of measurement. These data will be updated about 10 or 11 weeks after the end of each quarter, making them a timely measure of the distribution of wealth.
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Major differences from previous work: For level 2 catch: Catches in tons, raised to match nominal values, now consider the geographic area of the nominal data for improved accuracy. Captures in "Number of fish" are converted to weight based on nominal data. The conversion factors used in the previous version are no longer used, as they did not adequately represent the diversity of captures. Number of fish without corresponding data in nominal are not removed as they were before, creating a huge difference for this measurement_unit between the two datasets. Nominal data from WCPFC includes fishing fleet information, and georeferenced data has been raised based on this instead of solely on the triplet year/gear/species, to avoid random reallocations. Strata for which catches in tons are raised to match nominal data have had their numbers removed. Raising only applies to complete years to avoid overrepresenting specific months, particularly in the early years of georeferenced reporting. Strata where georeferenced data exceed nominal data have not been adjusted downward, as it is unclear if these discrepancies arise from missing nominal data or different aggregation methods in both datasets. The data is not aggregated to 5-degree squares and thus remains unharmonized spatially. Aggregation can be performed using CWP codes for geographic identifiers. For example, an R function is available: source("https://raw.githubusercontent.com/firms-gta/geoflow-tunaatlas/master/sardara_functions/transform_cwp_code_from_1deg_to_5deg.R") Level 0 dataset has been modified creating differences in this new version notably : The species retained are different; only 32 major species are kept. Mappings have been somewhat modified based on new standards implemented by FIRMS. New rules have been applied for overlapping areas. Data is only displayed in 1 degrees square area and 5 degrees square areas. The data is enriched with "Species group", "Gear labels" using the fdiwg standards. These main differences are recapped in the Differences_v2018_v2024.zip Recommendations: To avoid converting data from number using nominal stratas, we recommend the use of conversion factors which could be provided by tRFMOs. In some strata, nominal data appears higher than georeferenced data, as observed during level 2 processing. These discrepancies may result from errors or differences in aggregation methods. Further analysis will examine these differences in detail to refine treatments accordingly. A summary of differences by tRFMOs, based on the number of strata, is included in the appendix. Some nominal data have no equivalent in georeferenced data and therefore cannot be disaggregated. What could be done is to check for each nominal data without equivalence if a georeferenced data exists in different buffers, and to average the distribution of this footprint. Then, disaggregate the nominal data based on the georeferenced data. This would lead to the creation of data (approximately 3%), and would necessitate reducing/removing all georeferenced data without a nominal equivalent or with a lesser equivalent. Tests are currently being conducted with and without this. It would help improve the biomass captured footprint but could lead to unexpected discrepancies with current datasets. For level 0 effort : In some datasets—namely those from ICCAT and the purse seine (PS) data from WCPFC— same effort data has been reported multiple times by using different units which have been kept as is, since no official mapping allows conversion between these units. As a result, users have be remind that some ICCAT and WCPFC effort data are deliberately duplicated : in the case of ICCAT data, lines with identical strata but different effort units are duplicates reporting the same fishing activity with different measurement units. It is indeed not possible to infer strict equivalence between units, as some contain information about others (e.g., Hours.FAD and Hours.FSC may inform Hours.STD). in the case of WCPFC data, effort records were also kept in all originally reported units. Here, duplicates do not necessarily share the same “fishing_mode”, as SETS for purse seiners are reported with an explicit association to fishing_mode, while DAYS are not. This distinction allows SETS records to be separated by fishing mode, whereas DAYS records remain aggregated. Some limited harmonization—particularly between units such as NET-days and Nets—has not been implemented in the current version of the dataset, but may be considered in future releases if a consistent relationship can be established.
Facebook
Twitterhttps://spdx.org/licenses/etalab-2.0.htmlhttps://spdx.org/licenses/etalab-2.0.html
This database contains food demand elasticities estimates collected from a literature review carried out in 2015 as part of a contract funded by the International Food Policy Research Institute (IFPRI) (contract n° 2015X144.FEM). It served as a basis for the meta-analysis of price and income elasticities of food demand presented in Femenia (2019). Data collection: Two reports providing food demand elasticities published by the United States Department of Agriculture (USDA) (Seale et al. (2003) and Muhammad et al. (2011)) are frequently used to calibrate demand functions in global economic models. In these reports, price and income elasticities are estimated for eight broad food categories and for a large number of countries. This broad level of country coverage renders these elasticity data well-suited for calibrating large simulation models. Economists might however wish to use other source of elasticities for different reasons when, for instance, they consider food products at a higher disaggregation level or when they wish to compare results obtained with a calibration of demand parameters based on USDA estimates to those obtained with a calibration based on other estimates given in the literature. The USDA provides a literature review database (USDA, 2005), which contains this type of information. This database collects own price, cross price, expenditure and income demand elasticity estimates from papers that have been published and/or presented in the United States (US) between 1979 and 2005. While the database covers a large variety of products at various aggregation levels, few countries are included. These two sources of data, namely, the USDA’s estimates given in Seale et al. (2003) and Muhammad et al. (2011) and the USDA’s literature review database, were used as a basis to build the database presented here. We started with the structure of the USDA literature review database, which includes useful information on each elasticity estimate, such as the references of the papers from which the estimates have been collected; the countries, products and time periods concerned; the types of data used to conduct estimations; and the demand models estimated. The elasticities estimated by Seale et al. (2003) and Muhammad et al. (2011) were also included. We then reviewed the primary studies to check the information included in the USDA database and to ensure the consistency of the data. Of the 74 references present in these data, five PhD dissertations were not available to us, thus restricting our ability to verify the data and to collect new information, and we decided to exclude these references. In a second step, we searched for new references providing food demand elasticity estimates in the economic literature with a focus on pre-2005 studies dealing with countries other than the US and China and with a focus on post-2005 studies regardless of the country. The search was performed with Google Scholar in March 2015 using the following combinations of keywords: “price, elasticities, food, demand” and “income, elasticities, food, demand”. We did not limit our search to published papers; working papers, reports, and papers presented at conferences were also included. A total of 72 references were collected in this way. All price and income elasticity estimates of food demand reported in these references were collected. Among own price elasticities we distinguished uncompensated (Marshallian) price elasticities from compensated (Hicksian) elasticities. The final database contains 25,117 food demand elasticities estimates collected from 148 studies published between 1973 and 2014. Information included and data coding: In addition to the values of elasticity estimates and the references of the primary studies from which they have been collected, our database incorporate several variables aimed at providing detailed information on the estimated values. These descriptive variables contain information related to the type of data used to estimate the elasticities (time series, panel or cross section), to whether these data have been collected at the micro (household) or macro (country) level, to the decade in which they have been collected, which ranges from 1950 to 2010, and to the countries and products to which these data refer. To homogenize the information on food products, product names as they appear in the primary studies are mapped to the following eight product categories: beverages and tobacco, cereals, dairy products, fruits and vegetables, oils and fats, meat and fish, other food products and non-food products. Given that these categories are in some cases much broader than the product levels considered in primary studies, a variable representing the aggregation level of the primary data is also associated with each observation. The following four aggregation levels are considered: “global food aggregate”; “product category aggregate”, which corresponds to the aforementioned categories;...
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Detailed graphically-documented daily losses of Russian tanks according to ORYX
About the dataset
Detailed graphically-documented daily losses of Russian tanks according to ORYX ventilated by the model of the tank, its generation and by series-year.
Contents
The Excel file contains 3 sheets: - Model_Level : the cumulative number of tanks lost (destroyed, damaged, abandoned and captured), ventilated by the precise model of the tank - Year_Level : the cumulative number of tanks lost (destroyed, damaged, abandoned and captured), ventilated by the decade in which the tank entered production - SeriesYear_Level : the cumulative number of tanks lost (destroyed, damaged, abandoned and captured), ventilated by the decade in which the tank entered production and the series the tank belongs to
The column headers depend on the sheet : - Model_Level : the headers contain the name of the model. White spaces, dashes and other signs of punctuation have been removed - Year_Level : the headers contain the decade in which the tank entered production. It is preceded by an ‘x’. Unknown tanks are grouped under the ‘xUnknown’ header - SeriesYear_Level : the headers contains the concatenation of the decade (yyyy) and the series of the tank (up to four letters of the series name). Unknown tanks do not have an attributed decade.
What sets this dataset apart?
Method of collection
The data was collected from the ORYX website on a daily basis. Since ORYX does not provide a real-time dataset, I obtain the real-time data by using the Wayback machine and then save the snapshot as HTML code for each date.
Upon loading the HTML file for each day, I filter each level of aggregation by using the h3 tags and only select the Tanks. At the category level, the values are found within an h3 tag, while each specific piece of equipment is found in a bullet list.
Although ORYX provides the number by item, one of its limitations is that the numbers reported may be different from the sum of the individual pieces because of inputting errors or miscategorization. I therefore contrast this information with the sum of the individual pieces of equipment listed according to their state (more on this in the following section).
Cleaning and treatment of the data
There is an extensive cleaning process involved: - Cleaning of the names to remove typographical errors - Correcting the aggregate number by equipment if that value is absurd given the amount of individual pieces of equipment - These checks ensure that the error between the aggregate and individual numbers are less than 5 in absolute terms or less than 5%, whichever condition is the most restrictive. The final number in the dataset is the minimum value between the aggregate and the individual numbers. - As there may be revisions after the first time the information is published, I make sure to take the minimum value of the remaining series. This ensures that the numbers I provide are the most conservative possible.
Frequency of the dataset
I plan on updating the dataset every week, with the dataset made available on Tuesday.
Companion datasets
All my datasets : - Russian losses (materiel and personnel) according to the Ukrainian Ministry of Defense : https://www.kaggle.com/datasets/ol4ubert/rus-modukr-equipmentpersonnel - Ukrainian losses (materiel and personnel) according to the Russian Ministry of Defense : https://www.kaggle.com/datasets/ol4ubert/ukr-modrus-equipmentpersonnel - Russian losses (materiel) according to ORYX : https://www.kaggle.com/datasets/ol4ubert/rus-oryx-equipment - Ukrainian losses (materiel) according to ORYX : https://www.kaggle.com/datasets/ol4ubert/ukr-oryx-equipment - Russian tank losses according to ORYX : https://www.kaggle.com/datasets/ol4ubert/rus-oryx-tanks - Ukrainian tank losses according to ORYX : https://www.kaggle.com/datasets/ol4ubert/ukr-oryx-tanks - Ukrainian personnel losses (UALosses) : https://www.kaggle.com/datasets/ol4ubert/confirmed-ukrainian-military-personnel-losses - Russian personnel losses (KilledInUkraine) : https://www.kaggle.com/datasets/ol4ubert/confirmed-russian-military-officers-losses - Ukrainian losses in Kursk (materiel and personnel) according to the Russian Ministry of Defense: https://www.kaggle.com/datasets/ol4ubert/ukrainian-military-losses-in-kursk-mod-russia
Any comment is welcome. Please use the Discussion feature or send me an email directly.
Facebook
TwitterThis dataset is one of the outputs of the Global Spatially-Disaggregated Crop Production Statistics Data (MapSPAM) for 2010, which includes physical area, harvest area, production and yield, for 42 crops, disaggregated at the input-levels (e.g., irrigated/rainfed and high/low-input) on a 10 km grid globally. Crop production values in this dataset are given per ha for each technology aggregated by categories - crops/food/non-food - with no information on individual crops. Unit of measure: Production per ha for each technology: mt/ha This new version of MapSPAM, available to download from the Harvard Dataverse Website, marks the third generation of the SPAM data series, following those of 2000 and 2005. More information on the production systems and selected crops is available in the Global Spatially-Disaggregated Crop Production Statistics Data (MapSPAM) full metadata at https://data.apps.fao.org/map/catalog/srv/eng/catalog.search#/metadata/59f7a5ef-2be4-43ee-9600-a6a9e9ff562a
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Zip files enertalk-dataset-{house_number} contain a directory for each houses. Each directory holds a set of subdirectories that contain Parquet files for the daily aggregate and appliance-level data. The naming convention for these subdirectories is “” (e.g. “20161124” for November 24, 2016). The Parquet files are named “_.parquet.gzip” (e.g. “01_fridge.parquet.gzip”). In these names, the two-digit integer is uniquely associated with a distinct measuring device in a house. Each Parquet file consists of three columns: “timestamp,” “active_power,” and “reactive_power.” The “timestamp” column contains Unix timestamps in milliseconds, such that 1000 corresponds to one second. The “active_power” column represents active power in watts and the “reactive_power” column represents reactive power in VAR (volt-ampere reactive) units.
Facebook
TwitterThis dataset is one of the outputs of the Global Spatially-Disaggregated Crop Production Statistics Data (MapSPAM) for 2010, which includes physical area, harvest area, production and yield, for 42 crops, disaggregated at the input-levels (e.g., irrigated/rainfed and high/low-input) on a 10 km grid globally. Harvested area values in this dataset are given for each technology aggregated by categories – crops/food/non-food - with no information on individual crops. Unit of measure: Harvested area for each technology: ha This new version of MapSPAM, available to download from the Harvard Dataverse Website, marks the third generation of the SPAM data series, following those of 2000 and 2005. More information on the production systems and selected crops is available in the Global Spatially-Disaggregated Crop Production Statistics Data (MapSPAM) full metadata at https://data.apps.fao.org/map/catalog/srv/eng/catalog.search#/metadata/59f7a5ef-2be4-43ee-9600-a6a9e9ff562a
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The new scorecard tracks progress toward the World Bank Group's vision to create a world free of poverty on a livable planet. The Scorecard includes three types of indicators: - Vision indicators - reflect the new vision for the WBG, showing the WBG’s ambition and providing high-level measures to gauge the direction and pace of progress in tackling global challenges. Vision indicators contain aggregated and disaggregated development context data for all countries in the world, where data is available. The Scorecard reports the latest available global updates for each of these indicators. - Client context indicators - reflect the circumstances in client countries, including multidimensional aspects of poverty, and are aligned with the Sustainable Development Goals (SDGs). They serve to frame the challenges clients face, and the context in which the WBG operates. Client Context indicators contain aggregated and disaggregated development context data for World Bank client countries, based on country eligibility for financing and where data is available. The Scorecard also reports the latest available update for each of these indicators. - WBG Results indicators monitor WBG progress on some of the most critical global challenges. Results data include: - Active Portfolio Results: Contain achieved and expected results of WBG operations based on its active portfolio as of end of June 2024. Includes aggregated and disaggregated data. - Results achieved since July 1st, 2023: Contain cumulative results achieved between July 1st, 2023 - June 30, 2024 from active and closed projects. Results achieved before July 1st, 2023 are excluded from this calculation. Includes aggregated data for World Bank, IBRD and IDA only. IFC and MIGA do not currently report this data. - Operations Details: Operation-level detail is provided for World Bank projects. However, in alignment with IFC and MIGA Access to Information Policies, project-level data is available in an aggregated format on the WBG Scorecard, provided the minimum threshold to secure individual clients' data is satisfied. This collection includes only a subset of indicators from the source dataset.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Percent of the overall population that is misallocated to unsettled cells (no exclusion), by aggregation level of the input data and output grid cell size.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Zip files enertalk-dataset-{house_number} contain a directory for each houses. Each directory holds a set of subdirectories that contain Parquet files for the daily aggregate and appliance-level data. The naming convention for these subdirectories is “” (e.g. “20161124” for November 24, 2016). The Parquet files are named “_.parquet.gzip” (e.g. “01_fridge.parquet.gzip”). In these names, the two-digit integer is uniquely associated with a distinct measuring device in a house. Each Parquet file consists of three columns: “timestamp,” “active_power,” and “reactive_power.” The “timestamp” column contains Unix timestamps in milliseconds, such that 1000 corresponds to one second. The “active_power” column represents active power in watts and the “reactive_power” column represents reactive power in VAR (volt-ampere reactive) units.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
TwitterThe Distributional Financial Accounts (DFAs) provide a quarterly measure of the distribution of U.S. household wealth since 1989, based on a comprehensive integration of disaggregated household-level wealth data with official aggregate wealth measures. The data set contains the level and share of each balance sheet item on the Financial Accounts' household wealth table (Table B.101.h), for various sub-populations in the United States. In our core data set, aggregate household wealth is allocated to each of four percentile groups of wealth: the top 1 percent, the next 9 percent (i.e., 90th to 99th percentile), the next 40 percent (50th to 90th percentile), and the bottom half (below the 50th percentile). Additionally, the data set contains the level and share of aggregate household wealth by income, age, generation, education, and race. The quarterly frequency makes the data useful for studying the business cycle dynamics of wealth concentration--which are typically difficult to observe in lower-frequency data because peaks and troughs often fall between times of measurement. These data will be updated about 10 or 11 weeks after the end of each quarter, making them a timely measure of the distribution of wealth.