100+ datasets found
  1. P

    German Credit Dataset Dataset

    • paperswithcode.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    German Credit Dataset Dataset [Dataset]. https://paperswithcode.com/dataset/german-credit-dataset
    Explore at:
    Description

    Two datasets are provided. the original dataset, in the form provided by Prof. Hofmann, contains categorical/symbolic attributes and is in the file "german.data".

    For algorithms that need numerical attributes, Strathclyde University produced the file "german.data-numeric". This file has been edited and several indicator variables added to make it suitable for algorithms which cannot cope with categorical variables. Several attributes that are ordered categorical (such as attribute 17) have been coded as integer. This was the form used by StatLog.

    This dataset requires use of a cost matrix:

    GoodBad
    Good01
    Bad50

    The rows represent the actual classification and the columns the predicted classification.

    It is worse to class a customer as good when they are bad (5), than it is to class a customer as bad when they are good (1).

  2. German Credit Scoring Data

    • kaggle.com
    Updated Jan 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Elshan Kazim (2024). German Credit Scoring Data [Dataset]. https://www.kaggle.com/datasets/elsnkazm/german-credit-scoring-data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 17, 2024
    Dataset provided by
    Kaggle
    Authors
    Elshan Kazim
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Context

    This dataset classifies people described by a set of attributes as good or bad credit risks. Link to the original dataset: German Credit Data

    Dataset Characteristics# Instances# Features
    Multivariate100020

    Since it is impossible to understand the original dataset due to its categorical features with coded, we have mapped those codes into appropriate ones.

    Content

    Features and explanations

    1. checking_acc_status (categorical) - Status of existing checking account
      • below_0: ... < 0 DM
      • below_200: 0 <= ... < 200 DM
      • above_200: ... >= 200 DM / salary assignments for at least 1 year
      • no_checking_acc: no checking account
    2. duration (numeric) - Agreed Loan Duration in months
    3. cred_hist (categorical) - Credit history status
      • no_loan_or_paid_duly_other: no credits taken/ all credits paid back duly
      • paid_duly_this_bank: all credits at this bank paid back duly
      • curr_loans_paid_duly: existing credits paid back duly till now
      • delay_in_past: delay in paying off in the past
      • risky_acc_or_curr_loan_other: critical account/ other credits existing (not at this bank)
    4. purpose (categorical) - Loan Request Purpose
      • car_new: car (new)
      • car_used: car (used)
      • furniture_equipment: furniture/equipment
      • radio_tv: radio/television
      • domestic_appliance: domestic appliances
      • repairs: repairs
      • education: education
      • retraining: retraining
      • business: business
      • others: others
    5. loan_amt (numerical) - Credit amount
    6. saving_acc_bonds (categorical) - Savings account/bonds
      • below_100: ... < 100 DM
      • below_500: 100 <= ... < 500 DM
      • below_1000: 500 <= ... < 1000 DM
      • above_1000: .. >= 1000 DM
      • unknown_no_saving_acc: unknown/ no savings account
    7. present_employment_since (categorical) - Present employment since
      • unemployed: unemployed
      • below_1y: ... < 1 year
      • below_4y: 1 <= ... < 4 years
      • below_7y: 4 <= ... < 7 years
      • above_7y: .. >= 7 years
    8. installment_rate (numerical) - Installment rate in percentage of disposable income
    9. personal_stat_gender (categorical) - Personal status and sex
      • male_divorced_separated
      • female_divorced_separated_married
      • male_single
      • male_married_widowed
      • female_single
    10. other_debtors_guarantors (categorical: co-applicant, guarantor, none)
    11. present_residence_since (numerical)
    12. property (categorical)
      • real_estate
      • life_insurance_or_agreements: if not real_estate: building society savings agreement/ life insurance
      • car_or_other: if not others: car or other, not in attribute 6
      • unknown_or_no_property: unknown / no property
    13. age (numerical)
    14. other_installment_plans (categorical: bank, stores, none)
    15. housing (categorical: rent, own, for_free)
    16. num_curr_loans - Number of existing credits at this bank
    17. job (categorical)
      • unemployed_non_resident: unemployed/ unskilled - non-resident
      • unskilled_resident: unskilled - resident
      • skilled_official: skilled employee / official
      • management_or_self_emp: management/ self-employed/highly qualified employee/ officer
    18. num_people_provide_maint (numerical) - Number of people being liable to provide maintenance for
    19. telephone (categorical)
    20. is_foreign_worker (categorical) - Indicates whether the individual is a foreign worker
  3. German Credit Risk

    • kaggle.com
    Updated Dec 14, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UCI Machine Learning (2016). German Credit Risk [Dataset]. https://www.kaggle.com/uciml/german-credit/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 14, 2016
    Dataset provided by
    Kaggle
    Authors
    UCI Machine Learning
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    Context

    The original dataset contains 1000 entries with 20 categorial/symbolic attributes prepared by Prof. Hofmann. In this dataset, each entry represents a person who takes a credit by a bank. Each person is classified as good or bad credit risks according to the set of attributes. The link to the original dataset can be found below.

    Content

    It is almost impossible to understand the original dataset due to its complicated system of categories and symbols. Thus, I wrote a small Python script to convert it into a readable CSV file. Several columns are simply ignored, because in my opinion either they are not important or their descriptions are obscure. The selected attributes are:

    1. Age (numeric)
    2. Sex (text: male, female)
    3. Job (numeric: 0 - unskilled and non-resident, 1 - unskilled and resident, 2 - skilled, 3 - highly skilled)
    4. Housing (text: own, rent, or free)
    5. Saving accounts (text - little, moderate, quite rich, rich)
    6. Checking account (numeric, in DM - Deutsch Mark)
    7. Credit amount (numeric, in DM)
    8. Duration (numeric, in month)
    9. Purpose (text: car, furniture/equipment, radio/TV, domestic appliances, repairs, education, business, vacation/others)

    Acknowledgements

    Source: UCI

  4. s

    German Dataset

    • ig.shaip.com
    Updated Jun 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shaip (2023). German Dataset [Dataset]. https://ig.shaip.com/offerings/speech-data-catalog/german-dataset/
    Explore at:
    Dataset updated
    Jun 11, 2023
    Dataset authored and provided by
    Shaip
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Ụlọ German DatasetDeutscher DatensatzHigh-Quality German Call-Center, na IVR Dataset maka AI & Ụdị Okwu Kpọtụrụ Anyị Oku-Center Data IVR Data Call-Center Data .elementor-58669 .elementor-element.elementor-element-91938a9{padding:20px 0px 50px;}.elementor-0 .elementor-element.elementor-element-58669f99d{padding:171px 0px 0px…

  5. R

    German Dataset

    • universe.roboflow.com
    zip
    Updated Jun 15, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    faculty of engineering minia university (2023). German Dataset [Dataset]. https://universe.roboflow.com/faculty-of-engineering-minia-university/german-7b6mo
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 15, 2023
    Dataset authored and provided by
    faculty of engineering minia university
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Sign Bounding Boxes
    Description

    German

    ## Overview
    
    German is a dataset for object detection tasks - it contains Sign annotations for 898 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  6. h

    German-PD

    • huggingface.co
    Updated Nov 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    PleIAs (2024). German-PD [Dataset]. https://huggingface.co/datasets/PleIAs/German-PD
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 6, 2024
    Dataset authored and provided by
    PleIAs
    Description

    🇩🇪 German Public Domain 🇩🇪

    German-Public Domain or German-PD is a large collection aiming to aggregate all German monographies and periodicals in the public domain. As of March 2024, it is the biggest German open corpus.

      Dataset summary
    

    The collection contains 260,638 individual texts making up 37,650,706,611 words recovered from multiple sources, including Internet Archive and various European national libraries and cultural heritage institutions. Each parquet file… See the full description on the dataset page: https://huggingface.co/datasets/PleIAs/German-PD.

  7. Ten Thousand German News Articles Dataset

    • kaggle.com
    • tblock.github.io
    zip
    Updated Jan 20, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Timo Block (2022). Ten Thousand German News Articles Dataset [Dataset]. https://www.kaggle.com/tblock/10kgnad
    Explore at:
    zip(21144764 bytes)Available download formats
    Dataset updated
    Jan 20, 2022
    Authors
    Timo Block
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    (see https://tblock.github.io/10kGNAD/ for the original dataset page)

    This page introduces the 10k German News Articles Dataset (10kGNAD) german topic classification dataset. The 10kGNAD is based on the One Million Posts Corpus and avalaible under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. You can download the dataset here.

    Why a German dataset?

    English text classification datasets are common. Examples are the big AG News, the class-rich 20 Newsgroups and the large-scale DBpedia ontology datasets for topic classification and for example the commonly used IMDb and Yelp datasets for sentiment analysis. Non-english datasets, especially German datasets, are less common. There is a collection of sentiment analysis datasets assembled by the Interest Group on German Sentiment Analysis. However, to my knowlege, no german topic classification dataset is avaliable to the public.

    Due to grammatical differences between the English and the German language, a classifyer might be effective on a English dataset, but not as effectiv on a German dataset. The German language has a higher inflection and long compound words are quite common compared to the English language. One would need to evaluate a classifyer on multiple German datasets to get a sense of it's effectivness.

    The dataset

    The 10kGNAD dataset is intended to solve part of this problem as the first german topic classification dataset. It consists of 10273 german language news articles from an austrian online newspaper categorized into nine topics. These articles are a till now unused part of the One Million Posts Corpus.

    In the One Million Posts Corpus each article has a topic path. For example Newsroom/Wirtschaft/Wirtschaftpolitik/Finanzmaerkte/Griechenlandkrise. The 10kGNAD uses the second part of the topic path, here Wirtschaft, as class label. In result the dataset can be used for multi-class classification.

    I created and used this dataset in my thesis to train and evaluate four text classifyers on the German language. By publishing the dataset I hope to support the advancement of tools and models for the German language. Additionally this dataset can be used as a benchmark dataset for german topic classification.

    Numbers and statistics

    As in most real-world datasets the class distribution of the 10kGNAD is not balanced. The biggest class Web consists of 1678, while the smalles class Kultur contains only 539 articles. However articles from the Web class have on average the fewest words, while artilces from the culture class have the second most words.

    Splitting into train and test

    I propose a stratifyed split of 10% for testing and the remaining articles for training. To use the dataset as a benchmark dataset, please used the train.csv and test.csv files located in the project root.

    Code

    Python scripts to extract the articles and split them into a train- and a testset avaliable in the code directory of this project. Make sure to install the requirements. The original corpus.sqlite3 is required to extract the articles (download here (compressed) or here (uncompressed)).

    License

    Creative Commons License

    This dataset is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. Please consider citing the authors of the One Million Post Corpus if you use the dataset.

  8. s

    German Dataset

    • la.shaip.com
    Updated Dec 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shaip (2024). German Dataset [Dataset]. https://la.shaip.com/offerings/speech-data-catalog/german-dataset/
    Explore at:
    Dataset updated
    Dec 8, 2024
    Dataset authored and provided by
    Shaip
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Domus Germanica DatasetDataset Germanica Dataset Altae Qualitatis pro Centris Vocationum et IVR pro Exemplis Intelligentiae Artificialis et Orationis Contactus Nobiscum Data Centrorum Vocationum Data IVR Data Centrorum Vocationum .elementor-58669 .elementor-element.elementor-element-91938a9{padding:20px 0px 50px 0px;}.elementor-58669 .elementor-element.elementor-element-99f171d{padding:0px 0px 20px…

  9. E

    German Fake News Dataset "GermanFakeNC"

    • live.european-language-grid.eu
    • explore.openaire.eu
    json
    Updated Apr 15, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). German Fake News Dataset "GermanFakeNC" [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/7564
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Apr 15, 2024
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Germany
    Description

    "GermanFakeNC" is a German Fake News Corpus including 490 texts which were retrieved from German alternative online media sources. Every fake statement in the text was verified claim-by-claim by authoritative sources (e.g. from local police authorities, scientific studies, the police press office, etc.). The time interval for most of the news is established from December 2015 to March 2018.

    Steps to reproduce the data are described in the README file.

    Please cite:

    @inproceedings{TPDL_Vogel19, author = {Inna Vogel and Peter Jiang}, title = {Fake News Detection with the New German Dataset "GermanFakeNC"}, booktitle = {Digital Libraries for Open Knowledge - 23rd International Conference on Theory and Practice of Digital Libraries, {TPDL} 2019, Oslo, Norway, September 9-12, 2019, Proceedings}, pages = {288--295}, year = {2019}, url = {https://doi.org/10.1007/978-3-030-30760-8\_25}, doi = {10.1007/978-3-030-30760-8\_25},}

  10. Traffic German Dataset

    • universe.roboflow.com
    zip
    Updated Apr 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Object detection (2024). Traffic German Dataset [Dataset]. https://universe.roboflow.com/object-detection-7sfqy/traffic-german
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 29, 2024
    Dataset authored and provided by
    Object detection
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Football Player Detection Bounding Boxes
    Description

    Traffic German

    ## Overview
    
    Traffic German is a dataset for object detection tasks - it contains Football Player Detection annotations for 6,523 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  11. h

    hatecheck-german

    • huggingface.co
    Updated Jan 23, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Paul Röttger (2025). hatecheck-german [Dataset]. https://huggingface.co/datasets/Paul/hatecheck-german
    Explore at:
    Dataset updated
    Jan 23, 2025
    Authors
    Paul Röttger
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset Card for Multilingual HateCheck

      Dataset Description
    

    Multilingual HateCheck (MHC) is a suite of functional tests for hate speech detection models in 10 different languages: Arabic, Dutch, French, German, Hindi, Italian, Mandarin, Polish, Portuguese and Spanish. For each language, there are 25+ functional tests that correspond to distinct types of hate and challenging non-hate. This allows for targeted diagnostic insights into model performance. For more details… See the full description on the dataset page: https://huggingface.co/datasets/Paul/hatecheck-german.

  12. g

    GERDA -- German Election Database

    • german-elections.com
    Updated Jan 2, 2006
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hanno Hilbig (2006). GERDA -- German Election Database [Dataset]. http://www.german-elections.com/
    Explore at:
    Dataset updated
    Jan 2, 2006
    Authors
    Hanno Hilbig
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Germany
    Description

    Comprehensive dataset of local, state, and federal election results in Germany, facilitating research on electoral behavior, representation, and political responsiveness. Umfassende Datenbank von: Bundestagswahlergebnissen, Landeswahlergebnissen und Kommunalwahlergebnissen in Deutschland, die die Forschung zu Wahlverhalten, politischer Repräsentation und politischer Reaktionsfähigkeit ermöglicht.

  13. h

    German-PD-Newspapers

    • huggingface.co
    Updated Dec 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sebastian Majstorovic (2024). German-PD-Newspapers [Dataset]. https://huggingface.co/datasets/storytracer/German-PD-Newspapers
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 5, 2024
    Authors
    Sebastian Majstorovic
    License

    https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/

    Description

    Dataset Card for Public Domain Newspapers (German)

    This dataset contains 13 billion words of OCR text extracted from German historical newspapers.

      Dataset Details
    
    
    
    
    
      Dataset Description
    

    Curated by: Sebastian Majstorovic Language(s) (NLP): German License: Dataset: CC0, Texts: Public Domain

      Dataset Sources [optional]
    

    Repository: https://www.deutsche-digitale-bibliothek.de/newspaper

      Copyright & License
    

    The newspapers texts have been… See the full description on the dataset page: https://huggingface.co/datasets/storytracer/German-PD-Newspapers.

  14. h

    german-hate-speech-superset

    • huggingface.co
    Updated Nov 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Manuel Tonneau (2024). german-hate-speech-superset [Dataset]. https://huggingface.co/datasets/manueltonneau/german-hate-speech-superset
    Explore at:
    Dataset updated
    Nov 6, 2024
    Authors
    Manuel Tonneau
    Description

    German Hate Speech Superset

    This dataset is a superset (N=50,545) of posts annotated as hateful or not. It results from the preprocessing and merge of all available German hate speech datasets in April 2024. These datasets were identified through a systematic survey of hate speech datasets conducted in early 2024. We only kept datasets that:

    are documented are publicly available focus on hate speech, defined broadly as "any kind of communication in speech, writing or behavior, that… See the full description on the dataset page: https://huggingface.co/datasets/manueltonneau/german-hate-speech-superset.

  15. P

    Voxforge German Dataset

    • paperswithcode.com
    Updated Nov 14, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Voxforge German Dataset [Dataset]. https://paperswithcode.com/dataset/voxforge-german
    Explore at:
    Dataset updated
    Nov 14, 2022
    Description

    VoxForge is an open speech dataset that was set up to collect transcribed speech for use with Free and Open Source Speech Recognition Engines (on Linux, Windows and Mac).

    We will make available all submitted audio files under the GPL license, and then 'compile' them into acoustic models for use with Open Source speech recognition engines such as CMU Sphinx, ISIP, Julius (github) and HTK (note: HTK has distribution restrictions).

  16. German Reichstag Election Data, 1871-1912

    • icpsr.umich.edu
    ascii, sas, spss
    Updated Jan 12, 2006
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Inter-university Consortium for Political and Social Research (2006). German Reichstag Election Data, 1871-1912 [Dataset]. http://doi.org/10.3886/ICPSR00043.v1
    Explore at:
    sas, spss, asciiAvailable download formats
    Dataset updated
    Jan 12, 2006
    Dataset authored and provided by
    Inter-university Consortium for Political and Social Researchhttps://www.icpsr.umich.edu/web/pages/
    License

    https://www.icpsr.umich.edu/web/ICPSR/studies/43/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/43/terms

    Time period covered
    1871 - 1912
    Area covered
    Global, Germany
    Description

    This data collection contains electoral data at the wahlkreis and staat levels for the Reichstag elections of 1871, 1874, 1877, 1878, 1881, 1884, 1890, 1893, 1898, 1903, 1907, and 1912. The variables for each election provide information on the votes cast for parties, including the Conservative Party, the German Empire Party, the National-Liberals, the Liberal Empire Party, the People's Party, the Social Democrats, the Progress Party, the Catholic Center, the Particularists, the Poles Party, the Protest Party, the Antisemites, the Free-thinking People's Party, the German Reform Party, the Farmers' Union, the Peasants' Union, and splinter parties. Data are also provided on the total population in 1871 and every fifth year between 1875 and 1910, and the proportions of Protestants and of Catholics in the total population for 1871, 1875, 1880, 1885, 1890, 1905, and 1910. Additional variables provide information on the number of eligible voters, valid and invalid votes cast, and voter turnout.

  17. Germany: civilian workforce by gender and foreign workers 1939-1944

    • statista.com
    Updated Dec 18, 2012
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2012). Germany: civilian workforce by gender and foreign workers 1939-1944 [Dataset]. https://www.statista.com/statistics/1290338/german-workforce-wwii-background/
    Explore at:
    Dataset updated
    Dec 18, 2012
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    May 1941 - Sep 1944
    Area covered
    Germany
    Description

    In late May 1939, just three months before the Second World War began in Europe, Germany's workforce was made up of almost 25 million men, 15 million women, and a very small number of foreign workers. The share of German men in the workforce decreased each year thereafter, as more were conscripted into the armed forces, and there were approximately 11 million fewer German male citizens in the workforce by September 1944. The number of German women fluctuated, but remained between 14 and 15 million throughout the given period, and it exceeded the number of German men in 1944. Despite the number of German men in the workforce dropping by 45 percent, the total number of workers in German was consistently around 36 million between 1940 and 1944, as this difference was offset by foreign and forced laborers. These workers were mostly drafted from annexed territories in Eastern Europe, and prisoners were transferred from concentration and POW camps to meet the labor demands in various areas of Germany.

  18. J

    The German Cliometrics Database (replication data)

    • journaldata.zbw.eu
    csv, json
    Updated Sep 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tobias A. Jopp; Mark Spoerer; Tobias A. Jopp; Mark Spoerer (2024). The German Cliometrics Database (replication data) [Dataset]. http://doi.org/10.15456/vswg.2024078.1048204229
    Explore at:
    csv(7674), csv(1184041), json(1643212)Available download formats
    Dataset updated
    Sep 17, 2024
    Dataset provided by
    ZBW - Leibniz Informationszentrum Wirtschaft
    Authors
    Tobias A. Jopp; Mark Spoerer; Tobias A. Jopp; Mark Spoerer
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Germany
    Description

    This short article introduces the German Cliometrics Database as the fundament of Jopp and Spoerer (2024) who trace cliometric research on German history. This newly constructed database of every publication which (1) contributes to the historiography of Germany and (2) employs, as a baseline, inferential statistics enables researchers to specifically find cliometric studies related to their own work much quicker. Even though no full texts are provided along with the data file, the collected abstracts or, respectively, summaries for every publication in the database allow for some baseline text mining approaches. Along with the remaining information provided, they may also form the basis for broader bibliometric or historiographical studies.

  19. Population numbers in Germany 1990-2023

    • statista.com
    Updated Feb 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista Research Department (2025). Population numbers in Germany 1990-2023 [Dataset]. https://www.statista.com/topics/13131/german-election-2025/
    Explore at:
    Dataset updated
    Feb 24, 2025
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Statista Research Department
    Area covered
    Germany
    Description

    This statistic shows the development of population numbers in Germany from 1990 to 2023. In 2023, the population in Germany, as of December 31 of that year, amounted to 84.67 million people. An increase compared to the previous year.

  20. d

    Data from: German Socio-Economic Panel

    • dknet.org
    • neuinfo.org
    • +2more
    Updated Jul 31, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). German Socio-Economic Panel [Dataset]. http://identifiers.org/RRID:SCR_013140
    Explore at:
    Dataset updated
    Jul 31, 2024
    Description

    A wide-ranging representative longitudinal study of private households that permits researchers to track yearly changes in the health and economic well-being of older people relative to younger people in Germany from 1984 to the present. Every year, there were nearly 11,000 households, and more than 20,000 persons sampled by the fieldwork organization TNS Infratest Sozialforschung. The data provide information on all household members, consisting of Germans living in the Old and New German States, Foreigners, and recent Immigrants to Germany. The Panel was started in 1984. Some of the many topics include household composition, occupational biographies, employment, earnings, health and satisfaction indicators. In addition to standard demographic information, the GSOEP questionnaire also contains objective measuresuse of time, use of earnings, income, benefit payments, health, etc. and subjective measures - level of satisfaction with various aspects of life, hopes and fears, political involvement, etc. of the German population. The first wave, collected in 1984 in the western states of Germany, contains 5,921 households in two randomly sampled sub-groups: 1) German Sub-Sample: people in private households where the head of household was not of Turkish, Greek, Yugoslavian, Spanish, or Italian nationality; 2) Foreign Sub-Sample: people in private households where the head of household was of Turkish, Greek, Yugoslavian, Spanish, or Italian nationality. In each year since 1984, the GSOEP has attempted to re-interview original sample members unless they leave the country. A major expansion of the GSOEP was necessitated by German reunification. In June 1990, the GSOEP fielded a first wave of the eastern states of Germany. This sub-sample includes individuals in private households where the head of household was a citizen of the German Democratic Republic. The first wave contains 2,179 households. In 1994 and 1995, the GSOEP added a sample of immigrants to the western states of Germany from 522 households who arrived after 1984, which in 2006 included 360 households and 684 respondents. In 1998 a new refreshment sample of 1,067 households was selected from the population of private households. In 2000 a sample was drawn using essentially similar selection rules as the original German sub-sample and the 1998 refreshment sample with some modifications. The 2000 sample includes 6,052 households covering 10,890 individuals. Finally, in 2002, an overrepresentation of high-income households was added with 2,671 respondents from 1,224 households, of which 1,801 individuals (689 households) were still included in the year 2006. Data Availability: The data are available to researchers in Germany and abroad in SPSS, SAS, TDA, STATA, and ASCII format for immediate use. Extensive documentation in English and German is available online. The SOEP data are available in German and English, alone or in combination with data from other international panel surveys (e.g., the Cross-National Equivalent Files which contain panel data from Canada, Germany, and the United States). The public use file of the SOEP with anonymous microdata is provided free of charge (plus shipping costs) to universities and research centers. The individual SOEP datasets cannot be downloaded from the DIW Web site due to data protection regulations. Use of the data is subject to special regulations, and data privacy laws necessitate the signing of a data transfer contract with the DIW. The English Language Public Use Version of the GSOEP is distributed and administered by the Department of Policy Analysis and Management, Cornell University. The data are available on CD-ROM from Cornell for a fee. Full instructions for accessing GSOEP data may be accessed on the project website, http://www.human.cornell.edu/che/PAM/Research/Centers-Programs/German-Panel/cnef.cfm * Dates of Study: 1984-present * Study Features: Longitudinal, International * Sample Size: ** 1984: 12,290 (GSOEP West) ** 1990: 4,453 (GSOEP East) ** 2000: 20,000+ Links: * Cornell Project Website: http://www.human.cornell.edu/che/PAM/Research/Centers-Programs/German-Panel/cnef.cfm * GSOEP ICPSR: http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/00131

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
German Credit Dataset Dataset [Dataset]. https://paperswithcode.com/dataset/german-credit-dataset

German Credit Dataset Dataset

Explore at:
Description

Two datasets are provided. the original dataset, in the form provided by Prof. Hofmann, contains categorical/symbolic attributes and is in the file "german.data".

For algorithms that need numerical attributes, Strathclyde University produced the file "german.data-numeric". This file has been edited and several indicator variables added to make it suitable for algorithms which cannot cope with categorical variables. Several attributes that are ordered categorical (such as attribute 17) have been coded as integer. This was the form used by StatLog.

This dataset requires use of a cost matrix:

GoodBad
Good01
Bad50

The rows represent the actual classification and the columns the predicted classification.

It is worse to class a customer as good when they are bad (5), than it is to class a customer as bad when they are good (1).

Search
Clear search
Close search
Google apps
Main menu