100+ datasets found
  1. P

    German Credit Dataset Dataset

    • paperswithcode.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    German Credit Dataset Dataset [Dataset]. https://paperswithcode.com/dataset/german-credit-dataset
    Explore at:
    Description

    Two datasets are provided. the original dataset, in the form provided by Prof. Hofmann, contains categorical/symbolic attributes and is in the file "german.data".

    For algorithms that need numerical attributes, Strathclyde University produced the file "german.data-numeric". This file has been edited and several indicator variables added to make it suitable for algorithms which cannot cope with categorical variables. Several attributes that are ordered categorical (such as attribute 17) have been coded as integer. This was the form used by StatLog.

    This dataset requires use of a cost matrix:

    GoodBad
    Good01
    Bad50

    The rows represent the actual classification and the columns the predicted classification.

    It is worse to class a customer as good when they are bad (5), than it is to class a customer as bad when they are good (1).

  2. d

    Statlog (German Credit Data)

    • data.world
    csv, zip
    Updated Feb 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UCI (2024). Statlog (German Credit Data) [Dataset]. https://data.world/uci/statlog-german-credit-data
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Feb 11, 2024
    Dataset provided by
    data.world, Inc.
    Authors
    UCI
    Description

    Source:

    Professor Dr. Hans Hofmann Institut f"ur Statistik und "Okonometrie Universit"at Hamburg FB Wirtschaftswissenschaften Von-Melle-Park 5 2000 Hamburg 13

    Data Set Information:

    Two datasets are provided. the original dataset, in the form provided by Prof. Hofmann, contains categorical/symbolic attributes and is in the file "german.data". For algorithms that need numerical attributes, Strathclyde University produced the file "german.data-numeric". This file has been edited and several indicator variables added to make it suitable for algorithms which cannot cope with categorical variables. Several attributes that are ordered categorical (such as attribute 17) have been coded as integer. This was the form used by StatLog. This dataset requires use of a cost matrix (see below) ..... 1 2************** 1 0 1***********- 2 5 0 (1 = Good, 2 = Bad) The rows represent the actual classification and the columns the predicted classification. It is worse to class a customer as good when they are bad (5), than it is to class a customer as bad when they are good (1).

    Attribute Information:

    Attribute 1: (qualitative) Status of existing checking account A11 : ... < 0 DM A12 : 0 <= ... < 200 DM A13 : ... >= 200 DM / salary assignments for at least 1 year A14 : no checking account Attribute 2: (numerical) Duration in month Attribute 3: (qualitative) Credit history A30 : no credits taken/ all credits paid back duly A31 : all credits at this bank paid back duly A32 : existing credits paid back duly till now A33 : delay in paying off in the past A34 : critical account/ other credits existing (not at this bank) Attribute 4: (qualitative) Purpose A40 : car (new) A41 : car (used) A42 : furniture/equipment A43 : radio/television A44 : domestic appliances A45 : repairs A46 : education A47 : (vacation - does not exist?) A48 : retraining A49 : business A410 : others Attribute 5: (numerical) Credit amount Attibute 6: (qualitative) Savings account/bonds A61 : ... < 100 DM A62 : 100 <= ... < 500 DM A63 : 500 <= ... < 1000 DM A64 : .. >= 1000 DM A65 : unknown/ no savings account Attribute 7: (qualitative) Present employment since A71 : unemployed A72 : ... < 1 year A73 : 1 <= ... < 4 years A74 : 4 <= ... < 7 years A75 : .. >= 7 years Attribute 8: (numerical) Installment rate in percentage of disposable income Attribute 9: (qualitative) Personal status and sex A91 : male : divorced/separated A92 : female : divorced/separated/married A93 : male : single A94 : male : married/widowed A95 : female : single Attribute 10: (qualitative) Other debtors / guarantors A101 : none A102 : co-applicant A103 : guarantor Attribute 11: (numerical) Present residence since Attribute 12: (qualitative) Property A121 : real estate A122 : if not A121 : building society savings agreement/ life insurance A123 : if not A121/A122 : car or other, not in attribute 6 A124 : unknown / no property Attribute 13: (numerical) Age in years Attribute 14: (qualitative) Other installment plans A141 : bank A142 : stores A143 : none Attribute 15: (qualitative) Housing A151 : rent A152 : own A153 : for free Attribute 16: (numerical) Number of existing credits at this bank Attribute 17: (qualitative) Job A171 : unemployed/ unskilled - non-resident A172 : unskilled - resident A173 : skilled employee / official A174 : management/ self-employed/ highly qualified employee/ officer Attribute 18: (numerical) Number of people being liable to provide maintenance for Attribute 19: (qualitative) Telephone A191 : none A192 : yes, registered under the customers name Attribute 20: (qualitative) foreign worker A201 : yes A202 : no

    Relevant Papers:

    N/A

    Papers That Cite This Data Set1:

    Jeroen Eggermont and Joost N. Kok and Walter A. Kosters. Genetic Programming for data classification: partitioning the search space. SAC. 2004. * Ke Wang and Shiyu Zhou and Ada Wai-Chee Fu and Jeffrey Xu Yu. Mining Changes of Classification by Correspondence Tracing. SDM. 2003. * Avelino J. Gonzalez and Lawrence B. Holder and Diane J. Cook. Graph-Based Concept Learning. FLAIRS Conference. 2001. * Oya Ekin and Peter L. Hammer and Alexander Kogan and Pawel Winter. Distance-Based Classification Methods. e p o r t RUTCOR ffl Rutgers Center for Operations Research ffl Rutgers University. 1996. * Paul O' Dea and Josephine Griffith and Colm O' Riordan. Combining Feature Selection and Neural Networks for Solving Classification Problems. Information Technology Department, National University of Ireland. * Chotirat Ann and Dimitrios Gunopulos. Scaling up the Naive Bayesian Classifier: Using Decision Trees for Feature Selection. Computer Science Department University of California. * Paul O' Dea and David Griffith and Colm O' Riordan. DEPARTMENT OF INFORMATION TECHNOLOGY. P. O'Dea (NUI.

    Citation Request:

    Please refer to the Machine Learning Repository's citation policy. [1] Papers were automatically harvested and associated with this data set, in collaborationwith Rexa.info

    Source: http://archive.ics.uci.edu/ml/datasets/Statlog+%28German+Credit+Data%29

  3. k

    German-Credit-Data-Set-with-Credit-Risk

    • kaggle.com
    Updated Dec 14, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). German-Credit-Data-Set-with-Credit-Risk [Dataset]. https://www.kaggle.com/datasets/benjaminmcgregor/german-credit-data-set-with-credit-risk
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 14, 2022
    Description

    This data set has been created by re-integrating the 'credit risk' attribute from Professor Dr. Hans Hofmann's 'German Credit Data' data set on the UCI Machine Learning Repository (https://archive.ics.uci.edu/ml/datasets/statlog+(german+credit+data)) into the refined data set published by UCI (https://www.kaggle.com/datasets/uciml/german-credit).

    I have included the Python script that I wrote to produce this data set.

  4. H

    Replication Data for: German Credit

    • dataverse.harvard.edu
    tsv
    Updated Apr 6, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Harvard Dataverse (2016). Replication Data for: German Credit [Dataset]. http://doi.org/10.7910/DVN/Q8MAW8
    Explore at:
    tsv(53493)Available download formats
    Dataset updated
    Apr 6, 2016
    Dataset provided by
    Harvard Dataverse
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Original data from: https://archive.ics.uci.edu/ml/datasets/Statlog+(German+Credit+Data), using the file "german.data-numeric" version produced by Strathclyde University. Changes made: - changed the ordering of Attribute 3 (Credit History) to try to extract monotone relationship: ORIGINAL ORDERING: A30 : no credits taken/all credits paid back duly, A31 : all credits at this bank paid back duly, A32 : existing credits paid back duly till now, A33 : delay in paying off in the past, A34 : critical account/other credits existing (not at this bank)) NEW ORDERING: 0=all credits paid back (A31) 1=all credits paid back duly til now (A32) 2= no credits taken (A30) 3= delay in past (A33) 4=critical acct (A34). ATTRIBUTES: 0 CLASS Credit Rating: +1 is bad / -1 is good 1 BalanceCheque 2 Loan NurnMonth 3 CreditHistory 4 CreditAmt 5 SavingsBalance 6 Mths in PresentEmployment 7 PersonStatusSex 8 PresentResidenceSince 9 Property 10 AgeInYears 11 OtherInstallmentPlans (highest val is NO other installment plans) 12 NumExistingCreditsThisBank 13 NumPplLiablMaint 14 Telephone 15 ForeignWorker 16 Purpose-CarNew 17 Purpose-CarOld 18 otherdebtor-none (compared to guarantor) 19 otherdebt-coappl (compared to guarantor) 20 house-rent (compared to 'for free') 21 house-owns (compared to 'for free') 22 job-unemployed (vs mgt) 23 jobs-unskilled (vs mgt) 24 job-skilled (vs mgt)

  5. Ten Thousand German News Articles Dataset

    • kaggle.com
    • tblock.github.io
    zip
    Updated Jan 20, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Timo Block (2022). Ten Thousand German News Articles Dataset [Dataset]. https://www.kaggle.com/tblock/10kgnad
    Explore at:
    zip(21144764 bytes)Available download formats
    Dataset updated
    Jan 20, 2022
    Authors
    Timo Block
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    (see https://tblock.github.io/10kGNAD/ for the original dataset page)

    This page introduces the 10k German News Articles Dataset (10kGNAD) german topic classification dataset. The 10kGNAD is based on the One Million Posts Corpus and avalaible under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. You can download the dataset here.

    Why a German dataset?

    English text classification datasets are common. Examples are the big AG News, the class-rich 20 Newsgroups and the large-scale DBpedia ontology datasets for topic classification and for example the commonly used IMDb and Yelp datasets for sentiment analysis. Non-english datasets, especially German datasets, are less common. There is a collection of sentiment analysis datasets assembled by the Interest Group on German Sentiment Analysis. However, to my knowlege, no german topic classification dataset is avaliable to the public.

    Due to grammatical differences between the English and the German language, a classifyer might be effective on a English dataset, but not as effectiv on a German dataset. The German language has a higher inflection and long compound words are quite common compared to the English language. One would need to evaluate a classifyer on multiple German datasets to get a sense of it's effectivness.

    The dataset

    The 10kGNAD dataset is intended to solve part of this problem as the first german topic classification dataset. It consists of 10273 german language news articles from an austrian online newspaper categorized into nine topics. These articles are a till now unused part of the One Million Posts Corpus.

    In the One Million Posts Corpus each article has a topic path. For example Newsroom/Wirtschaft/Wirtschaftpolitik/Finanzmaerkte/Griechenlandkrise. The 10kGNAD uses the second part of the topic path, here Wirtschaft, as class label. In result the dataset can be used for multi-class classification.

    I created and used this dataset in my thesis to train and evaluate four text classifyers on the German language. By publishing the dataset I hope to support the advancement of tools and models for the German language. Additionally this dataset can be used as a benchmark dataset for german topic classification.

    Numbers and statistics

    As in most real-world datasets the class distribution of the 10kGNAD is not balanced. The biggest class Web consists of 1678, while the smalles class Kultur contains only 539 articles. However articles from the Web class have on average the fewest words, while artilces from the culture class have the second most words.

    Splitting into train and test

    I propose a stratifyed split of 10% for testing and the remaining articles for training. To use the dataset as a benchmark dataset, please used the train.csv and test.csv files located in the project root.

    Code

    Python scripts to extract the articles and split them into a train- and a testset avaliable in the code directory of this project. Make sure to install the requirements. The original corpus.sqlite3 is required to extract the articles (download here (compressed) or here (uncompressed)).

    License

    Creative Commons License

    This dataset is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. Please consider citing the authors of the One Million Post Corpus if you use the dataset.

  6. P

    Voxforge German Dataset

    • paperswithcode.com
    Updated Aug 2, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Voxforge German Dataset [Dataset]. https://paperswithcode.com/dataset/voxforge-german
    Explore at:
    Dataset updated
    Aug 2, 2022
    Description

    VoxForge is an open speech dataset that was set up to collect transcribed speech for use with Free and Open Source Speech Recognition Engines (on Linux, Windows and Mac).

    We will make available all submitted audio files under the GPL license, and then 'compile' them into acoustic models for use with Open Source speech recognition engines such as CMU Sphinx, ISIP, Julius (github) and HTK (note: HTK has distribution restrictions).

  7. germanquad

    • huggingface.co
    • opendatalab.com
    Updated Jun 16, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    deepset (2021). germanquad [Dataset]. https://huggingface.co/datasets/deepset/germanquad
    Explore at:
    Dataset updated
    Jun 16, 2021
    Dataset authored and provided by
    deepsethttps://www.deepset.ai/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In order to raise the bar for non-English QA, we are releasing a high-quality, human-labeled German QA dataset consisting of 13 722 questions, incl. a three-way annotated test set. The creation of GermanQuAD is inspired by insights from existing datasets as well as our labeling experience from several industry projects. We combine the strengths of SQuAD, such as high out-of-domain performance, with self-sufficient questions that contain all relevant information for open-domain QA as in the NaturalQuestions dataset. Our training and test datasets do not overlap like other popular datasets and include complex questions that cannot be answered with a single entity or only a few words.

  8. w

    Germany

    • workwithdata.com
    Updated Apr 18, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2024). Germany [Dataset]. https://www.workwithdata.com/place/germany
    Explore at:
    Dataset updated
    Apr 18, 2024
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Germany
    Description

    Explore Germany through unique data from multiples sources: key facts, real-time news, interactive charts, detailed maps & open datasets

  9. G

    Germany Population: German

    • ceicdata.com
    Updated Mar 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com (2024). Germany Population: German [Dataset]. https://www.ceicdata.com/en/germany/population/population-german
    Explore at:
    Dataset updated
    Mar 15, 2024
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Dec 1, 2011 - Dec 1, 2022
    Area covered
    Germany
    Variables measured
    Population
    Description

    Germany Population: German data was reported at 72,034,650.000 Person in 2022. This records a decrease from the previous number of 72,344,071.000 Person for 2021. Germany Population: German data is updated yearly, averaging 73,301,664.000 Person from Dec 1970 to 2022, with 53 observations. The data reached an all-time high of 75,212,869.000 Person in 2004 and a record low of 56,478,581.000 Person in 1986. Germany Population: German data remains active status in CEIC and is reported by Statistisches Bundesamt. The data is categorized under Global Database’s Germany – Table DE.G001: Population. Population prior to 1990 covers West Germany only.

  10. german dataset

    • kaggle.com
    zip
    Updated Aug 17, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    daniel_lopez (2019). german dataset [Dataset]. https://www.kaggle.com/datasets/dnllpz/german-dataset
    Explore at:
    zip(17216 bytes)Available download formats
    Dataset updated
    Aug 17, 2019
    Authors
    daniel_lopez
    Description

    Dataset

    This dataset was created by daniel_lopez

    Contents

  11. E

    Dataset: The plural interpretability of German linking elements...

    • live.european-language-grid.eu
    • zenodo.org
    csv
    Updated Aug 15, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2021). Dataset: The plural interpretability of German linking elements ("Morphology") [Dataset]. https://live.european-language-grid.eu/catalogue/lcr/7422
    Explore at:
    csvAvailable download formats
    Dataset updated
    Aug 15, 2021
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    This dataset accompanies a paper to be published in "Morphology" (JOMO, Springer). Under the present DOI, all data generated for this research as well as all scripts used are stored. The paper itself is not CC-licensed, refer to Springer's "Morphology" website for details!AbstractIn this paper, we take a closer theoretical and empirical look at the linking elements in German N1+N2 compounds which are identical to the plural marker of N1 (such as -er with umlaut, as in Häus-er-meer 'sea of houses'). Various perspectives on the actual extent of plural interpretability of these pluralic linking elements are expressed in the literature. We aim to clarify this question by empirically examining to what extent there may be a relationship between plural form and meaning which informs in which sorts of compounds pluralic linking elements appear. Specifically, we investigate whether pluralic linking elements occur especially frequently in compounds where a plural meaning of the first constituent is induced either externally (through plural inflection of the entire compound) or internally (through a relation between the constituents such that N2 forces N1 to be conceptually plural, as in the example above). The results of a corpus study using the DECOW16A corpus and a split-100 experiment show that in the internal but not external plural meaning conditions, a pluralic linking element is preferred over a non-pluralic one, though there is considerable inter-speaker variability, and limitations imposed by other constraints on linking element distribution also play a role. However, we show the overall tendency that German language users do use pluralic linking elements as cues to the plural interpretation of N1+N2 compounds. Our interpretation does not reference a specific morphological framework. Instead, we view our data as strengthening the general approach of probabilistic morphology.

  12. s

    German Language Datasets | Call Center, Virtual Assistant & TTS

    • shaip.com
    • vi.shaip.com
    • +27more
    Updated Jun 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shaip (2023). German Language Datasets | Call Center, Virtual Assistant & TTS [Dataset]. https://www.shaip.com/offerings/speech-data-catalog/german-dataset/
    Explore at:
    Dataset updated
    Jun 11, 2023
    Dataset authored and provided by
    Shaip
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Enhance your Conversational AI model with our Off-the-Shelf German Language Datasets. Shaip high-quality audio datasets are a quick and effective solution for model training.

  13. Vietnamese German Dataset

    • kaggle.com
    zip
    Updated Dec 24, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nguyễn Trung Hậu (2018). Vietnamese German Dataset [Dataset]. https://www.kaggle.com/flightstar/vietnamese-german-dataset
    Explore at:
    zip(456165 bytes)Available download formats
    Dataset updated
    Dec 24, 2018
    Authors
    Nguyễn Trung Hậu
    License

    Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Nguyễn Trung Hậu

    Released under Database: Open Database, Contents: © Original Authors

    Contents

  14. T

    Texas German Sample Corpus

    • dataverse.tdl.org
    • dataverse-prod.tdl.org
    bin, tsv, txt, wav
    Updated Feb 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Margaret Blevins; Margaret Blevins (2024). Texas German Sample Corpus [Dataset]. http://doi.org/10.18738/T8/IOX9ZA
    Explore at:
    wav(12790956), bin(14824), wav(25855234), bin(52448), wav(16278382), wav(30825244), bin(23773), wav(18861400), bin(31428), bin(68753), wav(29250172), wav(11289036), bin(36927), bin(34685), wav(48292100), bin(34751), wav(10595244), wav(15029632), wav(11364396), wav(20104532), bin(104715), bin(47994), bin(8479), wav(12072650), bin(61836), wav(18987750), bin(82713), bin(69348), bin(29738), wav(23227036), bin(119156), bin(88550), bin(106906), bin(88191), bin(54710), wav(23313500), bin(34799), wav(5508844), bin(127336), wav(16127938), wav(11634760), wav(10389110), wav(19228828), bin(18696), wav(14654788), wav(15058040), bin(67781), wav(18535980), wav(13864116), wav(15385256), bin(61508), wav(11900592), wav(12246806), bin(65014), wav(3739878), bin(47082), wav(41747320), bin(30024), bin(40022), wav(19404044), bin(73975), bin(12677), wav(31295816), wav(11942828), wav(41425942), bin(94184), wav(24013302), bin(56085), wav(10164434), wav(16648632), bin(127737), wav(50438200), wav(20629020), bin(79777), wav(7140396), bin(125239), bin(46545), wav(12645964), wav(28180468), wav(19127424), wav(30465016), wav(12700844), bin(31477), wav(27118168), bin(27516), bin(128964), wav(10964740), bin(129241), bin(81517), wav(5292044), bin(34745), bin(86921), wav(7097072), wav(12866596), wav(19608962), wav(14175648), wav(6619812), bin(65646), wav(4586142), bin(64662), bin(68000), bin(46111), wav(4410282), bin(61079), wav(7424114), bin(14196), bin(90750), bin(16213), wav(10853976), wav(29871606), bin(38796), wav(24352200), wav(14975118), bin(67687), wav(39437350), wav(24979808), bin(38030), bin(62014), wav(20426160), bin(59061), bin(39968), wav(14216796), bin(83016), wav(16998952), wav(16038118), wav(58720196), wav(21679108), wav(9965716), wav(49820980), bin(111250), bin(18201), wav(25512686), bin(64610), wav(37004348), bin(64931), wav(3048258), bin(13037), bin(47904), wav(20258912), wav(7530188), wav(27000288), wav(5826276), wav(27284650), wav(19693814), bin(92398), bin(111957), wav(35480126), wav(11612886), bin(101084), wav(12272892), wav(13650528), bin(75245), wav(17044468), bin(39039), wav(24430756), bin(68621), bin(67483), wav(65875842), bin(64482), wav(13795276), bin(15369), bin(26213), wav(42277748), wav(14096408), wav(18751254), wav(8560456), bin(27994), wav(12568234), wav(22270750), wav(17192710), bin(19215), wav(22407632), wav(56463626), bin(25981), bin(20656), bin(63701), wav(13190756), bin(81298), bin(70114), wav(17560380), wav(11316932), wav(14237170), wav(18029476), wav(16277646), bin(35997), txt(27125), bin(75523), bin(47985), bin(30296), bin(31990), wav(12119580), bin(84856), wav(30699364), bin(35237), bin(103800), bin(69133), bin(37458), bin(176022), wav(9542228), bin(56022), bin(78412), wav(20490896), wav(22578346), wav(30246604), bin(98065), wav(19831894), bin(80014), wav(24820288), wav(20423966), wav(7596410), wav(35831730), wav(14944758), wav(53220820), wav(12340558), wav(3553570), bin(34269), bin(50838), wav(39525712), bin(104096), wav(16429012), wav(30459424), bin(79300), wav(8055388), wav(40193350), bin(53328), wav(6719136), wav(18534548), wav(22487068), bin(216), wav(4705540), bin(63492), wav(17336078), bin(46334), wav(27710144), bin(41175), wav(4915710), bin(27165), bin(52219), bin(41767), bin(19306), bin(29116), wav(10120086), wav(9983308), wav(21657114), bin(68800), bin(46426), bin(88706), bin(82839), wav(9481550), bin(71857), bin(63079), wav(9409746), wav(7851816), bin(59369), wav(25363980), wav(13605484), bin(98796), bin(38692), bin(95652), bin(39582), bin(48298), bin(101564), bin(48522), wav(6801450), wav(11730608), wav(19039156), bin(81961), bin(75529), wav(77510874), wav(6289076), wav(22623668), bin(25129), bin(81673), wav(11112930), bin(73057), bin(59109), bin(14996), bin(50073), bin(42302), wav(7612590), bin(45591), wav(16639000), bin(37985), bin(67039), bin(71843), wav(12643988), bin(38729), bin(44481), wav(29539084), bin(55627), wav(33817272), bin(23364), wav(21631654), bin(50574), wav(36143700), bin(51914), wav(16458148), bin(96808), wav(6848304), bin(91852), wav(11929704), bin(65266), wav(17421032), bin(39672), bin(46841), wav(18400970), wav(14372940), wav(11552164), wav(11555780), bin(35032), wav(10305684), wav(44345930), bin(61300), bin(34023), wav(16520242), bin(114792), bin(63295), bin(44824), bin(16551), wav(31059568), wav(48329826), bin(29275), wav(72276378), wav(76545210), bin(35158), wav(15080980), bin(95571), wav(13351960), wav(6253560), wav(16967040), wav(11657008), wav(16002674), wav(11862060), wav(38034158), bin(43946), bin(54824), bin(81595), bin(119369), bin(80844), bin(78360), wav(44807702), bin(76789), bin(101126), bin(61014), bin(46278), bin(88328), bin(23926), bin(64518), wav(51328764), bin(50014), bin(56162), bin(23907), bin(69670), bin(82479), bin(17239), bin(19694), wav(43100176), wav(21211016), wav(7056044), wav(6631566), wav(36123584), wav(17166010), bin(37840), wav(19271240), wav(15393814), bin(83348), bin(36921), bin(69782), bin(87413), wav(28021042), wav(20727752), wav(21829788), bin(39054), wav(18472378), wav(26921922), bin(84781), wav(34432660), bin(86734), bin(33864), wav(21673618), wav(29289766), bin(106154), bin(83451), bin(43249), bin(58301), bin(66395), bin(37081), wav(9015368), wav(21006092), bin(49774), wav(14741002), wav(40521000), bin(38589), wav(17914586), wav(7956200), bin(46623), wav(8089094), wav(4538524), wav(13296708), wav(28412640), wav(10407972), bin(128069), wav(10642744), wav(13713534), bin(28900), bin(63707), wav(35382784), bin(42505), wav(7509860), bin(37361), wav(39593800), wav(10870910), bin(58039), bin(25571), bin(34126), bin(88400), wav(39411720), bin(78153), wav(23523300), bin(76460), bin(72142), bin(23463), bin(54750), bin(84751), wav(14393664), bin(68017), wav(9098990), wav(19033888), bin(92909), wav(16896150), bin(66372), wav(14455636), bin(22222), wav(13502194), wav(33920932), wav(16043112), wav(24008002), bin(58853), wav(34269684), wav(12780556), bin(114352), wav(11075708), wav(18144340), bin(93235), wav(7840904), wav(8519724), bin(92543), bin(21346), wav(9834844), bin(97860), wav(14211728), bin(48161), wav(21832338), bin(21862), bin(33644), wav(9611382), wav(20431664), bin(110957), wav(11341116), wav(11267608), wav(18888292), bin(37843), wav(44448072), wav(17692500), bin(120438), wav(7455296), wav(28083572), bin(54758), bin(37827), wav(15423506), bin(47270), bin(162429), wav(29643580), wav(18407096), wav(17177550), wav(19279646), bin(75617), wav(19336092), wav(18046732), wav(46747928), wav(34367588), bin(71795), bin(50221), wav(14840754), bin(41381), bin(55164), wav(12473182), bin(42568), wav(7694568), wav(9475036), bin(40079), bin(99911), wav(5269258), wav(21195932), bin(41118), wav(7364732), bin(62299), bin(42780), wav(6479996), bin(69603), bin(13875), wav(16956312), bin(88235), bin(29877), bin(37356), bin(27035), wav(30934940), bin(63728), wav(10729620), wav(26057468), bin(32416), wav(49423744), bin(72383), bin(40884), bin(48655), wav(15572164), wav(15876138), wav(11035094), bin(55902), wav(14942712), wav(29299148), bin(77118), bin(71871), wav(29186618), bin(51134), wav(29842302), wav(69751858), wav(16769360), bin(41446), wav(9504122), bin(49043), wav(37073810), wav(10024404), bin(70053), bin(23087), bin(34420), wav(80493548), bin(68882), bin(19103), wav(10876522), wav(7952780), bin(87825), wav(16237632), wav(21518084), bin(24293), bin(51950), wav(17640044), wav(11900390), bin(55520), wav(17006272), bin(117684), wav(18243730), wav(19801376), wav(13582036), wav(15353148), wav(13325472), bin(58618), wav(25676268), bin(113794), bin(82434), bin(87179), wav(15597612), bin(38297), bin(30741), wav(27177704), wav(35182700), bin(69227), bin(79292), bin(94952), bin(48500), wav(40534612), bin(27020), wav(5015056), wav(11624572), bin(44002), bin(68452), wav(16526612), bin(154630), bin(72444), bin(44771), wav(4410044), bin(45324), bin(144092), wav(11437268), bin(98787), bin(48674), bin(70488), bin(48467), bin(89264), wav(18699540), bin(36299), bin(204362), wav(100924450), bin(34128), bin(47596), wav(14841056), bin(57374), wav(29330088), wav(24673062), wav(29995024), bin(28525), bin(71516), bin(36405), wav(29091112), bin(48405), wav(6022812), wav(9491404), wav(12058668), wav(40752916), bin(164614), bin(41996), bin(117281), wav(9684874), bin(39569), wav(18214956), bin(68462), bin(53569), wav(27854608), bin(45279), bin(11177), wav(15637202), wav(4449358), bin(62268), wav(5724952), bin(28549), bin(172376), wav(25304314), wav(7179444), bin(11788), wav(11828900), wav(10952828), bin(149835), bin(13748), bin(87074), bin(45931), wav(28862632), wav(38042252), wav(23822504), bin(63336), wav(7845288), bin(41063), bin(41895), bin(39748), wav(4985616), wav(25907092), wav(4100000), wav(15252808), wav(18650294), bin(39975), bin(44687), wav(29906148), wav(58156412), wav(8190010), bin(68772), bin(23428), wav(15898724), wav(21120734), wav(14577452), wav(9933040), wav(7976334), wav(40598614), bin(13623), wav(31041144), bin(60251), bin(36958), bin(101159), wav(15844940), wav(15697672), bin(86235), wav(12429320), wav(34416088), wav(29564972), wav(15780372), bin(20063), bin(91507), bin(129544), bin(24616), bin(14044), bin(65483), bin(56235), bin(45286), wav(6316348), bin(66449), wav(5706766), wav(12294458), bin(130167), wav(36262812), bin(8132), bin(71210), wav(15297306), bin(63348), wav(13239592), wav(33820836), bin(90269), bin(73931), bin(23296), bin(36510), wav(8241678), wav(8696684), bin(32397), wav(5434926), wav(30252520), bin(79459), bin(86771), wav(22404824), bin(108342), wav(24624940), bin(48968), wav(13435580), bin(38442), wav(30843424), bin(42102), bin(18262), bin(36057), wav(18683842), wav(6181892), wav(20662876), wav(26945216), bin(39742), bin(23416), wav(23387178), wav(9240620), bin(52688), bin(17067), wav(8236736), wav(35680032), bin(51531), bin(21752), wav(24670652), bin(53837), bin(25426), bin(124273), wav(15809828), bin(33635), wav(15354682), wav(14320668), bin(39212), wav(6881404), wav(16481236), wav(11892046), bin(66498), wav(29681752), wav(15941676), bin(60129), wav(8246056), wav(14382746), bin(106082), wav(18365332), bin(20891), wav(10133628), wav(5571518), wav(90771138), wav(18086072), bin(28493), bin(143807), bin(127359), bin(69550), bin(126986), wav(56577944), wav(22720780), wav(28434622), bin(65108), wav(36881924), wav(8135130), wav(31924268), wav(9880494), bin(127354), bin(57736), wav(12094192), bin(52714), wav(8676550), wav(8934264), wav(19929114), wav(19981684), bin(34131), bin(142208), bin(59453), bin(39290), wav(33503630), wav(17416612), bin(26486), bin(70232), wav(22980634), bin(42504), bin(80497), bin(25934), wav(11735354), bin(25142), wav(24144260), bin(123014), wav(6547108), bin(136375), bin(128336), bin(24171), wav(17348610), wav(24696044), bin(19742), wav(20053568), bin(47835), bin(38042), bin(41820), wav(7633244), bin(63407), bin(82274), wav(22305916), bin(99824), wav(18187240), wav(8949942), wav(28524200), tsv(43977), wav(13230282), bin(20016), bin(25004), bin(34935), wav(22051000), bin(161694), bin(53259), bin(12279), bin(42131), wav(21742076), bin(35305), wav(35067056), wav(15722260), wav(6217188), bin(23329), wav(75436434), bin(49033), bin(36441), bin(48211), bin(46347), bin(88746), bin(46805), bin(34958), wav(27201256), wav(26418688), wav(9229736), bin(244493), wav(52723204), wav(9896060), wav(18886550), wav(8114220), wav(24628048), wav(8708060), bin(26316), bin(101079), bin(181171), bin(13726), bin(43444), wav(12800410), wav(19400684), bin(25389), wav(11309178), bin(80279), wav(14024944), wav(9587482), wav(21149230), wav(33113792), bin(71687), wav(17674112), bin(30540), wav(18942224), wav(9792764), bin(130138), wav(19495960), wav(10118232), bin(102381), bin(30795), wav(9154200), bin(47220), wav(19278412), bin(72563), bin(42343), bin(27140), bin(97263), bin(42160), bin(40240), wav(5176388)Available download formats
    Dataset updated
    Feb 6, 2024
    Dataset provided by
    Texas Data Repository
    Authors
    Margaret Blevins; Margaret Blevins
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Texas
    Description

    The Texas German Sample Corpus (TGSC) is a collection of annotated transcripts of spoken Texas German (~13.5 hours, 75,000+ tokens). The TGSC was created to implement and test the language-tagging and normalization guidelines as proposed in Blevins (2022). Texas German is a set of mixed-language contact varieties of German "spoken in Texas which have descended from the dialects of German brought to Texas in the 19th century" by German-speaking immigrants (Boas 2009: 34)." The TGSC is a collection of audio recordings from the Texas German Dialect Archive (TGDA, tgdp.org/dialect-archive) with the following annotation layers: original TGDA literary transcription, tokenization, language tags, normalization, standard German utterance translation, and the original TGDA word-for-word English translation. By using the Texas German Sample Corpus (TGSC) database, you agree to the "User Rights and Responsibilities" in accordance with the specifications on https://tgdp.org/dialect-archive/ . Please cite the following works: - For the TGSC: Blevins (2022) The language-tagging & orthographic normalization of spoken mixed-language data, with a focus on Texas German (https://hdl.handle.net/2152/116703) - For the TGDA / TGDP (where the source material for the TGSC came from): Boas, Hans C., Marc Pierce, Karen Roesch, Guido Halder, and Hunter Weilbacher. (2010). The Texas German Dialect Archive: A Multimedia Resource for Research, Teaching, and Outreach. Journal of Germanic Linguistics, 22(3), 277-296.

  15. Data from: German Weimar Republic Data, 1919-1933

    • icpsr.umich.edu
    ascii, sas, spss
    Updated Dec 22, 2005
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Inter-university Consortium for Political and Social Research (2005). German Weimar Republic Data, 1919-1933 [Dataset]. http://doi.org/10.3886/ICPSR00042.v1
    Explore at:
    spss, ascii, sasAvailable download formats
    Dataset updated
    Dec 22, 2005
    Dataset authored and provided by
    Inter-university Consortium for Political and Social Researchhttps://www.icpsr.umich.edu/web/pages/
    License

    https://www.icpsr.umich.edu/web/ICPSR/studies/42/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/42/terms

    Time period covered
    1919 - 1933
    Area covered
    Germany
    Description

    This data collection contains electoral and demographic data at several levels of aggregation (kreis, land/regierungsberzirk, and wahlkreis) for Germany in the Weimar Republic period of 1919-1933. Two datasets are available. Part 1, 1919 Data, presents raw and percentagized election returns at the wahlkreis level for the 1919 election to the Nationalversammlung. Information is provided on the number and percentage of eligible voters and the total votes cast for parties such as the German National People's Party, German People's Party, Christian People's Party, German Democratic Party, Social Democratic Party, and Independent Social Democratic Party. Part 2, 1920-1933 Data, consists of returns for elections to the Reichstag, 1920-1933, and for the Reichsprasident elections of 1925 and 1932 (including runoff elections in each year), returns for two national referenda, held in 1926 and 1929, and data pertaining to urban population, religion, and occupations, taken from the German Census of 1925. This second dataset contains data at several levels of aggregation and is a merged file. Crosstemporal discrepancies, such as changes in the names of the geographical units and the disappearance of units, have been adjusted for whenever possible. Variables in this file provide information for the total number and percentage of eligible voters and votes cast for parties, including the German Nationalist People's Party, German People's Party, German Center Party, German Democratic Party, German Social Democratic Party, German Communist Party, Bavarian People's Party, Nationalist-Socialist German Workers' Party (Hitler's movement), German Middle Class Party, German Business and Labor Party, Conservative People's Party, and other parties. Data are also provided for the total number and percentage of votes cast in the Reichsprasident elections of 1925 and 1932 for candidates Jarres, Held, Ludendorff, Braun, Marx, Hellpach, Thalman, Hitler, Duesterburg, Von Hindenburg, Winter, and others. Additional variables provide information on occupations in the country, including the number of wage earners employed in agriculture, industry and manufacturing, trade and transportation, civil service, army and navy, clergy, public health, welfare, domestic and personal services, and unknown occupations. Other census data cover the total number of wage earners in the labor force and the number of female wage earners employed in all occupations. Also provided is the percentage of the total population living in towns with 5,000 inhabitants or more, and the number and percentage of the population who were Protestants, Catholics, and Jews.

  16. German Consumers/ B2C data in Germany

    • datarade.ai
    Updated Dec 6, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Techsalerator (2021). German Consumers/ B2C data in Germany [Dataset]. https://datarade.ai/data-products/german-consumers-b2c-in-germany-techsalerator
    Explore at:
    Dataset updated
    Dec 6, 2021
    Dataset authored and provided by
    Techsalerator
    Area covered
    Germany
    Description

    With close to 30M records in Germany, Techsalerator has access to some of the most qualitative B2C data in Germany.

    Thanks to our unique tools and data specialists, we can select the ideal targeted dataset based on unique elements such as the location/ country, gender, age...

    Whether you are looking for an entire fill install, an access to one of our API's or if you only need a one-time targeted purchase, get in touch with our company and we will fulfill your international data need.

  17. E

    Domain-Specific Dataset of Difficulty Ratings for German Noun Compounds

    • live.european-language-grid.eu
    txt
    Updated Jan 20, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2021). Domain-Specific Dataset of Difficulty Ratings for German Noun Compounds [Dataset]. https://live.european-language-grid.eu/catalogue/lcr/4925
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jan 20, 2021
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Dataset with difficulty ratings for 1,030 German closed noun compounds extracted from domain-specific texts for do-it-ourself (DIY), cooking and automotive. It includes two-part compounds for cooking and DIY, and two- to four-part compounds for automotive.

  18. SB-10K german dataset

    • kaggle.com
    zip
    Updated Jan 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    sary nasser (2024). SB-10K german dataset [Dataset]. https://www.kaggle.com/datasets/sarynasser/sb-10k-german-dataset
    Explore at:
    zip(115068 bytes)Available download formats
    Dataset updated
    Jan 22, 2024
    Authors
    sary nasser
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset

    This dataset was created by sary nasser

    Released under Apache 2.0

    Contents

  19. T

    Germany GDP

    • tradingeconomics.com
    • ar.tradingeconomics.com
    • +16more
    csv, excel, json, xml
    Updated Apr 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2024). Germany GDP [Dataset]. https://tradingeconomics.com/germany/gdp
    Explore at:
    excel, xml, json, csvAvailable download formats
    Dataset updated
    Apr 23, 2024
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Dec 31, 1970 - Dec 31, 2022
    Area covered
    Germany
    Description

    The Gross Domestic Product (GDP) in Germany was worth 4082.47 billion US dollars in 2022, according to official data from the World Bank. The GDP value of Germany represents 1.75 percent of the world economy. This dataset provides the latest reported value for - Germany GDP - plus previous releases, historical high and low, short-term forecast and long-term prediction, economic calendar, survey consensus and news.

  20. d

    GER_SET: Situation Entity Type labelled corpus for German - Dataset - B2FIND...

    • b2find.dkrz.de
    Updated Oct 22, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). GER_SET: Situation Entity Type labelled corpus for German - Dataset - B2FIND [Dataset]. https://b2find.dkrz.de/dataset/85e565ca-7054-5c19-9a38-d19d84d77636
    Explore at:
    Dataset updated
    Oct 22, 2023
    Description

    Semantic clause types, also called Situation Entity (SE) types (Smith, 2003) are linguistic characterizations of aspectual properties shown to be useful for tasks like argumentation structure analysis (Becker et al., 2016), genre characterization (Palmer and Friedrich, 2014), and detection of generic and generalizing sentences (Friedrich et al., 2016). We annotate several texts from different genres (newspaper, commentary, argumentative texts, and Wikipedia articles) with Situation Entity types. This data is in German. References: Maria Becker, Alexis Palmer, and Anette Frank (2016). Argumentative texts and Clause Types. Proceedings of the 3rd Workshop on Argument Mining (ACL-Workshop), pp. 21-30. Annemarie Friedrich, Alexis Palmer, and Manfred Pinkal (2016). Situation entity types: automatic classification of clause-level aspect. In Proceedings of ACL 2016. Alexis Palmer and Annemarie Friedrich (2014). Genre distinctions and discourse modes: Text types differ in their situation type distributions. Proceedings of the Workshop on Frontiers and Connections between Argumentation Theory and Natural Language Processing. Forlì-Cesena, Italy. Carlota S. Smith (2003). Modes of discourse: The local structure of texts, volume 103. Cambridge University Press.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
German Credit Dataset Dataset [Dataset]. https://paperswithcode.com/dataset/german-credit-dataset

German Credit Dataset Dataset

Explore at:
Description

Two datasets are provided. the original dataset, in the form provided by Prof. Hofmann, contains categorical/symbolic attributes and is in the file "german.data".

For algorithms that need numerical attributes, Strathclyde University produced the file "german.data-numeric". This file has been edited and several indicator variables added to make it suitable for algorithms which cannot cope with categorical variables. Several attributes that are ordered categorical (such as attribute 17) have been coded as integer. This was the form used by StatLog.

This dataset requires use of a cost matrix:

GoodBad
Good01
Bad50

The rows represent the actual classification and the columns the predicted classification.

It is worse to class a customer as good when they are bad (5), than it is to class a customer as bad when they are good (1).

Search
Clear search
Close search
Google apps
Main menu