100+ datasets found

P
German Credit Dataset Dataset
paperswithcode.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
German Credit Dataset Dataset [Dataset]. https://paperswithcode.com/dataset/german-credit-dataset
Explore at:
Description
Two datasets are provided. the original dataset, in the form provided by Prof. Hofmann, contains categorical/symbolic attributes and is in the file "german.data".

For algorithms that need numerical attributes, Strathclyde University produced the file "german.data-numeric". This file has been edited and several indicator variables added to make it suitable for algorithms which cannot cope with categorical variables. Several attributes that are ordered categorical (such as attribute 17) have been coded as integer. This was the form used by StatLog.

This dataset requires use of a cost matrix:

Good Bad
Good 0 1
Bad 5 0

The rows represent the actual classification and the columns the predicted classification.

It is worse to class a customer as good when they are bad (5), than it is to class a customer as bad when they are good (1).
d
Statlog (German Credit Data)
data.world
csv, zip
Updated Feb 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
UCI (2024). Statlog (German Credit Data) [Dataset]. https://data.world/uci/statlog-german-credit-data
Explore at:
zip, csvAvailable download formats
Dataset updated
Feb 11, 2024
Dataset provided by
data.world, Inc.
Authors
UCI
Description
Source:

Professor Dr. Hans Hofmann Institut f"ur Statistik und "Okonometrie Universit"at Hamburg FB Wirtschaftswissenschaften Von-Melle-Park 5 2000 Hamburg 13

Data Set Information:

Two datasets are provided. the original dataset, in the form provided by Prof. Hofmann, contains categorical/symbolic attributes and is in the file "german.data". For algorithms that need numerical attributes, Strathclyde University produced the file "german.data-numeric". This file has been edited and several indicator variables added to make it suitable for algorithms which cannot cope with categorical variables. Several attributes that are ordered categorical (such as attribute 17) have been coded as integer. This was the form used by StatLog. This dataset requires use of a cost matrix (see below) ..... 1 2************** 1 0 1***********- 2 5 0 (1 = Good, 2 = Bad) The rows represent the actual classification and the columns the predicted classification. It is worse to class a customer as good when they are bad (5), than it is to class a customer as bad when they are good (1).

Attribute Information:

Attribute 1: (qualitative) Status of existing checking account A11 : ... < 0 DM A12 : 0 <= ... < 200 DM A13 : ... >= 200 DM / salary assignments for at least 1 year A14 : no checking account Attribute 2: (numerical) Duration in month Attribute 3: (qualitative) Credit history A30 : no credits taken/ all credits paid back duly A31 : all credits at this bank paid back duly A32 : existing credits paid back duly till now A33 : delay in paying off in the past A34 : critical account/ other credits existing (not at this bank) Attribute 4: (qualitative) Purpose A40 : car (new) A41 : car (used) A42 : furniture/equipment A43 : radio/television A44 : domestic appliances A45 : repairs A46 : education A47 : (vacation - does not exist?) A48 : retraining A49 : business A410 : others Attribute 5: (numerical) Credit amount Attibute 6: (qualitative) Savings account/bonds A61 : ... < 100 DM A62 : 100 <= ... < 500 DM A63 : 500 <= ... < 1000 DM A64 : .. >= 1000 DM A65 : unknown/ no savings account Attribute 7: (qualitative) Present employment since A71 : unemployed A72 : ... < 1 year A73 : 1 <= ... < 4 years A74 : 4 <= ... < 7 years A75 : .. >= 7 years Attribute 8: (numerical) Installment rate in percentage of disposable income Attribute 9: (qualitative) Personal status and sex A91 : male : divorced/separated A92 : female : divorced/separated/married A93 : male : single A94 : male : married/widowed A95 : female : single Attribute 10: (qualitative) Other debtors / guarantors A101 : none A102 : co-applicant A103 : guarantor Attribute 11: (numerical) Present residence since Attribute 12: (qualitative) Property A121 : real estate A122 : if not A121 : building society savings agreement/ life insurance A123 : if not A121/A122 : car or other, not in attribute 6 A124 : unknown / no property Attribute 13: (numerical) Age in years Attribute 14: (qualitative) Other installment plans A141 : bank A142 : stores A143 : none Attribute 15: (qualitative) Housing A151 : rent A152 : own A153 : for free Attribute 16: (numerical) Number of existing credits at this bank Attribute 17: (qualitative) Job A171 : unemployed/ unskilled - non-resident A172 : unskilled - resident A173 : skilled employee / official A174 : management/ self-employed/ highly qualified employee/ officer Attribute 18: (numerical) Number of people being liable to provide maintenance for Attribute 19: (qualitative) Telephone A191 : none A192 : yes, registered under the customers name Attribute 20: (qualitative) foreign worker A201 : yes A202 : no

Relevant Papers:

N/A

Papers That Cite This Data Set1:

Jeroen Eggermont and Joost N. Kok and Walter A. Kosters. Genetic Programming for data classification: partitioning the search space. SAC. 2004. * Ke Wang and Shiyu Zhou and Ada Wai-Chee Fu and Jeffrey Xu Yu. Mining Changes of Classification by Correspondence Tracing. SDM. 2003. * Avelino J. Gonzalez and Lawrence B. Holder and Diane J. Cook. Graph-Based Concept Learning. FLAIRS Conference. 2001. * Oya Ekin and Peter L. Hammer and Alexander Kogan and Pawel Winter. Distance-Based Classification Methods. e p o r t RUTCOR ffl Rutgers Center for Operations Research ffl Rutgers University. 1996. * Paul O' Dea and Josephine Griffith and Colm O' Riordan. Combining Feature Selection and Neural Networks for Solving Classification Problems. Information Technology Department, National University of Ireland. * Chotirat Ann and Dimitrios Gunopulos. Scaling up the Naive Bayesian Classifier: Using Decision Trees for Feature Selection. Computer Science Department University of California. * Paul O' Dea and David Griffith and Colm O' Riordan. DEPARTMENT OF INFORMATION TECHNOLOGY. P. O'Dea (NUI.

Citation Request:

Please refer to the Machine Learning Repository's citation policy. [1] Papers were automatically harvested and associated with this data set, in collaborationwith Rexa.info

Source: http://archive.ics.uci.edu/ml/datasets/Statlog+%28German+Credit+Data%29
k
German-Credit-Data-Set-with-Credit-Risk
kaggle.com
Updated Dec 14, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). German-Credit-Data-Set-with-Credit-Risk [Dataset]. https://www.kaggle.com/datasets/benjaminmcgregor/german-credit-data-set-with-credit-risk
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 14, 2022
Description
This data set has been created by re-integrating the 'credit risk' attribute from Professor Dr. Hans Hofmann's 'German Credit Data' data set on the UCI Machine Learning Repository (https://archive.ics.uci.edu/ml/datasets/statlog+(german+credit+data)) into the refined data set published by UCI (https://www.kaggle.com/datasets/uciml/german-credit).

I have included the Python script that I wrote to produce this data set.
H
Replication Data for: German Credit
dataverse.harvard.edu
tsv
Updated Apr 6, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Harvard Dataverse (2016). Replication Data for: German Credit [Dataset]. http://doi.org/10.7910/DVN/Q8MAW8
Explore at:
tsv(53493)Available download formats
Unique identifier
https://doi.org/10.7910/DVN/Q8MAW8
Dataset updated
Apr 6, 2016
Dataset provided by
Harvard Dataverse
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Original data from: https://archive.ics.uci.edu/ml/datasets/Statlog+(German+Credit+Data), using the file "german.data-numeric" version produced by Strathclyde University. Changes made: - changed the ordering of Attribute 3 (Credit History) to try to extract monotone relationship: ORIGINAL ORDERING: A30 : no credits taken/all credits paid back duly, A31 : all credits at this bank paid back duly, A32 : existing credits paid back duly till now, A33 : delay in paying off in the past, A34 : critical account/other credits existing (not at this bank)) NEW ORDERING: 0=all credits paid back (A31) 1=all credits paid back duly til now (A32) 2= no credits taken (A30) 3= delay in past (A33) 4=critical acct (A34). ATTRIBUTES: 0 CLASS Credit Rating: +1 is bad / -1 is good 1 BalanceCheque 2 Loan NurnMonth 3 CreditHistory 4 CreditAmt 5 SavingsBalance 6 Mths in PresentEmployment 7 PersonStatusSex 8 PresentResidenceSince 9 Property 10 AgeInYears 11 OtherInstallmentPlans (highest val is NO other installment plans) 12 NumExistingCreditsThisBank 13 NumPplLiablMaint 14 Telephone 15 ForeignWorker 16 Purpose-CarNew 17 Purpose-CarOld 18 otherdebtor-none (compared to guarantor) 19 otherdebt-coappl (compared to guarantor) 20 house-rent (compared to 'for free') 21 house-owns (compared to 'for free') 22 job-unemployed (vs mgt) 23 jobs-unskilled (vs mgt) 24 job-skilled (vs mgt)
Ten Thousand German News Articles Dataset
kaggle.com
tblock.github.io
zip
Updated Jan 20, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Timo Block (2022). Ten Thousand German News Articles Dataset [Dataset]. https://www.kaggle.com/tblock/10kgnad
Explore at:
zip(21144764 bytes)Available download formats
Dataset updated
Jan 20, 2022
Authors
Timo Block
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
(see https://tblock.github.io/10kGNAD/ for the original dataset page)

This page introduces the 10k German News Articles Dataset (10kGNAD) german topic classification dataset. The 10kGNAD is based on the One Million Posts Corpus and avalaible under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. You can download the dataset here.

Why a German dataset?

English text classification datasets are common. Examples are the big AG News, the class-rich 20 Newsgroups and the large-scale DBpedia ontology datasets for topic classification and for example the commonly used IMDb and Yelp datasets for sentiment analysis. Non-english datasets, especially German datasets, are less common. There is a collection of sentiment analysis datasets assembled by the Interest Group on German Sentiment Analysis. However, to my knowlege, no german topic classification dataset is avaliable to the public.

Due to grammatical differences between the English and the German language, a classifyer might be effective on a English dataset, but not as effectiv on a German dataset. The German language has a higher inflection and long compound words are quite common compared to the English language. One would need to evaluate a classifyer on multiple German datasets to get a sense of it's effectivness.

The dataset

The 10kGNAD dataset is intended to solve part of this problem as the first german topic classification dataset. It consists of 10273 german language news articles from an austrian online newspaper categorized into nine topics. These articles are a till now unused part of the One Million Posts Corpus.

In the One Million Posts Corpus each article has a topic path. For example Newsroom/Wirtschaft/Wirtschaftpolitik/Finanzmaerkte/Griechenlandkrise. The 10kGNAD uses the second part of the topic path, here Wirtschaft, as class label. In result the dataset can be used for multi-class classification.

I created and used this dataset in my thesis to train and evaluate four text classifyers on the German language. By publishing the dataset I hope to support the advancement of tools and models for the German language. Additionally this dataset can be used as a benchmark dataset for german topic classification.

Numbers and statistics

As in most real-world datasets the class distribution of the 10kGNAD is not balanced. The biggest class Web consists of 1678, while the smalles class Kultur contains only 539 articles. However articles from the Web class have on average the fewest words, while artilces from the culture class have the second most words.

Splitting into train and test

I propose a stratifyed split of 10% for testing and the remaining articles for training. To use the dataset as a benchmark dataset, please used the train.csv and test.csv files located in the project root.

Code

Python scripts to extract the articles and split them into a train- and a testset avaliable in the code directory of this project. Make sure to install the requirements. The original corpus.sqlite3 is required to extract the articles (download here (compressed) or here (uncompressed)).

License

This dataset is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. Please consider citing the authors of the One Million Post Corpus if you use the dataset.
P
Voxforge German Dataset
paperswithcode.com
Updated Aug 2, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). Voxforge German Dataset [Dataset]. https://paperswithcode.com/dataset/voxforge-german
Explore at:
Dataset updated
Aug 2, 2022
Description
VoxForge is an open speech dataset that was set up to collect transcribed speech for use with Free and Open Source Speech Recognition Engines (on Linux, Windows and Mac).

We will make available all submitted audio files under the GPL license, and then 'compile' them into acoustic models for use with Open Source speech recognition engines such as CMU Sphinx, ISIP, Julius (github) and HTK (note: HTK has distribution restrictions).
germanquad
huggingface.co
opendatalab.com
Updated Jun 16, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
deepset (2021). germanquad [Dataset]. https://huggingface.co/datasets/deepset/germanquad
Explore at:
Dataset updated
Jun 16, 2021
Dataset authored and provided by
deepsethttps://www.deepset.ai/
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
In order to raise the bar for non-English QA, we are releasing a high-quality, human-labeled German QA dataset consisting of 13 722 questions, incl. a three-way annotated test set. The creation of GermanQuAD is inspired by insights from existing datasets as well as our labeling experience from several industry projects. We combine the strengths of SQuAD, such as high out-of-domain performance, with self-sufficient questions that contain all relevant information for open-domain QA as in the NaturalQuestions dataset. Our training and test datasets do not overlap like other popular datasets and include complex questions that cannot be answered with a single entity or only a few words.
w
Germany
workwithdata.com
Updated Apr 18, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Work With Data (2024). Germany [Dataset]. https://www.workwithdata.com/place/germany
Explore at:
Dataset updated
Apr 18, 2024
Dataset authored and provided by
Work With Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Germany
Description
Explore Germany through unique data from multiples sources: key facts, real-time news, interactive charts, detailed maps & open datasets
G
Germany Population: German
ceicdata.com
Updated Mar 15, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CEICdata.com (2024). Germany Population: German [Dataset]. https://www.ceicdata.com/en/germany/population/population-german
Explore at:
Dataset updated
Mar 15, 2024
Dataset provided by
CEICdata.com
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Dec 1, 2011 - Dec 1, 2022
Area covered
Germany
Variables measured
Population
Description
Germany Population: German data was reported at 72,034,650.000 Person in 2022. This records a decrease from the previous number of 72,344,071.000 Person for 2021. Germany Population: German data is updated yearly, averaging 73,301,664.000 Person from Dec 1970 to 2022, with 53 observations. The data reached an all-time high of 75,212,869.000 Person in 2004 and a record low of 56,478,581.000 Person in 1986. Germany Population: German data remains active status in CEIC and is reported by Statistisches Bundesamt. The data is categorized under Global Database’s Germany – Table DE.G001: Population. Population prior to 1990 covers West Germany only.
german dataset
kaggle.com
zip
Updated Aug 17, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
daniel_lopez (2019). german dataset [Dataset]. https://www.kaggle.com/datasets/dnllpz/german-dataset
Explore at:
zip(17216 bytes)Available download formats
Dataset updated
Aug 17, 2019
Authors
daniel_lopez
Description
Dataset

This dataset was created by daniel_lopez

Contents
E
Dataset: The plural interpretability of German linking elements...
live.european-language-grid.eu
zenodo.org
csv
Updated Aug 15, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2021). Dataset: The plural interpretability of German linking elements ("Morphology") [Dataset]. https://live.european-language-grid.eu/catalogue/lcr/7422
Explore at:
csvAvailable download formats
Dataset updated
Aug 15, 2021
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
This dataset accompanies a paper to be published in "Morphology" (JOMO, Springer). Under the present DOI, all data generated for this research as well as all scripts used are stored. The paper itself is not CC-licensed, refer to Springer's "Morphology" website for details!AbstractIn this paper, we take a closer theoretical and empirical look at the linking elements in German N1+N2 compounds which are identical to the plural marker of N1 (such as -er with umlaut, as in Häus-er-meer 'sea of houses'). Various perspectives on the actual extent of plural interpretability of these pluralic linking elements are expressed in the literature. We aim to clarify this question by empirically examining to what extent there may be a relationship between plural form and meaning which informs in which sorts of compounds pluralic linking elements appear. Specifically, we investigate whether pluralic linking elements occur especially frequently in compounds where a plural meaning of the first constituent is induced either externally (through plural inflection of the entire compound) or internally (through a relation between the constituents such that N2 forces N1 to be conceptually plural, as in the example above). The results of a corpus study using the DECOW16A corpus and a split-100 experiment show that in the internal but not external plural meaning conditions, a pluralic linking element is preferred over a non-pluralic one, though there is considerable inter-speaker variability, and limitations imposed by other constraints on linking element distribution also play a role. However, we show the overall tendency that German language users do use pluralic linking elements as cues to the plural interpretation of N1+N2 compounds. Our interpretation does not reference a specific morphological framework. Instead, we view our data as strengthening the general approach of probabilistic morphology.
s
German Language Datasets | Call Center, Virtual Assistant & TTS
shaip.com
vi.shaip.com
+27more
Updated Jun 11, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shaip (2023). German Language Datasets | Call Center, Virtual Assistant & TTS [Dataset]. https://www.shaip.com/offerings/speech-data-catalog/german-dataset/
Explore at:
Dataset updated
Jun 11, 2023
Dataset authored and provided by
Shaip
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Enhance your Conversational AI model with our Off-the-Shelf German Language Datasets. Shaip high-quality audio datasets are a quick and effective solution for model training.
Vietnamese German Dataset
kaggle.com
zip
Updated Dec 24, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nguyễn Trung Hậu (2018). Vietnamese German Dataset [Dataset]. https://www.kaggle.com/flightstar/vietnamese-german-dataset
Explore at:
zip(456165 bytes)Available download formats
Dataset updated
Dec 24, 2018
Authors
Nguyễn Trung Hậu
License
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
Description
Dataset

This dataset was created by Nguyễn Trung Hậu

Released under Database: Open Database, Contents: © Original Authors

Contents
T
Texas German Sample Corpus
dataverse.tdl.org
dataverse-prod.tdl.org
bin, tsv, txt, wav
Updated Feb 6, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Margaret Blevins; Margaret Blevins (2024). Texas German Sample Corpus [Dataset]. http://doi.org/10.18738/T8/IOX9ZA
Explore at:
wav(12790956), bin(14824), wav(25855234), bin(52448), wav(16278382), wav(30825244), bin(23773), wav(18861400), bin(31428), bin(68753), wav(29250172), wav(11289036), bin(36927), bin(34685), wav(48292100), bin(34751), wav(10595244), wav(15029632), wav(11364396), wav(20104532), bin(104715), bin(47994), bin(8479), wav(12072650), bin(61836), wav(18987750), bin(82713), bin(69348), bin(29738), wav(23227036), bin(119156), bin(88550), bin(106906), bin(88191), bin(54710), wav(23313500), bin(34799), wav(5508844), bin(127336), wav(16127938), wav(11634760), wav(10389110), wav(19228828), bin(18696), wav(14654788), wav(15058040), bin(67781), wav(18535980), wav(13864116), wav(15385256), bin(61508), wav(11900592), wav(12246806), bin(65014), wav(3739878), bin(47082), wav(41747320), bin(30024), bin(40022), wav(19404044), bin(73975), bin(12677), wav(31295816), wav(11942828), wav(41425942), bin(94184), wav(24013302), bin(56085), wav(10164434), wav(16648632), bin(127737), wav(50438200), wav(20629020), bin(79777), wav(7140396), bin(125239), bin(46545), wav(12645964), wav(28180468), wav(19127424), wav(30465016), wav(12700844), bin(31477), wav(27118168), bin(27516), bin(128964), wav(10964740), bin(129241), bin(81517), wav(5292044), bin(34745), bin(86921), wav(7097072), wav(12866596), wav(19608962), wav(14175648), wav(6619812), bin(65646), wav(4586142), bin(64662), bin(68000), bin(46111), wav(4410282), bin(61079), wav(7424114), bin(14196), bin(90750), bin(16213), wav(10853976), wav(29871606), bin(38796), wav(24352200), wav(14975118), bin(67687), wav(39437350), wav(24979808), bin(38030), bin(62014), wav(20426160), bin(59061), bin(39968), wav(14216796), bin(83016), wav(16998952), wav(16038118), wav(58720196), wav(21679108), wav(9965716), wav(49820980), bin(111250), bin(18201), wav(25512686), bin(64610), wav(37004348), bin(64931), wav(3048258), bin(13037), bin(47904), wav(20258912), wav(7530188), wav(27000288), wav(5826276), wav(27284650), wav(19693814), bin(92398), bin(111957), wav(35480126), wav(11612886), bin(101084), wav(12272892), wav(13650528), bin(75245), wav(17044468), bin(39039), wav(24430756), bin(68621), bin(67483), wav(65875842), bin(64482), wav(13795276), bin(15369), bin(26213), wav(42277748), wav(14096408), wav(18751254), wav(8560456), bin(27994), wav(12568234), wav(22270750), wav(17192710), bin(19215), wav(22407632), wav(56463626), bin(25981), bin(20656), bin(63701), wav(13190756), bin(81298), bin(70114), wav(17560380), wav(11316932), wav(14237170), wav(18029476), wav(16277646), bin(35997), txt(27125), bin(75523), bin(47985), bin(30296), bin(31990), wav(12119580), bin(84856), wav(30699364), bin(35237), bin(103800), bin(69133), bin(37458), bin(176022), wav(9542228), bin(56022), bin(78412), wav(20490896), wav(22578346), wav(30246604), bin(98065), wav(19831894), bin(80014), wav(24820288), wav(20423966), wav(7596410), wav(35831730), wav(14944758), wav(53220820), wav(12340558), wav(3553570), bin(34269), bin(50838), wav(39525712), bin(104096), wav(16429012), wav(30459424), bin(79300), wav(8055388), wav(40193350), bin(53328), wav(6719136), wav(18534548), wav(22487068), bin(216), wav(4705540), bin(63492), wav(17336078), bin(46334), wav(27710144), bin(41175), wav(4915710), bin(27165), bin(52219), bin(41767), bin(19306), bin(29116), wav(10120086), wav(9983308), wav(21657114), bin(68800), bin(46426), bin(88706), bin(82839), wav(9481550), bin(71857), bin(63079), wav(9409746), wav(7851816), bin(59369), wav(25363980), wav(13605484), bin(98796), bin(38692), bin(95652), bin(39582), bin(48298), bin(101564), bin(48522), wav(6801450), wav(11730608), wav(19039156), bin(81961), bin(75529), wav(77510874), wav(6289076), wav(22623668), bin(25129), bin(81673), wav(11112930), bin(73057), bin(59109), bin(14996), bin(50073), bin(42302), wav(7612590), bin(45591), wav(16639000), bin(37985), bin(67039), bin(71843), wav(12643988), bin(38729), bin(44481), wav(29539084), bin(55627), wav(33817272), bin(23364), wav(21631654), bin(50574), wav(36143700), bin(51914), wav(16458148), bin(96808), wav(6848304), bin(91852), wav(11929704), bin(65266), wav(17421032), bin(39672), bin(46841), wav(18400970), wav(14372940), wav(11552164), wav(11555780), bin(35032), wav(10305684), wav(44345930), bin(61300), bin(34023), wav(16520242), bin(114792), bin(63295), bin(44824), bin(16551), wav(31059568), wav(48329826), bin(29275), wav(72276378), wav(76545210), bin(35158), wav(15080980), bin(95571), wav(13351960), wav(6253560), wav(16967040), wav(11657008), wav(16002674), wav(11862060), wav(38034158), bin(43946), bin(54824), bin(81595), bin(119369), bin(80844), bin(78360), wav(44807702), bin(76789), bin(101126), bin(61014), bin(46278), bin(88328), bin(23926), bin(64518), wav(51328764), bin(50014), bin(56162), bin(23907), bin(69670), bin(82479), bin(17239), bin(19694), wav(43100176), wav(21211016), wav(7056044), wav(6631566), wav(36123584), wav(17166010), bin(37840), wav(19271240), wav(15393814), bin(83348), bin(36921), bin(69782), bin(87413), wav(28021042), wav(20727752), wav(21829788), bin(39054), wav(18472378), wav(26921922), bin(84781), wav(34432660), bin(86734), bin(33864), wav(21673618), wav(29289766), bin(106154), bin(83451), bin(43249), bin(58301), bin(66395), bin(37081), wav(9015368), wav(21006092), bin(49774), wav(14741002), wav(40521000), bin(38589), wav(17914586), wav(7956200), bin(46623), wav(8089094), wav(4538524), wav(13296708), wav(28412640), wav(10407972), bin(128069), wav(10642744), wav(13713534), bin(28900), bin(63707), wav(35382784), bin(42505), wav(7509860), bin(37361), wav(39593800), wav(10870910), bin(58039), bin(25571), bin(34126), bin(88400), wav(39411720), bin(78153), wav(23523300), bin(76460), bin(72142), bin(23463), bin(54750), bin(84751), wav(14393664), bin(68017), wav(9098990), wav(19033888), bin(92909), wav(16896150), bin(66372), wav(14455636), bin(22222), wav(13502194), wav(33920932), wav(16043112), wav(24008002), bin(58853), wav(34269684), wav(12780556), bin(114352), wav(11075708), wav(18144340), bin(93235), wav(7840904), wav(8519724), bin(92543), bin(21346), wav(9834844), bin(97860), wav(14211728), bin(48161), wav(21832338), bin(21862), bin(33644), wav(9611382), wav(20431664), bin(110957), wav(11341116), wav(11267608), wav(18888292), bin(37843), wav(44448072), wav(17692500), bin(120438), wav(7455296), wav(28083572), bin(54758), bin(37827), wav(15423506), bin(47270), bin(162429), wav(29643580), wav(18407096), wav(17177550), wav(19279646), bin(75617), wav(19336092), wav(18046732), wav(46747928), wav(34367588), bin(71795), bin(50221), wav(14840754), bin(41381), bin(55164), wav(12473182), bin(42568), wav(7694568), wav(9475036), bin(40079), bin(99911), wav(5269258), wav(21195932), bin(41118), wav(7364732), bin(62299), bin(42780), wav(6479996), bin(69603), bin(13875), wav(16956312), bin(88235), bin(29877), bin(37356), bin(27035), wav(30934940), bin(63728), wav(10729620), wav(26057468), bin(32416), wav(49423744), bin(72383), bin(40884), bin(48655), wav(15572164), wav(15876138), wav(11035094), bin(55902), wav(14942712), wav(29299148), bin(77118), bin(71871), wav(29186618), bin(51134), wav(29842302), wav(69751858), wav(16769360), bin(41446), wav(9504122), bin(49043), wav(37073810), wav(10024404), bin(70053), bin(23087), bin(34420), wav(80493548), bin(68882), bin(19103), wav(10876522), wav(7952780), bin(87825), wav(16237632), wav(21518084), bin(24293), bin(51950), wav(17640044), wav(11900390), bin(55520), wav(17006272), bin(117684), wav(18243730), wav(19801376), wav(13582036), wav(15353148), wav(13325472), bin(58618), wav(25676268), bin(113794), bin(82434), bin(87179), wav(15597612), bin(38297), bin(30741), wav(27177704), wav(35182700), bin(69227), bin(79292), bin(94952), bin(48500), wav(40534612), bin(27020), wav(5015056), wav(11624572), bin(44002), bin(68452), wav(16526612), bin(154630), bin(72444), bin(44771), wav(4410044), bin(45324), bin(144092), wav(11437268), bin(98787), bin(48674), bin(70488), bin(48467), bin(89264), wav(18699540), bin(36299), bin(204362), wav(100924450), bin(34128), bin(47596), wav(14841056), bin(57374), wav(29330088), wav(24673062), wav(29995024), bin(28525), bin(71516), bin(36405), wav(29091112), bin(48405), wav(6022812), wav(9491404), wav(12058668), wav(40752916), bin(164614), bin(41996), bin(117281), wav(9684874), bin(39569), wav(18214956), bin(68462), bin(53569), wav(27854608), bin(45279), bin(11177), wav(15637202), wav(4449358), bin(62268), wav(5724952), bin(28549), bin(172376), wav(25304314), wav(7179444), bin(11788), wav(11828900), wav(10952828), bin(149835), bin(13748), bin(87074), bin(45931), wav(28862632), wav(38042252), wav(23822504), bin(63336), wav(7845288), bin(41063), bin(41895), bin(39748), wav(4985616), wav(25907092), wav(4100000), wav(15252808), wav(18650294), bin(39975), bin(44687), wav(29906148), wav(58156412), wav(8190010), bin(68772), bin(23428), wav(15898724), wav(21120734), wav(14577452), wav(9933040), wav(7976334), wav(40598614), bin(13623), wav(31041144), bin(60251), bin(36958), bin(101159), wav(15844940), wav(15697672), bin(86235), wav(12429320), wav(34416088), wav(29564972), wav(15780372), bin(20063), bin(91507), bin(129544), bin(24616), bin(14044), bin(65483), bin(56235), bin(45286), wav(6316348), bin(66449), wav(5706766), wav(12294458), bin(130167), wav(36262812), bin(8132), bin(71210), wav(15297306), bin(63348), wav(13239592), wav(33820836), bin(90269), bin(73931), bin(23296), bin(36510), wav(8241678), wav(8696684), bin(32397), wav(5434926), wav(30252520), bin(79459), bin(86771), wav(22404824), bin(108342), wav(24624940), bin(48968), wav(13435580), bin(38442), wav(30843424), bin(42102), bin(18262), bin(36057), wav(18683842), wav(6181892), wav(20662876), wav(26945216), bin(39742), bin(23416), wav(23387178), wav(9240620), bin(52688), bin(17067), wav(8236736), wav(35680032), bin(51531), bin(21752), wav(24670652), bin(53837), bin(25426), bin(124273), wav(15809828), bin(33635), wav(15354682), wav(14320668), bin(39212), wav(6881404), wav(16481236), wav(11892046), bin(66498), wav(29681752), wav(15941676), bin(60129), wav(8246056), wav(14382746), bin(106082), wav(18365332), bin(20891), wav(10133628), wav(5571518), wav(90771138), wav(18086072), bin(28493), bin(143807), bin(127359), bin(69550), bin(126986), wav(56577944), wav(22720780), wav(28434622), bin(65108), wav(36881924), wav(8135130), wav(31924268), wav(9880494), bin(127354), bin(57736), wav(12094192), bin(52714), wav(8676550), wav(8934264), wav(19929114), wav(19981684), bin(34131), bin(142208), bin(59453), bin(39290), wav(33503630), wav(17416612), bin(26486), bin(70232), wav(22980634), bin(42504), bin(80497), bin(25934), wav(11735354), bin(25142), wav(24144260), bin(123014), wav(6547108), bin(136375), bin(128336), bin(24171), wav(17348610), wav(24696044), bin(19742), wav(20053568), bin(47835), bin(38042), bin(41820), wav(7633244), bin(63407), bin(82274), wav(22305916), bin(99824), wav(18187240), wav(8949942), wav(28524200), tsv(43977), wav(13230282), bin(20016), bin(25004), bin(34935), wav(22051000), bin(161694), bin(53259), bin(12279), bin(42131), wav(21742076), bin(35305), wav(35067056), wav(15722260), wav(6217188), bin(23329), wav(75436434), bin(49033), bin(36441), bin(48211), bin(46347), bin(88746), bin(46805), bin(34958), wav(27201256), wav(26418688), wav(9229736), bin(244493), wav(52723204), wav(9896060), wav(18886550), wav(8114220), wav(24628048), wav(8708060), bin(26316), bin(101079), bin(181171), bin(13726), bin(43444), wav(12800410), wav(19400684), bin(25389), wav(11309178), bin(80279), wav(14024944), wav(9587482), wav(21149230), wav(33113792), bin(71687), wav(17674112), bin(30540), wav(18942224), wav(9792764), bin(130138), wav(19495960), wav(10118232), bin(102381), bin(30795), wav(9154200), bin(47220), wav(19278412), bin(72563), bin(42343), bin(27140), bin(97263), bin(42160), bin(40240), wav(5176388)Available download formats
Unique identifier
https://doi.org/10.18738/T8/IOX9ZA
Dataset updated
Feb 6, 2024
Dataset provided by
Texas Data Repository
Authors
Margaret Blevins; Margaret Blevins
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Area covered
Texas
Description
The Texas German Sample Corpus (TGSC) is a collection of annotated transcripts of spoken Texas German (~13.5 hours, 75,000+ tokens). The TGSC was created to implement and test the language-tagging and normalization guidelines as proposed in Blevins (2022). Texas German is a set of mixed-language contact varieties of German "spoken in Texas which have descended from the dialects of German brought to Texas in the 19th century" by German-speaking immigrants (Boas 2009: 34)." The TGSC is a collection of audio recordings from the Texas German Dialect Archive (TGDA, tgdp.org/dialect-archive) with the following annotation layers: original TGDA literary transcription, tokenization, language tags, normalization, standard German utterance translation, and the original TGDA word-for-word English translation. By using the Texas German Sample Corpus (TGSC) database, you agree to the "User Rights and Responsibilities" in accordance with the specifications on https://tgdp.org/dialect-archive/ . Please cite the following works: - For the TGSC: Blevins (2022) The language-tagging & orthographic normalization of spoken mixed-language data, with a focus on Texas German (https://hdl.handle.net/2152/116703) - For the TGDA / TGDP (where the source material for the TGSC came from): Boas, Hans C., Marc Pierce, Karen Roesch, Guido Halder, and Hunter Weilbacher. (2010). The Texas German Dialect Archive: A Multimedia Resource for Research, Teaching, and Outreach. Journal of Germanic Linguistics, 22(3), 277-296.
Data from: German Weimar Republic Data, 1919-1933
icpsr.umich.edu
ascii, sas, spss
Updated Dec 22, 2005
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Inter-university Consortium for Political and Social Research (2005). German Weimar Republic Data, 1919-1933 [Dataset]. http://doi.org/10.3886/ICPSR00042.v1
Explore at:
spss, ascii, sasAvailable download formats
Unique identifier
https://doi.org/10.3886/ICPSR00042.v1
Dataset updated
Dec 22, 2005
Dataset authored and provided by
Inter-university Consortium for Political and Social Researchhttps://www.icpsr.umich.edu/web/pages/
License
https://www.icpsr.umich.edu/web/ICPSR/studies/42/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/42/terms
Time period covered
1919 - 1933
Area covered
Germany
Description
This data collection contains electoral and demographic data at several levels of aggregation (kreis, land/regierungsberzirk, and wahlkreis) for Germany in the Weimar Republic period of 1919-1933. Two datasets are available. Part 1, 1919 Data, presents raw and percentagized election returns at the wahlkreis level for the 1919 election to the Nationalversammlung. Information is provided on the number and percentage of eligible voters and the total votes cast for parties such as the German National People's Party, German People's Party, Christian People's Party, German Democratic Party, Social Democratic Party, and Independent Social Democratic Party. Part 2, 1920-1933 Data, consists of returns for elections to the Reichstag, 1920-1933, and for the Reichsprasident elections of 1925 and 1932 (including runoff elections in each year), returns for two national referenda, held in 1926 and 1929, and data pertaining to urban population, religion, and occupations, taken from the German Census of 1925. This second dataset contains data at several levels of aggregation and is a merged file. Crosstemporal discrepancies, such as changes in the names of the geographical units and the disappearance of units, have been adjusted for whenever possible. Variables in this file provide information for the total number and percentage of eligible voters and votes cast for parties, including the German Nationalist People's Party, German People's Party, German Center Party, German Democratic Party, German Social Democratic Party, German Communist Party, Bavarian People's Party, Nationalist-Socialist German Workers' Party (Hitler's movement), German Middle Class Party, German Business and Labor Party, Conservative People's Party, and other parties. Data are also provided for the total number and percentage of votes cast in the Reichsprasident elections of 1925 and 1932 for candidates Jarres, Held, Ludendorff, Braun, Marx, Hellpach, Thalman, Hitler, Duesterburg, Von Hindenburg, Winter, and others. Additional variables provide information on occupations in the country, including the number of wage earners employed in agriculture, industry and manufacturing, trade and transportation, civil service, army and navy, clergy, public health, welfare, domestic and personal services, and unknown occupations. Other census data cover the total number of wage earners in the labor force and the number of female wage earners employed in all occupations. Also provided is the percentage of the total population living in towns with 5,000 inhabitants or more, and the number and percentage of the population who were Protestants, Catholics, and Jews.
German Consumers/ B2C data in Germany
datarade.ai
Updated Dec 6, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Techsalerator (2021). German Consumers/ B2C data in Germany [Dataset]. https://datarade.ai/data-products/german-consumers-b2c-in-germany-techsalerator
Explore at:
Dataset updated
Dec 6, 2021
Dataset authored and provided by
Techsalerator
Area covered
Germany
Description
With close to 30M records in Germany, Techsalerator has access to some of the most qualitative B2C data in Germany.

Thanks to our unique tools and data specialists, we can select the ideal targeted dataset based on unique elements such as the location/ country, gender, age...

Whether you are looking for an entire fill install, an access to one of our API's or if you only need a one-time targeted purchase, get in touch with our company and we will fulfill your international data need.
E
Domain-Specific Dataset of Difficulty Ratings for German Noun Compounds
live.european-language-grid.eu
txt
Updated Jan 20, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2021). Domain-Specific Dataset of Difficulty Ratings for German Noun Compounds [Dataset]. https://live.european-language-grid.eu/catalogue/lcr/4925
Explore at:
txtAvailable download formats
Dataset updated
Jan 20, 2021
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Dataset with difficulty ratings for 1,030 German closed noun compounds extracted from domain-specific texts for do-it-ourself (DIY), cooking and automotive. It includes two-part compounds for cooking and DIY, and two- to four-part compounds for automotive.
SB-10K german dataset
kaggle.com
zip
Updated Jan 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
sary nasser (2024). SB-10K german dataset [Dataset]. https://www.kaggle.com/datasets/sarynasser/sb-10k-german-dataset
Explore at:
zip(115068 bytes)Available download formats
Dataset updated
Jan 22, 2024
Authors
sary nasser
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Dataset

This dataset was created by sary nasser

Released under Apache 2.0

Contents
T
Germany GDP
tradingeconomics.com
ar.tradingeconomics.com
+16more
csv, excel, json, xml
Updated Apr 23, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2024). Germany GDP [Dataset]. https://tradingeconomics.com/germany/gdp
Explore at:
excel, xml, json, csvAvailable download formats
Dataset updated
Apr 23, 2024
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Dec 31, 1970 - Dec 31, 2022
Area covered
Germany
Description
The Gross Domestic Product (GDP) in Germany was worth 4082.47 billion US dollars in 2022, according to official data from the World Bank. The GDP value of Germany represents 1.75 percent of the world economy. This dataset provides the latest reported value for - Germany GDP - plus previous releases, historical high and low, short-term forecast and long-term prediction, economic calendar, survey consensus and news.
d
GER_SET: Situation Entity Type labelled corpus for German - Dataset - B2FIND...
b2find.dkrz.de
Updated Oct 22, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). GER_SET: Situation Entity Type labelled corpus for German - Dataset - B2FIND [Dataset]. https://b2find.dkrz.de/dataset/85e565ca-7054-5c19-9a38-d19d84d77636
Explore at:
Dataset updated
Oct 22, 2023
Description
Semantic clause types, also called Situation Entity (SE) types (Smith, 2003) are linguistic characterizations of aspectual properties shown to be useful for tasks like argumentation structure analysis (Becker et al., 2016), genre characterization (Palmer and Friedrich, 2014), and detection of generic and generalizing sentences (Friedrich et al., 2016). We annotate several texts from different genres (newspaper, commentary, argumentative texts, and Wikipedia articles) with Situation Entity types. This data is in German. References: Maria Becker, Alexis Palmer, and Anette Frank (2016). Argumentative texts and Clause Types. Proceedings of the 3rd Workshop on Argument Mining (ACL-Workshop), pp. 21-30. Annemarie Friedrich, Alexis Palmer, and Manfred Pinkal (2016). Situation entity types: automatic classification of clause-level aspect. In Proceedings of ACL 2016. Alexis Palmer and Annemarie Friedrich (2014). Genre distinctions and discourse modes: Text types differ in their situation type distributions. Proceedings of the Workshop on Frontiers and Connections between Argumentation Theory and Natural Language Processing. Forlì-Cesena, Italy. Carlota S. Smith (2003). Modes of discourse: The local structure of texts, volume 103. Cambridge University Press.

Facebook

Twitter

Click to copy link

Link copied

Cite

German Credit Dataset Dataset [Dataset]. https://paperswithcode.com/dataset/german-credit-dataset

German Credit Dataset Dataset

Explore at:

Description

Two datasets are provided. the original dataset, in the form provided by Prof. Hofmann, contains categorical/symbolic attributes and is in the file "german.data".

For algorithms that need numerical attributes, Strathclyde University produced the file "german.data-numeric". This file has been edited and several indicator variables added to make it suitable for algorithms which cannot cope with categorical variables. Several attributes that are ordered categorical (such as attribute 17) have been coded as integer. This was the form used by StatLog.

This dataset requires use of a cost matrix:

	Good	Bad
Good	0	1
Bad	5	0

The rows represent the actual classification and the columns the predicted classification.

It is worse to class a customer as good when they are bad (5), than it is to class a customer as bad when they are good (1).

Clear search

Close search

Google apps

Main menu

German Credit Dataset Dataset

Statlog (German Credit Data)

Source:

Data Set Information:

Attribute Information:

Relevant Papers:

Papers That Cite This Data Set1:

Citation Request:

German-Credit-Data-Set-with-Credit-Risk

Replication Data for: German Credit

Ten Thousand German News Articles Dataset

Why a German dataset?

The dataset

Numbers and statistics

Splitting into train and test

Code

License

Voxforge German Dataset

germanquad

Germany

Germany Population: German

german dataset

Dataset

Contents

Dataset: The plural interpretability of German linking elements...

German Language Datasets | Call Center, Virtual Assistant & TTS

Vietnamese German Dataset

Dataset

Contents

Texas German Sample Corpus

Data from: German Weimar Republic Data, 1919-1933

German Consumers/ B2C data in Germany

Domain-Specific Dataset of Difficulty Ratings for German Noun Compounds

SB-10K german dataset

Dataset

Contents

Germany GDP

GER_SET: Situation Entity Type labelled corpus for German - Dataset - B2FIND...

German Credit Dataset Dataset