Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Gross Domestic Product (GDP) in Germany contracted 0.30 percent in the second quarter of 2025 over the previous quarter. This dataset provides the latest reported value for - Germany GDP Growth Rate - plus previous releases, historical high and low, short-term forecast and long-term prediction, economic calendar, survey consensus and news.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Unemployment Rate in Germany remained unchanged at 6.30 percent in August. This dataset provides the latest reported value for - Germany Unemployment Rate - plus previous releases, historical high and low, short-term forecast and long-term prediction, economic calendar, survey consensus and news.
The study, funded by the German Federal Ministry of Education and Research, is carried out jointly by GESIS – Leibniz Institute for the Social Sciences, the University of Heidelberg and the Berlin Social Science Center as part of the Solikris project. Solikris investigates the effects of crises on solidarity dynamics in society and politics. To this end, the study asks for data on everyday life, social and political issues in Germany and Europe in 2020. The focus is largely on the impact of the COVID-19 pandemic on the opinions and feelings of the surveyed citizens about everyday life and the political situation in their respective countries.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Mica - Muskrat and Coypu and Raccoon Occurrences collected by ITAW, Germany is an occurrence dataset published by the Research Institute of Nature and Forest (INBO) and ITAW (Institute for Terrestrial and Aquatic Wildlife Research. It is part of the LIFE MICA - Management of Invasive Coypu and muskrat in Europe project on Muskrat monitoring networks in Flanders, The Netherlands and Germany. This dataset contains Muskrat, Raccoon and Coypu counts. Here it is published as a standardized Darwin Core Archive and includes for each occurrence record an recordID, date, location, samplingProtocol, the number of recorded individuals, status (present/absent) and scientific name. Issues with the dataset can be reported at https://github.com/inbo/muskrat-uvw-occurrences/issues
We have released this dataset to the public domain under a Creative Commons Zero waiver. We would appreciate it if you follow the INBO norms for data use (https://www.inbo.be/en/norms-data-use) when using the data. If you have any questions regarding this dataset, don't hesitate to contact us via the contact information provided in the metadata or via opendata@inbo.be.
The dataset on this page was obtained from the 2001 to 2005 editions of the Reader's Digest Magazine. The dataset has been used to evaluate the performance of semantic relatedness measures. In the file, each word choice problem is on a single line. It starts with the target word, followed by the four candidates, followed by the number of the correct candidate. All fields are separated by a colon.
The Politbarometer has been conducted since 1977 on an almost monthly basis by the Forschungsgruppe Wahlen on behalf of the Second German Television (ZDF). Since 1990, this database has also been available for the new German states. The survey focuses on the opinions and attitudes of the voting-age population in the Federal Republic on current political issues, parties, politicians, and voting behavior. From 1990 to 1995 and from 1999 onward, the Politbarometer surveys were conducted separately both in the newly formed eastern and in the western German states (Politbarometer East and Politbarometer West). The separate monthly surveys of a year are integrated into a cumulative data set that includes all surveys of a year and all variables of the respective year. Starting in 2003, the Politbarometer short surveys, collected with varying frequency throughout the year, are integrated into the annual cumulation. The following topics were repeated identically at each survey period: most important political problems in Germany; voting intention at the next parliamentary elections (opinion poll, ranking) ; party preference; voting behaviour at the last parliamentary elections; coalition preference; sympathy-scale for SPD, CDU, CSU, FDP, die Grünen and PDS; rank of the parties (split); sympathy-scale for selected leading politicians (Joschka Fischer, Angela Merkel, Gerhard Schröder, Edmund Stoiber, Guido Westerwelle and Christian Wulff); judgement of the present economic situation in Germany; the most competent party to resolve the present economic problems; judgement of respondents economic situation in present and in future; judgement of an upward trend in German economy (economic situation expectation); the most competent party for the creation of jobs; self-assessment on a left-right continuum. 2. At least in one or in additional months was asked: postal vote; first and second vote; eligible parties and non eligible parties; certainty of ones own voting decision; judgement of the so called grand coalition of CDU/CSU and SPD as well as different coalitions from the parties in the Bundestag; attitude towards an one party government of the CDU/CSU; attitude towards a SPD government with PDS as party for obtaining the majority; voting for a different party, in case the election results would have been known before; satisfaction with the result of the parliamentary elections; attitude to a participation of FDP and PDS in government; the most competent government coalition to resolve the problems in Germany; accessibility of a majority of SPD and die Grünen; preference for SPD in the government or in the opposition; Federal Chancellor preference for Angela Merkel or Gerhard Schröder in general as well as with one grand coalition; additionally preference for a Federal Chancellor; clarification of the chancellor question or the government programme in first place during the negotiations between CDU/CSU and SPD; attitude towards a minority government; preferred minority government; judgement of the election results with regard to the approach to resolve the most important problems in Germany; expectation of a grand coalition by CDU/CSU and SPD; perceived euphoric mood in Germany after the formation of the new government; important contribution of the grand coalition to resolve the problems in Germany, in fighting of unemployment, in resolving the pension problem, the finance problem, the problems in health service, in stimulating the economy as well as supporting families; attitude towards an election of Angela Merkel as a Federal Chancellor; preference for Gerhard Schröder as a Federal Chancellor in a grand coalition; expected authority of Angela Merkel in important political questions; judgement of the competence of Angela Merkel in representing Germany abroad; satisfaction with the new government team; expected support of Merkel by the CDU/CSU parliamentary group as well as the SPD parliamentary group in the Bundestag; expected continuance of the grand coalition over the whole legislative period; attitude towards East Germans as party leaders (of CDU and SPD); attitude towards a woman as chancellor; woman as reason for the eligibility of the CDU/CSU; satisfaction with the performances of the Federal Government as well as with the individual parties SPD, die Grünen, CDU/CSU, FDP as well as the Linkspartei.PDS (scale); currently most important politician or politicians in Germany; sympathy scale for selected leading politicians (in addition to those mentioned above: Wolfgang Clement, Hans Eichel, Gregor Gysi, Roland Koch, Horst Köhler, Oskar Lafontaine, Friedrich Merz, Franz Müntefering, Matthias Platzeck, Otto Schily, Ulla Schmidt, Horst Seehofer, Peer Steinbrück and George W. Bush); disunity of SPD, CDU, CSU, die Grünen, FDP and PDS as well as of CDU and CSU with each other; judgement of the relations between the ruling parties SPD and die Grünen and the relations of CDU to CSU; candidate for the chancellorship of the CDU/CSU with the greatest chances of an electoral victory at the next parliamentary elections; assessment of the most suitable time for the decision of the candidate for chancellorship question among the CDU/CSU; assessment of the support of Gerhard Schröder by the SPD, of Angela Merkel by the CDU and CSU, of Edmund Stoiber by the CSU and of Guido Westerwelle by the FDP; comparison of Angela Merkel with Gerhard Schröder with regard to reliability, energy, sympathy, authority, expertise and winner type as well as leadership in government and at the solution of future problems in Germany ; Gerhard Schröder or Angela Merkel as an expected beneficiary of the intended TV duel; TV duel between Gerhard Schröder and Angela Merkel watched; better performance of Schröder or Merkel at the TV duel; change of the respondents attitude towards the candidates by the TV duel; the most competent candidate for the chancellorship for the creation of new jobs; Angela Merkel as a person representing the interests of the women and the East Germans; expected election outcome for the CDU with as well as without Merkel; chancellor preference; assignment of the qualities ´progressive´, ´credible´ and ´socially´ to the large parties; satisfaction with democracy; strength of the interest in politics; right people in the leading positions (general, in politics and in the economy); expectation of the future economic situation in Germany; condition of the society in Germany and in comparison with the Western European neighbours; comparison between Germany and Western European neighbouring states regarding the economic situation; Europe, USA or China as the most successful economy region; the presumably strongest economy region within 10 years: Europe, USA or China; perceived conflicts between the poor and the rich, employers and employees, young and old, foreigners and Germans, East Germans and West Germans, as well as between men and women; expected and preferred direction of development for the SPD (to the left or to the right); attitude towards the intended resignation of Franz Müntefering from the SPD party leadership; judgement of Matthias Platzeck as a successor for the SPD party leadership as well as expected strengthening of the cohesion in the SPD due to Matthias Platzeck; correctness and sufficiency of the previous reforms; personal meaning of Hartz IV; judgement of the unemployment benefit (ALG II) and the cuts for long-term unemployed; attitude towards beginning work on lower wage level; assessment of the success of the Hartz IV reforms in respect to creation of new jobs; judgement of the carrying out of the introduction of Hartz IV; attitude towards the standard wage as a minimum wage; attitude towards a stronger taxation of high incomes as well as at a common tax rate of 25%; preferred measures of the state for the reduction of the budget deficit: tax increases, reductions of expenditure or additional debts; attitude towards the rise of the retirement age to 67 years; preference for a rise of contributions to the health insurance or for the payment of services on ones own expense; attitude towards the suggested contribution to the health insurance of non-working spouse; expected reform readiness of the grand coalition; Federal Government, enterprise or world economic situation being responsible for unemployment in Germany; assessment of the share of enterprises which cut jobs despite high profits; opinion on the SPD debate: greed for profit of the enterprises leads to endangering the democracy; attitude towards the counter-opinion of the CDU: SPD debate as a diversionary tactics of unemployment issues; opinion about solving the unemployment problem within the next years; assumed agreement between the government and the CDU/CSU opposition for fighting of unemployment; sufficient measures of the Federal Government in fighting unemployment and in comparison to an assumed CDU/CSUs commanded government as well as in comparison to a grand coalition of SPD and CDU/CSU; expected effect of selected measures on the fighting of unemployment (tax reduction for enterprise, loosening of the dismissal protection, working time extension, reduction in the contributions to the social insurance; expected continuance of the government coalition until the next parliamentary elections in 2006; assumed actual majority for Gerhard Schröder in the Bundestag; expectation of a forward brought new election according to the vote of confidence; judgement of a forward brought new election; attitude towards a resignation of Gerhard Schröder; attitude towards a change of the constitution for the self-dissolution of the Bundestag; expectation of value-added tax increase, further cuts in the health system, the abolition of subsidies for home buyers, cuts of the social provision as well as keeping the environmental tax after an electoral victory by the CDU/CSU or SPD; judgement of the solution expertise for economic and social
There are a number of reasons Germans don’t use online administration services. For most, either they don’t know what exactly the online offers are, or the service they are looking for isn’t available online, thus meaning an actual appointment has to be made after all.
Privacy and security concerns
In Germany, e-government procedures have been supported by the e-government law since 2013. Despite this, there are still several serious concerns among the population about data security, which contribute to usage barriers and prevent services for becoming better known, or, indeed, more widely used. The main issues are thought to be lack of security during data transmission and the fear of becoming a transparent citizen due to personal data being collected in one central database. In fact, regarding the topic of personal data being safe online more broadly, the general population was skeptical.
Use and awareness
When Germans did use e-government services, they tended to search for information from their city administrations about current topics on social networks, as well as use open data portals run by an administration. In terms of e-government service awareness, users were least informed about crowd-sourcing deals available for negotiation between citizens and an administration.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset includes occurrence data of invasive and potentially invasive nonnative plant species. The datasets are being collected since 2010, mostly in the German federal state Saxony-Anhalt, but also in other federal states of Germany and in Europe. Collectors are the staff of the Coordination Centre for alien invasive plant species (KORINA), employees of public authorities, scientists, students and citizens.
Attitude to problems of international policy and the USA-image of the Germans. Topics: Satisfaction with standard of living; contentment with life; attitude to France, USA, Soviet Union, China, Italy and Great Britain; judgement on foreign policy; assessment of the desire for peace and the military strength of the two superpowers; judgement on the US-American and Soviet relation in international matters; trust in the political capabilities of the USA; assessment of the capability of the Soviet Union and America in the areas of economy, culture, science, space research, education and nuclear weapons; judgement on the importance of landing on the moon; attitude to a united Europe; judgement on world population growth and the population development in the FRG; attitude to a birth control program in the FRG and in developing countries; current politician idols in Europe and the rest of the world; trust in the alliance partners; attitude to disarmament, the NATO, nuclear weapons tests, the UN, the admission of China into the United Nations and the Vietnam war; assessment of the race problems in the USA; attitude to American private investments in the FRG; naming groups and organizations in the FRG that are too influential; membership in a trade union; party preference; religiousness. Demography: age (classified); sex; occupation; state. Interviewer rating: willingness of respondent to cooperate; difficulties in answering questions; length of interview; presence of another person; number of contact attempts; social class of respondent; city size; date of interview. Einstellung zu Problemen der internationalen Politik und das USA-Image der Deutschen. Themen: Zufriedenheit mit dem Lebensstandard; Lebenszufriedenheit Einstellung zu Frankreich, USA, Sowjetunion, China, Italien und Großbritannien; Beurteilung der Außenpolitik; Einschätzung der Friedensabsicht und der militärischen Stärke der beiden Supermächte; Beurteilung des US-amerikanischen und sowjetischen Verhältnis in internationalen Angelegenheiten; Vertrauen in die politischen Fähigkeiten der USA; Einschätzung der Leistungsfähigkeit der Sowjetunion und Amerikas auf den Gebieten der Wirtschaft, Kultur, Wissenschaft, Weltraumforschung, Bildung und der Atomwaffen; Beurteilung der Wichtigkeit einer Mondlandung; Einstellung zu einem vereinten Europa; Beurteilung des Weltbevölkerungszuwachses und der Bevölkerungsentwicklung in der BRD; Einstellung zu einem Geburtenkontrollprogramm in der BRD und in Entwicklungsländern; gegenwärtige Politiker-Idole in Europa und der übrigen Welt; Vertrauen in die Bündnispartner; Einstellung zur Abrüstung, zur NATO, zu Atomwaffenversuchen, zur UNO, zur Aufnahme Chinas in die Vereinten Nationen und zum Vietnamkrieg; Einschätzung der Rassenprobleme in den USA; Einstellung zu amerikanischen Privatinvestitionen in der BRD; Nennung von zu einflußreichen Gruppen und Organisationen in der BRD; Mitgliedschaft in einer Gewerkschaft; Parteipräferenz; Religiosität. Demographie: Alter (klassiert); Geschlecht; Beruf; Bundesland. Interviewerrating: Kooperationsbereitschaft des Befragten; Interviewdauer; Anwesenheit einer anderen Person; Anzahl der Kontaktversuche; Schichtzugehörigkeit des Befragten; Ortsgröße; Interviewdatum.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Germany recorded a trade surplus of 14.70 EUR Billion in July of 2025. This dataset provides - Germany Balance of Trade - actual values, historical data, forecast, chart, statistics, economic calendar and news.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is about books. It has 1 row and is filtered where the book is Czechoslovakia before Munich : the German minority problem and British appeasement policy. It features 7 columns including author, publication date, language, and book publisher.
The Politbarometer has been conducted since 1977 on an almost monthly basis by the Research Group for Elections (Forschungsgruppe Wahlen) for the Second German Television (ZDF). Since 1990, this database has also been available for the new German states. The survey focuses on the opinions and attitudes of the voting population in the Federal Republic on current political topics, parties, politicians, and voting behavior. From 1990 to 1995 and from 1999 onward, the Politbarometer surveys were conducted separately in the eastern and western federal states (Politbarometer East and Politbarometer West). The separate monthly surveys of a year are integrated into a cumulative data set that includes all surveys of a year and all variables of the respective year. The Politbarometer short surveys, collected with varying frequency throughout the year, are integrated into the annual cumulation starting from 2003.
It contains 5 interviews with Greek immigrants and immigrant women in Germany (2 women and 3 men), recording the post-war difficulties that led them to migration, the transition from rural life to industrial work, and the problems of returning to Greece. The purpose of the research was also to compare the post-war Greek migration to Germany with the experience of Albanian immigrants in Greece (Collection No 2). On the basis of this comparison, the students produced a radio show entitled "Journey to Infinity", which was broadcast on the then Municipal Radio. Non-probability: Availability Face-to-face interview
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Abstract: The dataset contains 3D geometries of major tectonic faults in Germany as mesh data in .txt format. TechnicalRemarks: The dataset contains 3D geometries of tectonic faults in Germany as mesh data. There are three fault sets: • Vertical fault set: Vertical_Fault_Set • Andersonian fault set containing: Andersonian_Normal_Faults, Andersonian_Thrust_Faults and Andersonian_Strike_Slip_Faults • Semi-Realistic fault set containing: Semi_Realistic _Normal_Faults, Semi_Realistic_Thrust_Faults and Semi_Realistic _Strike_Slip_Faults The three fault sets are of increasing complexity, however their accuracy is limited due to the large scale and the sparse availability of reference data. The data sets are given as .txt files that can be imported to Tecplot EX 2019 using the freely available AddOn GeoStress. If the files should be opened using Hypermesh or used as an input file for Abaqus the following edits have to be made: • The data type has to be changed to .inp • The Line *NAME has to be deleted • The document must end with ***** instead of **
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset consists of 1,759,830 multi-spectral image patches from the Sentinel-2 mission, annotated with image- and pixel-level land cover and land usage labels from the German land cover model LBM-DE2018 with land cover classes based on the CORINE Land Cover database (CLC) 2018. It includes pixel synchronous examples from each of the four seasons, plus an additional snowy set, spanning the time from April 2018 to February 2019. The patches were taken from 519,547 unique locations, covering the whole surface area of Germany, with each patch covering an area of 1.2km x 1.2km. The set is split into two overlapping grids, consisting of roughly 880,000 samples each, which are shifted by half the patch size in both dimensions. The images in each of the both grids themselves do not overlap.
Contents
Each sample includes:
3 10m resolution bands (RGB), 120px x 120px
1 10m resolution band (infrared), 120px x 120px
6 20m resolution bands, 60px x 60px
2 60m resolution bands, 20xp x 20px
1 pixel-level label map
2 binary masks for cloud and snow coverage
2 binary masks for easy and medium segmentation difficulties, marks areas <300px and <100px respectively
1 JSON-file containing additional meta-information
The meta.csv contains the following information about each sample:
Which season it belongs to
Which of the two grids it belongs to
Coordinates of the patch center
Whether it was acquired from Sentinel-2 Satellite A or B
Date and time of image acquisition
Snow and cloud coverage percentages
Image-level multi-class labels
Three additional image-level urbanization labels, based on the center pixel (details below)
The path to the sample
Classes
ID
Class
1
Continuous urban fabric
2
Discontinuous urban fabric
3
Industrial or commercial units
4
Road and rail networks and associated land
5
Port areas
6
Airports
7
Mineral extraction sites
8
Dump sites
9
Construction sites
10
Green urban areas
11
Sport and leisure facilities
12
Non-irrigated arable land
13
Vineyards
14
Fruit trees and berry plantations
15
Pastures
16
Broad-leaved forest
17
Coniferous forest
18
Mixed forest
19
Natural grasslands
20
Moors and heathland
21
Transitional woodland/shrub
22
Beaches, dunes, sands
23
Bare rock
24
Sparsely vegetated areas
25
Inland marshes
26
Peat bogs
27
Salt marshes
28
Intertidal flats
29
Water courses
30
Water bodies
31
Coastal lagoons
32
Estuaries
33
Sea and ocean
Urbanization classes
SLRAUM
0: None
1: Ländlicher Raum (~ rural area)
2: Städtischer Raum (~ urban area)
RTYP3
0: None
1: Ländliche Regionen (~ rural areas)
2: Regionen mit Verstädterungsansätzen (~ urbanizing areas)
3: Städtische Regionen (~ urban areas)
KTYP4
0: None
1: Dünn besiedelte ländliche Kreise
2: Kreisfreie Großstädte
3: Ländliche Kreise mit Verdichtungsansätzen
4: Städtische Kreise
Further information on the urbanization classes can be found here:
SLRAUM
RTYP3
KTYP4
License of landcover model
Bundesamt für Kartographie und Geodäsie
dl-de/by-2-0 from https://www.govdata.de/dl-de/by-2-0
© GeoBasis-DE / BKG 2022
Source of landcover model
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
The German Healthcare Chat Dataset is a rich collection of over 12,000 text-based conversations between customers and call center agents, focused on real-world healthcare interactions. Designed to reflect authentic language use and domain-specific dialogue patterns, this dataset supports the development of conversational AI, chatbots, and NLP models tailored for healthcare applications in German-speaking regions.
The dataset captures a wide spectrum of healthcare-related chat scenarios, ensuring comprehensive coverage for training robust AI systems:
This variety helps simulate realistic healthcare support workflows and patient-agent dynamics.
This dataset reflects the natural flow of German healthcare communication and includes:
These elements ensure the dataset is contextually relevant and linguistically rich for real-world use cases.
Conversations range from simple inquiries to complex advisory sessions, including:
Each conversation typically includes these structural components:
This structured flow mirrors actual healthcare support conversations and is ideal for training advanced dialogue systems.
Available in JSON, CSV, and TXT formats, each conversation includes:
Geocoded addresses with enriched data and contact information from schools / educational institutions in Germany. This data provide detailed information on the location, name of the facility, type of facility such as high school, day care center, etc., number of pupils and children, contact options for the respective facility and many more.
Key features: Our school dataset of 32K records includes: • Address, Phone Number, E-Mail Address • Type of school • Number of students • 30+ Categories such as elementary school, grammar school, comprehensive school, community school, high school etc. ,
Benefits: • One authoritative dataset of all school facilities in Germany • Data updated annually • Consistent format
Applications: Directmarketing, Dialogue marketing, Telemarketing, Outdoor advertising, Location planning, Location-related analyzes, Geomarketing, Email campaigns, Location planning, construction planning, property valuation, location-related analyzes
Industries: Advertising Agencies, Planning Agencies, Data Collection & Internet Portals, Marketing Data Providers, Data Management Platforms, Real estate, Finance, Investment, Housing associations, Planning offices
Problem solving: • How can I address schools directly for my marketing campaign? • Which advertising near a school fits perfectly to my target group such as teachers, carers, mothers / parents or children? • Which outdoor advertising shouldn't be placed near a school (for e.g. Tobaco, Alcohol, Adult content etc.)? • How can I directly communicate with schools as part of a telemarketing campaign? • What role do schools play in location planning for retail, gastronomy, transport, leisure, etc.? • How can schools be involved in the traffic and infrastructure planning of cities and municipalities in smart city projects or the establishment of safety zones?
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
(see https://tblock.github.io/10kGNAD/ for the original dataset page)
This page introduces the 10k German News Articles Dataset (10kGNAD) german topic classification dataset. The 10kGNAD is based on the One Million Posts Corpus and avalaible under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. You can download the dataset here.
English text classification datasets are common. Examples are the big AG News, the class-rich 20 Newsgroups and the large-scale DBpedia ontology datasets for topic classification and for example the commonly used IMDb and Yelp datasets for sentiment analysis. Non-english datasets, especially German datasets, are less common. There is a collection of sentiment analysis datasets assembled by the Interest Group on German Sentiment Analysis. However, to my knowlege, no german topic classification dataset is avaliable to the public.
Due to grammatical differences between the English and the German language, a classifyer might be effective on a English dataset, but not as effectiv on a German dataset. The German language has a higher inflection and long compound words are quite common compared to the English language. One would need to evaluate a classifyer on multiple German datasets to get a sense of it's effectivness.
The 10kGNAD dataset is intended to solve part of this problem as the first german topic classification dataset. It consists of 10273 german language news articles from an austrian online newspaper categorized into nine topics. These articles are a till now unused part of the One Million Posts Corpus.
In the One Million Posts Corpus each article has a topic path. For example Newsroom/Wirtschaft/Wirtschaftpolitik/Finanzmaerkte/Griechenlandkrise
.
The 10kGNAD uses the second part of the topic path, here Wirtschaft
, as class label.
In result the dataset can be used for multi-class classification.
I created and used this dataset in my thesis to train and evaluate four text classifyers on the German language. By publishing the dataset I hope to support the advancement of tools and models for the German language. Additionally this dataset can be used as a benchmark dataset for german topic classification.
As in most real-world datasets the class distribution of the 10kGNAD is not balanced. The biggest class Web consists of 1678, while the smalles class Kultur contains only 539 articles. However articles from the Web class have on average the fewest words, while artilces from the culture class have the second most words.
I propose a stratifyed split of 10% for testing and the remaining articles for training.
To use the dataset as a benchmark dataset, please used the train.csv
and test.csv
files located in the project root.
Python scripts to extract the articles and split them into a train- and a testset avaliable in the code directory of this project.
Make sure to install the requirements.
The original corpus.sqlite3
is required to extract the articles (download here (compressed) or here (uncompressed)).
This dataset is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. Please consider citing the authors of the One Million Post Corpus if you use the dataset.
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Welcome to the German General Conversation Speech Dataset — a rich, linguistically diverse corpus purpose-built to accelerate the development of German speech technologies. This dataset is designed to train and fine-tune ASR systems, spoken language understanding models, and generative voice AI tailored to real-world German communication.
Curated by FutureBeeAI, this 30 hours dataset offers unscripted, spontaneous two-speaker conversations across a wide array of real-life topics. It enables researchers, AI developers, and voice-first product teams to build robust, production-grade German speech models that understand and respond to authentic German accents and dialects.
The dataset comprises 30 hours of high-quality audio, featuring natural, free-flowing dialogue between native speakers of German. These sessions range from informal daily talks to deeper, topic-specific discussions, ensuring variability and context richness for diverse use cases.
The dataset spans a wide variety of everyday and domain-relevant themes. This topic diversity ensures the resulting models are adaptable to broad speech contexts.
Each audio file is paired with a human-verified, verbatim transcription available in JSON format.
These transcriptions are production-ready, enabling seamless integration into ASR model pipelines or conversational AI workflows.
The dataset comes with granular metadata for both speakers and recordings:
Such metadata helps developers fine-tune model training and supports use-case-specific filtering or demographic analysis.
This dataset is a versatile resource for multiple German speech and language AI applications:
The study ´Current questions on the economy and transformation´ has been conducted by forsa on behalf of the Press and Information Office of the Federal Government. In the survey period from 15 April to 17 April 2024, the German population was asked about their opinions on the economic transformation. Topics: Current challenges of economic development in Germany compared to ten years ago; assessment of the appropriateness of the activities of the following actors with regard to overcoming the economic challenges: federal government, opposition in the Bundestag, state governments, companies and business associations, trade unions; preference for a future orientation of the German economy towards: climate protection and green technologies, established industries; importance of the following aspects with regard to the federal government´s actions: higher investment in infrastructure, greater expansion of renewable energies, promotion of climate-neutral industry, promotion of the establishment of future industries, improvement of working conditions, increasing the efficiency of public administration work, relieving companies of bureaucracy, no further debt, expansion of partnerships with Brazil, India and South Africa; attitude towards selected statements: ‘Made in Germany’ is recognised worldwide as a seal of quality, German economy should also become more independent of other countries in the long term despite higher costs in the short term, Germany needs more skilled workers from abroad. Demography: sex; age (grouped); school leaving certificate; net household income (grouped); party preference in the next federal election; voting behaviour in the last federal election. Additionally coded: respondent ID; size of locality; region; weight. Die Studie ´Aktuelle Fragen zu Wirtschaft und Transformation´ wurde von forsa im Auftrag des Presse- und Informationsamts der Bundesregierung durchgeführt. Im Erhebungszeitraum 15.04.2024 bis 17.04.2024 wurde die deutsche Bevölkerung zu ihren Meinungen zur wirtschaftlichen Transformation befragt. Themen: aktuelle Herausforderungen der wirtschaftlichen Entwicklung in Deutschland im Vergleich zu vor zehn Jahren; Bewertung der Angemessenheit der Aktivitäten der folgenden Akteure in Bezug auf die Bewältigung der wirtschaftlichen Herausforderungen: Bundesregierung, Opposition im Bundestag, Landesregierungen, Unternehmen und Unternehmensverbände, Gewerkschaften; Präferenz für eine künftige Ausrichtung der deutschen Wirtschaft auf: Klimaschutz und grüne Technologien, bewährte Industrien; Bedeutung der folgende Aspekte im Hinblick auf das Handeln der Bundesregierung: höhere Investitionen in Infrastruktur, stärkerer Ausbau erneuerbarer Energien, Förderung einer klimaneutralen Industrie, Förderung der Ansiedlung von Zukunftsindustrien, Verbesserung der Arbeitsbedingungen, Steigerung der Effizienz der Arbeit der öffentlichen Verwaltung, Entlastung von Unternehmen von Bürokratie, keine weitere Verschuldung, Ausbau der Partnerschaften mit Brasilien, Indien und Südafrika; Zustimmung zu ausgewählten Aussagen: ´Made in Germany´ gilt weltweit als anerkanntes Qualitätssiegel, deutsche Wirtschaft sollte auch trotz kurzfristig höherer Kosten langfristig unabhängiger von anderen Ländern werden, Deutschland braucht mehr Fachkräfte aus dem Ausland. Demographie: Geschlecht; Alter (gruppiert); Schulabschluss; Haushaltsnettoeinkommen (gruppiert); Parteipräferenz bei der nächsten Bundestagswahl; Wahlverhalten bei der letzten Bundestagswahl. Zusätzlich verkodet wurde: Befragtennummer; Ortsgröße; Region; Gewicht.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Gross Domestic Product (GDP) in Germany contracted 0.30 percent in the second quarter of 2025 over the previous quarter. This dataset provides the latest reported value for - Germany GDP Growth Rate - plus previous releases, historical high and low, short-term forecast and long-term prediction, economic calendar, survey consensus and news.