7 datasets found
  1. aggregate-data-italian-cities-from-wikipedia

    • kaggle.com
    Updated May 20, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    alepuzio (2020). aggregate-data-italian-cities-from-wikipedia [Dataset]. https://www.kaggle.com/alepuzio/aggregatedataitaliancitiesfromwikipedia/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 20, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    alepuzio
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Context

    This dataset is the result of my study on web-scraping of English Wikipedia in R and my tests on regression and classification modelization in R.

    Content

    The content is create by reading the appropriate articles in English Wikipedia about Italian cities: I did'nt run NPL analisys but only the table with the data and I ranked every city from 0 to N in every aspect. About the values, 0 means "*the city is not ranked in this aspect*" and N means "*the city is at first place, in descending order of importance, in this aspect* ". If there's no ranking in a particular aspect (for example, the only existence of the airports/harbours with no additional data about the traffic or the size), then 0 means "*no existence*" and N means "*there are N airports/harbours*". The only not-numeric column is the column with the name of the cities in English form, except some exceptions (for example, "*Bra (CN)* " because of simplicity.

    Acknowledgements

    I acknowledge the Wikimedia Foundation for his work, his mission and to make available the cover image of this dataset, (please read the article "The Ideal city (painting)") . I acknowledge too StackOverflow and Cross-Validated to be the most important focus of technical knowledge in the world, all the people in Kaggle for the suggestions.

    Inspiration

    As a beginner in data analisys and modelization (Ok, I passed the exam of statistics in Politecnico di Milano (Italy), but there are more than 10 years that I don't work in this topic and my memory is getting old ^_^) I worked more on data clean, dataset building and building the simplest modelization.

    You can use this datase to realize which city is good to live or to expand this to add some other data from Wikipedia (not only reading the tables but too to read the text adn extrapolate the data from the meaningless text.)

  2. I

    Italy IT: Proportion of People Living Below 50 Percent Of Median Income: %

    • ceicdata.com
    Updated Nov 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com (2022). Italy IT: Proportion of People Living Below 50 Percent Of Median Income: % [Dataset]. https://www.ceicdata.com/en/italy/social-poverty-and-inequality
    Explore at:
    Dataset updated
    Nov 29, 2022
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Dec 1, 2010 - Dec 1, 2021
    Area covered
    Italy
    Description

    IT: Proportion of People Living Below 50 Percent Of Median Income: % data was reported at 15.300 % in 2021. This records a decrease from the previous number of 15.600 % for 2020. IT: Proportion of People Living Below 50 Percent Of Median Income: % data is updated yearly, averaging 14.050 % from Dec 1977 (Median) to 2021, with 36 observations. The data reached an all-time high of 16.200 % in 1993 and a record low of 9.700 % in 1982. IT: Proportion of People Living Below 50 Percent Of Median Income: % data remains active status in CEIC and is reported by World Bank. The data is categorized under Global Database’s Italy – Table IT.World Bank.WDI: Social: Poverty and Inequality. The percentage of people in the population who live in households whose per capita income or consumption is below half of the median income or consumption per capita. The median is measured at 2017 Purchasing Power Parity (PPP) using the Poverty and Inequality Platform (http://www.pip.worldbank.org). For some countries, medians are not reported due to grouped and/or confidential data. The reference year is the year in which the underlying household survey data was collected. In cases for which the data collection period bridged two calendar years, the first year in which data were collected is reported.;World Bank, Poverty and Inequality Platform. Data are based on primary household survey data obtained from government statistical agencies and World Bank country departments. Data for high-income economies are mostly from the Luxembourg Income Study database. For more information and methodology, please see http://pip.worldbank.org.;;The World Bank’s internationally comparable poverty monitoring database now draws on income or detailed consumption data from more than 2000 household surveys across 169 countries. See the Poverty and Inequality Platform (PIP) for details (www.pip.worldbank.org).

  3. g

    European Values Study 2008: Italy (EVS 2008)

    • search.gesis.org
    • dbk.gesis.org
    • +3more
    Updated Nov 30, 2010
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rovati, Giancarlo (2010). European Values Study 2008: Italy (EVS 2008) [Dataset]. http://doi.org/10.4232/1.10031
    Explore at:
    Dataset updated
    Nov 30, 2010
    Dataset provided by
    GESIS Data Archive
    GESIS search
    Authors
    Rovati, Giancarlo
    License

    https://www.gesis.org/en/institute/data-usage-termshttps://www.gesis.org/en/institute/data-usage-terms

    Time period covered
    Oct 2, 2009 - Dec 30, 2009
    Area covered
    Italy
    Description

    This survey is a not up-to-date version. Please, use the updated version included in the EVS integrated data files. This national dataset is only available for replication purposes and analysis with additional country-specific variables (see ´Further Remarks´).

    Two online overviews offer comprehensive metadata on the EVS datasets and variables.

    The extended study description for the EVS 2008 provides country-specific information on the origin and outcomes of the national surveys The variable overview of the four EVS waves 1981 1990 1999/2000 and 2008 allows for identifying country specific deviations in the question wording within and across the EVS waves.

    These overviews can be found at: Extended Study Description Variable Overview

    Moral, religious, societal, political, work, and family values of Europeans.

    Topics: 1. Perceptions of life: importance of work, family, friends and acquaintances, leisure time, politics and religion; frequency of political discussions with friends; happiness; self-assessment of own health; memberships and unpaid work (volunteering) in: social welfare services, religious or church organisations, education, or cultural activities, labour unions, political parties, local political actions, human rights, environmental or peace movement, professional associations, youth work, sports clubs, women´s groups, voluntary associations concerned with health or other groups; tolerance towards minorities (people with a criminal record, of a different race, left/right wing extremists, alcohol addicts, large families, emotionally unstable people, Muslims, immigrants, AIDS sufferers, drug addicts, homosexuals, Jews, gypsies and Christians - social distance); trust in people; estimation of people´s fair and helpful behaviour; internal or external control; satisfaction with life.

    1. Work: reasons for people to live in need; importance of selected aspects of occupational work; employment status; general work satisfaction; freedom of decision-taking in the job; importance of work (work ethics, scale); important aspects of leisure time; attitude towards following instructions at work without criticism (obedience work); give priority to nationals over foreigners as well as men over women in jobs.

    2. Religion: Individual or general clear guidelines for good and evil; religious denomination; current and former religious denomination; current frequency of church attendance and at the age of 12; importance of religious celebration at birth, marriage, and funeral; self-assessment of religiousness; churches give adequate answers to moral questions, problems of family life, spiritual needs and social problems of the country; belief in God, life after death, hell, heaven, sin and re-incarnation; personal God versus spirit or life force; own way of connecting with the divine; interest in the sacred or the supernatural; attitude towards the existence of one true religion; importance of God in one´s life (10-point-scale); experience of comfort and strength from religion and belief; moments of prayer and meditation; frequency of prayers; belief in lucky charms or a talisman (10-point-scale); attitude towards the separation of church and state.

    3. Family and marriage: most important criteria for a successful marriage (scale); attitude towards childcare (a child needs a home with father and mother, a woman has to have children to be fulfilled, marriage is an out-dated institution, woman as a single-parent); attitude towards marriage, children, and traditional family structure (scale); attitude towards traditional understanding of one´s role of man and woman in occupation and family (scale); attitude towards: respect and love for parents, parent´s responsibilities for their children and the responsibility of adult children for their parents when they are in need of long-term care; importance of educational goals; attitude towards abortion.

    4. Politics and society: political interest; political participation; preference for individual freedom or social equality; self-assessment on a left-right continuum (10-point-scale); self-responsibility or governmental provision; free decision of job-taking of the unemployed or no permission to refuse a job; advantage or harmfulness of competition; liberty of firms or governmental control; equal incomes or incentives for indivi...

  4. F

    English-Italian Parallel Corpus for the Gaming Domain

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). English-Italian Parallel Corpus for the Gaming Domain [Dataset]. https://www.futurebeeai.com/dataset/parallel-corpora/italian-english-translated-parallel-corpus-for-gaming-domain
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    The English-Italian Gaming Parallel Corpora is a curated bilingual dataset designed to support game localization, machine translation, and language model training for the Gaming industry. It consists of over 50,000 sentence pairs, professionally translated between English and Italian, capturing the linguistic and cultural depth of gaming content.

    Dataset Content

    Volume and Translator Diversity
    Total Sentence Pairs: 50,000+
    Contributors: Over 200 native and professional translators
    Source: All content is original and tailored specifically for the Gaming domain
    Sentence Variety
    Sentence Length: 7 to 25 words
    Sentence Types: Includes simple, compound, and complex sentences
    Forms Covered: Interrogative, imperative, affirmative, and negative sentences
    Voice Diversity: Sentences written in both active and passive voice
    Stylistic Coverage: Includes idioms, metaphors, gaming slang, and figurative expressions
    Discourse Elements: Contains conjunctions, logical connectors, and transitional phrases for natural flow
    Bidirectional Structure: Includes English to Italian and Italian to English translations for robust model training

    Domain-Specific Focus

    Gaming Language Coverage
    Terminology: Covers in-game elements, UI/UX, controls, multiplayer features, and genre-specific phrases
    Dialogue Content: Includes NPC dialogue, tutorial lines, mission briefings, walkthroughs, and strategy guidance
    Communication Scenarios: Reflects live chat, support queries, and multiplayer messaging
    Cross-Domain Inclusion: Contains relevant terms from adjacent domains like entertainment, esports, virtual worlds, and AR/VR
    Format and Structure
    File Formats: Delivered in Excel, with optional conversion to JSON, TMX, XML, XLIFF, XLS, or other standard formats
    Structure Fields: Serial Number, Unique ID, Source Sentence, Source Word Count, Target Sentence, Target Word Count
    Sentence Alignment: Sentence-level parallel pairs with consistent formatting for MT pipelines

    Usage and Applications

    Machine Translation: Train and fine-tune domain-specific MT engines for gaming content
    Game Localization: Adapt games across English-Italian markets while preserving nuance and playability
    NLP Tools: Power predictive keyboards, grammar checkers, spelling correction, and sentence completion models
    LLM Fine-Tuning: Strengthen bilingual comprehension and translation capabilities in large language models
    Dialogue Systems: Enable context-aware, conversational AI for in-game or support environments
    Bilingual Retrieval: Use for cross-language search, sentence matching, and similarity scoring

    Alignment Confidence and Quality Assurance

    All translations are manually verified by native bilingual experts for accuracy, naturalness, and domain relevance
    Each sentence pair is reviewed to ensure semantic alignment and stylistic consistency
    <div

  5. T

    Italy Interest Rate

    • tradingeconomics.com
    • de.tradingeconomics.com
    • +13more
    csv, excel, json, xml
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS, Italy Interest Rate [Dataset]. https://tradingeconomics.com/italy/interest-rate
    Explore at:
    excel, csv, json, xmlAvailable download formats
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Dec 18, 1998 - Jul 24, 2025
    Area covered
    Italy
    Description

    The benchmark interest rate in Italy was last recorded at 4.50 percent. This dataset provides - Italy Interest Rate - actual values, historical data, forecast, chart, statistics, economic calendar and news.

  6. E

    Venice Italian Treebank (VIT)

    • catalog.elra.info
    • live.european-language-grid.eu
    Updated Oct 23, 2014
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ELRA (European Language Resources Association) and its operational body ELDA (Evaluations and Language resources Distribution Agency) (2014). Venice Italian Treebank (VIT) [Dataset]. https://catalog.elra.info/en-us/repository/browse/ELRA-W0040/
    Explore at:
    Dataset updated
    Oct 23, 2014
    Dataset provided by
    ELRA (European Language Resources Association) and its operational body ELDA (Evaluations and Language resources Distribution Agency)
    ELRA (European Language Resources Association)
    License

    https://catalog.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdfhttps://catalog.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdf

    https://catalog.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdfhttps://catalog.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdf

    Area covered
    Venice
    Description

    The VIT, Venice Italian Treebank is the effort of the collaboration of people working at the Laboratory of Computational Linguistics of the University of Venice in the years 1995-2005. It is partly the result of annotation carried out internally with no specific project in mind and no financial support. This work was partly related to the development of a lexicon, a morphological analyzer, a tagger, a deep parser of Italian. All these resources were finally ready at the beginning of the ‘90s when the LCL got involved in the first national projects. The VIT contains about 272,000 words distributed over six different domains, and this is what makes it so relevant for the study of the structure of Italian language. The following domains were annotated: Domain Number of words Time spanBureaucratic 20,000 1986 Politics 40,000 1984Economic & financial 12,000 1987Literary 10,000 1984Scientific 20,000 1985News 170,000 1994In addition, some 60,000 tokens of spoken dialogues in different Italian varieties were annotated.The annotation follows general X-bar criteria with 29 constituency labels and 102 PoS tags. VIT is also made available in a broad annotation version with 10 constituency labels and 22 PoS tags for machine learning purposes.The format is plain text with square bracketing. However, a UPenn style version which is readable by the open source query language CorpusSearch is also provided.Version 2 also available here: http://catalog.elra.info/en-us/repository/browse/ELRA-W0324/

  7. Number of UK citizens living in EU countries 2019

    • statista.com
    Updated Jul 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Number of UK citizens living in EU countries 2019 [Dataset]. https://www.statista.com/statistics/1059795/uk-expats-in-europe/
    Explore at:
    Dataset updated
    Jul 15, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2019
    Area covered
    European Union
    Description

    In 2019, there were approximately 302,020 British citizens living in Spain, with a further 293,061 in Ireland and 176,672 in France. By comparison, there were only 604 British people living in Slovenia, the fewest of any European Union member state. As a member of the European Union, British citizens had the right to live and work in any EU member state. Although these rights were lost for most British citizens after the UK left the EU in 2020, Britons already living in EU states were able to largely retain their previous rights of residence. EU citizens living in the UK EU citizens living in the UK face the same dilemma that British nationals did regarding their legal status after Brexit. In the same year, there were 902,000 Polish citizens, 404,000 Romanians, and 322,000 people from the Republic of Ireland living in the UK in that year, along with almost two million EU citizens from the other 24 EU member states. To retain their rights after Brexit, EU citizens living in the UK were able to apply for the EU settlement scheme. As of 2025, there have been around 8.4 million applications to this scheme, with Romanian and Polish nationals the most common nationality at 1.87 million applications, and 1.27 million applications respectively. Is support for Brexit waning in 2024? As of 2025, the share of people in the UK who think leaving the EU was the wrong decision stood at 56 percent, compared with 31 percent who think it was the correct choice. In general, support for Brexit has declined since April 2021, when 46 percent of people supported Brexit, compared with 43 percent who regretted it. What people think Britain's relationship with the EU should be is, however, still unclear. A survey from November 2023 indicated that just 31 percent thought the UK should rejoin the EU, with a further 11 percent supporting rejoining the single market but not the EU. Only ten percent of respondents were satisfied with the current relationship, while nine percent wished to reduce ties even further.

  8. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
alepuzio (2020). aggregate-data-italian-cities-from-wikipedia [Dataset]. https://www.kaggle.com/alepuzio/aggregatedataitaliancitiesfromwikipedia/code
Organization logo

aggregate-data-italian-cities-from-wikipedia

Elementary data about Italian cities in the specilized articles in Wikipedia

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 20, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
alepuzio
License

Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically

Description

Context

This dataset is the result of my study on web-scraping of English Wikipedia in R and my tests on regression and classification modelization in R.

Content

The content is create by reading the appropriate articles in English Wikipedia about Italian cities: I did'nt run NPL analisys but only the table with the data and I ranked every city from 0 to N in every aspect. About the values, 0 means "*the city is not ranked in this aspect*" and N means "*the city is at first place, in descending order of importance, in this aspect* ". If there's no ranking in a particular aspect (for example, the only existence of the airports/harbours with no additional data about the traffic or the size), then 0 means "*no existence*" and N means "*there are N airports/harbours*". The only not-numeric column is the column with the name of the cities in English form, except some exceptions (for example, "*Bra (CN)* " because of simplicity.

Acknowledgements

I acknowledge the Wikimedia Foundation for his work, his mission and to make available the cover image of this dataset, (please read the article "The Ideal city (painting)") . I acknowledge too StackOverflow and Cross-Validated to be the most important focus of technical knowledge in the world, all the people in Kaggle for the suggestions.

Inspiration

As a beginner in data analisys and modelization (Ok, I passed the exam of statistics in Politecnico di Milano (Italy), but there are more than 10 years that I don't work in this topic and my memory is getting old ^_^) I worked more on data clean, dataset building and building the simplest modelization.

You can use this datase to realize which city is good to live or to expand this to add some other data from Wikipedia (not only reading the tables but too to read the text adn extrapolate the data from the meaningless text.)

Search
Clear search
Close search
Google apps
Main menu