100+ datasets found
  1. EurLex DataSet

    • kaggle.com
    zip
    Updated May 23, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Beshoy Hakeem (2025). EurLex DataSet [Dataset]. https://www.kaggle.com/datasets/puskas78/eurlex-dataset
    Explore at:
    zip(992021193 bytes)Available download formats
    Dataset updated
    May 23, 2025
    Authors
    Beshoy Hakeem
    Description

    The CEPS EurLex dataset The dataset contains 142.036 EU laws - almost the entire corpus of the EU's digitally available legal acts passed between 1952 - 2019. It encompasses the three types of legally binding acts passed by the EU institutions: 102.304 regulations, 4.070 directives, 35.798 decisions in English language. The dataset was scraped from the official EU legal database (Eur-lex.eu) and transformed in machine-readable CSV format with the programming languages R and Python. The dataset was collected by the Centre for European Policy Studies (CEPS) for the TRIGGER project (https://trigger-project.eu/). We hope that it will facilitate future quantitative and computational research on the EU.

    Brief description: - The dataset is organised in tabular format, with each law representing one row and the columns representing 23 variables. - The full text of 134.633 laws is included (column "act_raw_text"). For newer laws, the text was scraped from Eur-lex.eu via the HTML pages, while for older laws, the text was extracted from (scanned) PDF documents (if available in English). - 22 additional variables are included, such as 'Act_name', 'Act_type', 'Subject_matter', 'Authors', 'Date_document', 'ELI_link', 'CELEX' (a unique identifier for every law). Please see the "CEPS_EurLex_codebook.pdf" file for an explanation of all variables. - Given its size, the dataset was uploaded in different batches to facilitate usage. Some Excel files are provided for non-technical users. We recommend, however, the use of the CSV files, since Excel does not save large amounts of data properly. EurLex_all.csv is the master file containing all data.

    Caveats: - The Eur-lex.eu website does not consistently provide data for all the variables. In addition, the HTML documents were not always cleanly formatted and text extraction from scanned PDFs is not entirely clean. Some data points are therefore missing for some laws and some laws were excluded entirely. - Not not all (older) laws were available in English, especially since Ireland and the UK only joined the European Communities in 1973. Non-English laws are excluded from the dataset.

    Other: - For details on the types of EU legal acts: https://ec.europa.eu/info/law/law-making-process/types-eu-law_en - An example for an experimental analysis with this dataset: https://trigger-project.eu/2019/10/28/a-data-science-approach-to-eu-differentiated-integration/ - The TRIGGER project is funded by the EU's Horizon 2020 programme, grant number 822735 (2020-02-16)

  2. File status Named Authority List

    • data.europa.eu
    rdf xml, xml, zip
    Updated Sep 26, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Publications Office of the European Union (2024). File status Named Authority List [Dataset]. https://data.europa.eu/data/datasets/file-status?locale=en
    Explore at:
    zip, xml, rdf xmlAvailable download formats
    Dataset updated
    Sep 26, 2024
    Dataset provided by
    Publications Office of the European Unionhttp://op.europa.eu/
    European Union-
    Authors
    Publications Office of the European Union
    License

    http://data.europa.eu/eli/dec/2011/833/ojhttp://data.europa.eu/eli/dec/2011/833/oj

    Description

    File status is a controlled vocabulary that outlines various stages in the life cycle of different activities. Originally, it detailed the different stages in the life cycle of an act (or in the decision-making process of an act) from initiation to completion, to meet the needs of EUR-Lex. As the need for a more generic status table arose, this asset was expanded to encompass stages for all kinds of products and processes. File status is maintained by the Publications Office of the European Union and disseminated on the EU Vocabularies website.

  3. European Commission - Service for Foreign Policy Instruments - Activity File...

    • iatiregistry.org
    iati-xml
    Updated Nov 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    European Commission - Service for Foreign Policy Instruments (2025). European Commission - Service for Foreign Policy Instruments - Activity File - Russia [Dataset]. https://iatiregistry.org/dataset/ec-fpi-ru
    Explore at:
    iati-xml(6115)Available download formats
    Dataset updated
    Nov 20, 2025
    Dataset provided by
    Service for Foreign Policy Instruments
    European Commissionhttp://ec.europa.eu/
    Area covered
    Russia
    Description

    European Commission - Service for Foreign Policy Instruments - Activity File - Russia

  4. e

    Standard Eurobarometer STD93 : Standard Eurobarometer 93 - Summer 2020

    • data.europa.eu
    zip
    Updated Jan 6, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Directorate-General for Communication (2021). Standard Eurobarometer STD93 : Standard Eurobarometer 93 - Summer 2020 [Dataset]. https://data.europa.eu/data/datasets/s2262_93_1_93_1_eng?locale=en
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 6, 2021
    Dataset authored and provided by
    Directorate-General for Communication
    License

    http://data.europa.eu/eli/dec/2011/833/ojhttp://data.europa.eu/eli/dec/2011/833/oj

    Description

    Summer 2020 edition of the standard Europe-wide survey was carried out in 34 countries/territories (the EU, the UK, 5 candidates for EU membership (North Macedonia, Turkey, Montenegro, Serbia and Albania) and the Turkish Cypriot community (in the part of the country not controlled by the Cypriot government). The survey addresses topics such as: • the political and economic situation in Europe • how Europeans perceive their political institutions (both national governments/parliaments and the EU institutions) • attitudes to European citizenship and other key policy areas • perceptions of the coronavirus pandemic and its consequences.

    Processed data

    Processed data files for the Eurobarometer surveys are published in .xlsx format.

    • Volume A "Countries/EU" The file contains frequencies and means or other synthetic indicators including elementary bivariate statistics describing distribution patterns of (weighted) replies for each country or territory and for (weighted) EU results.
    • Volume AP "Previous survey trends" The file compares to the previous poll in (weighted) frequencies and means (or other synthetic indicators including elementary bivariate statistics describing distribution patterns of replies); shifts for each country or territory foreseen in Volume A and for (weighted) results.
    • Volume AA "Groups of countries" The file contains (labelled) frequencies and means or other synthetic indicators including elementary bivariate statistics describing distribution patterns of (weighted) replies for groups of countries specified by the managing unit on the part of the EC.
    • Volume AAP "Trends of groups of countries" The file contains shifts compared to the previous poll in (weighted) frequencies and means (or other synthetic indicators including elementary bivariate statistics describing distribution patterns of replies); shifts for each groups of countries foreseen in Volume AA and for (weighted) results.
    • Volume B "EU/socio-demographics" The file contains (labelled) frequencies and means or other synthetic indicators including elementary bivariate statistics describing distribution patterns of replies for the EU as a whole (weighted) and cross-tabulated by some 20 sociodemographic, socio-political or other variables, depending on the request from the managing unit on the part of the EC or the managing department of the other contracting authorities.
    • Volume BP "Trends of EU/socio-demographics" The file contains shifts compared to the previous poll in (weighted) frequencies and means (or other synthetic indicators including elementary bivariate statistics describing distribution patterns of replies); shifts for each country or territory foreseen in Volume B above)and for (weighted) results.
    • Volume C "Country/socio-demographics" The file contains (labelled) weighted frequencies and means or other synthetic indicators including elementary bivariate statistics describing distribution patterns of replies for each country or territory surveyed separately and cross-tabulated by some 20 socio-demographic, socio-political or other variables (including a regional breakdown).
    • Volume D "Trends"" The file compares to previous polls in (weighted) frequencies and means (or other synthetic indicators including elementary bivariate statistics describing distribution patterns of replies); shifts for each country or territory foreseen in Volume A and for (weighted) results. _

    For SPSS files and questionnaires, please contact GESIS - Leibniz Institute for the Social Sciences: https://www.gesis.org/eurobarometer

  5. e

    Slope derived from the Digital Elevation Model over Europe from the GSGRDA...

    • sdi.eea.europa.eu
    www:url
    Updated May 1, 2012
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2012). Slope derived from the Digital Elevation Model over Europe from the GSGRDA project (EU-DEM-PRE Slope, resolution 25 m) [Dataset]. https://sdi.eea.europa.eu/catalogue/srv/api/records/b0f63ca4-a269-4769-b384-5eedd64a7522
    Explore at:
    www:urlAvailable download formats
    Dataset updated
    May 1, 2012
    Time period covered
    Jan 1, 2000 - Dec 31, 2010
    Area covered
    Description

    The EU-DEM is a Digital Surface Model (DSM) representing the first surface as illuminated by the sensors. EU-DEM covers the EEA39 countries and it has been produced by a consortium led by Indra, Intermap edited the EUDEM and AGI provided the water mask. The EU-DEM is a 3D raster dataset with elevations captured at 1 arc second postings (2.78E-4 degrees) or about every 30 meter. It is a hybrid product based on SRTM and ASTER GDEM data fused by a weighted averaging approach. Ownership of EU-DEM belongs to European Commision, DG Enterprise and Industry.

    The projection onto an Inspire compliant grid of 25m resolution and the computation of a Slope raster have been performed by the Joint Research Centre of the European Commission (see file documentation/SPEC010_a100421-SLOP.pdf).

  6. Harmonized Tree Species Occurrence Points for Europe

    • zenodo.org
    application/gzip, bin +1
    Updated Jul 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Johannes Heisig; Johannes Heisig; Tomislav Hengl; Tomislav Hengl (2024). Harmonized Tree Species Occurrence Points for Europe [Dataset]. http://doi.org/10.5281/zenodo.4068253
    Explore at:
    bin, png, application/gzipAvailable download formats
    Dataset updated
    Jul 19, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Johannes Heisig; Johannes Heisig; Tomislav Hengl; Tomislav Hengl
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This data set is a harmonized collection of existing data from GBIF, the EU-Forest project and the LUCAS survey. It has about 3 million observations and is supplemented by variables (e.g. location accuracy, land cover type, canopy height, etc.) which enable precise filtering for specific user applications.

    The RDS file is created from an sf-object and suitable for fast reading in the R-programming environment. The CSV.GZ file contains records as a table with Easting and Northing in Coordinate Reference System ETRS89 / LAEA Europe (= EPSG code 3035) and can be fed in a GIS after being unzipped.

    The code producing this data set is publicly available on GitLab.

    Variables:

    • id = unique point identifier
    • easting = x coordinate
    • northing = y coordinate
    • country = ISO country code
    • species = Latin species name
    • genus = genus name
    • scientific_name = long species name
    • gbif_taxon_key = taxon key from GBIF
    • gbif_genus_key = genus key from GBIF
    • taxon_rank = species or genus
    • year = year of observation
    • accessed_through = database through which data was accessed (GBIF, LUCAS, EU-Forest)
    • dataset_info = data set name (individual sub-data-set)
    • citation = DOI citation of the individual data set
    • license = distribution license
    • location_accuracy = spatial accuracy of observation (meters)
    • flag_location_issue = known location issues present
    • flag_date_issue = known date issues present
    • eoo = Extent of occurrence (applying the concept of natural geographical range used for the EU-Forest data set (Mauri et al., 2017) to all other data points. 1 = point inside species range; 0 = point outside; NA = EOO polygon not available for this species)
    • dbh = Diameter Breast Height (only recorded for observations from the EU-Forest data set (Mauri et al., 2017))
    • lc1 = LUCAS land cover type 1 (only recorded for observations from LUCAS data)
    • lc2 = LUCAS land cover type 2 (only recorded for observations from LUCAS data)
    • landmask_country = land mask overlay 30 meters (NA = not on land)
    • corine = CORINE 2018 land cover type (extracted from the 100 meter raster data set)
    • nightlights = light pollution observed by VIIRS (proxy for remoteness / distance to human structures)
    • canopy_height = canopy height derived from GEDI waveform LiDAR point data
    • natura_2000 = Natura 2000 site code (if a point falls inside a protected area (GIS-layer) this variable contains the site identification code; all sites can be explored on an interactive map)
    • freq_location = number of points with identical location (in some cases one location has multiple observation, differing in species and/or year. This may lead to difficulties in certain modeling tasks)
    • geometry = point geometry in ETRS89 / LAEA Europe

    See this detailed documentation for more insights into each variable.

    If you would like to know more about the creation of this data set, see

    1. the R-Markdown documenting the process (GitLab repository)
    2. the talk at OpenGeoHub Summer School 2020 (Youtube)

    Some advice: This data set is a puzzle with pieces from many different sources. Take some time to explore before including it in your work. Use summary statistics to see which variables have NAs and how many. Choose your filtering criteria wisely. For example, some points with the highest location accuracy have no record for the year of observations. You would exclude these, if "year > 1990" was your criteria.

    This work has received funding from the European Union's the Innovation and Networks Executive Agency (INEA) under Grant Agreement Connecting Europe Facility (CEF) Telecom project 2018-EU-IA-0095 (https://ec.europa.eu/inea/en/connecting-europe-facility/cef-telecom/2018-eu-ia-0095).

  7. European Commission - Directorate-General for International Partnerships -...

    • iatiregistry.org
    iati-xml
    Updated Nov 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    European Commission - International Partnerships (2025). European Commission - Directorate-General for International Partnerships - Activity File - Somalia [Dataset]. https://iatiregistry.org/dataset/ec-intpa-so
    Explore at:
    iati-xml(8574946)Available download formats
    Dataset updated
    Nov 6, 2025
    Dataset provided by
    Directorate-General for International Partnerships
    European Commissionhttp://ec.europa.eu/
    Description

    European Commission - Directorate-General for International Partnerships - Activity File - Somalia

  8. e

    Copernicus Digital Elevation Model (DEM) for Europe at 100 meter resolution...

    • data.europa.eu
    • data.opendatascience.eu
    • +4more
    tiff
    Updated Feb 20, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Copernicus Digital Elevation Model (DEM) for Europe at 100 meter resolution (EU-LAEA) derived from Copernicus Global 30 meter DEM dataset [Dataset]. https://data.europa.eu/88u/dataset/74d0e58f-9f51-444e-a5a7-eff4c20f05b1
    Explore at:
    tiffAvailable download formats
    Dataset updated
    Feb 20, 2022
    Area covered
    Europe
    Description

    The Copernicus DEM is a Digital Surface Model (DSM) which represents the surface of the Earth including buildings, infrastructure and vegetation. The original GLO-30 provides worldwide coverage at 30 meters (refers to 10 arc seconds). Note that ocean areas do not have tiles, there one can assume height values equal to zero. Data is provided as Cloud Optimized GeoTIFFs. Note that the vertical unit for measurement of elevation height is meters.

    The Copernicus DEM for Europe at 100 meter resolution (EU-LAEA projection) in COG format has been derived from the Copernicus DEM GLO-30, mirrored on Open Data on AWS, dataset managed by Sinergise (https://registry.opendata.aws/copernicus-dem/).

    Processing steps: The original Copernicus GLO-30 DEM contains a relevant percentage of tiles with non-square pixels. We created a mosaic map in https://gdal.org/drivers/raster/vrt.html format and defined within the VRT file the rule to apply cubic resampling while reading the data, i.e. importing them into GRASS GIS for further processing. We chose cubic instead of bilinear resampling since the height-width ratio of non-square pixels is up to 1:5. Hence, artefacts between adjacent tiles in rugged terrain could be minimized: gdalbuildvrt -input_file_list list_geotiffs_MOOD.csv -r cubic -tr 0.000277777777777778 0.000277777777777778 Copernicus_DSM_30m_MOOD.vrt

    In order to reproject the data to EU-LAEA projection while reducing the spatial resolution to 100 m, bilinear resampling was performed in GRASS GIS (using r.proj) and the pixel values were scaled with 1000 (storing the pixels as Integer values) for data volume reduction. In addition, a hillshade raster map was derived from the resampled elevation map (using r.relief GRASS GIS). Eventually, we exported the elevation and hillshade raster maps in Cloud Optimized GeoTIFF (COG) format, along with SLD and QML style files.

  9. e

    German-Portuguese website parallel corpus from the Federal Foreign Office...

    • data.europa.eu
    • live.european-language-grid.eu
    zip
    Updated Dec 21, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Directorate-General for Communications Networks, Content and Technology (2017). German-Portuguese website parallel corpus from the Federal Foreign Office Berlin (Processed) [Dataset]. https://data.europa.eu/88u/dataset/elrc_640
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 21, 2017
    Dataset authored and provided by
    Directorate-General for Communications Networks, Content and Technology
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Germany
    Description

    German-Portuguese texts extracted from the website of the Federal Foreign Office Berlin. This includes 415 pairs that were translated between September 2013 and the beginning of December 2015 and converted into a .TMX file format. This version resulted from a correction/cleaning of the original tmx and stripping of the file

    This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) actions SMART 2014/1074 and SMART 2015/1091. For further information on the project: http://lr-coordination.eu.

  10. European Mask (MAPPE model)

    • kaggle.com
    zip
    Updated Apr 18, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Joint Research Centre (2019). European Mask (MAPPE model) [Dataset]. https://www.kaggle.com/datasets/joint-research-centre/european-mask-mappe-model/code
    Explore at:
    zip(89131187 bytes)Available download formats
    Dataset updated
    Apr 18, 2019
    Dataset authored and provided by
    Joint Research Centrehttps://joint-research-centre.ec.europa.eu/index_en
    Description

    Content

    More details about each file are in the individual file descriptions.

    Context

    This is a dataset from Joint Research Centre hosted by the EU Open Data Portal. The Open Data Portal is found here and they update their information according the frequency that the data is collected. Explore Joint Research Centre data using Kaggle and all of the data sources available through the Joint Research Centre organization page!

    • Update Frequency: This dataset is updated daily.

    Acknowledgements

    This dataset is maintained using the EU ODP API and Kaggle's API.

    This dataset is distributed under the following licenses: Dataset License

    Cover photo by Mikayla Mallek on Unsplash
    Unsplash Images are distributed under a unique Unsplash License.

  11. European Commission - Service for Foreign Policy Instruments - Activity File...

    • iatiregistry.org
    iati-xml
    Updated Nov 6, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    European Commission - Service for Foreign Policy Instruments (2025). European Commission - Service for Foreign Policy Instruments - Activity File - Europe, regional [Dataset]. https://iatiregistry.org/dataset/ec-fpi-89
    Explore at:
    iati-xml(2919274)Available download formats
    Dataset updated
    Nov 6, 2025
    Dataset provided by
    Service for Foreign Policy Instruments
    European Commissionhttp://ec.europa.eu/
    Area covered
    Europe
    Description

    European Commission - Service for Foreign Policy Instruments - Activity File - Europe, regional

  12. u

    Data from: DATABASE FOR THE ANALYSIS OF ROAD ACCIDENTS IN EUROPE

    • produccioncientifica.ugr.es
    • data.niaid.nih.gov
    • +1more
    Updated 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Navarro-Moreno, José; De Oña, Juan; Calvo-Poyo, Francisco; Navarro-Moreno, José; De Oña, Juan; Calvo-Poyo, Francisco (2022). DATABASE FOR THE ANALYSIS OF ROAD ACCIDENTS IN EUROPE [Dataset]. https://produccioncientifica.ugr.es/documentos/668fc484b9e7c03b01bdfcfc
    Explore at:
    Dataset updated
    2022
    Authors
    Navarro-Moreno, José; De Oña, Juan; Calvo-Poyo, Francisco; Navarro-Moreno, José; De Oña, Juan; Calvo-Poyo, Francisco
    Area covered
    Europe
    Description

    This database that can be used for macro-level analysis of road accidents on interurban roads in Europe. Through the variables it contains, road accidents can be explained using variables related to economic resources invested in roads, traffic, road network, socioeconomic characteristics, legislative measures and meteorology. This repository contains the data used for the analysis carried out in the papers: 1. Calvo-Poyo F., Navarro-Moreno J., de Oña J. (2020) Road Investment and Traffic Safety: An International Study. Sustainability 12:6332. https://doi.org/10.3390/su12166332 2. Navarro-Moreno J., Calvo-Poyo F., de Oña J. (2022) Influence of road investment and maintenance expenses on injured traffic crashes in European roads. Int J Sustain Transp 1–11. https://doi.org/10.1080/15568318.2022.2082344 3. Navarro-Moreno, J., Calvo-Poyo, F., de Oña, J. (2022) Investment in roads and traffic safety: linked to economic development? A European comparison. Environ. Sci. Pollut. Res. https://doi.org/10.1007/s11356-022-22567 The file with the database is available in excel. DATA SOURCES The database presents data from 1998 up to 2016 from 20 european countries: Austria, Belgium, Croatia, Czechia, Denmark, Estonia, Finland, France, Germany, Ireland, Italy, Latvia, Netherlands, Poland, Portugal, Slovakia, Slovenia, Spain, Sweden and United Kingdom. Crash data were obtained from the United Nations Economic Commission for Europe (UNECE) [2], which offers enough level of disaggregation between crashes occurring inside versus outside built-up areas. With reference to the data on economic resources invested in roadways, deserving mention –given its extensive coverage—is the database of the Organisation for Economic Cooperation and Development (OECD), managed by the International Transport Forum (ITF) [1], which collects data on investment in the construction of roads and expenditure on their maintenance, following the definitions of the United Nations System of National Accounts (2008 SNA). Despite some data gaps, the time series present consistency from one country to the next. Moreover, to confirm the consistency and complete missing data, diverse additional sources, mainly the national Transport Ministries of the respective countries were consulted. All the monetary values were converted to constant prices in 2015 using the OECD price index. To obtain the rest of the variables in the database, as well as to ensure consistency in the time series and complete missing data, the following national and international sources were consulted: Eurostat [3] Directorate-General for Mobility and Transport (DG MOVE). European Union [4] The World Bank [5] World Health Organization (WHO) [6] European Transport Safety Council (ETSC) [7] European Road Safety Observatory (ERSO) [8] European Climatic Energy Mixes (ECEM) of the Copernicus Climate Change [9] EU BestPoint-Project [10] Ministerstvo dopravy, República Checa [11] Bundesministerium für Verkehr und digitale Infrastruktur, Alemania [12] Ministerie van Infrastructuur en Waterstaat, Países Bajos [13] National Statistics Office, Malta [14] Ministério da Economia e Transição Digital, Portugal [15] Ministerio de Fomento, España [16] Trafikverket, Suecia [17] Ministère de l’environnement de l’énergie et de la mer, Francia [18] Ministero delle Infrastrutture e dei Trasporti, Italia [19–25] Statistisk sentralbyrå, Noruega [26-29] Instituto Nacional de Estatística, Portugal [30] Infraestruturas de Portugal S.A., Portugal [31–35] Road Safety Authority (RSA), Ireland [36] DATA BASE DESCRIPTION The database was made trying to combine the longest possible time period with the maximum number of countries with complete dataset (some countries like Lithuania, Luxemburg, Malta and Norway were eliminated from the definitive dataset owing to a lack of data or breaks in the time series of records). Taking into account the above, the definitive database is made up of 19 variables, and contains data from 20 countries during the period between 1998 and 2016. Table 1 shows the coding of the variables, as well as their definition and unit of measure. Table. Database metadata Code Variable and unit fatal_pc_km Fatalities per billion passenger-km fatal_mIn Fatalities per million inhabitants accid_adj_pc_km Accidents per billion passenger-km p_km Billions of passenger-km croad_inv_km Investment in roads construction per kilometer, €/km (2015 constant prices) croad_maint_km Expenditure on roads maintenance per kilometer €/km (2015 constant prices) prop_motorwa Proportion of motorways over the total road network (%) populat Population, in millions of inhabitants unemploy Unemployment rate (%) petro_car Consumption of gasolina and petrol derivatives (tons), per tourism alcohol Alcohol consumption, in liters per capita (age > 15) mot_index Motorization index, in cars per 1,000 inhabitants den_populat Population density, inhabitants/km2 cgdp Gross Domestic Product (GDP), in € (2015 constant prices) cgdp_cap GDP per capita, in € (2015 constant prices) precipit Average depth of rain water during a year (mm) prop_elder Proportion of people over 65 years (%) dps Demerit Point System, dummy variable (0: no; 1: yes) freight Freight transport, in billions of ton-km ACKNOWLEDGEMENTS This database was carried out in the framework of the project “Inversión en carreteras y seguridad vial: un análisis internacional (INCASE)”, financed by: FEDER/Ministerio de Ciencia, Innovación y Universidades–Agencia Estatal de Investigación/Proyecto RTI2018-101770-B-I00, within Spain´s National Program of R+D+i Oriented to Societal Challenges. Moreover, the authors would like to express their gratitude to the Ministry of Transport, Mobility and Urban Agenda of Spain (MITMA), and the Federal Ministry of Transport and Digital Infrastructure of Germany (BMVI) for providing data for this study. REFERENCES 1. International Transport Forum OECD iLibrary | Transport infrastructure investment and maintenance. 2. United Nations Economic Commission for Europe UNECE Statistical Database Available online: https://w3.unece.org/PXWeb2015/pxweb/en/STAT/STAT_40-TRTRANS/?rxid=18ad5d0d-bd5e-476f-ab7c-40545e802eeb (accessed on Apr 28, 2020). 3. European Commission Database - Eurostat Available online: https://ec.europa.eu/eurostat/data/database (accessed on Apr 28, 2021). 4. Directorate-General for Mobility and Transport. European Commission EU Transport in figures - Statistical Pocketbooks Available online: https://ec.europa.eu/transport/facts-fundings/statistics_en (accessed on Apr 28, 2021). 5. World Bank Group World Bank Open Data | Data Available online: https://data.worldbank.org/ (accessed on Apr 30, 2021). 6. World Health Organization (WHO) WHO Global Information System on Alcohol and Health Available online: https://apps.who.int/gho/data/node.main.GISAH?lang=en (accessed on Apr 29, 2021). 7. European Transport Safety Council (ETSC) Traffic Law Enforcement across the EU - Tackling the Three Main Killers on Europe’s Roads; Brussels, Belgium, 2011; 8. Copernicus Climate Change Service Climate data for the European energy sector from 1979 to 2016 derived from ERA-Interim Available online: https://cds.climate.copernicus.eu/cdsapp#!/dataset/sis-european-energy-sector?tab=overview (accessed on Apr 29, 2021). 9. Klipp, S.; Eichel, K.; Billard, A.; Chalika, E.; Loranc, M.D.; Farrugia, B.; Jost, G.; Møller, M.; Munnelly, M.; Kallberg, V.P.; et al. European Demerit Point Systems : Overview of their main features and expert opinions. EU BestPoint-Project 2011, 1–237. 10. Ministerstvo dopravy Serie: Ročenka dopravy; Ročenka dopravy; Centrum dopravního výzkumu: Prague, Czech Republic; 11. Bundesministerium für Verkehr und digitale Infrastruktur Verkehr in Zahlen 2003/2004; Hamburg, Germany, 2004; ISBN 3871542946. 12. Bundesministerium für Verkehr und digitale Infrastruktur Verkehr in Zahlen 2018/2019. In Verkehrsdynamik; Flensburg, Germany, 2018 ISBN 9783000612947. 13. Ministerie van Infrastructuur en Waterstaat Rijksjaarverslag 2018 a Infrastructuurfonds; The Hague, Netherlands, 2019; ISBN 0921-7371. 14. Ministerie van Infrastructuur en Milieu Rijksjaarverslag 2014 a Infrastructuurfonds; The Hague, Netherlands, 2015; ISBN 0921- 7371. 15. Ministério da Economia e Transição Digital Base de Dados de Infraestruturas - GEE Available online: https://www.gee.gov.pt/pt/publicacoes/indicadores-e-estatisticas/base-de-dados-de-infraestruturas (accessed on Apr 29, 2021). 16. Ministerio de Fomento. Dirección General de Programación Económica y Presupuestos. Subdirección General de Estudios Económicos y Estadísticas Serie: Anuario estadístico; NIPO 161-13-171-0; Centro de Publicaciones. Secretaría General Técnica. Ministerio de Fomento: Madrid, Spain; 17. Trafikverket The Swedish Transport Administration Annual report: 2017; 2018; ISBN 978-91-7725-272-6. 18. Ministère de l’Équipement, du T. et de la M. Mémento de statistiques des transports 2003; Ministère de l’environnement de l’énergie et de la mer, 2005; 19. Ministero delle Infrastrutture e dei Trasporti Conto Nazionale delle Infrastrutture e dei Trasporti Anno 2000; Istituto Poligrafico e Zecca dello Stato: Roma, Italy, 2001; 20. Ministero delle Infrastrutture e dei Trasporti Conto nazionale dei trasporti 1999. 2000. 21. Generale, D.; Informativi, S. delle Infrastrutture e dei Trasporti Anno 2004. 22. Ministero delle Infrastrutture e dei Trasporti Conto Nazionale delle Infrastrutture e dei Trasporti Anno 2001; 2002; 23. Ministero delle Infrastrutture e dei

  13. Housing cost overburden rate by degree of urbanisation

    • ec.europa.eu
    Updated Oct 10, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eurostat (2025). Housing cost overburden rate by degree of urbanisation [Dataset]. http://doi.org/10.2908/ILC_LVHO07D
    Explore at:
    tsv, application/vnd.sdmx.data+csv;version=1.0.0, application/vnd.sdmx.data+csv;version=2.0.0, json, application/vnd.sdmx.data+xml;version=3.0.0, application/vnd.sdmx.genericdata+xml;version=2.1Available download formats
    Dataset updated
    Oct 10, 2025
    Dataset authored and provided by
    Eurostathttps://ec.europa.eu/eurostat
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    2003 - 2024
    Area covered
    Malta, Portugal, European Union - 27 countries (2007-2013), Croatia, Greece, Albania, Luxembourg, European Union - 27 countries (from 2020), Lithuania, Finland
    Description

    The European Union Statistics on Income and Living Conditions (EU-SILC) collects timely and comparable multidimensional microdata on income, poverty, social exclusion and living conditions.

    The EU-SILC collection is a key instrument for providing information required by the European Semester ([1]) and the European Pillar of Social Rights, and the main source of data for microsimulation purposes and flash estimates of income distribution and poverty rates.

    AROPE remains crucial to monitor European social policies, especially to monitor the EU 2030 target on poverty and social exclusion. For more information, please consult EU social indicators.

    The EU-SILC instrument provides two types of data:

    • Cross-sectional data pertaining to a given time or a certain time period with variables on income, poverty, social exclusion and other living conditions.
    • Longitudinal data pertaining to individual-level changes over time, observed periodically over four‐or more year rotation scheme (Annex III (2) of 2019/1700).

    EU-SILC collects:

    • annual variables,
    • three-yearly modules,
    • six-yearly modules,
    • ad-hoc new policy needs modules,
    • optional variables.

    The variables collected are grouped by topic and detailed topic and transmitted to Eurostat in four main files (D-File, H-File, R-File and P-file).

    The domain ‘Income and Living Conditions’ covers the following topics: persons at risk of poverty or social exclusion, income inequality, income distribution and monetary poverty, living conditions, material deprivation, and EU-SILC ad-hoc modules, which are structured into collections of indicators on specific topics.

    In 2023, in addition to annual data, in EU-SILC were collected: the three yearly module on labour market and housing, the six yearly module on intergenerational transmission of advantages and disadvantages, housing difficulties, and the ad hoc subject on households energy efficiency.

    Starting from 2021 onwards, the EU quality reports use the structure of the Single Integrated Metadata Structure (SIMS).

    ([1]) The European Semester is the European Union’s framework for the coordination and surveillance of economic and social policies.

  14. Inability to keep home adequately warm

    • ec.europa.eu
    Updated Oct 10, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eurostat (2025). Inability to keep home adequately warm [Dataset]. http://doi.org/10.2908/ILC_MDES01
    Explore at:
    application/vnd.sdmx.data+csv;version=2.0.0, json, application/vnd.sdmx.data+csv;version=1.0.0, tsv, application/vnd.sdmx.data+xml;version=3.0.0, application/vnd.sdmx.genericdata+xml;version=2.1Available download formats
    Dataset updated
    Oct 10, 2025
    Dataset authored and provided by
    Eurostathttps://ec.europa.eu/eurostat
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    2003 - 2024
    Area covered
    EA13-2007, Euro area (EA11-1999, EA20-2023), EA15-2008, EA16-2009, EA19-2015, EA12-2001, EA17-2011, EA18-2014, Slovenia, United Kingdom, Euro area – 20 countries (from 2023), Finland, European Union - 28 countries (2013-2020), Lithuania, Estonia, Serbia, Kosovo*
    Description

    The European Union Statistics on Income and Living Conditions (EU-SILC) collects timely and comparable multidimensional microdata on income, poverty, social exclusion and living conditions.

    The EU-SILC collection is a key instrument for providing information required by the European Semester ([1]) and the European Pillar of Social Rights, and the main source of data for microsimulation purposes and flash estimates of income distribution and poverty rates.

    AROPE remains crucial to monitor European social policies, especially to monitor the EU 2030 target on poverty and social exclusion. For more information, please consult EU social indicators.

    The EU-SILC instrument provides two types of data:

    • Cross-sectional data pertaining to a given time or a certain time period with variables on income, poverty, social exclusion and other living conditions.
    • Longitudinal data pertaining to individual-level changes over time, observed periodically over four‐or more year rotation scheme (Annex III (2) of 2019/1700).

    EU-SILC collects:

    • annual variables,
    • three-yearly modules,
    • six-yearly modules,
    • ad-hoc new policy needs modules,
    • optional variables.

    The variables collected are grouped by topic and detailed topic and transmitted to Eurostat in four main files (D-File, H-File, R-File and P-file).

    The domain ‘Income and Living Conditions’ covers the following topics: persons at risk of poverty or social exclusion, income inequality, income distribution and monetary poverty, living conditions, material deprivation, and EU-SILC ad-hoc modules, which are structured into collections of indicators on specific topics.

    In 2023, in addition to annual data, in EU-SILC were collected: the three yearly module on labour market and housing, the six yearly module on intergenerational transmission of advantages and disadvantages, housing difficulties, and the ad hoc subject on households energy efficiency.

    Starting from 2021 onwards, the EU quality reports use the structure of the Single Integrated Metadata Structure (SIMS).

    ([1]) The European Semester is the European Union’s framework for the coordination and surveillance of economic and social policies.

  15. Distribution of households by household type from 2003 onwards

    • ec.europa.eu
    Updated Oct 10, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eurostat (2025). Distribution of households by household type from 2003 onwards [Dataset]. http://doi.org/10.2908/ILC_LVPH02
    Explore at:
    application/vnd.sdmx.data+csv;version=1.0.0, tsv, application/vnd.sdmx.genericdata+xml;version=2.1, application/vnd.sdmx.data+csv;version=2.0.0, json, application/vnd.sdmx.data+xml;version=3.0.0Available download formats
    Dataset updated
    Oct 10, 2025
    Dataset authored and provided by
    Eurostathttps://ec.europa.eu/eurostat
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    2003 - 2024
    Area covered
    Germany, Hungary, Kosovo*, Sweden, Estonia, North Macedonia, Greece, Norway, Spain, European Union
    Description

    The European Union Statistics on Income and Living Conditions (EU-SILC) collects timely and comparable multidimensional microdata on income, poverty, social exclusion and living conditions.

    The EU-SILC collection is a key instrument for providing information required by the European Semester ([1]) and the European Pillar of Social Rights, and the main source of data for microsimulation purposes and flash estimates of income distribution and poverty rates.

    AROPE remains crucial to monitor European social policies, especially to monitor the EU 2030 target on poverty and social exclusion. For more information, please consult EU social indicators.

    The EU-SILC instrument provides two types of data:

    • Cross-sectional data pertaining to a given time or a certain time period with variables on income, poverty, social exclusion and other living conditions.
    • Longitudinal data pertaining to individual-level changes over time, observed periodically over four‐or more year rotation scheme (Annex III (2) of 2019/1700).

    EU-SILC collects:

    • annual variables,
    • three-yearly modules,
    • six-yearly modules,
    • ad-hoc new policy needs modules,
    • optional variables.

    The variables collected are grouped by topic and detailed topic and transmitted to Eurostat in four main files (D-File, H-File, R-File and P-file).

    The domain ‘Income and Living Conditions’ covers the following topics: persons at risk of poverty or social exclusion, income inequality, income distribution and monetary poverty, living conditions, material deprivation, and EU-SILC ad-hoc modules, which are structured into collections of indicators on specific topics.

    In 2023, in addition to annual data, in EU-SILC were collected: the three yearly module on labour market and housing, the six yearly module on intergenerational transmission of advantages and disadvantages, housing difficulties, and the ad hoc subject on households energy efficiency.

    Starting from 2021 onwards, the EU quality reports use the structure of the Single Integrated Metadata Structure (SIMS).

    ([1]) The European Semester is the European Union’s framework for the coordination and surveillance of economic and social policies.

  16. EU Parliament Eurobarometer Survey

    • kaggle.com
    zip
    Updated Dec 18, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    European Parliament (2019). EU Parliament Eurobarometer Survey [Dataset]. https://www.kaggle.com/eu-parliament/eu-parliament-eurobarometer-survey
    Explore at:
    zip(818866397 bytes)Available download formats
    Dataset updated
    Dec 18, 2019
    Dataset authored and provided by
    European Parliamenthttp://europarl.europa.eu/
    Description

    Content

    More details about each file are in the individual file descriptions.

    Context

    This is a dataset from European Parliament hosted by the EU Open Data Portal. The Open Data Portal is found here and they update their information according the amount of data that is brought in. Explore European Parliament data using Kaggle and all of the data sources available through the European Parliament organization page!

    • Update Frequency: This dataset is updated daily.

    Acknowledgements

    This dataset is maintained using the EU ODP API and Kaggle's API.

    This dataset is distributed under the following licenses: Legal Notice

    Cover photo by Frederic Köberl on Unsplash
    Unsplash Images are distributed under a unique Unsplash License.

  17. Overall life satisfaction by level of satisfaction, age and educational...

    • ec.europa.eu
    Updated Oct 10, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eurostat (2025). Overall life satisfaction by level of satisfaction, age and educational attainment [Dataset]. http://doi.org/10.2908/ILC_PW05
    Explore at:
    application/vnd.sdmx.data+csv;version=1.0.0, json, tsv, application/vnd.sdmx.genericdata+xml;version=2.1, application/vnd.sdmx.data+csv;version=2.0.0, application/vnd.sdmx.data+xml;version=3.0.0Available download formats
    Dataset updated
    Oct 10, 2025
    Dataset authored and provided by
    Eurostathttps://ec.europa.eu/eurostat
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    2013 - 2024
    Area covered
    Netherlands, Slovenia, Croatia, Montenegro, Finland, Lithuania, United Kingdom, Euro area – 20 countries (from 2023), Estonia, European Union
    Description

    The European Union Statistics on Income and Living Conditions (EU-SILC) collects timely and comparable multidimensional microdata on income, poverty, social exclusion and living conditions.

    The EU-SILC collection is a key instrument for providing information required by the European Semester ([1]) and the European Pillar of Social Rights, and the main source of data for microsimulation purposes and flash estimates of income distribution and poverty rates.

    AROPE remains crucial to monitor European social policies, especially to monitor the EU 2030 target on poverty and social exclusion. For more information, please consult EU social indicators.

    The EU-SILC instrument provides two types of data:

    • Cross-sectional data pertaining to a given time or a certain time period with variables on income, poverty, social exclusion and other living conditions.
    • Longitudinal data pertaining to individual-level changes over time, observed periodically over four‐or more year rotation scheme (Annex III (2) of 2019/1700).

    EU-SILC collects:

    • annual variables,
    • three-yearly modules,
    • six-yearly modules,
    • ad-hoc new policy needs modules,
    • optional variables.

    The variables collected are grouped by topic and detailed topic and transmitted to Eurostat in four main files (D-File, H-File, R-File and P-file).

    The domain ‘Income and Living Conditions’ covers the following topics: persons at risk of poverty or social exclusion, income inequality, income distribution and monetary poverty, living conditions, material deprivation, and EU-SILC ad-hoc modules, which are structured into collections of indicators on specific topics.

    In 2023, in addition to annual data, in EU-SILC were collected: the three yearly module on labour market and housing, the six yearly module on intergenerational transmission of advantages and disadvantages, housing difficulties, and the ad hoc subject on households energy efficiency.

    Starting from 2021 onwards, the EU quality reports use the structure of the Single Integrated Metadata Structure (SIMS).

    ([1]) The European Semester is the European Union’s framework for the coordination and surveillance of economic and social policies.

  18. European Commission - Directorate-General for International Partnerships -...

    • iatiregistry.org
    iati-xml
    Updated Nov 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    European Commission - International Partnerships (2025). European Commission - Directorate-General for International Partnerships - Activity File - Argentina [Dataset]. https://iatiregistry.org/dataset/ec-intpa-ar
    Explore at:
    iati-xml(2251916)Available download formats
    Dataset updated
    Nov 20, 2025
    Dataset provided by
    Directorate-General for International Partnerships
    European Commissionhttp://ec.europa.eu/
    Description

    European Commission - Directorate-General for International Partnerships - Activity File - Argentina

  19. e

    Geographic Information System of the European Commission (GISCO) - full...

    • sdi.eea.europa.eu
    www:url
    Updated May 23, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2018). Geographic Information System of the European Commission (GISCO) - full database, Jul. 2018 [Dataset]. https://sdi.eea.europa.eu/catalogue/srv/api/records/799f353c-d074-47c3-9783-7e246c036a1b
    Explore at:
    www:urlAvailable download formats
    Dataset updated
    May 23, 2018
    Time period covered
    Jan 1, 2010 - Dec 31, 2010
    Area covered
    Description

    GISCO (Geographic Information System of the COmmission) is responsible for meeting the European Commission's geographical information needs at three levels: the European Union, its member countries, and its regions.

    In addition to creating statistical and other thematic maps, GISCO manages a database of geographical information, and provides related services to the Commission. Its database contains core geographical data covering the whole of Europe, such as administrative boundaries, and thematic geospatial information, such as population grid data. Some data are available for download by the general public and may be used for non-commercial purposes. For further details and information about any forthcoming new or updated datasets, see http://ec.europa.eu/eurostat/web/gisco/geodata.

    This metadata refers to the whole content of GISCO reference database extracted in July 2018, which contains both public datasets and datasets to be used only internally by the EEA. The document GISCO-ConditionsOfUse.pdf provided with the dataset gives information on the copyrighted data sources, the mandatory acknowledgement clauses and re-dissemination rights. The license conditions for EuroGeographic datasets in GISCO are provided in a standalone document "LicenseConditions_EuroGeographics.pdf".

    The database is provided in GDB and in SQLITE, with datasets at scales from 1:60M to 1:100K, with reference years spanning until 2016. The database manual, a file with the content of the database, and a document with the naming conventions are also provided with the database. For particular datasets extracted from this database (NUTS 2016 and COUNTRIES 2016) please refer to the associated resources in the EEA SDI catalogue.

    NOTE: This metadata file is only for internal EEA purposes and in no case replaces the official metadata provided by Eurostat.

  20. CORDIS - EU funded projects under FP1 (1984–1987)

    • data.europa.eu
    csv, excel xlsx, json +1
    Updated Dec 3, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Publications Office of the European Union (2021). CORDIS - EU funded projects under FP1 (1984–1987) [Dataset]. https://data.europa.eu/88u/dataset/fp1-cordis
    Explore at:
    json, xml, csv, excel xlsxAvailable download formats
    Dataset updated
    Dec 3, 2021
    Dataset provided by
    Publications Office of the European Unionhttp://op.europa.eu/
    European Union-
    Authors
    Publications Office of the European Union
    License

    http://data.europa.eu/eli/dec/2011/833/ojhttp://data.europa.eu/eli/dec/2011/833/oj

    Description

    This dataset contains projects funded by the European Union under the first framework programme for research and technological development (FP1) from 1984 to 1987.

    The file 'FP1 Projects' contains the public grant information for each project, including the following information: Record Control Number (RCN), project ID (grant agreement number), project acronym, project status, funding programme, topic, project title, project start date, project end date, project objective, project total cost, EC max contribution (commitment), call ID, funding scheme (type of action), coordinator, coordinator country, participants (ordered in a semi-colon separated list), participant countries (ordered in a semi-colon separated list).

    The participating organisations are listed in the file 'FP1 Organisations' which includes: project Record Control Number (RCN), project ID, project acronym, organisation role, organisation ID, organisation name, organisation short name, organisation type, participation ended (true/false), EC contribution, organisation country.

    The dataset has been updated to match the structure of more recent datasets - some fields may not be populated

    Reference data (countries, funding schemes/types of action, subjects (SIC codes)) can be found in this dataset: https://data.europa.eu/euodp/en/data/dataset/cordisref-data

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Beshoy Hakeem (2025). EurLex DataSet [Dataset]. https://www.kaggle.com/datasets/puskas78/eurlex-dataset
Organization logo

EurLex DataSet

Explore at:
zip(992021193 bytes)Available download formats
Dataset updated
May 23, 2025
Authors
Beshoy Hakeem
Description

The CEPS EurLex dataset The dataset contains 142.036 EU laws - almost the entire corpus of the EU's digitally available legal acts passed between 1952 - 2019. It encompasses the three types of legally binding acts passed by the EU institutions: 102.304 regulations, 4.070 directives, 35.798 decisions in English language. The dataset was scraped from the official EU legal database (Eur-lex.eu) and transformed in machine-readable CSV format with the programming languages R and Python. The dataset was collected by the Centre for European Policy Studies (CEPS) for the TRIGGER project (https://trigger-project.eu/). We hope that it will facilitate future quantitative and computational research on the EU.

Brief description: - The dataset is organised in tabular format, with each law representing one row and the columns representing 23 variables. - The full text of 134.633 laws is included (column "act_raw_text"). For newer laws, the text was scraped from Eur-lex.eu via the HTML pages, while for older laws, the text was extracted from (scanned) PDF documents (if available in English). - 22 additional variables are included, such as 'Act_name', 'Act_type', 'Subject_matter', 'Authors', 'Date_document', 'ELI_link', 'CELEX' (a unique identifier for every law). Please see the "CEPS_EurLex_codebook.pdf" file for an explanation of all variables. - Given its size, the dataset was uploaded in different batches to facilitate usage. Some Excel files are provided for non-technical users. We recommend, however, the use of the CSV files, since Excel does not save large amounts of data properly. EurLex_all.csv is the master file containing all data.

Caveats: - The Eur-lex.eu website does not consistently provide data for all the variables. In addition, the HTML documents were not always cleanly formatted and text extraction from scanned PDFs is not entirely clean. Some data points are therefore missing for some laws and some laws were excluded entirely. - Not not all (older) laws were available in English, especially since Ireland and the UK only joined the European Communities in 1973. Non-English laws are excluded from the dataset.

Other: - For details on the types of EU legal acts: https://ec.europa.eu/info/law/law-making-process/types-eu-law_en - An example for an experimental analysis with this dataset: https://trigger-project.eu/2019/10/28/a-data-science-approach-to-eu-differentiated-integration/ - The TRIGGER project is funded by the EU's Horizon 2020 programme, grant number 822735 (2020-02-16)

Search
Clear search
Close search
Google apps
Main menu