15 datasets found
  1. Large Scale International Boundaries

    • catalog.data.gov
    • geodata.state.gov
    Updated May 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Department of State (Point of Contact) (2025). Large Scale International Boundaries [Dataset]. https://catalog.data.gov/dataset/large-scale-international-boundaries
    Explore at:
    Dataset updated
    May 23, 2025
    Dataset provided by
    United States Department of Statehttp://state.gov/
    Description

    Overview The Office of the Geographer and Global Issues at the U.S. Department of State produces the Large Scale International Boundaries (LSIB) dataset. The current edition is version 11.4 (published 24 February 2025). The 11.4 release contains updated boundary lines and data refinements designed to extend the functionality of the dataset. These data and generalized derivatives are the only international boundary lines approved for U.S. Government use. The contents of this dataset reflect U.S. Government policy on international boundary alignment, political recognition, and dispute status. They do not necessarily reflect de facto limits of control. National Geospatial Data Asset This dataset is a National Geospatial Data Asset (NGDAID 194) managed by the Department of State. It is a part of the International Boundaries Theme created by the Federal Geographic Data Committee. Dataset Source Details Sources for these data include treaties, relevant maps, and data from boundary commissions, as well as national mapping agencies. Where available and applicable, the dataset incorporates information from courts, tribunals, and international arbitrations. The research and recovery process includes analysis of satellite imagery and elevation data. Due to the limitations of source materials and processing techniques, most lines are within 100 meters of their true position on the ground. Cartographic Visualization The LSIB is a geospatial dataset that, when used for cartographic purposes, requires additional styling. The LSIB download package contains example style files for commonly used software applications. The attribute table also contains embedded information to guide the cartographic representation. Additional discussion of these considerations can be found in the Use of Core Attributes in Cartographic Visualization section below. Additional cartographic information pertaining to the depiction and description of international boundaries or areas of special sovereignty can be found in Guidance Bulletins published by the Office of the Geographer and Global Issues: https://data.geodata.state.gov/guidance/index.html Contact Direct inquiries to internationalboundaries@state.gov. Direct download: https://data.geodata.state.gov/LSIB.zip Attribute Structure The dataset uses the following attributes divided into two categories: ATTRIBUTE NAME | ATTRIBUTE STATUS CC1 | Core CC1_GENC3 | Extension CC1_WPID | Extension COUNTRY1 | Core CC2 | Core CC2_GENC3 | Extension CC2_WPID | Extension COUNTRY2 | Core RANK | Core LABEL | Core STATUS | Core NOTES | Core LSIB_ID | Extension ANTECIDS | Extension PREVIDS | Extension PARENTID | Extension PARENTSEG | Extension These attributes have external data sources that update separately from the LSIB: ATTRIBUTE NAME | ATTRIBUTE STATUS CC1 | GENC CC1_GENC3 | GENC CC1_WPID | World Polygons COUNTRY1 | DoS Lists CC2 | GENC CC2_GENC3 | GENC CC2_WPID | World Polygons COUNTRY2 | DoS Lists LSIB_ID | BASE ANTECIDS | BASE PREVIDS | BASE PARENTID | BASE PARENTSEG | BASE The core attributes listed above describe the boundary lines contained within the LSIB dataset. Removal of core attributes from the dataset will change the meaning of the lines. An attribute status of “Extension” represents a field containing data interoperability information. Other attributes not listed above include “FID”, “Shape_length” and “Shape.” These are components of the shapefile format and do not form an intrinsic part of the LSIB. Core Attributes The eight core attributes listed above contain unique information which, when combined with the line geometry, comprise the LSIB dataset. These Core Attributes are further divided into Country Code and Name Fields and Descriptive Fields. County Code and Country Name Fields “CC1” and “CC2” fields are machine readable fields that contain political entity codes. These are two-character codes derived from the Geopolitical Entities, Names, and Codes Standard (GENC), Edition 3 Update 18. “CC1_GENC3” and “CC2_GENC3” fields contain the corresponding three-character GENC codes and are extension attributes discussed below. The codes “Q2” or “QX2” denote a line in the LSIB representing a boundary associated with areas not contained within the GENC standard. The “COUNTRY1” and “COUNTRY2” fields contain the names of corresponding political entities. These fields contain names approved by the U.S. Board on Geographic Names (BGN) as incorporated in the ‘"Independent States in the World" and "Dependencies and Areas of Special Sovereignty" lists maintained by the Department of State. To ensure maximum compatibility, names are presented without diacritics and certain names are rendered using common cartographic abbreviations. Names for lines associated with the code "Q2" are descriptive and not necessarily BGN-approved. Names rendered in all CAPITAL LETTERS denote independent states. Names rendered in normal text represent dependencies, areas of special sovereignty, or are otherwise presented for the convenience of the user. Descriptive Fields The following text fields are a part of the core attributes of the LSIB dataset and do not update from external sources. They provide additional information about each of the lines and are as follows: ATTRIBUTE NAME | CONTAINS NULLS RANK | No STATUS | No LABEL | Yes NOTES | Yes Neither the "RANK" nor "STATUS" fields contain null values; the "LABEL" and "NOTES" fields do. The "RANK" field is a numeric expression of the "STATUS" field. Combined with the line geometry, these fields encode the views of the United States Government on the political status of the boundary line. ATTRIBUTE NAME | | VALUE | RANK | 1 | 2 | 3 STATUS | International Boundary | Other Line of International Separation | Special Line A value of “1” in the “RANK” field corresponds to an "International Boundary" value in the “STATUS” field. Values of ”2” and “3” correspond to “Other Line of International Separation” and “Special Line,” respectively. The “LABEL” field contains required text to describe the line segment on all finished cartographic products, including but not limited to print and interactive maps. The “NOTES” field contains an explanation of special circumstances modifying the lines. This information can pertain to the origins of the boundary lines, limitations regarding the purpose of the lines, or the original source of the line. Use of Core Attributes in Cartographic Visualization Several of the Core Attributes provide information required for the proper cartographic representation of the LSIB dataset. The cartographic usage of the LSIB requires a visual differentiation between the three categories of boundary lines. Specifically, this differentiation must be between: International Boundaries (Rank 1); Other Lines of International Separation (Rank 2); and Special Lines (Rank 3). Rank 1 lines must be the most visually prominent. Rank 2 lines must be less visually prominent than Rank 1 lines. Rank 3 lines must be shown in a manner visually subordinate to Ranks 1 and 2. Where scale permits, Rank 2 and 3 lines must be labeled in accordance with the “Label” field. Data marked with a Rank 2 or 3 designation does not necessarily correspond to a disputed boundary. Please consult the style files in the download package for examples of this depiction. The requirement to incorporate the contents of the "LABEL" field on cartographic products is scale dependent. If a label is legible at the scale of a given static product, a proper use of this dataset would encourage the application of that label. Using the contents of the "COUNTRY1" and "COUNTRY2" fields in the generation of a line segment label is not required. The "STATUS" field contains the preferred description for the three LSIB line types when they are incorporated into a map legend but is otherwise not to be used for labeling. Use of the “CC1,” “CC1_GENC3,” “CC2,” “CC2_GENC3,” “RANK,” or “NOTES” fields for cartographic labeling purposes is prohibited. Extension Attributes Certain elements of the attributes within the LSIB dataset extend data functionality to make the data more interoperable or to provide clearer linkages to other datasets. The fields “CC1_GENC3” and “CC2_GENC” contain the corresponding three-character GENC code to the “CC1” and “CC2” attributes. The code “QX2” is the three-character counterpart of the code “Q2,” which denotes a line in the LSIB representing a boundary associated with a geographic area not contained within the GENC standard. To allow for linkage between individual lines in the LSIB and World Polygons dataset, the “CC1_WPID” and “CC2_WPID” fields contain a Universally Unique Identifier (UUID), version 4, which provides a stable description of each geographic entity in a boundary pair relationship. Each UUID corresponds to a geographic entity listed in the World Polygons dataset. These fields allow for linkage between individual lines in the LSIB and the overall World Polygons dataset. Five additional fields in the LSIB expand on the UUID concept and either describe features that have changed across space and time or indicate relationships between previous versions of the feature. The “LSIB_ID” attribute is a UUID value that defines a specific instance of a feature. Any change to the feature in a lineset requires a new “LSIB_ID.” The “ANTECIDS,” or antecedent ID, is a UUID that references line geometries from which a given line is descended in time. It is used when there is a feature that is entirely new, not when there is a new version of a previous feature. This is generally used to reference countries that have dissolved. The “PREVIDS,” or Previous ID, is a UUID field that contains old versions of a line. This is an additive field, that houses all Previous IDs. A new version of a feature is defined by any change to the

  2. o

    Country Codes

    • public.opendatasoft.com
    • data.smartidf.services
    • +6more
    csv, excel, geojson +1
    Updated Aug 25, 2015
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2015). Country Codes [Dataset]. https://public.opendatasoft.com/explore/dataset/countries-codes/
    Explore at:
    geojson, json, excel, csvAvailable download formats
    Dataset updated
    Aug 25, 2015
    License

    https://en.wikipedia.org/wiki/Public_domainhttps://en.wikipedia.org/wiki/Public_domain

    Description

    Country codes: ISO 2ISO 3UNLANGLABEL (EN, FR, SP)

  3. o

    Geonames - All Cities with a population > 1000

    • public.opendatasoft.com
    • data.smartidf.services
    • +2more
    csv, excel, geojson +1
    Updated Mar 10, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Geonames - All Cities with a population > 1000 [Dataset]. https://public.opendatasoft.com/explore/dataset/geonames-all-cities-with-a-population-1000/
    Explore at:
    csv, json, geojson, excelAvailable download formats
    Dataset updated
    Mar 10, 2024
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    All cities with a population > 1000 or seats of adm div (ca 80.000)Sources and ContributionsSources : GeoNames is aggregating over hundred different data sources. Ambassadors : GeoNames Ambassadors help in many countries. Wiki : A wiki allows to view the data and quickly fix error and add missing places. Donations and Sponsoring : Costs for running GeoNames are covered by donations and sponsoring.Enrichment:add country name

  4. Data from: Caravan - A global community dataset for large-sample hydrology

    • biorxiv.org
    • explore.openaire.eu
    • +2more
    bin, png, zip
    Updated Jul 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Frederik Kratzert; Frederik Kratzert; Grey Nearing; Grey Nearing; Nans Addor; Nans Addor; Tyler Erickson; Martin Gauch; Martin Gauch; Oren Gilon; Lukas Gudmundsson; Lukas Gudmundsson; Avinatan Hassidim; Daniel Klotz; Daniel Klotz; Sella Nevo; Guy Shalev; Yossi Matias; Tyler Erickson; Oren Gilon; Avinatan Hassidim; Sella Nevo; Guy Shalev; Yossi Matias (2024). Caravan - A global community dataset for large-sample hydrology [Dataset]. http://doi.org/10.5281/zenodo.7540792
    Explore at:
    png, zip, binAvailable download formats
    Dataset updated
    Jul 12, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Frederik Kratzert; Frederik Kratzert; Grey Nearing; Grey Nearing; Nans Addor; Nans Addor; Tyler Erickson; Martin Gauch; Martin Gauch; Oren Gilon; Lukas Gudmundsson; Lukas Gudmundsson; Avinatan Hassidim; Daniel Klotz; Daniel Klotz; Sella Nevo; Guy Shalev; Yossi Matias; Tyler Erickson; Oren Gilon; Avinatan Hassidim; Sella Nevo; Guy Shalev; Yossi Matias
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    THIS IS A PRE-RELEASE, WHILE THE CARAVAN IS UNDER REVISION.

    Check out the preprint at: https://eartharxiv.org/repository/view/3345/ (accepted for publication at Nature Scientific Data).

    Caravan is an open community dataset of meteorological forcing data, catchment attributes, and discharge daat for catchments around the world. Additionally, Caravan provides code to derive meteorological forcing data and catchment attributes from the same data sources in the cloud, making it easy for anyone to extend Caravan to new catchments. The vision of Caravan is to provide the foundation for a truly global open source community resource that will grow over time.

    Channel Log:

    • 23 May 2022: Version 0.2 - Resolved a bug when renaming the LamaH gauge ids from the LamaH ids to the official gauge ids provided as "govnr" in the LamaH dataset attribute files.
    • 24 May 2022: Version 0.3 - Fixed gaps in forcing data in some "camels" (US) basins.
    • 15 June 2022: Version 0.4 - Fixed replacing negative CAMELS US values with NaN (-999 in CAMELS indicates missing observation).
    • 1 December 2022: Version 0.4 - Added 4298 basins in the US, Canada and Mexico (part of HYSETS), now totalling to 6830 basins. Fixed a bug in the computation of catchment attributes that are defined as pour point properties, where sometimes the wrong HydroATLAS polygon was picked. Restructured the attribute files and added some more meta data (station name and country).
    • 16 January 2023: Version 1.0 - Version of the official paper release. No changes in the data but added a static copy of the accompanying code of the paper. For the most up to date version, please check https://github.com/kratzert/Caravan
  5. Caravan - A global community dataset for large-sample hydrology (csv...

    • zenodo.org
    application/gzip, zip
    Updated May 27, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Frederik Kratzert; Frederik Kratzert; Grey Nearing; Grey Nearing; Nans Addor; Nans Addor; Tyler Erickson; Martin Gauch; Martin Gauch; Oren Gilon; Lukas Gudmundsson; Lukas Gudmundsson; Avinatan Hassidim; Daniel Klotz; Daniel Klotz; Sella Nevo; Guy Shalev; Yossi Matias; Tyler Erickson; Oren Gilon; Avinatan Hassidim; Sella Nevo; Guy Shalev; Yossi Matias (2025). Caravan - A global community dataset for large-sample hydrology (csv version) [Dataset]. http://doi.org/10.5281/zenodo.15530022
    Explore at:
    zip, application/gzipAvailable download formats
    Dataset updated
    May 27, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Frederik Kratzert; Frederik Kratzert; Grey Nearing; Grey Nearing; Nans Addor; Nans Addor; Tyler Erickson; Martin Gauch; Martin Gauch; Oren Gilon; Lukas Gudmundsson; Lukas Gudmundsson; Avinatan Hassidim; Daniel Klotz; Daniel Klotz; Sella Nevo; Guy Shalev; Yossi Matias; Tyler Erickson; Oren Gilon; Avinatan Hassidim; Sella Nevo; Guy Shalev; Yossi Matias
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is the accompanying dataset to the following paper https://www.nature.com/articles/s41597-023-01975-w

    Caravan is an open community dataset of meteorological forcing data, catchment attributes, and discharge daat for catchments around the world. Additionally, Caravan provides code to derive meteorological forcing data and catchment attributes from the same data sources in the cloud, making it easy for anyone to extend Caravan to new catchments. The vision of Caravan is to provide the foundation for a truly global open source community resource that will grow over time.

    If you use Caravan in your research, it would be appreciated to not only cite Caravan itself, but also the source datasets, to pay respect to the amount of work that was put into the creation of these datasets and that made Caravan possible in the first place.

    All current development and additional community extensions can be found at https://github.com/kratzert/Caravan

    IMPORTANT: Due to size limitations for individual repositories, the netCDF version and the CSV version of Caravan (since Version 1.6) are split into two different repositories. You can find the netCDF version at https://zenodo.org/records/14673536

    Channel Log:

    • 23 May 2022: Version 0.2 - Resolved a bug when renaming the LamaH gauge ids from the LamaH ids to the official gauge ids provided as "govnr" in the LamaH dataset attribute files.
    • 24 May 2022: Version 0.3 - Fixed gaps in forcing data in some "camels" (US) basins.
    • 15 June 2022: Version 0.4 - Fixed replacing negative CAMELS US values with NaN (-999 in CAMELS indicates missing observation).
    • 1 December 2022: Version 0.4 - Added 4298 basins in the US, Canada and Mexico (part of HYSETS), now totalling to 6830 basins. Fixed a bug in the computation of catchment attributes that are defined as pour point properties, where sometimes the wrong HydroATLAS polygon was picked. Restructured the attribute files and added some more meta data (station name and country).
    • 16 January 2023: Version 1.0 - Version of the official paper release. No changes in the data but added a static copy of the accompanying code of the paper. For the most up to date version, please check https://github.com/kratzert/Caravan
    • 10 May 2023: Version 1.1 - No data change, just update data description.
    • 17 May 2023: Version 1.2 - Updated a handful of attribute values that were affected by a bug in their derivation. See https://github.com/kratzert/Caravan/issues/22 for details.
    • 16 April 2024: Version 1.4 - Added 9130 gauges from the original source dataset that were initially not included because of the area thresholds (i.e. basins smaller than 100sqkm or larger than 2000sqkm). Also extended the forcing period for all gauges (including the original ones) to 1950-2023. Added two different download options that include timeseries data only as either csv files (Caravan-csv.tar.xz) or netcdf files (Caravan-nc.tar.xz). Including the large basins also required an update in the earth engine code
    • 16 Jan 2025: Version 1.5 - Added FAO Penman-Monteith PET (potential_evaporation_sum_FAO_PENMAN_MONTEITH) and renamed the ERA5-LAND potential_evaporation band to potential_evaporation_sum_ERA5_LAND. Also added all PET-related climated indices derived with the Penman-Monteith PET band (suffix "_FAO_PM") and renamed the old PET-related indices accordingly (suffix "_ERA5_LAND").
    • 27 May 2025: Version 1.6
      • Updated the CAMELS-AUS data to source from CAMELS-AUS v2. This means more basins (561 compared to 222) and more recent streamflow data (2022 compared to 2014). Note that the gauge id for four basins changed between the original CAMELS-AUS version and v2. Those gauges are ['camelsaus_224213A', 'camelsaus_224214A', 'camelsaus_227225A', 'camelsaus_403213A'] that all lost their trailing "A". To stay synced with CAMELS-AUS (v2), we also adapted the new naming.
      • Added VERSION file to the root directory that contains the current version number.
      • Updated the code to the most recent GitHub snapshot (commit 6eab036).
      • Due to the 50GB repository limit, we had to split the netCDF version and the CSV version into two separate repositories. The CSV version can be found under https://zenodo.org/records/15530021
  6. The big dataset of ultra-marathon running

    • kaggle.com
    Updated Jul 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David (2023). The big dataset of ultra-marathon running [Dataset]. https://www.kaggle.com/datasets/aiaiaidavid/the-big-dataset-of-ultra-marathon-running
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 12, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    David
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    According to the Wikipedia, an ultramarathon, also called ultra distance or ultra running, is any footrace longer than the traditional marathon length of 42.195 kilometres (26 mi 385 yd). Various distances are raced competitively, from the shortest common ultramarathon of 31 miles (50 km) to over 200 miles (320 km). 50k and 100k are both World Athletics record distances, but some 100 miles (160 km) races are among the oldest and most prestigious events, especially in North America.}

    The data in this file is a large collection of ultra-marathon race records registered between 1798 and 2022 (a period of well over two centuries) being therefore a formidable long term sample. All data was obtained from public websites.

    Despite the original data being of public domain, the race records, which originally contained the athlete´s names, have been anonymized to comply with data protection laws and to preserve the athlete´s privacy. However, a column Athlete ID has been created with a numerical ID representing each unique runner (so if Antonio Fernández participated in 5 races over different years, then the corresponding race records now hold his unique Athlete ID instead of his name). This way I have preserved valuable information.

    The dataset contains 7,461,226 ultra-marathon race records from 1,641,168 unique athletes.

    The following columns (with data types) are included:

    • Year of event (int64)
    • Event dates (object)
    • Event name (object)
    • Event distance/length (object)
    • Event number of finishers (int64)
    • Athlete performance (object)
    • Athlete club (object)
    • Athlete country (object)
    • Athlete year of birth (float64)
    • Athlete gender (object)
    • Athlete age category (object)
    • Athlete average speed (object)
    • Athlete ID (int64)

    The Event name column include country location information that can be derived to a new column, and similarly seasonal information can be found in the Event dates column beyond the Year of event (these can be extracted with a bit of processing).

    The Event distance/length column describes the type of race, covering the most popular UM race distances and lengths, and some other specific modalities (multi-day, etc.):

    • Distances: 50km, 100km, 50mi, 100mi
    • Lengths: 6h, 12h, 24h, 48h, 72h, 6d, 10d

    Additionally, there is information of age, gender and speed (in km/h) in other columns.

  7. Data from: Large Landing Trajectory Data Set for Go-Around Analysis

    • zenodo.org
    • data.niaid.nih.gov
    application/gzip, bin +1
    Updated Dec 16, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Raphael Monstein; Raphael Monstein; Benoit Figuet; Benoit Figuet; Timothé Krauth; Timothé Krauth; Manuel Waltert; Manuel Waltert; Marcel Dettling; Marcel Dettling (2022). Large Landing Trajectory Data Set for Go-Around Analysis [Dataset]. http://doi.org/10.5281/zenodo.7148117
    Explore at:
    application/gzip, bin, zipAvailable download formats
    Dataset updated
    Dec 16, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Raphael Monstein; Raphael Monstein; Benoit Figuet; Benoit Figuet; Timothé Krauth; Timothé Krauth; Manuel Waltert; Manuel Waltert; Marcel Dettling; Marcel Dettling
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Large go-around, also referred to as missed approach, data set. The data set is in support of the paper presented at the OpenSky Symposium on November the 10th.

    If you use this data for a scientific publication, please consider citing our paper.

    The data set contains landings from 176 (mostly) large airports from 44 different countries. The landings are labelled as performing a go-around (GA) or not. In total, the data set contains almost 9 million landings with more than 33000 GAs. The data was collected from OpenSky Network's historical data base for the year 2019. The published data set contains multiple files:

    go_arounds_minimal.csv.gz

    Compressed CSV containing the minimal data set. It contains a row for each landing and a minimal amount of information about the landing, and if it was a GA. The data is structured in the following way:

    Column nameTypeDescription
    timedate timeUTC time of landing or first GA attempt
    icao24stringUnique 24-bit (hexadecimal number) ICAO identifier of the aircraft concerned
    callsignstringAircraft identifier in air-ground communications
    airportstringICAO airport code where the aircraft is landing
    runwaystringRunway designator on which the aircraft landed
    has_gastring"True" if at least one GA was performed, otherwise "False"
    n_approachesintegerNumber of approaches identified for this flight
    n_rwy_approachedintegerNumber of unique runways approached by this flight

    The last two columns, n_approaches and n_rwy_approached, are useful to filter out training and calibration flight. These have usually a large number of n_approaches, so an easy way to exclude them is to filter by n_approaches > 2.

    go_arounds_augmented.csv.gz

    Compressed CSV containing the augmented data set. It contains a row for each landing and additional information about the landing, and if it was a GA. The data is structured in the following way:

    Column nameTypeDescription
    timedate timeUTC time of landing or first GA attempt
    icao24stringUnique 24-bit (hexadecimal number) ICAO identifier of the aircraft concerned
    callsignstringAircraft identifier in air-ground communications
    airportstringICAO airport code where the aircraft is landing
    runwaystringRunway designator on which the aircraft landed
    has_gastring"True" if at least one GA was performed, otherwise "False"
    n_approachesintegerNumber of approaches identified for this flight
    n_rwy_approachedintegerNumber of unique runways approached by this flight
    registrationstringAircraft registration
    typecodestringAircraft ICAO typecode
    icaoaircrafttypestringICAO aircraft type
    wtcstringICAO wake turbulence category
    glide_slope_anglefloatAngle of the ILS glide slope in degrees
    has_intersection

    string

    Boolean that is true if the runway has an other runway intersecting it, otherwise false
    rwy_lengthfloatLength of the runway in kilometre
    airport_countrystringISO Alpha-3 country code of the airport
    airport_regionstringGeographical region of the airport (either Europe, North America, South America, Asia, Africa, or Oceania)
    operator_countrystringISO Alpha-3 country code of the operator
    operator_regionstringGeographical region of the operator of the aircraft (either Europe, North America, South America, Asia, Africa, or Oceania)
    wind_speed_kntsintegerMETAR, surface wind speed in knots
    wind_dir_degintegerMETAR, surface wind direction in degrees
    wind_gust_kntsintegerMETAR, surface wind gust speed in knots
    visibility_mfloatMETAR, visibility in m
    temperature_degintegerMETAR, temperature in degrees Celsius
    press_sea_level_pfloatMETAR, sea level pressure in hPa
    press_pfloatMETAR, QNH in hPA
    weather_intensitylistMETAR, list of present weather codes: qualifier - intensity
    weather_precipitationlistMETAR, list of present weather codes: weather phenomena - precipitation
    weather_desclistMETAR, list of present weather codes: qualifier - descriptor
    weather_obscurationlistMETAR, list of present weather codes: weather phenomena - obscuration
    weather_otherlistMETAR, list of present weather codes: weather phenomena - other

    This data set is augmented with data from various public data sources. Aircraft related data is mostly from the OpenSky Network's aircraft data base, the METAR information is from the Iowa State University, and the rest is mostly scraped from different web sites. If you need help with the METAR information, you can consult the WMO's Aerodrom Reports and Forecasts handbook.

    go_arounds_agg.csv.gz

    Compressed CSV containing the aggregated data set. It contains a row for each airport-runway, i.e. every runway at every airport for which data is available. The data is structured in the following way:

    Column nameTypeDescription
    airportstringICAO airport code where the aircraft is landing
    runwaystringRunway designator on which the aircraft landed
    n_landingsintegerTotal number of landings observed on this runway in 2019
    ga_ratefloatGo-around rate, per 1000 landings
    glide_slope_anglefloatAngle of the ILS glide slope in degrees
    has_intersectionstringBoolean that is true if the runway has an other runway intersecting it, otherwise false
    rwy_lengthfloatLength of the runway in kilometres
    airport_countrystringISO Alpha-3 country code of the airport
    airport_regionstringGeographical region of the airport (either Europe, North America, South America, Asia, Africa, or Oceania)

    This aggregated data set is used in the paper for the generalized linear regression model.

    Downloading the trajectories

    Users of this data set with access to OpenSky Network's Impala shell can download the historical trajectories from the historical data base with a few lines of Python code. For example, you want to get all the go-arounds of the 4th of January 2019 at London City Airport (EGLC). You can use the Traffic library for easy access to the database:

    import datetime
    from tqdm.auto import tqdm
    import pandas as pd
    from traffic.data import opensky
    from traffic.core import Traffic

    load minimum data set

    df = pd.read_csv("go_arounds_minimal.csv.gz", low_memory=False) df["time"] = pd.to_datetime(df["time"])

    select London City Airport, go-arounds, and 2019-01-04

    airport = "EGLC" start = datetime.datetime(year=2019, month=1, day=4).replace( tzinfo=datetime.timezone.utc ) stop = datetime.datetime(year=2019, month=1, day=5).replace( tzinfo=datetime.timezone.utc )

    df_selection = df.query("airport==@airport & has_ga

  8. Dataset of 'Mapping 10-m Industrial Lands across 1000+ Global Large Cities,...

    • zenodo.org
    bin, zip
    Updated Feb 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cheolhee Yoo; Cheolhee Yoo; Yuhan Zhou; Yuhan Zhou; Qihao Weng; Qihao Weng (2025). Dataset of 'Mapping 10-m Industrial Lands across 1000+ Global Large Cities, 2017-2023' [Dataset]. http://doi.org/10.5281/zenodo.14832219
    Explore at:
    zip, binAvailable download formats
    Dataset updated
    Feb 18, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Cheolhee Yoo; Cheolhee Yoo; Yuhan Zhou; Yuhan Zhou; Qihao Weng; Qihao Weng
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset provides high-resolution (10 m) industrial land maps for 1,093 global cities from 2017 to 2023.

    The dataset includes:

    • 10 m resolution industrial land maps for each year (GeoTIFF, Zip)
    • Detailed information for each city, including industrial land area per capita indicators (1093_city_information.xlsx)
    • Validation Package (Validation_Package.zip)

    File Naming Convention

    • GeoTIFF files: Industrial_land_XXX_YYY_YEAR.tif
      • XXX: Country code
      • YYY: City ID
      • YEAR: Year of data
      • Example: Industrial_land_USA_634_2017.tif represents the industrial land map for Chicago, USA, in 2017.

    Each TIF file has a 10 m spatial resolution with the GCS_WGS_1984 spatial projection. The maps include three classes:

    • Class 1: Industrial land in built-up areas
    • Class 2: Non-industrial land in built-up areas
    • Class 0: Non-built-up areas

    City Information

    A detailed summary of city-specific information, including the annual total industrial land area, is provided in 1093_city_information.xlsx. This file includes:

    • ID_HDC_G0: Unique city ID (Urban Centre)
    • CTR_MN_NM: Main country name
    • CTR_MN_ISO: ISO-3 country codes
    • UC_NM_MN: Main city (Urban Centre) name
    • UC_NM_LST: Full list of assigned city (Urban Centre) names
    • CLUSTER: Assigned cluster number in industrial land modeling
    • URB_ECOREGION: Assigned urban ecoregion
    • CLUSTER: Assigned cluster number in industrial land modeling
    • IND_YEAR: Total industrial land area for each year (in m²).

    Validation Package

    • This package includes validation samples used for industrial land mapping validation.
    • Validation_shapefile/: Contains the validation shapefiles used for assessing the accuracy of industrial land maps.
    • Validation of Industrial Land Map Using CO₂ Emissions Data.xlsx: Includes validation results comparing industrial land maps with CO₂ emissions data.
    • Association between Proposed Industrial Land and the Official Data.xlsx: Contains data assessing the relationship between the proposed industrial land and official datasets.
  9. Data from: Login Data Set for Risk-Based Authentication

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Jun 30, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stephan Wiefling; Stephan Wiefling; Paul René Jørgensen; Paul René Jørgensen; Sigurd Thunem; Sigurd Thunem; Luigi Lo Iacono; Luigi Lo Iacono (2022). Login Data Set for Risk-Based Authentication [Dataset]. http://doi.org/10.5281/zenodo.6782156
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 30, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Stephan Wiefling; Stephan Wiefling; Paul René Jørgensen; Paul René Jørgensen; Sigurd Thunem; Sigurd Thunem; Luigi Lo Iacono; Luigi Lo Iacono
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Login Data Set for Risk-Based Authentication

    Synthesized login feature data of >33M login attempts and >3.3M users on a large-scale online service in Norway. Original data collected between February 2020 and February 2021.

    This data sets aims to foster research and development for Risk-Based Authentication (RBA) systems. The data was synthesized from the real-world login behavior of more than 3.3M users at a large-scale single sign-on (SSO) online service in Norway.

    The users used this SSO to access sensitive data provided by the online service, e.g., a cloud storage and billing information. We used this data set to study how the Freeman et al. (2016) RBA model behaves on a large-scale online service in the real world (see Publication). The synthesized data set can reproduce these results made on the original data set (see Study Reproduction). Beyond that, you can use this data set to evaluate and improve RBA algorithms under real-world conditions.

    WARNING: The feature values are plausible, but still totally artificial. Therefore, you should NOT use this data set in productive systems, e.g., intrusion detection systems.

    Overview

    The data set contains the following features related to each login attempt on the SSO:

    FeatureData TypeDescriptionRange or Example
    IP AddressStringIP address belonging to the login attempt0.0.0.0 - 255.255.255.255
    CountryStringCountry derived from the IP addressUS
    RegionStringRegion derived from the IP addressNew York
    CityStringCity derived from the IP addressRochester
    ASNIntegerAutonomous system number derived from the IP address0 - 600000
    User Agent StringStringUser agent string submitted by the clientMozilla/5.0 (Windows NT 10.0; Win64; ...
    OS Name and VersionStringOperating system name and version derived from the user agent stringWindows 10
    Browser Name and VersionStringBrowser name and version derived from the user agent stringChrome 70.0.3538
    Device TypeStringDevice type derived from the user agent string(mobile, desktop, tablet, bot, unknown)1
    User IDIntegerIdenfication number related to the affected user account[Random pseudonym]
    Login TimestampIntegerTimestamp related to the login attempt[64 Bit timestamp]
    Round-Trip Time (RTT) [ms]IntegerServer-side measured latency between client and server1 - 8600000
    Login SuccessfulBooleanTrue: Login was successful, False: Login failed(true, false)
    Is Attack IPBooleanIP address was found in known attacker data set(true, false)
    Is Account TakeoverBooleanLogin attempt was identified as account takeover by incident response team of the online service(true, false)

    Data Creation

    As the data set targets RBA systems, especially the Freeman et al. (2016) model, the statistical feature probabilities between all users, globally and locally, are identical for the categorical data. All the other data was randomly generated while maintaining logical relations and timely order between the features.

    The timestamps, however, are not identical and contain randomness. The feature values related to IP address and user agent string were randomly generated by publicly available data, so they were very likely not present in the real data set. The RTTs resemble real values but were randomly assigned among users per geolocation. Therefore, the RTT entries were probably in other positions in the original data set.

    • The country was randomly assigned per unique feature value. Based on that, we randomly assigned an ASN related to the country, and generated the IP addresses for this ASN. The cities and regions were derived from the generated IP addresses for privacy reasons and do not reflect the real logical relations from the original data set.

    • The device types are identical to the real data set. Based on that, we randomly assigned the OS, and based on the OS the browser information. From this information, we randomly generated the user agent string. Therefore, all the logical relations regarding the user agent are identical as in the real data set.

    • The RTT was randomly drawn from the login success status and synthesized geolocation data. We did this to ensure that the RTTs are realistic ones.

    Regarding the Data Values

    Due to unresolvable conflicts during the data creation, we had to assign some unrealistic IP addresses and ASNs that are not present in the real world. Nevertheless, these do not have any effects on the risk scores generated by the Freeman et al. (2016) model.

    You can recognize them by the following values:

    • ASNs with values >= 500.000

    • IP addresses in the range 10.0.0.0 - 10.255.255.255 (10.0.0.0/8 CIDR range)

    Study Reproduction

    Based on our evaluation, this data set can reproduce our study results regarding the RBA behavior of an RBA model using the IP address (IP address, country, and ASN) and user agent string (Full string, OS name and version, browser name and version, device type) as features.

    The calculated RTT significances for countries and regions inside Norway are not identical using this data set, but have similar tendencies. The same is true for the Median RTTs per country. This is due to the fact that the available number of entries per country, region, and city changed with the data creation procedure. However, the RTTs still reflect the real-world distributions of different geolocations by city.

    See RESULTS.md for more details.

    Ethics

    By using the SSO service, the users agreed in the data collection and evaluation for research purposes. For study reproduction and fostering RBA research, we agreed with the data owner to create a synthesized data set that does not allow re-identification of customers.

    The synthesized data set does not contain any sensitive data values, as the IP addresses, browser identifiers, login timestamps, and RTTs were randomly generated and assigned.

    Publication

    You can find more details on our conducted study in the following journal article:

    Pump Up Password Security! Evaluating and Enhancing Risk-Based Authentication on a Real-World Large-Scale Online Service (2022)
    Stephan Wiefling, Paul René Jørgensen, Sigurd Thunem, and Luigi Lo Iacono.
    ACM Transactions on Privacy and Security

    Bibtex

    @article{Wiefling_Pump_2022,
     author = {Wiefling, Stephan and Jørgensen, Paul René and Thunem, Sigurd and Lo Iacono, Luigi},
     title = {Pump {Up} {Password} {Security}! {Evaluating} and {Enhancing} {Risk}-{Based} {Authentication} on a {Real}-{World} {Large}-{Scale} {Online} {Service}},
     journal = {{ACM} {Transactions} on {Privacy} and {Security}},
     doi = {10.1145/3546069},
     publisher = {ACM},
     year  = {2022}
    }

    License

    This data set and the contents of this repository are licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0) license. See the LICENSE file for details. If the data set is used within a publication, the following journal article has to be cited as the source of the data set:

    Stephan Wiefling, Paul René Jørgensen, Sigurd Thunem, and Luigi Lo Iacono: Pump Up Password Security! Evaluating and Enhancing Risk-Based Authentication on a Real-World Large-Scale Online Service. In: ACM Transactions on Privacy and Security (2022). doi: 10.1145/3546069

    1. Few (invalid) user agents strings from the original data set could not be parsed, so their device type is empty. Perhaps this parse error is useful information for your studies, so we kept these 1526 entries.↩︎

  10. G

    LSIB 2017: Large Scale International Boundary Polygons, Simplified

    • developers.google.com
    Updated Mar 30, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    United States Department of State, Office of the Geographer (2017). LSIB 2017: Large Scale International Boundary Polygons, Simplified [Dataset]. https://developers.google.com/earth-engine/datasets/catalog/USDOS_LSIB_SIMPLE_2017
    Explore at:
    Dataset updated
    Mar 30, 2017
    Dataset provided by
    United States Department of State, Office of the Geographer
    Time period covered
    Mar 30, 2017
    Area covered
    Earth
    Description

    The United States Office of the Geographer provides the Large Scale International Boundary (LSIB) dataset. The detailed version (2013) is derived from two other datasets: a LSIB line vector file and the World Vector Shorelines (WVS) from the National Geospatial-Intelligence Agency (NGA). The interior boundaries reflect U.S. government policies on boundaries, boundary disputes, and sovereignty. The exterior boundaries are derived from the WVS; however, the WVS coastline data is outdated and generally shifted from between several hundred meters to over a kilometer. Each feature is the polygonal area enclosed by interior boundaries and exterior coastlines where applicable, and many countries consist of multiple features, one per disjoint region. Compared with the detailed LSIB, in this simplified dataset some disjointed regions of each country have been reduced to a single feature. Furthermore, it excludes medium and smaller islands. The resulting simplified boundary lines are rarely shifted by more than 100 meters from the detailed LSIB lines. Each of the 312 features is a part of the geometry of one of the 284 countries described in this dataset.

  11. Global Air Quality Dataset 🌍🌫️

    • kaggle.com
    Updated Jul 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Waqar Ali (2024). Global Air Quality Dataset 🌍🌫️ [Dataset]. https://www.kaggle.com/datasets/waqi786/global-air-quality-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 28, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Waqar Ali
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    The Global Air Quality Data dataset provides an extensive compilation of air quality measurements from various prominent cities worldwide. This dataset includes crucial environmental indicators such as particulate matter (PM2.5 and PM10), nitrogen dioxide (NO2), sulfur dioxide (SO2), carbon monoxide (CO), and ozone (O3), along with meteorological data like temperature, humidity, and wind speed. With 10,000 records, this dataset is ideal for researchers, data scientists, and policy makers looking to analyze air quality trends, understand the impact of pollution on health, and develop strategies for environmental improvement.

    The dataset is composed of the following columns:

    City: The name of the city where the air quality measurement was taken. Country: The country in which the city is located. Date: The date when the measurement was recorded. PM2.5: The concentration of fine particulate matter with a diameter of less than 2.5 micrometers (µg/m³). PM10: The concentration of particulate matter with a diameter of less than 10 micrometers (µg/m³). NO2: The concentration of nitrogen dioxide (µg/m³). SO2: The concentration of sulfur dioxide (µg/m³). CO: The concentration of carbon monoxide (mg/m³). O3: The concentration of ozone (µg/m³). Temperature: The temperature at the time of measurement (°C). Humidity: The humidity level at the time of measurement (%). Wind Speed: The wind speed at the time of measurement (m/s).

  12. Student Academic Performance and Probation Dataset

    • kaggle.com
    Updated Nov 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rathnakumarw (2024). Student Academic Performance and Probation Dataset [Dataset]. https://www.kaggle.com/datasets/rathnakumarw/student-academic-performance-and-probation-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 29, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Rathnakumarw
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset Description This dataset consists of academic and demographic information about 300 students from a university, which can be used for predicting academic outcomes, such as probation status. The dataset was simulated to represent a variety of student attributes across multiple categories like personal data, academic history, and other related information. The primary goal of this dataset is to analyze factors contributing to academic performance and identify students at risk of probation.

    Column Descriptions Student No.: (Numeric) A unique identifier for each student. In this dataset, each student has a different ID number, making it a 100% unique column. Cohort: (Numeric) The year a student enrolled in the university. No missing values and consistent across the dataset. College: (Nominal) The name of the college the student belongs to. Examples include "Engineering," "Science," etc. No missing values. College Code: (Nominal) A numerical or alphanumerical code representing the college. This is an alternative representation of the "College" column. Major: (Nominal) The major field of study of the student. Some missing values (23%) represent students who haven’t declared a major or are in an undeclared status. Major Code: (Nominal) A code representing the major subject. Similar to the "Major" column, this has 23% missing values due to undeclared majors. Minor: (Nominal) The minor subject, if any, chosen by the student. This column has a high percentage of missing data (91%) since most students do not have minors. Spec: (Nominal) Specialization within the major field of study. Like the "Minor" column, this has 93% missing data as most students do not declare a specialization. Degree: (Numeric) The type of degree the student is pursuing (e.g., Bachelor's). In this dataset, all students are pursuing the same degree, so there are no missing values. Status: (Nominal) The current academic standing of the student (e.g., "Active," "Inactive"). No missing values. Load Status: (Nominal) The academic load status (e.g., "Full-time," "Part-time"). This column has very few missing values (1%). Gender: (Nominal) The gender of the student (e.g., "Male," "Female"). No missing values. Country: (Nominal) The country of origin of the student. Only 2 missing values, making it nearly complete. Governorate: (Nominal) The administrative region (governorate) the student comes from. This column has a small percentage of missing values (1%). Wellayah: (Nominal) The district or locality within the governorate. Around 1% of the data is missing. CGPA: (Numeric) The cumulative grade point average (CGPA) of the student. This field has 145 missing values, representing students without available CGPA records. Estimated Graduation Year: (Numeric) The expected year in which the student will graduate. No missing values. From HEAC: (Nominal) Indicates whether the student was admitted through the Higher Education Admission Center (HEAC). This column has 4% missing values. Admission Category: (Nominal) The category of admission (e.g., scholarship, self-funded). This column has a significant amount of missing data (98%), indicating that admission category data is either unavailable or irrelevant for most students. Birth Date: (Nominal) The birth date of the student. The dataset includes very few missing values (0%) and has been replaced by the derived feature "Age." Actual Graduation Date: (Nominal) The actual date on which a student graduates. More than half of the values are missing (54%), representing students who haven’t graduated yet. Withdrawal: (Nominal) Indicates whether the student has withdrawn from the university. This column has 89% missing data since the majority of students haven’t withdrawn. Marital Status: (Nominal) The marital status of the student (e.g., "Single," "Married"). No missing values. SQU Hostel: (Nominal) Indicates whether the student lives in the university hostel. No missing values. Percentage (Secondary School Score): (Nominal) The student’s percentage score from secondary school. No missing values. Probation Student: (Nominal) Indicates whether the student is under academic probation. This is the target variable for classification, with no missing values.

    Record Details Total Records: 300 Total Attributes: 26 Missing Values: Some columns have a significant proportion of missing data (e.g., Minor, Spec, Major Code), while others have very few or no missing values (e.g., Gender, Cohort, College). Missing values were handled using a placeholder for clarity in certain columns.

  13. Construction Data | Building Materials & Construction Industry Leaders in...

    • datarade.ai
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Success.ai, Construction Data | Building Materials & Construction Industry Leaders in Europe | Verified Global Profiles from 700M+ Dataset | Best Price Guarantee [Dataset]. https://datarade.ai/data-products/construction-data-building-materials-construction-industr-success-ai
    Explore at:
    .bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
    Dataset provided by
    Area covered
    Monaco, Serbia, Slovakia, Gibraltar, Moldova (Republic of), Croatia, Estonia, Bosnia and Herzegovina, Åland Islands, Malta
    Description

    Success.ai’s Construction Data for Building Materials & Construction Industry Leaders in Europe provides a reliable dataset tailored for businesses seeking to connect with leaders in the European construction and building materials sectors. Covering contractors, suppliers, architects, and project managers, this dataset offers verified profiles, firmographic insights, and decision-maker contacts.

    With access to over 700 million verified global profiles and data from 70 million businesses, Success.ai ensures that your outreach, market analysis, and strategic partnerships are powered by accurate, continuously updated, and AI-validated information. Backed by our Best Price Guarantee, this solution empowers you to engage effectively with the construction industry across Europe.

    Why Choose Success.ai’s Construction Data?

    1. Verified Contact Data for Industry Leaders

      • Access verified work emails, phone numbers, and LinkedIn profiles of construction executives, project managers, and building material suppliers.
      • AI-driven validation ensures 99% accuracy, optimizing outreach and minimizing inefficiencies in communication.
    2. Comprehensive Coverage Across Europe’s Construction Sector

      • Includes profiles from major construction hubs such as Germany, France, the UK, Italy, and Spain, covering a diverse range of projects and organizations.
      • Gain insights into regional construction trends, material sourcing strategies, and large-scale project developments.
    3. Continuously Updated Datasets

      • Real-time updates reflect changes in leadership, market expansions, material innovations, and project announcements.
      • Stay ahead of market trends to align your strategies with evolving industry needs.
    4. Ethical and Compliant

      • Adheres to GDPR, CCPA, and other global privacy regulations, ensuring responsible use of data and compliance with legal standards.

    Data Highlights:

    • 700M+ Verified Global Profiles: Engage with decision-makers, contractors, architects, and engineers in Europe’s construction sector.
    • 70M Business Profiles: Access firmographic data, including company sizes, revenue ranges, and geographic locations.
    • Decision-Maker Contacts: Connect directly with CEOs, procurement officers, and project leads driving construction projects and material procurement.
    • Industry Insights: Gain visibility into supply chain networks, sustainable building initiatives, and innovative construction techniques.

    Key Features of the Dataset:

    1. Leadership Profiles in Construction

      • Identify and connect with leaders responsible for major construction projects, material sourcing, and architectural planning.
      • Target professionals making decisions on vendor selection, project timelines, and compliance.
    2. Advanced Filters for Precision Campaigns

      • Filter companies by industry segment (commercial construction, residential, infrastructure), geographic location, or revenue size.
      • Tailor campaigns to align with regional construction challenges, such as sustainability, cost management, or urbanization.
    3. Firmographic Insights and Project Data

      • Access data on company structures, project scopes, and market positioning to refine your targeting strategy.
      • Use these insights to identify high-value prospects and uncover new business opportunities.
    4. AI-Driven Enrichment

      • Profiles enriched with actionable data enable personalized messaging, highlight unique value propositions, and improve engagement outcomes with construction stakeholders.

    Strategic Use Cases:

    1. Sales and Vendor Development

      • Offer construction materials, tools, or technology solutions to procurement teams and project managers in the construction industry.
      • Build relationships with vendors and contractors seeking innovative solutions to streamline operations and reduce costs.
    2. Market Research and Competitive Analysis

      • Analyze trends in material usage, construction technologies, and sustainable practices to guide product development and marketing strategies.
      • Benchmark against competitors to identify market gaps, emerging needs, and high-growth opportunities.
    3. Partnership Development and Supply Chain Optimization

      • Engage with companies seeking partnerships for large-scale projects, material sourcing, or technology integration.
      • Foster alliances that drive efficiency, quality, and innovation in construction projects.
    4. Recruitment and Workforce Solutions

      • Target HR professionals and hiring managers recruiting skilled workers, architects, or engineers for ongoing and upcoming projects.
      • Provide staffing services, training platforms, or workforce optimization tools tailored to the construction sector.

    Why Choose Success.ai?

    1. Best Price Guarantee
      • Access premium-quality construction data at competitive prices, ensuring strong ROI for your marketing, sales, and bu...
  14. World Important Events - Ancient to Modern

    • kaggle.com
    Updated Feb 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Saket Kumar (2024). World Important Events - Ancient to Modern [Dataset]. http://doi.org/10.34740/kaggle/dsv/7545558
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 3, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Saket Kumar
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Area covered
    World
    Description

    This dataset, "World Important Events - Ancient to Modern," spans significant historical milestones from ancient times to the modern era, covering diverse global incidents. It provides a comprehensive timeline of events that have shaped the world, offering insights into wars, cultural shifts, technological advancements, and social movements.

    Column Descriptions:

    Sl. No: Serial number. Name of Incident: Title of the event. Date, Month, Year: When the event occurred. Country: Where it happened. Type of Event: Nature of the event (e.g., War, Revolution). Place Name: Specific location of the event. Impact: Brief description of the event's impact. Affected Population: Who was impacted. Important Person/Group Responsible: Key figures or groups involved. Outcome: Result of the event (Positive, Negative, Mixed).

    Leverage this dataset for data analytics, cleaning, and visualization to uncover insights. Ideal for historical analysis and educational exploration, it offers a rich foundation for research, storytelling, and understanding global events' impact.

    Image attribute : Image By freepik

  15. England and Wales Census 2021 - RM010: Country of birth by ethnic group

    • statistics.ukdataservice.ac.uk
    csv, json, xlsx
    Updated May 9, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office for National Statistics; National Records of Scotland; Northern Ireland Statistics and Research Agency; UK Data Service. (2023). England and Wales Census 2021 - RM010: Country of birth by ethnic group [Dataset]. https://statistics.ukdataservice.ac.uk/dataset/england-and-wales-census-2021-rm010-country-of-birth-by-ethnic-group
    Explore at:
    xlsx, json, csvAvailable download formats
    Dataset updated
    May 9, 2023
    Dataset provided by
    Northern Ireland Statistics and Research Agency
    Office for National Statisticshttp://www.ons.gov.uk/
    UK Data Servicehttps://ukdataservice.ac.uk/
    Authors
    Office for National Statistics; National Records of Scotland; Northern Ireland Statistics and Research Agency; UK Data Service.
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Area covered
    Wales, England
    Description

    This dataset provides Census 2021 estimates that classify usual residents in England and Wales by country of birth and by ethnic group. The estimates are as at Census Day, 21 March 2021.

    Area type

    Census 2021 statistics are published for a number of different geographies. These can be large, for example the whole of England, or small, for example an output area (OA), the lowest level of geography for which statistics are produced.

    For higher levels of geography, more detailed statistics can be produced. When a lower level of geography is used, such as output areas (which have a minimum of 100 persons), the statistics produced have less detail. This is to protect the confidentiality of people and ensure that individuals or their characteristics cannot be identified.

    Coverage

    Census 2021 statistics are published for the whole of England and Wales. Data are also available in these geographic types:

    • country - for example, Wales
    • region - for example, London
    • local authority - for example, Cornwall
    • health area – for example, Clinical Commissioning Group
    • statistical area - for example, MSOA or LSOA

    Country of birth

    The country in which a person was born.

    For people not born in one of in the four parts of the UK, there was an option to select "elsewhere".

    People who selected "elsewhere" were asked to write in the current name for their country of birth.

    Ethnic group

    The ethnic group that the person completing the census feels they belong to. This could be based on their culture, family background, identity or physical appearance.

    Respondents could choose one out of 19 tick-box response categories, including write-in response options.

  16. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
U.S. Department of State (Point of Contact) (2025). Large Scale International Boundaries [Dataset]. https://catalog.data.gov/dataset/large-scale-international-boundaries
Organization logo

Large Scale International Boundaries

Explore at:
31 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
May 23, 2025
Dataset provided by
United States Department of Statehttp://state.gov/
Description

Overview The Office of the Geographer and Global Issues at the U.S. Department of State produces the Large Scale International Boundaries (LSIB) dataset. The current edition is version 11.4 (published 24 February 2025). The 11.4 release contains updated boundary lines and data refinements designed to extend the functionality of the dataset. These data and generalized derivatives are the only international boundary lines approved for U.S. Government use. The contents of this dataset reflect U.S. Government policy on international boundary alignment, political recognition, and dispute status. They do not necessarily reflect de facto limits of control. National Geospatial Data Asset This dataset is a National Geospatial Data Asset (NGDAID 194) managed by the Department of State. It is a part of the International Boundaries Theme created by the Federal Geographic Data Committee. Dataset Source Details Sources for these data include treaties, relevant maps, and data from boundary commissions, as well as national mapping agencies. Where available and applicable, the dataset incorporates information from courts, tribunals, and international arbitrations. The research and recovery process includes analysis of satellite imagery and elevation data. Due to the limitations of source materials and processing techniques, most lines are within 100 meters of their true position on the ground. Cartographic Visualization The LSIB is a geospatial dataset that, when used for cartographic purposes, requires additional styling. The LSIB download package contains example style files for commonly used software applications. The attribute table also contains embedded information to guide the cartographic representation. Additional discussion of these considerations can be found in the Use of Core Attributes in Cartographic Visualization section below. Additional cartographic information pertaining to the depiction and description of international boundaries or areas of special sovereignty can be found in Guidance Bulletins published by the Office of the Geographer and Global Issues: https://data.geodata.state.gov/guidance/index.html Contact Direct inquiries to internationalboundaries@state.gov. Direct download: https://data.geodata.state.gov/LSIB.zip Attribute Structure The dataset uses the following attributes divided into two categories: ATTRIBUTE NAME | ATTRIBUTE STATUS CC1 | Core CC1_GENC3 | Extension CC1_WPID | Extension COUNTRY1 | Core CC2 | Core CC2_GENC3 | Extension CC2_WPID | Extension COUNTRY2 | Core RANK | Core LABEL | Core STATUS | Core NOTES | Core LSIB_ID | Extension ANTECIDS | Extension PREVIDS | Extension PARENTID | Extension PARENTSEG | Extension These attributes have external data sources that update separately from the LSIB: ATTRIBUTE NAME | ATTRIBUTE STATUS CC1 | GENC CC1_GENC3 | GENC CC1_WPID | World Polygons COUNTRY1 | DoS Lists CC2 | GENC CC2_GENC3 | GENC CC2_WPID | World Polygons COUNTRY2 | DoS Lists LSIB_ID | BASE ANTECIDS | BASE PREVIDS | BASE PARENTID | BASE PARENTSEG | BASE The core attributes listed above describe the boundary lines contained within the LSIB dataset. Removal of core attributes from the dataset will change the meaning of the lines. An attribute status of “Extension” represents a field containing data interoperability information. Other attributes not listed above include “FID”, “Shape_length” and “Shape.” These are components of the shapefile format and do not form an intrinsic part of the LSIB. Core Attributes The eight core attributes listed above contain unique information which, when combined with the line geometry, comprise the LSIB dataset. These Core Attributes are further divided into Country Code and Name Fields and Descriptive Fields. County Code and Country Name Fields “CC1” and “CC2” fields are machine readable fields that contain political entity codes. These are two-character codes derived from the Geopolitical Entities, Names, and Codes Standard (GENC), Edition 3 Update 18. “CC1_GENC3” and “CC2_GENC3” fields contain the corresponding three-character GENC codes and are extension attributes discussed below. The codes “Q2” or “QX2” denote a line in the LSIB representing a boundary associated with areas not contained within the GENC standard. The “COUNTRY1” and “COUNTRY2” fields contain the names of corresponding political entities. These fields contain names approved by the U.S. Board on Geographic Names (BGN) as incorporated in the ‘"Independent States in the World" and "Dependencies and Areas of Special Sovereignty" lists maintained by the Department of State. To ensure maximum compatibility, names are presented without diacritics and certain names are rendered using common cartographic abbreviations. Names for lines associated with the code "Q2" are descriptive and not necessarily BGN-approved. Names rendered in all CAPITAL LETTERS denote independent states. Names rendered in normal text represent dependencies, areas of special sovereignty, or are otherwise presented for the convenience of the user. Descriptive Fields The following text fields are a part of the core attributes of the LSIB dataset and do not update from external sources. They provide additional information about each of the lines and are as follows: ATTRIBUTE NAME | CONTAINS NULLS RANK | No STATUS | No LABEL | Yes NOTES | Yes Neither the "RANK" nor "STATUS" fields contain null values; the "LABEL" and "NOTES" fields do. The "RANK" field is a numeric expression of the "STATUS" field. Combined with the line geometry, these fields encode the views of the United States Government on the political status of the boundary line. ATTRIBUTE NAME | | VALUE | RANK | 1 | 2 | 3 STATUS | International Boundary | Other Line of International Separation | Special Line A value of “1” in the “RANK” field corresponds to an "International Boundary" value in the “STATUS” field. Values of ”2” and “3” correspond to “Other Line of International Separation” and “Special Line,” respectively. The “LABEL” field contains required text to describe the line segment on all finished cartographic products, including but not limited to print and interactive maps. The “NOTES” field contains an explanation of special circumstances modifying the lines. This information can pertain to the origins of the boundary lines, limitations regarding the purpose of the lines, or the original source of the line. Use of Core Attributes in Cartographic Visualization Several of the Core Attributes provide information required for the proper cartographic representation of the LSIB dataset. The cartographic usage of the LSIB requires a visual differentiation between the three categories of boundary lines. Specifically, this differentiation must be between: International Boundaries (Rank 1); Other Lines of International Separation (Rank 2); and Special Lines (Rank 3). Rank 1 lines must be the most visually prominent. Rank 2 lines must be less visually prominent than Rank 1 lines. Rank 3 lines must be shown in a manner visually subordinate to Ranks 1 and 2. Where scale permits, Rank 2 and 3 lines must be labeled in accordance with the “Label” field. Data marked with a Rank 2 or 3 designation does not necessarily correspond to a disputed boundary. Please consult the style files in the download package for examples of this depiction. The requirement to incorporate the contents of the "LABEL" field on cartographic products is scale dependent. If a label is legible at the scale of a given static product, a proper use of this dataset would encourage the application of that label. Using the contents of the "COUNTRY1" and "COUNTRY2" fields in the generation of a line segment label is not required. The "STATUS" field contains the preferred description for the three LSIB line types when they are incorporated into a map legend but is otherwise not to be used for labeling. Use of the “CC1,” “CC1_GENC3,” “CC2,” “CC2_GENC3,” “RANK,” or “NOTES” fields for cartographic labeling purposes is prohibited. Extension Attributes Certain elements of the attributes within the LSIB dataset extend data functionality to make the data more interoperable or to provide clearer linkages to other datasets. The fields “CC1_GENC3” and “CC2_GENC” contain the corresponding three-character GENC code to the “CC1” and “CC2” attributes. The code “QX2” is the three-character counterpart of the code “Q2,” which denotes a line in the LSIB representing a boundary associated with a geographic area not contained within the GENC standard. To allow for linkage between individual lines in the LSIB and World Polygons dataset, the “CC1_WPID” and “CC2_WPID” fields contain a Universally Unique Identifier (UUID), version 4, which provides a stable description of each geographic entity in a boundary pair relationship. Each UUID corresponds to a geographic entity listed in the World Polygons dataset. These fields allow for linkage between individual lines in the LSIB and the overall World Polygons dataset. Five additional fields in the LSIB expand on the UUID concept and either describe features that have changed across space and time or indicate relationships between previous versions of the feature. The “LSIB_ID” attribute is a UUID value that defines a specific instance of a feature. Any change to the feature in a lineset requires a new “LSIB_ID.” The “ANTECIDS,” or antecedent ID, is a UUID that references line geometries from which a given line is descended in time. It is used when there is a feature that is entirely new, not when there is a new version of a previous feature. This is generally used to reference countries that have dissolved. The “PREVIDS,” or Previous ID, is a UUID field that contains old versions of a line. This is an additive field, that houses all Previous IDs. A new version of a feature is defined by any change to the

Search
Clear search
Close search
Google apps
Main menu