27 datasets found
  1. f

    Datasheet1_FLAP: a framework for linking free-text addresses to the Ordnance...

    • frontiersin.figshare.com
    pdf
    Updated Nov 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Huayu Zhang; Arlene Casey; Imane Guellil; Víctor Suárez-Paniagua; Clare MacRae; Charis Marwick; Honghan Wu; Bruce Guthrie; Beatrice Alex (2023). Datasheet1_FLAP: a framework for linking free-text addresses to the Ordnance Survey Unique Property Reference Number database.pdf [Dataset]. http://doi.org/10.3389/fdgth.2023.1186208.s001
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Nov 28, 2023
    Dataset provided by
    Frontiers
    Authors
    Huayu Zhang; Arlene Casey; Imane Guellil; Víctor Suárez-Paniagua; Clare MacRae; Charis Marwick; Honghan Wu; Bruce Guthrie; Beatrice Alex
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    IntroductionLinking free-text addresses to unique identifiers in a structural address database [the Ordnance Survey unique property reference number (UPRN) in the United Kingdom (UK)] is a necessary step for downstream geospatial analysis in many digital health systems, e.g., for identification of care home residents, understanding housing transitions in later life, and informing decision making on geographical health and social care resource distribution. However, there is a lack of open-source tools for this task with performance validated in a test data set.MethodsIn this article, we propose a generalisable solution (A Framework for Linking free-text Addresses to Ordnance Survey UPRN database, FLAP) based on a machine learning–based matching classifier coupled with a fuzzy aligning algorithm for feature generation with better performance than existing tools. The framework is implemented in Python as an Open Source tool (available at Link). We tested the framework in a real-world scenario of linking individual’s (n=771,588) addresses recorded as free text in the Community Health Index (CHI) of National Health Service (NHS) Tayside and NHS Fife to the Unique Property Reference Number database (UPRN DB).ResultsWe achieved an adjusted matching accuracy of 0.992 in a test data set randomly sampled (n=3,876) from NHS Tayside and NHS Fife CHI addresses. FLAP showed robustness against input variations including typographical errors, alternative formats, and partially incorrect information. It has also improved usability compared to existing solutions allowing the use of a customised threshold of matching confidence and selection of top n candidate records. The use of machine learning also provides better adaptability of the tool to new data and enables continuous improvement.DiscussionIn conclusion, we have developed a framework, FLAP, for linking free-text UK addresses to the UPRN DB with good performance and usability in a real-world task.

  2. d

    Addresses RÚIAN data distributed by municipalities in the CSV format

    • data.gov.cz
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Český úřad zeměměřický a katastrální, Addresses RÚIAN data distributed by municipalities in the CSV format [Dataset]. https://data.gov.cz/dataset?iri=https%3A%2F%2Fdata.gov.cz%2Fzdroj%2Fdatov%C3%A9-sady%2F00025712%2F59ddccd404a617f695e3a1ef8e65b81e
    Explore at:
    Dataset authored and provided by
    Český úřad zeměměřický a katastrální
    Description

    Dataset contains list of address points for individual municipalities in CSV format. For each address point are following attributes specified: address point code, municipality code and name, name of municipality district/municipality part (for territorialy structured statutory cities only), code and name of Prague city district (for Prague only), municipality part code and name, street code and name (in case it is specified), type of building object (with description/registration house number), house number, orientation number (if it is specified), character of orientation number (if it is specified only), postal code, Y and X coordinates of pointer of address point (in JTSK coordinate system) and the date of validity. Dataset is provided as Open Data (licence CC-BY 4.0). Data is based on RÚIAN (Register of Territorial Identification, Addresses and Real Estates). Data covers the whole territory of the Czech Republic. Data is provided for individual municipalities in a compressed form (ZIP). Files are created during the first day of each month with data valid to the last day of previous month. More in the Act No. 111/2009 Coll., on the Basic Registers, in Decree No. 359/2011 Coll., on the Basic Register of Territorial Identification, Addresses and Real Estates.

  3. d

    Addresses RÚIAN data distributed by the country in the CSV format

    • data.gov.cz
    Updated Feb 23, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Český úřad zeměměřický a katastrální (2024). Addresses RÚIAN data distributed by the country in the CSV format [Dataset]. https://data.gov.cz/dataset?iri=https%3A%2F%2Fdata.gov.cz%2Fzdroj%2Fdatov%C3%A9-sady%2F00025712%2F3eac0278ad025b9a9015465571fdb907
    Explore at:
    Dataset updated
    Feb 23, 2024
    Dataset authored and provided by
    Český úřad zeměměřický a katastrální
    Description

    Dataset contains list of address points for the whole Czech Republic in CSV format. For each address point following attributes are specified: address point code, municipality code and name, code and name of town district (for territorialy structured statutory cities only), code and name of Prague city district (for Prague only), municipality part code and name, street code and name (in case it is specified), type of building object (with description/registration house number), house number, orientation number (if it is specified), character of orientation number (if it is specified), postal code, Y and X coordinates of pointer of address point (in JTSK coordinate system) and the date of validity. Dataset is provided as Open Data (licence CC-BY 4.0). Data is based on RÚIAN (Register of Territorial Identification, Addresses and Real Estates). Data covers the whole territory of the Czech Republic. Data is provided in a compressed form (ZIP archive). File is created during the first day of each month with data valid to the last day of previous month. More in the Act No. 111/2009 Coll., on the Basic Registers, in Decree No. 359/2011 Coll., on the Basic Register of Territorial Identification, Addresses and Real Estates.

  4. TIGER/Line Shapefile, 2023, County, Real County, TX, Address Ranges...

    • catalog.data.gov
    Updated Dec 14, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Department of Commerce, U.S. Census Bureau, Geography Division, Geospatial Products Branch (Point of Contact) (2023). TIGER/Line Shapefile, 2023, County, Real County, TX, Address Ranges Relationship File [Dataset]. https://catalog.data.gov/dataset/tiger-line-shapefile-2023-county-real-county-tx-address-ranges-relationship-file
    Explore at:
    Dataset updated
    Dec 14, 2023
    Dataset provided by
    United States Census Bureauhttp://census.gov/
    Area covered
    Real County, Texas
    Description

    The TIGER/Line shapefiles and related database files (.dbf) are an extract of selected geographic and cartographic information from the U.S. Census Bureau's Master Address File / Topologically Integrated Geographic Encoding and Referencing (MAF/TIGER) Database (MTDB). The MTDB represents a seamless national file with no overlaps or gaps between parts, however, each TIGER/Line shapefile is designed to stand alone as an independent data set, or they can be combined to cover the entire nation. The Address Ranges Relationship File (ADDR.dbf) contains the attributes of each address range. Each address range applies to a single edge and has a unique address range identifier (ARID) value. The edge to which an address range applies can be determined by linking the address range to the All Lines Shapefile (EDGES.shp) using the permanent topological edge identifier (TLID) attribute. Multiple address ranges can apply to the same edge since an edge can have multiple address ranges. Note that the most inclusive address range associated with each side of a street edge already appears in the All Lines Shapefile (EDGES.shp). The TIGER/Line Files contain potential address ranges, not individual addresses. The term "address range" refers to the collection of all possible structure numbers from the first structure number to the last structure number and all numbers of a specified parity in between along an edge side relative to the direction in which the edge is coded. The address ranges in the TIGER/Line Files are potential ranges that include the full range of possible structure numbers even though the actual structures may not exist.

  5. d

    Hierarchy of addresses RÚIAN data distributed by the country in the CSV...

    • data.gov.cz
    • gimi9.com
    • +1more
    Updated Sep 7, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Český úřad zeměměřický a katastrální (2020). Hierarchy of addresses RÚIAN data distributed by the country in the CSV format [Dataset]. https://data.gov.cz/dataset?iri=https%3A%2F%2Fdata.gov.cz%2Fzdroj%2Fdatov%C3%A9-sady%2F00025712%2F52c6e525fda1b5f842e169420a4d8d29
    Explore at:
    Dataset updated
    Sep 7, 2020
    Dataset authored and provided by
    Český úřad zeměměřický a katastrální
    Description

    Dataset contains information on relationship between selected territorial elements and units of territorial registration. Data is specified in seven CSV files for the whole Czech Republic. File adresni-mista-vazby-cr.csv contains links of address points to the following elements – street, municipality part, town district (MOMC), Prague city district (MOP), town district of Prague (SPRAVOBV), municipality, municipality with an authorized municipal office (POU), municipality with extended competence (ORP), higher territorial self-governing entity (VÚSC) and election district (VO). File vazby-cr.csv contains links between elements municipality part, municipality, POU, ORP, VUSC, cohesion region (REGSOUDR) up to the element of state. File vazby-hlm-praha.csv contains modularity of elements in the city of Prague: MOMC, SPRAVOBV, municipality, POU, ORP, VUSC, REGSOUDR and state. File vazby-katastr-uzemi-cr.csv contains modularity of basic urban units (ZSJ) into cadastral units (KATUZ) and municipalities. File vazby-momc-statutarni-mesta.csv contains modularity of territorial elements in territorialy structured statutory cities: MOMC, MOP, obec, POU, ORP, VUSC, REGSOUDR and state. File vazby-okresy-cr.csv contains links between elements of municipality part, municipality, county, region (old – defined in 1960) and state. File vazby-ulice-obce-s-ulicni-siti.csv contains links of streets to the municipality. Dataset is provided as Open Data (licence CC-BY 4.0). Data is based on RÚIAN (Register of Territorial Identification, Addresses and Real Estates). Files are created during the first day of each month with data valid to the last day of previous month. The whole dataset is compressed (ZIP) for downloading. More in the Act No. 111/2009 Coll., on the Basic Registers, in Decree No. 359/2011 Coll., on the Basic Register of Territorial Identification, Addresses and Real Estates.

  6. TIGER/Line Shapefile, 2023, County, Real County, TX, Address Range-Feature

    • catalog.data.gov
    Updated Dec 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Department of Commerce, U.S. Census Bureau, Geography Division, Geospatial Products Branch (Point of Contact) (2023). TIGER/Line Shapefile, 2023, County, Real County, TX, Address Range-Feature [Dataset]. https://catalog.data.gov/dataset/tiger-line-shapefile-2023-county-real-county-tx-address-range-feature
    Explore at:
    Dataset updated
    Dec 15, 2023
    Dataset provided by
    United States Census Bureauhttp://census.gov/
    Area covered
    Real County, Texas
    Description

    The TIGER/Line shapefiles and related database files (.dbf) are an extract of selected geographic and cartographic information from the U.S. Census Bureau's Master Address File / Topologically Integrated Geographic Encoding and Referencing (MAF/TIGER) Database (MTDB). The MTDB represents a seamless national file with no overlaps or gaps between parts, however, each TIGER/Line shapefile is designed to stand alone as an independent data set, or they can be combined to cover the entire nation. The Address Ranges Feature Shapefile (ADDRFEAT.dbf) contains the geospatial edge geometry and attributes of all unsuppressed address ranges for a county or county equivalent area. The term "address range" refers to the collection of all possible structure numbers from the first structure number to the last structure number and all numbers of a specified parity in between along an edge side relative to the direction in which the edge is coded. Single-address address ranges have been suppressed to maintain the confidentiality of the addresses they describe. Multiple coincident address range feature edge records are represented in the shapefile if more than one left or right address ranges are associated to the edge. The ADDRFEAT shapefile contains a record for each address range to street name combination. Address range associated to more than one street name are also represented by multiple coincident address range feature edge records. Note that the ADDRFEAT shapefile includes all unsuppressed address ranges compared to the All Lines Shapefile (EDGES.shp) which only includes the most inclusive address range associated with each side of a street edge. The TIGER/Line shapefile contain potential address ranges, not individual addresses. The address ranges in the TIGER/Line Files are potential ranges that include the full range of possible structure numbers even though the actual structures may not exist.

  7. TIGER/Line Shapefile, 2022, County, Real County, TX, Address Range-Feature

    • s.cnmilf.com
    Updated Jan 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Department of Commerce, U.S. Census Bureau, Geography Division, Spatial Data Collection and Products Branch (Point of Contact) (2024). TIGER/Line Shapefile, 2022, County, Real County, TX, Address Range-Feature [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/tiger-line-shapefile-2022-county-real-county-tx-address-range-feature
    Explore at:
    Dataset updated
    Jan 28, 2024
    Dataset provided by
    United States Department of Commercehttp://www.commerce.gov/
    United States Census Bureauhttp://census.gov/
    Area covered
    Real County, Texas
    Description

    The TIGER/Line shapefiles and related database files (.dbf) are an extract of selected geographic and cartographic information from the U.S. Census Bureau's Master Address File / Topologically Integrated Geographic Encoding and Referencing (MAF/TIGER) Database (MTDB). The MTDB represents a seamless national file with no overlaps or gaps between parts, however, each TIGER/Line shapefile is designed to stand alone as an independent data set, or they can be combined to cover the entire nation. The Address Ranges Feature Shapefile (ADDRFEAT.dbf) contains the geospatial edge geometry and attributes of all unsuppressed address ranges for a county or county equivalent area. The term "address range" refers to the collection of all possible structure numbers from the first structure number to the last structure number and all numbers of a specified parity in between along an edge side relative to the direction in which the edge is coded. Single-address address ranges have been suppressed to maintain the confidentiality of the addresses they describe. Multiple coincident address range feature edge records are represented in the shapefile if more than one left or right address ranges are associated to the edge. The ADDRFEAT shapefile contains a record for each address range to street name combination. Address range associated to more than one street name are also represented by multiple coincident address range feature edge records. Note that the ADDRFEAT shapefile includes all unsuppressed address ranges compared to the All Lines Shapefile (EDGES.shp) which only includes the most inclusive address range associated with each side of a street edge. The TIGER/Line shapefile contain potential address ranges, not individual addresses. The address ranges in the TIGER/Line Files are potential ranges that include the full range of possible structure numbers even though the actual structures may not exist.

  8. Z

    I-BiDaaS - CAIXA - IP addresses - Synthetic Dataset

    • data.niaid.nih.gov
    Updated Jul 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ramon Martin de Pozuelo Genis (2024). I-BiDaaS - CAIXA - IP addresses - Synthetic Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4091025
    Explore at:
    Dataset updated
    Jul 19, 2024
    Dataset provided by
    Ramon Martin de Pozuelo Genis
    Omer Boehm
    Mario Maawad Marcos
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The generated dataset provides data on the relationships between customers in order to build part of the social graph of the bank. The data was synthetically generated based on real data coming from a set of restricted tables (relational database), with information related to the customers and their IP address when connecting online.

    CAIXA and IBM generated the data recipe for the data fabrication using IBM TDF. Through an iterative analysis of obtained results, the rules were improved in order to obtain the fabricated dataset used for testing the MVP, with more than 1 million entries.

  9. TIGER/Line Shapefile, 2023, County, Real County, TX, Address Range-Feature...

    • catalog.data.gov
    Updated Dec 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Department of Commerce, U.S. Census Bureau, Geography Division, Geospatial Products Branch (Point of Contact) (2023). TIGER/Line Shapefile, 2023, County, Real County, TX, Address Range-Feature Name Relationship File [Dataset]. https://catalog.data.gov/dataset/tiger-line-shapefile-2023-county-real-county-tx-address-range-feature-name-relationship-file
    Explore at:
    Dataset updated
    Dec 15, 2023
    Dataset provided by
    United States Census Bureauhttp://census.gov/
    Area covered
    Real County, Texas
    Description

    The TIGER/Line shapefiles and related database files (.dbf) are an extract of selected geographic and cartographic information from the U.S. Census Bureau's Master Address File / Topologically Integrated Geographic Encoding and Referencing (MAF/TIGER) Database (MTDB). The MTDB represents a seamless national filewith no overlaps or gaps between parts, however, each TIGER/Line shapefile is designed to stand alone as an independentdata set, or they can be combined to cover the entire nation. The Address Range / Feature Name Relationship File (ADDRFN.dbf) contains a record for each address range / linear feature name relationship. The purpose of this relationship file is to identify all street names associated with each address range. An edge can have several feature names; an address range located on an edge can be associated with one or any combination of the available feature names (an address range can be linked to multiple feature names). The address range is identified by the address range identifier (ARID) attribute that can be used to link to the Address Ranges Relationship File (ADDR.dbf). The linear feature name is identified by the linear feature identifier (LINEARID) attribute that can be used to link to the Feature Names Relationship File (FEATNAMES.dbf).

  10. TIGER/Line Shapefile, 2022, County, Real County, TX, Address Range-Feature...

    • catalog.data.gov
    Updated Jan 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Department of Commerce, U.S. Census Bureau, Geography Division, Spatial Data Collection and Products Branch (Point of Contact) (2024). TIGER/Line Shapefile, 2022, County, Real County, TX, Address Range-Feature Name Relationship File [Dataset]. https://catalog.data.gov/dataset/tiger-line-shapefile-2022-county-real-county-tx-address-range-feature-name-relationship-file
    Explore at:
    Dataset updated
    Jan 27, 2024
    Dataset provided by
    United States Census Bureauhttp://census.gov/
    Area covered
    Real County, Texas
    Description

    The TIGER/Line shapefiles and related database files (.dbf) are an extract of selected geographic and cartographic information from the U.S. Census Bureau's Master Address File / Topologically Integrated Geographic Encoding and Referencing (MAF/TIGER) Database (MTDB). The MTDB represents a seamless national filewith no overlaps or gaps between parts, however, each TIGER/Line shapefile is designed to stand alone as an independentdata set, or they can be combined to cover the entire nation. The Address Range / Feature Name Relationship File (ADDRFN.dbf) contains a record for each address range / linear feature name relationship. The purpose of this relationship file is to identify all street names associated with each address range. An edge can have several feature names; an address range located on an edge can be associated with one or any combination of the available feature names (an address range can be linked to multiple feature names). The address range is identified by the address range identifier (ARID) attribute that can be used to link to the Address Ranges Relationship File (ADDR.dbf). The linear feature name is identified by the linear feature identifier (LINEARID) attribute that can be used to link to the Feature Names Relationship File (FEATNAMES.dbf).

  11. g

    Addresses extracted from the cadastre | gimi9.com

    • gimi9.com
    Updated Mar 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Addresses extracted from the cadastre | gimi9.com [Dataset]. https://gimi9.com/dataset/eu_5bd837f2634f41112d338d46/
    Explore at:
    Dataset updated
    Mar 15, 2024
    Description

    This dataset contains all the addresses extracted from the cadastre files (plan and file of built plots). It is produced by the Etalab mission, by reprocessing the extracted data, as part of the preparation of the National Address Base (BAN), of which it is a primary source. #### Presentation The DGFiP built parcel file contains the list of premises listed for local tax purposes, which are associated with cadastral parcels, and possibly real or fictitious addresses. The computerised cadastral plan contains the geometry of the cadastral plots, and possibly the positions of track numbers, attached to these plots. It also contains place names, graphically represented. This dataset is the result of the complex operation of these files. It is presented in the form of unit addresses, qualified thanks to several source data sets explained below. Please note addresses associated exclusively with places that are not part of the local tax base are not present in this file (in particular certain public places). #### A few numbers * 24.1 million assumed real or fictitious addresses * of which 24.0 million geo-referenced addresses * of which 22.7 million useful addresses (house, trade, tourism, industry...) * of which 20.2 million addresses are assumed to be real If we wish to combine the above criteria, we arrive at 19.5 million addresses. #### Update frequency This dataset is produced every major update of one of its sources, or algorithms. — Each (quarterly) update of the cadastral plan allows for improved path names and addresses positions. — Each (annual) update of the MAJIC file increases the number of addresses (new construction, new addresses known to the tax services). — Each update of the FANTOIR file (quarterly) allows for improved path names and identifiers. #### Coverage This dataset covers metropolitan France, as well as the overseas departments and regions, Saint-Martin and Saint-Barthelemy. He doesn't have the ambition to be exhaustive. Only municipalities with a cadastral plan have geo-positioned addresses. The few missing municipalities will be added later. The overseas communities of Saint-Martin and Saint-Barthelemy are present and historically integrated into the department of Guadeloupe (971). #### Lexicon * Fictitious or pseudo-address: address that has been arbitrarily attributed by Public Finance services for technical reasons * Main destination of an address: main use of the premises that make up this address (among housing, commerce, industry, tourism, equipment, dependencies, construction site, unknown) * Useful address: address associated with relevant premises (housing, commerce, industry, tourism) #### Available distributions These files are available in 3 formats: — CSV (only useful addresses, BAL 1.1 compatible) — GeoJSON (only useful addresses, and with a position) — NDJSON (full data, including intermediate results, mainly for advanced use) ####Data schema External link #### Source code The source code for producing the dataset is available at GitHub. Although under the MIT license, it cannot be used without the MAJIC file, which is not open. We are thinking about introducing an intermediate step that will allow everyone to replay the essentials of the script and thus contribute to it. #### Sources — cadastre (Etalab)Computerised Cadastral PlanOfficial Geographical CodeFANTOIR file of lanes and places — MAJIC file (cadastral matrix, tax secret)

  12. d

    TIGER/Line Shapefile, 2016, county, Real County, TX, Address Ranges...

    • catalog.data.gov
    Updated Dec 3, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2020). TIGER/Line Shapefile, 2016, county, Real County, TX, Address Ranges County-based Relationship File [Dataset]. https://catalog.data.gov/dataset/tiger-line-shapefile-2016-county-real-county-tx-address-ranges-county-based-relationship-file
    Explore at:
    Dataset updated
    Dec 3, 2020
    Area covered
    Real County, Texas
    Description

    The TIGER/Line shapefiles and related database files (.dbf) are an extract of selected geographic and cartographic information from the U.S. Census Bureau's Master Address File / Topologically Integrated Geographic Encoding and Referencing (MAF/TIGER) Database (MTDB). The MTDB represents a seamless national file with no overlaps or gaps between parts, however, each TIGER/Line shapefile is designed to stand alone as an independent data set, or they can be combined to cover the entire nation. The Address Ranges Relationship File (ADDR.dbf) contains the attributes of each address range. Each address range applies to a single edge and has a unique address range identifier (ARID) value. The edge to which an address range applies can be determined by linking the address range to the All Lines Shapefile (EDGES.shp) using the permanent topological edge identifier (TLID) attribute. Multiple address ranges can apply to the same edge since an edge can have multiple address ranges. Note that the most inclusive address range associated with each side of a street edge already appears in the All Lines Shapefile (EDGES.shp). The TIGER/Line Files contain potential address ranges, not individual addresses. The term "address range" refers to the collection of all possible structure numbers from the first structure number to the last structure number and all numbers of a specified parity in between along an edge side relative to the direction in which the edge is coded. The address ranges in the TIGER/Line Files are potential ranges that include the full range of possible structure numbers even though the actual structures may not exist.

  13. g

    INSPIRE theme Addresses (AD)

    • gimi9.com
    Updated Sep 16, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). INSPIRE theme Addresses (AD) [Dataset]. https://gimi9.com/dataset/eu_cz-00025712-cuzk_series-md_ad
    Explore at:
    Dataset updated
    Sep 16, 2022
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Dataset for the theme Addresses (AD) harmonised according to the INSPIRE Directive and ELF Data Specification version 1.0. Harmonized INSPIRE dataset contains address points and their components, which are country, municipality, municipality part, Prague district (MOP), town district (MOMČ), street and post. Data contains also 2D geometry - pointers of address. The dataset covers the whole area of Czech Republic. It is provided as Open Data (licence CC-BY 4.0). Data is based on RÚIAN (Register of territorial identification, addresses and real estates). .85%, i.e. 25344 of addresses do not contain geometry – coordinates of pointer (to 2022-09-12) and therefore are not included in the dataset. Data is created daily via individual municipalities (if any change within the municipality occurs), is provided in the GML 3.2.1 format and is valid against XML Definition Schema for the theme Addresses in version 4.0 and spatial data scheme for ELF in version 1.0. Dataset is compressed (ZIP) for downloading. More in the Act No. 111/2009 Coll., on the Basic Registers, in Decree No. 359/2011 Coll., on the Basic Register of Territorial Identification, Addresses and Real Estates in the current versions and in the INSPIRE Data Specification on Addresses v. 3.0.1 from 2010-04-26.

  14. d

    Audience Targeting Data | 330M+ Global Devices | Audience Data & Advertising...

    • datarade.ai
    .json, .csv
    Updated Feb 4, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DRAKO (2025). Audience Targeting Data | 330M+ Global Devices | Audience Data & Advertising | API Delivery [Dataset]. https://datarade.ai/data-products/audience-targeting-data-330m-global-devices-audience-dat-drako
    Explore at:
    .json, .csvAvailable download formats
    Dataset updated
    Feb 4, 2025
    Dataset authored and provided by
    DRAKO
    Area covered
    Czech Republic, Armenia, Russian Federation, Curaçao, Equatorial Guinea, Serbia, Namibia, Suriname, Eritrea, San Marino
    Description

    DRAKO is a Mobile Location Audience Targeting provider with a programmatic trading desk specialising in geolocation analytics and programmatic advertising. Through our customised approach, we offer business and consumer insights as well as addressable audiences for advertising.

    Mobile Location Data can be meaningfully transformed into Audience Targeting when used in conjunction with other dataset. Our expansive POI Data allows us to segment users by visitation to major brands and retailers as well as categorizes them into syndicated segments. Beyond POI visits, our proprietary Home Location Model determines residents of geographic areas such as Designated Market Areas, Counties, or States. Relatedly, our Home Location Model also fuels our Geodemographic Census Data segments as we are able to determine residents of the smallest census units. Additionally, we also have audiences of: ticketed event and venue visitors; survey data; and retail data.

    All of our Audience Targeting is 100% deterministic in that it only includes high-quality, real visits to locations as defined by a POIs satellite imagery buildings contour. We never use a radius when building an audience unless requested. We have a horizontal accuracy of 5m.

    Additionally, we can always cross reference your audience targeting with our syndicated segments:

    Overview of our Syndicated Audience Data Segments: - Brand/POI segments (specific named stores and locations) - Categories (behavioural segments - revealed habits) - Census demographic segments (HH income, race, religion, age, family structure, language, etc.,) - Events segments (ticketed live events, conferences, and seminars) - Resident segments (State/province, CMAs, DMAs, city, county, sub-county) - Political segments (Canadian Federal and Provincial, US Congressional Upper and Lower House, US States, City elections, etc.,) - Survey Data (Psychosocial/Demographic survey data) - Retail Data (Receipt/transaction data)

    All of our syndicated segments are customizable. That means you can limit them to people within a certain geography, remove employees, include only the most frequent visitors, define your own custom lookback, or extend our audiences using our Home, Work, and Social Extensions.

    In addition to our syndicated segments, we’re also able to run custom queries return to you all the Mobile Ad IDs (MAIDs) seen at in a specific location (address; latitude and longitude; or WKT84 Polygon) or in your defined geographic area of interest (political districts, DMAs, Zip Codes, etc.,)

    Beyond just returning all the MAIDs seen within a geofence, we are also able to offer additional customizable advantages: - Average precision between 5 and 15 meters - CRM list activation + extension - Extend beyond Mobile Location Data (MAIDs) with our device graph - Filter by frequency of visitations - Home and Work targeting (retrieve only employees or residents of an address) - Home extensions (devices that reside in the same dwelling from your seed geofence) - Rooftop level address geofencing precision (no radius used EVER unless user specified) - Social extensions (devices in the same social circle as users in your seed geofence) - Turn analytics into addressable audiences - Work extensions (coworkers of users in your seed geofence)

    Data Compliance: All of our Audience Targeting Data is fully CCPA compliant and 100% sourced from SDKs (Software Development Kits), the most reliable and consistent mobile data stream with end user consent available with only a 4-5 day delay. This means that our location and device ID data comes from partnerships with over 1,500+ mobile apps. This data comes with an associated location which is how we are able to segment using geofences.

    Data Quality: In addition to partnering with trusted SDKs, DRAKO has additional screening methods to ensure that our mobile location data is consistent and reliable. This includes data harmonization and quality scoring from all of our partners in order to disregard MAIDs with a low quality score.

  15. d

    Real Property Tax - 2019

    • catalog.data.gov
    • data.montgomerycountymd.gov
    • +2more
    Updated Jun 29, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.montgomerycountymd.gov (2025). Real Property Tax - 2019 [Dataset]. https://catalog.data.gov/dataset/real-property-tax-2019
    Explore at:
    Dataset updated
    Jun 29, 2025
    Dataset provided by
    data.montgomerycountymd.gov
    Description

    This data represents all of the County’s residential real estate properties and all of the associated tax charges and credits with that property processed at the annual billing in July of each year, excluding any subsequent billing additions and/or revisions throughout the year. This dataset excludes the names of the property owners. The addresses in this database represent the address of the property. For more information about the individual taxes and credits, please go to http://www.montgomerycountymd.gov/finance/taxes/faqs.html#credit. Update Frequency: Updated Annually in July

  16. N

    real manhattan 2016 by address

    • data.cityofnewyork.us
    • data.wu.ac.at
    application/rdfxml +5
    Updated Jul 2, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    311 (2025). real manhattan 2016 by address [Dataset]. https://data.cityofnewyork.us/w/s8pi-nvru/25te-f2tw?cur=u2xJTiZqQWD&from=_KJ1x41BQKO
    Explore at:
    csv, application/rdfxml, xml, json, tsv, application/rssxmlAvailable download formats
    Dataset updated
    Jul 2, 2025
    Authors
    311
    Description

    All 311 Service Requests from 2010 to present. This information is automatically updated daily.

    Click here to download data from 2011 - https://data.cityofnewyork.us/dataset/311-Service-Requests-From-2011/fpz8-jqf4

    Click here to download data from 2012 - https://data.cityofnewyork.us/dataset/311-Service-Requests-From-2012/as38-8eb5

    Click here to download data from 2013 - https://data.cityofnewyork.us/dataset/311-Service-Requests-From-2013/hybb-af8n

    Click here to download data from 2014 - https://data.cityofnewyork.us/dataset/311-Service-Requests-From-2014/vtzg-7562

    Click here to download data from 2015 - https://data.cityofnewyork.us/dataset/311-Service-Requests-From-2015/57g5-etyj

  17. BitcoinTemporalGraph

    • figshare.com
    bin
    Updated Feb 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hugo Schnoering; Michalis Vazirgiannis (2025). BitcoinTemporalGraph [Dataset]. http://doi.org/10.6084/m9.figshare.26305093.v3
    Explore at:
    binAvailable download formats
    Dataset updated
    Feb 5, 2025
    Dataset provided by
    figshare
    Authors
    Hugo Schnoering; Michalis Vazirgiannis
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains several files:dataset.tar.gz: A compressed PostgreSQL database representing a graph.addresses.csv: A list of approximately 100,000 labeled Bitcoin addresses.BitcoinTemporalGraph (dataset.tar.gz)This dataset represents a graph of value transfers between Bitcoin users. The nodes represent entities/users, and the edges represent value transfers or transactions between these entities. The graph is temporal and directed.Usage:Decompress the archive: "pigz -p 10 -dc dataset.tar.gz | tar -xvf -"Restore the tables into an existing PostgreSQL database using the pg_restore utility: "pg_restore -j number_jobs -Fd -O -U database_username -d database_name dataset"Ensure substantial storage for the database: 40GB for node_features and 80GB for transaction_edges (including indexes)Dataset DescriptionThe database contains two tables: node_features (approximately 252 million rows) and transaction_edges (approximately 785 million rows).Columns for node_features table:alias: Identifier of the nodedegree: Degree of the nodedegree_in: Number of incoming edges to the nodedegree_out: Number of outgoing edges from the nodetotal_transaction_in: Total count of value transfers received by the nodetotal_transaction_out: Total count of value transfers initiated by the nodeAmounts are expressed in satoshis (1 satoshi = 10^-8 Bitcoin):min_sent: Minimum amount sent by the node during a transactionmax_sent: Maximum amount sent by the node during a transactiontotal_sent: Total amount sent by the node during all transactionsmin_received: Minimum amount received by the node during a transactionmax_received: Maximum amount received by the node during a transactiontotal_received: Total amount received by the node during all transactionslabel: Label describing the type of entity represented by the nodeTransactions on the Bitcoin network are stored in the public ledger named the "Bitcoin Blockchain". Each transaction is recorded in a block, with the block index indicating the transaction's position in the blockchain.first_transaction_in: Block index of the first transaction received by the nodelast_transaction_in: Block index of the last transaction received by the nodefirst_transaction_out: Block index of the first transaction sent by the nodelast_transaction_out: Block index of the last transaction sent by the nodeNodes can represent one or more Bitcoin addresses (pseudonyms used by Bitcoin users). A real entity often uses multiple addresses. The dataset contains only transactions between nodes (outer transactions), but provides information about inner transactions (transactions between addresses controlled by the same node).cluster_size: Number of addresses represented by the nodecluster_num_edges: Number of transactions between the addresses represented by the nodecluster_num_cc: Number of connected components in the transaction graph of the addresses represented by the nodecluster_num_nodes_in_cc: Number of non-isolated addresses in the clusterColumns in the transaction_edges table:a: Node alias of the senderb: Node alias of the recipientreveal: Block index of the first transaction from a to blast_seen: Block index of the last transaction from a to btotal: Total number of transactions from a to bmin_sent: Minimum amount sent (in satoshis) in a transaction from a to bmax_sent: Maximum amount sent (in satoshis) in a transaction from a to btotal_sent: Total amount sent (in satoshis) in all transactions from a to bDataset of Bitcoin Labeled Addresses (addresses.csv)This file contains 103,812 labeled Bitcoin addresses with the following columns:address: Bitcoin addressentity: Name of the entitycategory: Type of the entity (e.g., individual, bet, ransomware, gambling, exchange, mining, ponzi, marketplace, faucet, bridge, mixer)source: Source used to label the address

  18. TIGER/Line Shapefile, 2022, County, Real County, TX, Address Ranges...

    • catalog.data.gov
    Updated Jan 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Department of Commerce, U.S. Census Bureau, Geography Division, Spatial Data Collection and Products Branch (Point of Contact) (2024). TIGER/Line Shapefile, 2022, County, Real County, TX, Address Ranges Relationship File [Dataset]. https://catalog.data.gov/dataset/tiger-line-shapefile-2022-county-real-county-tx-address-ranges-relationship-file
    Explore at:
    Dataset updated
    Jan 27, 2024
    Dataset provided by
    United States Census Bureauhttp://census.gov/
    Area covered
    Real County, Texas
    Description

    The TIGER/Line shapefiles and related database files (.dbf) are an extract of selected geographic and cartographic information from the U.S. Census Bureau's Master Address File / Topologically Integrated Geographic Encoding and Referencing (MAF/TIGER) Database (MTDB). The MTDB represents a seamless national file with no overlaps or gaps between parts, however, each TIGER/Line shapefile is designed to stand alone as an independent data set, or they can be combined to cover the entire nation. The Address Ranges Relationship File (ADDR.dbf) contains the attributes of each address range. Each address range applies to a single edge and has a unique address range identifier (ARID) value. The edge to which an address range applies can be determined by linking the address range to the All Lines Shapefile (EDGES.shp) using the permanent topological edge identifier (TLID) attribute. Multiple address ranges can apply to the same edge since an edge can have multiple address ranges. Note that the most inclusive address range associated with each side of a street edge already appears in the All Lines Shapefile (EDGES.shp). The TIGER/Line Files contain potential address ranges, not individual addresses. The term "address range" refers to the collection of all possible structure numbers from the first structure number to the last structure number and all numbers of a specified parity in between along an edge side relative to the direction in which the edge is coded. The address ranges in the TIGER/Line Files are potential ranges that include the full range of possible structure numbers even though the actual structures may not exist.

  19. a

    Complete List of Pulp Juice and Smoothie Bar Locations

    • aggdata.com
    csv
    Updated Apr 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AggData (2025). Complete List of Pulp Juice and Smoothie Bar Locations [Dataset]. https://www.aggdata.com/aggdata/complete-list-pulp-juice-and-smoothie-bar-locations
    Explore at:
    csvAvailable download formats
    Dataset updated
    Apr 28, 2025
    Dataset authored and provided by
    AggData
    Description

    This is a complete list of Pulp Juice and Smoothie Bar locations along with their geographical coordinates. Pulp Juice and Smoothie Bars use real fruit, real fruit juice and real vegetable juice in their menu of over 30 smoothies. This data includes addresses and phone numbers for each location.

  20. 🔍 Diverse CSV Dataset Samples

    • kaggle.com
    Updated Nov 6, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Samy Baladram (2023). 🔍 Diverse CSV Dataset Samples [Dataset]. https://www.kaggle.com/datasets/samybaladram/multidisciplinary-csv-datasets-collection/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 6, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Samy Baladram
    License

    http://www.gnu.org/licenses/lgpl-3.0.htmlhttp://www.gnu.org/licenses/lgpl-3.0.html

    Description

    https://i.imgur.com/PcSDv8A.png" alt="Imgur">

    Overview

    The dataset provided here is a rich compilation of various data files gathered to support diverse analytical challenges and education in data science. It is especially curated to provide researchers, data enthusiasts, and students with real-world data across different domains, including biostatistics, travel, real estate, sports, media viewership, and more.

    Files

    Below is a brief overview of what each CSV file contains: - Addresses: Practical examples of string manipulation and address data formatting in CSV. - Air Travel: Historical dataset suitable for analyzing trends in air travel over a period of three years. - Biostats: A dataset of office workers' biometrics, ideal for introductory statistics and biology. - Cities: Geographic and administrative data for urban analysis or socio-demographic studies. - Car Crashes in Catalonia: Weekly traffic accident data from Catalonia, providing a base for public policy research. - De Niro's Film Ratings: Analyze trends in film ratings over time with this entertainment-focused dataset. - Ford Escort Sales: Pre-owned vehicle sales data, perfect for regression analysis or price prediction models. - Old Faithful Geyser: Geological data for pattern recognition and prediction in natural phenomena. - Freshman Year Weights and BMIs: Dataset depicting weight and BMI changes for health and lifestyle studies. - Grades: Education performance data which can be correlated with demographics or study patterns. - Home Sales: A dataset reflecting the housing market dynamics, useful for economic analysis or real estate appraisal. - Hooke's Law Demonstration: Physics data illustrating the classic principle of elasticity in springs. - Hurricanes and Storm Data: Climate data on hurricane and storm frequency for environmental risk assessments. - Height and Weight Measurements: Public health research dataset on anthropometric data. - Lead Shot Specs: Detailed engineering data for material sciences and manufacturing studies. - Alphabet Letter Frequency: Text analysis dataset for frequency distribution studies in large text samples. - MLB Player Statistics: Comprehensive athletic data set for analysis of performance metrics in sports. - MLB Teams' Seasonal Performance: A dataset combining financial and sports performance data from the 2012 MLB season. - TV News Viewership: Media consumption data which can be used to analyze viewing patterns and trends. - Historical Nile Flood Data: A unique environmental dataset for historical trend analysis in flood levels. - Oscar Winner Ages: A dataset to explore age trends among Oscar-winning actors and actresses. - Snakes and Ladders Statistics: Data from the game outcomes useful in studying probability and game theory. - Tallahassee Cab Fares: Price modeling data from the real-world pricing of taxi services. - Taxable Goods Data: A snapshot of economic data concerning taxation impact on prices. - Tree Measurements: Ecological and environmental science data related to tree growth and forest management. - Real Estate Prices from Zillow: Market analysis dataset for those interested in housing price determinants.

    Format

    The enclosed data respect the comma-separated values (CSV) file format standards, ensuring compatibility with most data processing libraries in Python, R, and other languages. The datasets are ready for import into Jupyter notebooks, RStudio, or any other integrated development environment (IDE) used for data science.

    Quality Assurance

    The data is pre-checked for common issues such as missing values, duplicate records, and inconsistent entries, offering a clean and reliable dataset for various analytical exercises. With initial header lines in some CSV files, users can easily identify dataset fields and start their analysis without additional data cleaning for headers.

    Acknowledgements

    The dataset adheres to the GNU LGPL license, making it freely available for modification and distribution, provided that the original source is cited. This opens up possibilities for educators to integrate real-world data into curricula, researchers to validate models against diverse datasets, and practitioners to refine their analytical skills with hands-on data.

    This dataset has been compiled from https://people.sc.fsu.edu/~jburkardt/data/csv/csv.html, with gratitude to the authors and maintainers for their dedication to providing open data resources for educational and research purposes. https://i.imgur.com/HOtyghv.png" alt="Imgur">

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Huayu Zhang; Arlene Casey; Imane Guellil; Víctor Suárez-Paniagua; Clare MacRae; Charis Marwick; Honghan Wu; Bruce Guthrie; Beatrice Alex (2023). Datasheet1_FLAP: a framework for linking free-text addresses to the Ordnance Survey Unique Property Reference Number database.pdf [Dataset]. http://doi.org/10.3389/fdgth.2023.1186208.s001

Datasheet1_FLAP: a framework for linking free-text addresses to the Ordnance Survey Unique Property Reference Number database.pdf

Related Article
Explore at:
pdfAvailable download formats
Dataset updated
Nov 28, 2023
Dataset provided by
Frontiers
Authors
Huayu Zhang; Arlene Casey; Imane Guellil; Víctor Suárez-Paniagua; Clare MacRae; Charis Marwick; Honghan Wu; Bruce Guthrie; Beatrice Alex
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

IntroductionLinking free-text addresses to unique identifiers in a structural address database [the Ordnance Survey unique property reference number (UPRN) in the United Kingdom (UK)] is a necessary step for downstream geospatial analysis in many digital health systems, e.g., for identification of care home residents, understanding housing transitions in later life, and informing decision making on geographical health and social care resource distribution. However, there is a lack of open-source tools for this task with performance validated in a test data set.MethodsIn this article, we propose a generalisable solution (A Framework for Linking free-text Addresses to Ordnance Survey UPRN database, FLAP) based on a machine learning–based matching classifier coupled with a fuzzy aligning algorithm for feature generation with better performance than existing tools. The framework is implemented in Python as an Open Source tool (available at Link). We tested the framework in a real-world scenario of linking individual’s (n=771,588) addresses recorded as free text in the Community Health Index (CHI) of National Health Service (NHS) Tayside and NHS Fife to the Unique Property Reference Number database (UPRN DB).ResultsWe achieved an adjusted matching accuracy of 0.992 in a test data set randomly sampled (n=3,876) from NHS Tayside and NHS Fife CHI addresses. FLAP showed robustness against input variations including typographical errors, alternative formats, and partially incorrect information. It has also improved usability compared to existing solutions allowing the use of a customised threshold of matching confidence and selection of top n candidate records. The use of machine learning also provides better adaptability of the tool to new data and enables continuous improvement.DiscussionIn conclusion, we have developed a framework, FLAP, for linking free-text UK addresses to the UPRN DB with good performance and usability in a real-world task.

Search
Clear search
Close search
Google apps
Main menu