ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
Database of IPv4 address networks with their respective geographical location.
Based on GeoLite2 Country Free Downloadable Databases as of Apr 21, 2015 http://dev.maxmind.com/geoip/geoip2/geolite2/...
This IP2Location IP Geolocation LITE database in BIN format provides a solution to determine the country of origin for any IP address for IPv4.
This dataset is a database of IPv4 address networks with their respective geographical location.
Start.io's mobile IP database is one of the largest and most comprehensive out there. Used by some of the largest location and device-graph companies in the world, this data is linked with MAIDs and timestamps, offering insights into billions of devices and events.
Use cases : - Device graph enrichment - Fraud detection - Geolocation services - Customer journey mapping - Ad-targeting
https://www.caida.org/about/legal/aua/public_aua/https://www.caida.org/about/legal/aua/public_aua/
https://www.caida.org/about/legal/aua/https://www.caida.org/about/legal/aua/
A collection of router interface IP addresses geolocated to the city level. 11,857 IP addressed geolocated based on DNS names and 4,838 IP addresses geolocated based on RTT proximity to RIPE Atlas probes. The DNS-based data was created on May 15, 2016. The RTT-proximity data was created from measurements collected on May 25, 2016. The total number of addresses in the dataset is 16586 (109 addresses found to be common between the two sources of data with very similar locations). Data supplement for paper M. Gharaibeh, A. Shah, B. Huffaker, H. Zhang, R. Ensafi, and C. Papadopoulos, A Look at Router Geolocation in Public and Commercial Databases, Proc. Internet Measurement Conference (IMC), Nov 2017.
Comprehensive dataset of Telegram users' geolocations with IP addresses, fully consented, comprising 50,000 records. Ideal for AI, ML, DL, and LLM training, this dataset provides detailed geospatial insights across various regions, enhancing geofencing, localization, and behavioral analysis models.
IP Geolocation dataset contains the location information of IP addresses of 9 /8s (1 2 12 14 24 31 121 192 196) and also the raw probing data.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Login Data Set for Risk-Based Authentication
Synthesized login feature data of >33M login attempts and >3.3M users on a large-scale online service in Norway. Original data collected between February 2020 and February 2021.
This data sets aims to foster research and development for Risk-Based Authentication (RBA) systems. The data was synthesized from the real-world login behavior of more than 3.3M users at a large-scale single sign-on (SSO) online service in Norway.
The users used this SSO to access sensitive data provided by the online service, e.g., a cloud storage and billing information. We used this data set to study how the Freeman et al. (2016) RBA model behaves on a large-scale online service in the real world (see Publication). The synthesized data set can reproduce these results made on the original data set (see Study Reproduction). Beyond that, you can use this data set to evaluate and improve RBA algorithms under real-world conditions.
WARNING: The feature values are plausible, but still totally artificial. Therefore, you should NOT use this data set in productive systems, e.g., intrusion detection systems.
Overview
The data set contains the following features related to each login attempt on the SSO:
Feature | Data Type | Description | Range or Example |
---|---|---|---|
IP Address | String | IP address belonging to the login attempt | 0.0.0.0 - 255.255.255.255 |
Country | String | Country derived from the IP address | US |
Region | String | Region derived from the IP address | New York |
City | String | City derived from the IP address | Rochester |
ASN | Integer | Autonomous system number derived from the IP address | 0 - 600000 |
User Agent String | String | User agent string submitted by the client | Mozilla/5.0 (Windows NT 10.0; Win64; ... |
OS Name and Version | String | Operating system name and version derived from the user agent string | Windows 10 |
Browser Name and Version | String | Browser name and version derived from the user agent string | Chrome 70.0.3538 |
Device Type | String | Device type derived from the user agent string | (mobile , desktop , tablet , bot , unknown )1 |
User ID | Integer | Idenfication number related to the affected user account | [Random pseudonym] |
Login Timestamp | Integer | Timestamp related to the login attempt | [64 Bit timestamp] |
Round-Trip Time (RTT) [ms] | Integer | Server-side measured latency between client and server | 1 - 8600000 |
Login Successful | Boolean | True : Login was successful, False : Login failed | (true , false ) |
Is Attack IP | Boolean | IP address was found in known attacker data set | (true , false ) |
Is Account Takeover | Boolean | Login attempt was identified as account takeover by incident response team of the online service | (true , false ) |
Data Creation
As the data set targets RBA systems, especially the Freeman et al. (2016) model, the statistical feature probabilities between all users, globally and locally, are identical for the categorical data. All the other data was randomly generated while maintaining logical relations and timely order between the features.
The timestamps, however, are not identical and contain randomness. The feature values related to IP address and user agent string were randomly generated by publicly available data, so they were very likely not present in the real data set. The RTTs resemble real values but were randomly assigned among users per geolocation. Therefore, the RTT entries were probably in other positions in the original data set.
The country was randomly assigned per unique feature value. Based on that, we randomly assigned an ASN related to the country, and generated the IP addresses for this ASN. The cities and regions were derived from the generated IP addresses for privacy reasons and do not reflect the real logical relations from the original data set.
The device types are identical to the real data set. Based on that, we randomly assigned the OS, and based on the OS the browser information. From this information, we randomly generated the user agent string. Therefore, all the logical relations regarding the user agent are identical as in the real data set.
The RTT was randomly drawn from the login success status and synthesized geolocation data. We did this to ensure that the RTTs are realistic ones.
Regarding the Data Values
Due to unresolvable conflicts during the data creation, we had to assign some unrealistic IP addresses and ASNs that are not present in the real world. Nevertheless, these do not have any effects on the risk scores generated by the Freeman et al. (2016) model.
You can recognize them by the following values:
ASNs with values >= 500.000
IP addresses in the range 10.0.0.0 - 10.255.255.255 (10.0.0.0/8 CIDR range)
Study Reproduction
Based on our evaluation, this data set can reproduce our study results regarding the RBA behavior of an RBA model using the IP address (IP address, country, and ASN) and user agent string (Full string, OS name and version, browser name and version, device type) as features.
The calculated RTT significances for countries and regions inside Norway are not identical using this data set, but have similar tendencies. The same is true for the Median RTTs per country. This is due to the fact that the available number of entries per country, region, and city changed with the data creation procedure. However, the RTTs still reflect the real-world distributions of different geolocations by city.
See RESULTS.md for more details.
Ethics
By using the SSO service, the users agreed in the data collection and evaluation for research purposes. For study reproduction and fostering RBA research, we agreed with the data owner to create a synthesized data set that does not allow re-identification of customers.
The synthesized data set does not contain any sensitive data values, as the IP addresses, browser identifiers, login timestamps, and RTTs were randomly generated and assigned.
Publication
You can find more details on our conducted study in the following journal article:
Pump Up Password Security! Evaluating and Enhancing Risk-Based Authentication on a Real-World Large-Scale Online Service (2022)
Stephan Wiefling, Paul René Jørgensen, Sigurd Thunem, and Luigi Lo Iacono.
ACM Transactions on Privacy and Security
Bibtex
@article{Wiefling_Pump_2022, author = {Wiefling, Stephan and Jørgensen, Paul René and Thunem, Sigurd and Lo Iacono, Luigi}, title = {Pump {Up} {Password} {Security}! {Evaluating} and {Enhancing} {Risk}-{Based} {Authentication} on a {Real}-{World} {Large}-{Scale} {Online} {Service}}, journal = {{ACM} {Transactions} on {Privacy} and {Security}}, doi = {10.1145/3546069}, publisher = {ACM}, year = {2022} }
License
This data set and the contents of this repository are licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0) license. See the LICENSE file for details. If the data set is used within a publication, the following journal article has to be cited as the source of the data set:
Stephan Wiefling, Paul René Jørgensen, Sigurd Thunem, and Luigi Lo Iacono: Pump Up Password Security! Evaluating and Enhancing Risk-Based Authentication on a Real-World Large-Scale Online Service. In: ACM Transactions on Privacy and Security (2022). doi: 10.1145/3546069
Few (invalid) user agents strings from the original data set could not be parsed, so their device type is empty. Perhaps this parse error is useful information for your studies, so we kept these 1526 entries.↩︎
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains all data collected by the CyberLab honeynet experiment, from May 2019 to February 2020.
The experiment was based on the Cowrie honeypot (https://github.com/cowrie/cowrie, versions 1.6.0 and 2.0.2, see below for the timeline) deployed on approximately 50 nodes at different EU and US universities and companies. This number has varied throughout the duration of the experiment due to scaling efforts and the target node availability. All public IP addresses in the dataset are pseudonymized to protect the identity of the destination nodes.
Each file in the dataset is a daily compilation of all connections starting at midnight on that date (date in filename, midnight in UTC time), grouped into "attack sessions". Each event in such a session includes all the data reported by the honeypot software (https://github.com/cowrie/cowrie). The honeypot has been operating in its default (low-interaction) mode using version 1.6.0 from the start of the experiment until November 8, 2019; after that date, we upgraded to Cowrie version 2.0.2, which allowed us to back it by a pool of real Linux instances to provide more convincing high-interaction mode. Results from high-interaction mode are tagged with "sensor:ubuntu_basic_pool".
Geolocation data was added to Cowrie output messages based on the source IP address.
Field Description =============================== =========================================================== session_id Unique ID of the session dst_ip_identifier Pseudonymized dst public IPv4 of the honeypot node dst_host_identifier Obfuscated (pseudonymized) name of the honeypot node src_ip_identifier Obfuscated (pseudonymized) IP address of the attacker eventid Event id of the session in the cowrie honeypot timestamp UTC time of the event message Message of the Cowrie honeypot protocol Protocol used in the cowrie honeypot; either ssh or telnet geolocation_data/postal_code Source IP postal code as (determined by logstash) geolocation_data/continent_code Source IP continent code (as determined by logstash) geolocation_data/country_code3 Source IP country code3 (as determined by logstash) geolocation_data/region_name Source IP region name (as determined by logstash) geolocation_data/latitude Source IP latitude (as determined by logstash) geolocation_data/longitude Source IP longitude (as determined by logstash) geolocation_data/country_name Source IP full country name (as determined by logstash) geolocation_data/timezone Source IP timezone geolocation_data/country_code2 Source IP country code2 geolocation_data/region_code Source IP region code geolocation_data/city_name Source IP city name src_port Source TCP port sensor Sensor name; serves to identify our experiment config arch Represents the CPU/OS architecture emulated by cowrie duration Session duration in seconds ssh_client_version Attacker's SSH client version username Login username; only used for login events password Password; only used for login events macCS HMAC algorithms supported by the client encCS Encryption algorithms supported by the client kexAlgs Key exchange algorithms supported by the client keyAlgs Public key algorithms supported by the client
More detailed description of the fields (with examples) and all subsequent data (after February 2020) can be found at cyber.ltfe.org.
Irys specializes in collecting and curating high-quality GPS signals from millions of connected devices worldwide. Our Mobile Location Data insights are sourced through partnerships with tier-1 app developers and a unique data collection method. The low-latency delivery ensures real-time insights, setting us apart and providing unparalleled benefits and use cases for Location Data, Places Data, Mobility Data, and IP Address Data.
Our commitment to privacy compliance is unwavering. Clear and compliant privacy notices accompany our data collection process. Opt-in/out management empowers users over data distribution.
Discover the precision of our Mobile Location Data insights with Irys – where quality meets innovation.
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
The GNSS (Global Navigation Satellite System), or satellite positioning system, includes all satellite navigation systems. It allows you to know your location, anywhere in the country. Theoretical GNSS specifications estimate the accuracy of the position obtained from a receiver to be approximately 15 meters in planimetry and 25 meters in altimetry. By combining the data with that of another receiver placed on a known geodesic point, the accuracy of the obtained position can vary from a few centimeters to a few meters, depending on the type of receiver used. In order to increase accuracy, the Government of Quebec records data continuously through a network of 18 GNSS stations. These stations are located on geodetic points that are free of any obstacles and capture data from the GPS and GLONASS constellations. Some of these stations receive signals from the Galileo constellation. This data is available in the standard exchange format*Receiver Independent Exchange Format* (RINEX), version 2.11. This format is recognized by the majority of GNSS data processing software. The data is accessible on the _ ftp server_) of the MRNF or using the _ Interactive Map_) of the geodetic network. It should be noted that only data from the last 366 days is kept. The structure of the directories and files on the _ ftp server_) as well as the coordinates of the stations are presented in the document _ GNSS sensor stations_. # #État of GNSS stations## You can consult the status of the stations in the document _ Status of GNSS stations_. You will be notified if a station is in service, out of service, or if equipment maintenance is planned. # #GNSS in real time by cellular telephony The government also offers GNSS data by cellular telephony that allows centimeter positioning work to be carried out in real time. Users of georeferenced data can thus, with a single multi-frequency GNSS receiver equipped with a modem by cellular telephone, identify or implement any physical detail with an accuracy of a few centimeters in the NAD 83 reference system (SCRS) (period 1997.0). The signal that contains this data is available to everyone. The range depends on telephone coverage, ionospheric conditions and especially on the instruments used. For more information on using GNSS in real time, see document _ Guidelines for GNSS RTK/RTN Surveys in Canada_. # #Détails techniques The transmission of GNSS data as well as the station's NAD 83 (SCRS) coordinates (era 1997.0) is transmitted by cellular telephony from an IP address on the Internet. Each station transmits its data in one of the following two formats: CMR+ or RTCM V3.2. The document _ GNSS capture stations_) gives for each city the IP address of the CMR+ or RTCM V3.2 formats as well as the antenna model. It should be noted that the data is not broadcast according to the*Networked Transport of RTCM protocol via Internet Protocol* (NTRIP).**This third party metadata element was translated using an automated translation tool (Amazon Translate).**
IP Geolocation dataset contains the location information of IP addresses of 13 /8s (166 167 168 169 172 179 210 212 213 220 221 222 223) and also the raw probing data.
This view redacts sensitive information including respondent IP Address, Contact and Company/Business Name, Address, Phone, Email and Geolocation. Results from the City of Mesa's COVID-19 Business Impact Survey administered by Economic Development Department March 30 – May 1, 2020. This was a voluntary survey.
Quadrant provides Insightful, accurate, and reliable mobile location data.
Our privacy-first mobile location data unveils hidden patterns and opportunities, provides actionable insights, and fuels data-driven decision-making at the world's biggest companies.
These companies rely on our privacy-first Mobile Location and Points-of-Interest Data to unveil hidden patterns and opportunities, provide actionable insights, and fuel data-driven decision-making. They build better AI models, uncover business insights, and enable location-based services using our robust and reliable real-world data.
We conduct stringent evaluations on data providers to ensure authenticity and quality. Our proprietary algorithms detect, and cleanse corrupted and duplicated data points – allowing you to leverage our datasets rapidly with minimal processing or cleaning. During the ingestion process, our proprietary Data Filtering Algorithms remove events based on a number of both qualitative factors, as well as latency and other integrity variables to provide more efficient data delivery. The deduplicating algorithm focuses on a combination of four important attributes: Device ID, Latitude, Longitude, and Timestamp. This algorithm scours our data and identifies rows that contain the same combination of these four attributes. Post-identification, it retains a single copy and eliminates duplicate values to ensure our customers only receive complete and unique datasets.
We actively identify overlapping values at the provider level to determine the value each offers. Our data science team has developed a sophisticated overlap analysis model that helps us maintain a high-quality data feed by qualifying providers based on unique data values rather than volumes alone – measures that provide significant benefit to our end-use partners.
Quadrant mobility data contains all standard attributes such as Device ID, Latitude, Longitude, Timestamp, Horizontal Accuracy, and IP Address, and non-standard attributes such as Geohash and H3. In addition, we have historical data available back through 2022.
Through our in-house data science team, we offer sophisticated technical documentation, location data algorithms, and queries that help data buyers get a head start on their analyses. Our goal is to provide you with data that is “fit for purpose”.
https://www.wiseguyreports.com/pages/privacy-policyhttps://www.wiseguyreports.com/pages/privacy-policy
BASE YEAR | 2024 |
HISTORICAL DATA | 2019 - 2024 |
REPORT COVERAGE | Revenue Forecast, Competitive Landscape, Growth Factors, and Trends |
MARKET SIZE 2023 | 50.35(USD Billion) |
MARKET SIZE 2024 | 56.57(USD Billion) |
MARKET SIZE 2032 | 143.6(USD Billion) |
SEGMENTS COVERED | Ad Format ,Location Data Source ,Targeting Options ,Device Type ,Industry Vertical ,Regional |
COUNTRIES COVERED | North America, Europe, APAC, South America, MEA |
KEY MARKET DYNAMICS | Increased smartphone penetration Growth of mobile commerce Advancement in locationbased technologies Rising consumer demand for personalized advertising Increasing adoption of AIML in ad targeting |
MARKET FORECAST UNITS | USD Billion |
KEY COMPANIES PROFILED | Waze ,Foursquare ,Amazon ,Snapchat ,LinkedIn ,Pinterest ,Twitter ,Google ,Uber ,Lyft ,Facebook ,Microsoft ,Yelp ,Baidu ,Apple |
MARKET FORECAST PERIOD | 2025 - 2032 |
KEY MARKET OPPORTUNITIES | Increased smartphone penetration Growing popularity of locationbased services Advancements in artificial intelligence and machine learning Increased adoption of mobile payments Rise of personalized marketing |
COMPOUND ANNUAL GROWTH RATE (CAGR) | 12.35% (2025 - 2032) |
Quadrant provides Insightful, accurate, and reliable mobile location data.
Our privacy-first mobile location data unveils hidden patterns and opportunities, provides actionable insights, and fuels data-driven decision-making at the world's biggest companies.
These companies rely on our privacy-first Mobile Location and Points-of-Interest Data to unveil hidden patterns and opportunities, provide actionable insights, and fuel data-driven decision-making. They build better AI models, uncover business insights, and enable location-based services using our robust and reliable real-world data.
We conduct stringent evaluations on data providers to ensure authenticity and quality. Our proprietary algorithms detect, and cleanse corrupted and duplicated data points – allowing you to leverage our datasets rapidly with minimal processing or cleaning. During the ingestion process, our proprietary Data Filtering Algorithms remove events based on a number of both qualitative factors, as well as latency and other integrity variables to provide more efficient data delivery. The deduplicating algorithm focuses on a combination of four important attributes: Device ID, Latitude, Longitude, and Timestamp. This algorithm scours our data and identifies rows that contain the same combination of these four attributes. Post-identification, it retains a single copy and eliminates duplicate values to ensure our customers only receive complete and unique datasets.
We actively identify overlapping values at the provider level to determine the value each offers. Our data science team has developed a sophisticated overlap analysis model that helps us maintain a high-quality data feed by qualifying providers based on unique data values rather than volumes alone – measures that provide significant benefit to our end-use partners.
Quadrant mobility data contains all standard attributes such as Device ID, Latitude, Longitude, Timestamp, Horizontal Accuracy, and IP Address, and non-standard attributes such as Geohash and H3. In addition, we have historical data available back through 2022.
Through our in-house data science team, we offer sophisticated technical documentation, location data algorithms, and queries that help data buyers get a head start on their analyses. Our goal is to provide you with data that is “fit for purpose”.
Irys specializes in collecting and curating high-quality GPS signals from millions of connected devices worldwide. Our Mobility Data insights are sourced through partnerships with tier-1 app developers, providing unparalleled benefits and use cases for Transport and Logistic Data, Mobile Location Data, Mobility Data, and IP Address Data.
Our commitment to privacy compliance is unwavering. All data is collected with clear privacy notices, and our opt-in/out management ensures transparent control over data collection, use, and distribution.
Discover the precision of our Mobility Data insights with Irys – where accuracy meets innovation.
This dataset is aggregated from the unidirectional unsolicited IPv4 traffic reaching the UCSD Network Telescope. From the raw traffic data, we extract the backscatter (response) packets sent by victims of randomly and uniformly spoofed DoS attacks, summarize activity that relates to the same victim in an 'attack vector', and produce a single CSV file of attack vectors per day.
The attack vector consists of a target IP address, statistical information about the attack, and geolocation and BGP routing metadata for that IP address. Possible uses of this data include: studying and modeling DoS attacks and characterizing victim populations.
For more information see http://www.caida.org/data/passive/rsdos-targets/
Leverage the most reliable and compliant mobile device location/foot traffic dataset on the market!
Veraset Movement (GPS Mobility Data) offers unparalleled insights into footfall traffic patterns across nearly four dozen countries in Africa.
Covering 46+ countries, Veraset's Mobility Data draws on raw GPS data from tier-1 apps, SDKs, and aggregators of mobile devices to provide customers with accurate, up-to-the-minute information on human movement.
Ideal for ad tech, planning, retail, and transportation logistics, Veraset's Movement data (Mobility data) helps shape strategy and make impactful data-driven decisions.
Veraset’s Africa Movement Panel includes the following countries: - algeria-DZ - angola-AO - benin-BJ - botswana-BW - burkina faso-BF - burundi-BI - cameroon-CM - central african republic-CF - chad-TD - comoros-KM - congo-brazzaville-CG - congo-kinshasa-CD - djibouti-DJ - egypt-EG - eritrea-ER - ethiopia-ET - gabon-GA - gambia-GM - ghana-GH - guinea-bissau-GW - kenya-KE - lesotho-LS - liberia-LR - libya-LY - madagascar-MG - malawi-MW - mali-ML - mauritius-MU - morocco-MA - mozambique-MZ - namibia-NA - nigeria-NG - rwanda-RW - senegal-SN - seychelles-SC - sierra leone-SL - somalia-SO - south africa-ZA - south sudan-SS - tanzania-TZ - togo-TG - tunisia-TN - uganda-UG - zambia-ZM - zimbabwe-ZW
Companies use Veraset's Mobility Data for: - Advertising - Ad Placement, Attribution, and Segmentation - Audience Creation/Building - Dynamic Ad Targeting - Infrastructure Plans - Route Optimization - Public Transit Optimization - Credit Card Loyalty - Competitive Analysis - Risk assessment, Underwriting, and Policy Personalization - Enrichment of Existing Datasets - Trade Area Analysis - Predictive Analytics and Trend Forecasting
Not seeing a result you expected?
Learn how you can add new datasets to our index.
ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
Database of IPv4 address networks with their respective geographical location.
Based on GeoLite2 Country Free Downloadable Databases as of Apr 21, 2015 http://dev.maxmind.com/geoip/geoip2/geolite2/...