Facebook
TwitterOur People data is gathered and aggregated via surveys, digital services, and public data sources. We use powerful profiling algorithms to collect and ingest only fresh and reliable data points.
Our comprehensive data enrichment solution includes a variety of data sets that can help you address gaps in your customer data, gain a deeper understanding of your customers, and power superior client experiences.
People Data Schema & Reach: Our data reach represents the total number of counts available within various categories and comprises attributes such as country location, MAU, DAU & Monthly Location Pings:
Data Export Methodology: Since we collect data dynamically, we provide the most updated data and insights via a best-suited method on a suitable interval (daily/weekly/monthly).
People data Use Cases:
360-Degree Customer View: Get a comprehensive image of customers by the means of internal and external data aggregation. Data Enrichment: Leverage Online to offline consumer profiles to build holistic audience segments to improve campaign targeting using user data enrichment Fraud Detection: Use multiple digital (web and mobile) identities to verify real users and detect anomalies or fraudulent activity. Advertising & Marketing: Understand audience demographics, interests, lifestyle, hobbies, and behaviors to build targeted marketing campaigns.
Here's the schema of People Data:
person_id
first_name
last_name
age
gender
linkedin_url
twitter_url
facebook_url
city
state
address
zip
zip4
country
delivery_point_bar_code
carrier_route
walk_seuqence_code
fips_state_code
fips_country_code
country_name
latitude
longtiude
address_type
metropolitan_statistical_area
core_based+statistical_area
census_tract
census_block_group
census_block
primary_address
pre_address
streer
post_address
address_suffix
address_secondline
address_abrev
census_median_home_value
home_market_value
property_build+year
property_with_ac
property_with_pool
property_with_water
property_with_sewer
general_home_value
property_fuel_type
year
month
household_id
Census_median_household_income
household_size
marital_status
length+of_residence
number_of_kids
pre_school_kids
single_parents
working_women_in_house_hold
homeowner
children
adults
generations
net_worth
education_level
occupation
education_history
credit_lines
credit_card_user
newly_issued_credit_card_user
credit_range_new
credit_cards
loan_to_value
mortgage_loan2_amount
mortgage_loan_type
mortgage_loan2_type
mortgage_lender_code
mortgage_loan2_render_code
mortgage_lender
mortgage_loan2_lender
mortgage_loan2_ratetype
mortgage_rate
mortgage_loan2_rate
donor
investor
interest
buyer
hobby
personal_email
work_email
devices
phone
employee_title
employee_department
employee_job_function
skills
recent_job_change
company_id
company_name
company_description
technologies_used
office_address
office_city
office_country
office_state
office_zip5
office_zip4
office_carrier_route
office_latitude
office_longitude
office_cbsa_code
office_census_block_group
office_census_tract
office_county_code
company_phone
company_credit_score
company_csa_code
company_dpbc
company_franchiseflag
company_facebookurl
company_linkedinurl
company_twitterurl
company_website
company_fortune_rank
company_government_type
company_headquarters_branch
company_home_business
company_industry
company_num_pcs_used
company_num_employees
company_firm_individual
company_msa
company_msa_name
company_naics_code
company_naics_description
company_naics_code2
company_naics_description2
company_sic_code2
company_sic_code2_description
company_sic_code4
company_sic_code4_description
company_sic_code6
company_sic_code6_description
company_sic_code8
company_sic_code8_description
company_parent_company
company_parent_company_location
company_public_private
company_subsidiary_company
company_residential_business_code
company_revenue_at_side_code
company_revenue_range
company_revenue
company_sales_volume
company_small_business
company_stock_ticker
company_year_founded
company_minorityowned
company_female_owned_or_operated
company_franchise_code
company_dma
company_dma_name
company_hq_address
company_hq_city
company_hq_duns
company_hq_state
company_hq_zip5
company_hq_zip4
company_se...
Facebook
TwitterMobility/Location data is gathered from location-aware mobile apps using an SDK-based implementation. All users explicitly consent to allow location data sharing using a clear opt-in process for our use cases and are given clear opt-out options. Factori ingests, cleans, validates, and exports all location data signals to ensure only the highest quality of data is made available for analysis.
Record Count:90 Billion+ Capturing Frequency: Once per Event Delivering Frequency: Once per Day Updated: Daily
Mobility Data Reach: Our data reach represents the total number of counts available within various categories and comprises attributes such as country location, MAU, DAU & Monthly Location Pings.
Data Export Methodology: Since we collect data dynamically, we provide the most updated data and insights via a best-suited interval (daily/weekly/monthly/quarterly).
Use Cases: Consumer Insight: Gain a comprehensive 360-degree perspective of the customer to spot behavioral changes, analyze trends and predict business outcomes. Market Intelligence: Study various market areas, the proximity of points or interests, and the competitive landscape. Advertising: Create campaigns and customize your messaging depending on your target audience's online and offline activity. Retail Analytics Analyze footfall trends in various locations and gain understanding of customer personas.
Here's the data attributes: maid latitude longtitude horizontal_accuracy timestamp id_type ipv4 ipv6 user_agent country state_hasc city_hasc postcode geohash hex8 hex9 carrier
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In this project, we aim to annotate car images captured on highways. The annotated data will be used to train machine learning models for various computer vision tasks, such as object detection and classification.
For this project, we will be using Roboflow, a powerful platform for data annotation and preprocessing. Roboflow simplifies the annotation process and provides tools for data augmentation and transformation.
Roboflow offers data augmentation capabilities, such as rotation, flipping, and resizing. These augmentations can help improve the model's robustness.
Once the data is annotated and augmented, Roboflow allows us to export the dataset in various formats suitable for training machine learning models, such as YOLO, COCO, or TensorFlow Record.
By completing this project, we will have a well-annotated dataset ready for training machine learning models. This dataset can be used for a wide range of applications in computer vision, including car detection and tracking on highways.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains Zenodo's published open access records and communities metadata, including entries marked by the Zenodo staff as spam and deleted.
The datasets are gzipped compressed JSON-lines files, where each line is a JSON object representation of a Zenodo record or community.
Records dataset
Filename: zenodo_open_metadata_{ date of export }.jsonl.gz
Each object contains the terms: part_of, thesis, description, doi, meeting, imprint, references, recid, alternate_identifiers, resource_type, journal, related_identifiers, title, subjects, notes, creators, communities, access_right, keywords, contributors, publication_date
which correspond to the fields with the same name available in Zenodo's record JSON Schema at https://zenodo.org/schemas/records/record-v1.0.0.json.
In addition, some terms have been altered:
Communities dataset
Filename: zenodo_community_metadata_{ date of export }.jsonl.gz
Each object contains the terms: id, title, description, curation_policy, page
which correspond to the fields with the same name available in Zenodo's community creation form.
Notes for all datasets
For each object the term spam contains a boolean value, determining whether a given record/community was marked as spam content by Zenodo staff.
Some values for the top-level terms, which were missing in the metadata may contain a null value.
A smaller uncompressed random sample of 200 JSON lines is also included for each dataset to test and get familiar with the format without having to download the entire dataset.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This study introduces Popnet, a deep learning model for forecasting 1km-gridded populations, integrating U-Net, ConvLSTM, a Spatial Autocorrelation module and deep ensemble methods. Using spatial variables and population data from 2000 to 2020, Popnet predicts South Korea's population trends by age groups (under 14, 15-64, over 65) up to 2040. In validation, it outperforms traditional machine learning and state-of-the-art computer vision models. The output of this model discovered significant polarization: population growth in urban areas, especially the capital region, and severe depopulation in rural areas. Popnet is a robust tool for offering significant insights to policymakers and related stakeholders about the detailed future population, which allows them to establish detailed, localised planning and resource allocations.*Due to the export restrictions on grid data imposed by the National Geographic Information Institute of Korea, the training data has been replaced with data from Tennessee. However, the Korean version of the future prediction data remains unchanged. Please take this into consideration.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset was collected by me as a means for training machine / deep learning models in an EEG motor imagery classification. This was one of my roles as a machine learning engineer for the graduation project in Faculty of Artificial Intelligence, KafrElsheikh university. I am profoundly grateful to all the technical support and advice provided by the project's supervisor, Dr. Mona AlNaggar
The goal is to perform motor imagery classification (left, right, relaxed) and translate JUST thoughts into action.
I used Muse2 Headband with 4 electrodes and Muse Monitor android app to start the recording sessions, and export the CSVs.
This high dimensional, temporal-data was collected in both subject-independent and subject-dependent contexts with the help of 19 healthy subjects (12 males, 7 females) in different states aged between 19 and 68 as a means of training for various deterministic and non-deterministic machine learning models to carry out Motor imagery classification task. 20 columns, where we have 5 powerbands (Alpha, beta, theta, delta, gamma) per each of the 4 sensor-electrodes, were of significance to the motor imagery classification. I didn't use the raw data. However, raw data is exported via muse monitor too so you can use it as more insights can be extracted out of the raw data and thus use 4 columns only (AF7, AF8, TP9, TP10). Features like gyro, accelerometer weren't an area of interest for this EEG brain analysis or motor imagery classification. Feature engineering techniques like PCA, ICA can be beneficial especially for the raw data scenario.
Motor Imagery is one class of the event-related potentials. It is the imagination of motion, without doing any actual movement.
For the elements column, there are instances of blink, Marker 1, 2, 3. An Important assumption to mention about the data markers is that :
Marker 1 -> left motor imagery
Marker 2 -> right motor imagery
Marker 3 -> Relaxed state (which is an intermediate phase between right and left in the conducted motor imagery experiments for 19 subjects)
Data was later split into training, testing and validation portions. The experiment involves various steps, fitting the muse2 headband to the subject's head, the experiment conductor carefully looking over Mind Monitor android app and starts a recording session to collect the data, objects are present on the right and left of the experiment subject. Usually, relaxing state is maintained at first then left imagery then relax then right imagery and so on.
Before heading onto the experiment, subjects were trained upon the official Muse app, meditative sessions so we are sure of their ability to maintain their focus especially in the intermediate step between left and right imagery.
Experiment setup in comfortable and stable lightened places :- The volunteer begins by sitting down on a comfortable chair, their arms parallel and resting on a table. Prior to this, they are trained in meditation using the Muse app to help them achieve a relaxed state. Two cups of water are placed on either side of the participant, each 5 cm away from a hand and within their line of sight. The volunteer then sits comfortably, raises their chin slightly, and keeps their head steady to avoid noise in the EEG data. Their eyes rotate to the left or right, looking over the cup without moving their head.
The experiment conductor then instructs the participant to engage in motor imagery, imagining picking up the cup on their left with their left hand and drinking, but without any actual hand movement. This state of focus is maintained for a duration of 0.5, 1, 2, or 3 minutes, depending on the subject’s attention span. The same process is repeated for the cup on the right with the right hand.
If the experiment conductor notices a decline in the concentration level, indicating a lower attention span, they ask the volunteer to concentrate more on the motor imagery and apply a visual stimulus to the cup. These steps are repeated several times, ranging from 2 to 3, to capture clean and accurate EEG data while the volunteer is engaged in motor imagery tasks. It’s a careful balance of physical stillness and mental activity.
Potential applications for the motor imagery classification: Translating thoughts into actions for controlling a game, helping the handicapped control their surroundings especially in a smart home environment.
Obviously, the performance metric for such motor imagery task would be the F1 score which is harmonic mean between precision and recall.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This Zenodo record contains two test datasets (Birds and Littorina) used in the paper:
PhenoLearn: A user-friendly Toolkit for Image Annotation and Deep Learning-Based Phenotyping for Biological Datasets
Authors: Yichen He, Christopher R. Cooney, Steve Maddock, Gavin H. Thomas
PhenoLearn is a graphical and script-based toolkit designed to help biologists annotate and analyse biological images using deep learning. This dataset includes two test cases: one of bird specimen images for semantic segmentation, and another of marine snail (Littorina) images for landmark detection. These datasets are used to demonstrate the PhenoLearn workflow in the accompanying paper.
Download the dataset folders.
Use PhenoLearn to load seg_train.csv (segmentation) or pts_train.csv (landmark) to view and edit annotations.
Train segmentation or landmark prediction models directly via PhenoLearn's training module, or export data for external tools.
Use name_file_pred to match predictions with ground-truth for evaluation.
See the full tutorial and usage guide in the https://github.com/EchanHe/PhenoLearn.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains all recorded and hand-annotated as well as all synthetically generated data as well as representative trained networks used for detection and tracking experiments in the replicAnt - generating annotated images of animals in complex environments using Unreal Engine manuscript. Unless stated otherwise, all 3D animal models used in the synthetically generated data have been generated with the open-source photgrammetry platform scAnt peerj.com/articles/11155/. All synthetic data has been generated with the associated replicAnt project available from https://github.com/evo-biomech/replicAnt.
Abstract:
Deep learning-based computer vision methods are transforming animal behavioural research. Transfer learning has enabled work in non-model species, but still requires hand-annotation of example footage, and is only performant in well-defined conditions. To overcome these limitations, we created replicAnt, a configurable pipeline implemented in Unreal Engine 5 and Python, designed to generate large and variable training datasets on consumer-grade hardware instead. replicAnt places 3D animal models into complex, procedurally generated environments, from which automatically annotated images can be exported. We demonstrate that synthetic data generated with replicAnt can significantly reduce the hand-annotation required to achieve benchmark performance in common applications such as animal detection, tracking, pose-estimation, and semantic segmentation; and that it increases the subject-specificity and domain-invariance of the trained networks, so conferring robustness. In some applications, replicAnt may even remove the need for hand-annotation altogether. It thus represents a significant step towards porting deep learning-based computer vision tools to the field.
Benchmark data
Two video datasets were curated to quantify detection performance; one in laboratory and one in field conditions. The laboratory dataset consists of top-down recordings of foraging trails of Atta vollenweideri (Forel 1893) leaf-cutter ants. The colony was collected in Uruguay in 2014, and housed in a climate chamber at 25°C and 60% humidity. A recording box was built from clear acrylic, and placed between the colony nest and a box external to the climate chamber, which functioned as feeding site. Bramble leaves were placed in the feeding area prior to each recording session, and ants had access to the recording area at will. The recorded area was 104 mm wide and 200 mm long. An OAK-D camera (OpenCV AI Kit: OAK-D, Luxonis Holding Corporation) was positioned centrally 195 mm above the ground. While keeping the camera position constant, lighting, exposure, and background conditions were varied to create recordings with variable appearance: The “base” case is an evenly lit and well exposed scene with scattered leaf fragments on an otherwise plain white backdrop. A “bright” and “dark” case are characterised by systematic over- or underexposure, respectively, which introduces motion blur, colour-clipped appendages, and extensive flickering and compression artefacts. In a separate well exposed recording, the clear acrylic backdrop was substituted with a printout of a highly textured forest ground to create a “noisy” case. Last, we decreased the camera distance to 100 mm at constant focal distance, effectively doubling the magnification, and yielding a “close” case, distinguished by out-of-focus workers. All recordings were captured at 25 frames per second (fps).
The field datasets consists of video recordings of Gnathamitermes sp. desert termites, filmed close to the nest entrance in the desert of Maricopa County, Arizona, using a Nikon D850 and a Nikkor 18-105 mm lens on a tripod at camera distances between 20 cm to 40 cm. All video recordings were well exposed, and captured at 23.976 fps.
Each video was trimmed to the first 1000 frames, and contains between 36 and 103 individuals. In total, 5000 and 1000 frames were hand-annotated for the laboratory- and field-dataset, respectively: each visible individual was assigned a constant size bounding box, with a centre coinciding approximately with the geometric centre of the thorax in top-down view. The size of the bounding boxes was chosen such that they were large enough to completely enclose the largest individuals, and was automatically adjusted near the image borders. A custom-written Blender Add-on aided hand-annotation: the Add-on is a semi-automated multi animal tracker, which leverages blender’s internal contrast-based motion tracker, but also include track refinement options, and CSV export functionality. Comprehensive documentation of this tool and Jupyter notebooks for track visualisation and benchmarking is provided on the replicAnt and BlenderMotionExport GitHub repositories.
Synthetic data generation
Two synthetic datasets, each with a population size of 100, were generated from 3D models of \textit{Atta vollenweideri} leaf-cutter ants. All 3D models were created with the scAnt photogrammetry workflow. A “group” population was based on three distinct 3D models of an ant minor (1.1 mg), a media (9.8 mg), and a major (50.1 mg) (see 10.5281/zenodo.7849059)). To approximately simulate the size distribution of A. vollenweideri colonies, these models make up 20%, 60%, and 20% of the simulated population, respectively. A 33% within-class scale variation, with default hue, contrast, and brightness subject material variation, was used. A “single” population was generated using the major model only, with 90% scale variation, but equal material variation settings.
A Gnathamitermes sp. synthetic dataset was generated from two hand-sculpted models; a worker and a soldier made up 80% and 20% of the simulated population of 100 individuals, respectively with default hue, contrast, and brightness subject material variation. Both 3D models were created in Blender v3.1, using reference photographs.
Each of the three synthetic datasets contains 10,000 images, rendered at a resolution of 1024 by 1024 px, using the default generator settings as documented in the Generator_example level file (see documentation on GitHub). To assess how the training dataset size affects performance, we trained networks on 100 (“small”), 1,000 (“medium”), and 10,000 (“large”) subsets of the “group” dataset. Generating 10,000 samples at the specified resolution took approximately 10 hours per dataset on a consumer-grade laptop (6 Core 4 GHz CPU, 16 GB RAM, RTX 2070 Super).
Additionally, five datasets which contain both real and synthetic images were curated. These “mixed” datasets combine image samples from the synthetic “group” dataset with image samples from the real “base” case. The ratio between real and synthetic images across the five datasets varied between 10/1 to 1/100.
Funding
This study received funding from Imperial College’s President’s PhD Scholarship (to Fabian Plum), and is part of a project that has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program (Grant agreement No. 851705, to David Labonte). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Welcome to the Universal Roblox Character Detection Dataset (URCDD). This dataset is a comprehensive collection of images extracted from various games on the Roblox platform. Our primary objective is to offer a diverse and extensive dataset that encompasses the wide array of characters found in Roblox games.
Versions tab.We have created a unique tag for each game that we have collected data from. Refer to the list below:
1. baseplate - https://www.roblox.com/games/4483381587
2. da-hood - https://www.roblox.com/games/2788229376
3. arsenal - https://www.roblox.com/games/286090429
4. aimblox - https://www.roblox.com/games/6808416928
5. hood-customs - https://www.roblox.com/games/9825515356
6. counter-blox - https://www.roblox.com/games/301549746/
7. hood-testing - https://www.roblox.com/games/12673840215
8. phantom-forces - https://www.roblox.com/games/292439477
9. entrenched - https://www.roblox.com/games/3678761576
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The License Plates dataset is a object detection dataset of different vehicles (i.e. cars, vans, etc.) and their respective license plate. Annotations also include examples of "vehicle" and "license-plate". This dataset has a train/validation/test split of 245/70/35 respectively.
https://i.imgur.com/JmRgjBq.png" alt="Dataset Example">
This dataset could be used to create a vehicle and license plate detection object detection model. Roboflow provides a great guide on creating a license plate and vehicle object detection model.
This dataset is a subset of the Open Images Dataset. The annotations are licensed by Google LLC under CC BY 4.0 license. Some annotations have been combined or removed using Roboflow's annotation management tools to better align the annotations with the purpose of the dataset. The images have a CC BY 2.0 license.
Roboflow creates tools that make computer vision easy to use for any developer, even if you're not a machine learning expert. You can use it to organize, label, inspect, convert, and export your image datasets. And even to train and deploy computer vision models with no code required.
https://i.imgur.com/WHFqYSJ.png" alt="https://roboflow.com">
Facebook
TwitterWe present QuasarNet a novel research platform that enables deployment of data-driven modeling techniques for the investigation of the properties of super-massive black holes. Black hole data sets — observations and simulations — have grown rapidly in the last decade in both complexity and abundance. However, our computational environments and tool sets have not matured commensurately to exhaust opportunities for discovery with these observational and simulated data. Our pilot study presented here is motivated by one of the fundamental open questions that is the nature of the quasar and host halo/galaxy connection. To explore this, we co-locate large, multi-wavelength observational data sets of the high-redshift luminous population of accreting black holes at z > 3 alongside simulated data spanning the same cosmic epochs in QuasarNetW˙ e demonstrate that the properties of observed quasars as well as their putative dark matter host halos can be extracted for studying their association and correspondence. In this paper, we describe the design, implementation, and operation of the publicly queryable QuasarNet database and provide examples of query types and visualizations that can be used to explore the data. Starting with data collated in QuasarNet which will serve as training sets, we plan to utilize machine learning algorithms to predict properties of the as yet undetected, less luminous quasar population. To that ultimate goal, here we present newly developed tools that permit extracting relevant quantities for future analysis. All codes and the data itself are available for downloading from this site.
https://drive.google.com/uc?export=view&id=1F3PfdOueRqS3_ARt0RuAkIq4jUOLKyYE" alt="">
The database contains primarily the simulation datasets, and observational datasets.
PN gratefully acknowledges the invitation to Google's Science Festival SciFoo in 2018, where she first hatched this idea and thanks Sanjay Sarma and Brian Subirana at MIT for early discussions. She acknowledges Alphabet-X for technical support and computational resources for this project. KST thanks Frank Wang at Google for his help with the Google Cloud Platform, and Rick Ebert at the Infra-Red Processing and Analysis Center (IPAC) at the California Institute of Technology for his help with accessing the NED database. SK acknowledges use of the ARCHER UK National Super-computing Service (http://www.archer.ac.uk) for running the LEGACY simulation. BN acknowledges support from the Fermi National Accelerator Laboratory, managed and operated by Fermi Research Alliance, LLC under Contract No. DE-AC02-07CH11359 with the U.S. Department of Energy. The U.S. Government retains and the publisher, by accepting the article for publication, acknowledges that the U.S. Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for U.S. Government purposes. SS acknowledges the Aspen Center for Physics where parts of this work were done, which is supported by National Science Foundation grant PHY-1607611.
Your data will be in front of the world's largest data science community. What questions do you want to see answered?
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset contains more than 100K textual descriptions of cultural items from Cultura Italia (http://www.culturaitalia.it/opencms/index.jsp?language=en), the Italian National Cultural aggregator. Each of the description is labeled either HIGH or LOW quality, according its adherence to the standard cataloguing guidelines provided by Istituto Centrale per il Catalogo e la Documentazione (ICCD). More precisely, each description is labeled as HIGH quality if the object and subject of the item (for which the description is provided) are both described according to the ICCD guidelines, and as LOW quality in all other cases. Most of the dataset was manually annotated, with ~30K descriptions automatically labeled as LOW quality due to their length (less than 3 tokens) or their provenance from old (pre-2012), not curated, collections. The dataset was developed to support the training and testing of ML text classification approaches for automatically assessing the quality of textual descriptions in digital Cultural Heritage repositories.The dataset is provided as a CSV file, where each row corresponds to an item from Cultura Italia, and contains the textual description of the item, the domain of the item (OpereArteVisiva/RepertoArcheologico/Architettura) and the quality label (Low_Quality/High_Quality).The textual descriptions in the dataset are provided by Cultura Italia with a "Public Domain" license (c.f., http://www.culturaitalia.it/opencms/export/sites/culturaitalia/attachments/linked_open_data/Licenza_CulturaItalia_CC0.pdf). The whole dataset, including the annotation, is openly distributed according to the Creative Commons Attribution-ShareAlike 4.0 Generic (CC BY-SA 4.0) licence.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
TwitterOur People data is gathered and aggregated via surveys, digital services, and public data sources. We use powerful profiling algorithms to collect and ingest only fresh and reliable data points.
Our comprehensive data enrichment solution includes a variety of data sets that can help you address gaps in your customer data, gain a deeper understanding of your customers, and power superior client experiences.
People Data Schema & Reach: Our data reach represents the total number of counts available within various categories and comprises attributes such as country location, MAU, DAU & Monthly Location Pings:
Data Export Methodology: Since we collect data dynamically, we provide the most updated data and insights via a best-suited method on a suitable interval (daily/weekly/monthly).
People data Use Cases:
360-Degree Customer View: Get a comprehensive image of customers by the means of internal and external data aggregation. Data Enrichment: Leverage Online to offline consumer profiles to build holistic audience segments to improve campaign targeting using user data enrichment Fraud Detection: Use multiple digital (web and mobile) identities to verify real users and detect anomalies or fraudulent activity. Advertising & Marketing: Understand audience demographics, interests, lifestyle, hobbies, and behaviors to build targeted marketing campaigns.
Here's the schema of People Data:
person_id
first_name
last_name
age
gender
linkedin_url
twitter_url
facebook_url
city
state
address
zip
zip4
country
delivery_point_bar_code
carrier_route
walk_seuqence_code
fips_state_code
fips_country_code
country_name
latitude
longtiude
address_type
metropolitan_statistical_area
core_based+statistical_area
census_tract
census_block_group
census_block
primary_address
pre_address
streer
post_address
address_suffix
address_secondline
address_abrev
census_median_home_value
home_market_value
property_build+year
property_with_ac
property_with_pool
property_with_water
property_with_sewer
general_home_value
property_fuel_type
year
month
household_id
Census_median_household_income
household_size
marital_status
length+of_residence
number_of_kids
pre_school_kids
single_parents
working_women_in_house_hold
homeowner
children
adults
generations
net_worth
education_level
occupation
education_history
credit_lines
credit_card_user
newly_issued_credit_card_user
credit_range_new
credit_cards
loan_to_value
mortgage_loan2_amount
mortgage_loan_type
mortgage_loan2_type
mortgage_lender_code
mortgage_loan2_render_code
mortgage_lender
mortgage_loan2_lender
mortgage_loan2_ratetype
mortgage_rate
mortgage_loan2_rate
donor
investor
interest
buyer
hobby
personal_email
work_email
devices
phone
employee_title
employee_department
employee_job_function
skills
recent_job_change
company_id
company_name
company_description
technologies_used
office_address
office_city
office_country
office_state
office_zip5
office_zip4
office_carrier_route
office_latitude
office_longitude
office_cbsa_code
office_census_block_group
office_census_tract
office_county_code
company_phone
company_credit_score
company_csa_code
company_dpbc
company_franchiseflag
company_facebookurl
company_linkedinurl
company_twitterurl
company_website
company_fortune_rank
company_government_type
company_headquarters_branch
company_home_business
company_industry
company_num_pcs_used
company_num_employees
company_firm_individual
company_msa
company_msa_name
company_naics_code
company_naics_description
company_naics_code2
company_naics_description2
company_sic_code2
company_sic_code2_description
company_sic_code4
company_sic_code4_description
company_sic_code6
company_sic_code6_description
company_sic_code8
company_sic_code8_description
company_parent_company
company_parent_company_location
company_public_private
company_subsidiary_company
company_residential_business_code
company_revenue_at_side_code
company_revenue_range
company_revenue
company_sales_volume
company_small_business
company_stock_ticker
company_year_founded
company_minorityowned
company_female_owned_or_operated
company_franchise_code
company_dma
company_dma_name
company_hq_address
company_hq_city
company_hq_duns
company_hq_state
company_hq_zip5
company_hq_zip4
company_se...