100+ datasets found
  1. d

    Web Scraping Data | Key Customers Domain Name Data | Scanning Logos found on...

    • datarade.ai
    .json
    Updated Jun 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    PredictLeads (2024). Web Scraping Data | Key Customers Domain Name Data | Scanning Logos found on Websites | 222M+ Records [Dataset]. https://datarade.ai/data-products/predictleads-web-scraping-data-domain-name-data-business-predictleads
    Explore at:
    .jsonAvailable download formats
    Dataset updated
    Jun 27, 2024
    Dataset authored and provided by
    PredictLeads
    Area covered
    Malaysia, Nigeria, Svalbard and Jan Mayen, Benin, Oman, Northern Mariana Islands, Colombia, Burkina Faso, Turkmenistan, Curaçao
    Description

    PredictLeads Key Customers Data provides essential business intelligence by analyzing company relationships, uncovering vendor partnerships, client connections, and strategic affiliations through advanced web scraping and logo recognition. This dataset captures business interactions directly from company websites, offering valuable insights into market positioning, competitive landscapes, and growth opportunities.

    Use Cases:

    ✅ Account Profiling – Gain a 360-degree customer view by mapping company relationships and partnerships. ✅ Competitive Intelligence – Track vendor-client connections and business affiliations to identify key industry players. ✅ B2B Lead Targeting – Prioritize leads based on their business relationships, improving sales and marketing efficiency. ✅ CRM Data Enrichment – Enhance company records with detailed key customer data, ensuring data accuracy. ✅ Market Research – Identify emerging trends and industry networks to optimize strategic planning.

    Key API Attributes:

    • id (string, UUID) – Unique identifier for the company connection.
    • category (string) – Type of relationship (e.g., vendor, client, partner).
    • source_category (string) – Where the connection was detected (e.g., partner page, case study).
    • source_url (string, URL) – Website where the relationship was found.
    • individual_source_url (string, URL) – Specific page confirming the connection.
    • context (string) – Extracted description of the business relationship (e.g., "Company X - partners with Company Y to enhance payment processing").
    • first_seen_at (ISO 8601 date-time) – Date the connection was first detected.
    • last_seen_at (ISO 8601 date-time) – Most recent confirmation of the relationship.
    • company1 & company2 (objects) – Details of the two connected companies, including:
    • - domain (string) – Company website domain.
    • - company_name (string) – Official company name.
    • - ticker (string, nullable) – Stock ticker, if available.

    📌 PredictLeads Key Customers Data is an indispensable tool for B2B sales, marketing, and market intelligence teams, providing actionable relationship insights to drive targeted outreach, competitor tracking, and strategic decision-making.

    API Example: https://docs.predictleads.com/v3/guide/connections_dataset/data_model

  2. f

    Web Designer Express | Graphics Multimedia & Web Design | Technology Data

    • datastore.forage.ai
    Updated Sep 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Web Designer Express | Graphics Multimedia & Web Design | Technology Data [Dataset]. https://datastore.forage.ai/searchresults/?resource_keyword=web
    Explore at:
    Dataset updated
    Sep 24, 2024
    Description

    Web Designer Express is a reputable Miami-based company that has been in business for 20 years. With a team of experienced web designers and developers, they offer a wide range of services, including web design, e-commerce development, web development, and more. Their portfolio showcases over 10,000 websites designed, with a focus on creating custom, unique solutions for each client. With a presence in Miami, Florida, they cater to businesses and individuals seeking to establish a strong online presence. As a company, Web Designer Express is dedicated to building long-lasting relationships with their clients, providing personalized service, and exceeding expectations.

  3. w

    Websites using data-urls

    • webtechsurvey.com
    csv
    Updated Feb 10, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    WebTechSurvey (2025). Websites using data-urls [Dataset]. https://webtechsurvey.com/technology/data-urls
    Explore at:
    csvAvailable download formats
    Dataset updated
    Feb 10, 2025
    Dataset authored and provided by
    WebTechSurvey
    License

    https://webtechsurvey.com/termshttps://webtechsurvey.com/terms

    Time period covered
    2025
    Area covered
    Global
    Description

    A complete list of live websites using the data-urls technology, compiled through global website indexing conducted by WebTechSurvey.

  4. d

    Altosight | AI Custom Web Scraping Data | 100% Global | Free Unlimited Data...

    • datarade.ai
    .json, .csv, .xls
    Updated Sep 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Altosight (2024). Altosight | AI Custom Web Scraping Data | 100% Global | Free Unlimited Data Points | Bypassing All CAPTCHAs & Blocking Mechanisms | GDPR Compliant [Dataset]. https://datarade.ai/data-products/altosight-ai-custom-web-scraping-data-100-global-free-altosight
    Explore at:
    .json, .csv, .xlsAvailable download formats
    Dataset updated
    Sep 7, 2024
    Dataset authored and provided by
    Altosight
    Area covered
    Tajikistan, Czech Republic, Svalbard and Jan Mayen, Wallis and Futuna, Chile, Singapore, Côte d'Ivoire, Greenland, Guatemala, Paraguay
    Description

    Altosight | AI Custom Web Scraping Data

    ✦ Altosight provides global web scraping data services with AI-powered technology that bypasses CAPTCHAs, blocking mechanisms, and handles dynamic content.

    We extract data from marketplaces like Amazon, aggregators, e-commerce, and real estate websites, ensuring comprehensive and accurate results.

    ✦ Our solution offers free unlimited data points across any project, with no additional setup costs.

    We deliver data through flexible methods such as API, CSV, JSON, and FTP, all at no extra charge.

    ― Key Use Cases ―

    ➤ Price Monitoring & Repricing Solutions

    🔹 Automatic repricing, AI-driven repricing, and custom repricing rules 🔹 Receive price suggestions via API or CSV to stay competitive 🔹 Track competitors in real-time or at scheduled intervals

    ➤ E-commerce Optimization

    🔹 Extract product prices, reviews, ratings, images, and trends 🔹 Identify trending products and enhance your e-commerce strategy 🔹 Build dropshipping tools or marketplace optimization platforms with our data

    ➤ Product Assortment Analysis

    🔹 Extract the entire product catalog from competitor websites 🔹 Analyze product assortment to refine your own offerings and identify gaps 🔹 Understand competitor strategies and optimize your product lineup

    ➤ Marketplaces & Aggregators

    🔹 Crawl entire product categories and track best-sellers 🔹 Monitor position changes across categories 🔹 Identify which eRetailers sell specific brands and which SKUs for better market analysis

    ➤ Business Website Data

    🔹 Extract detailed company profiles, including financial statements, key personnel, industry reports, and market trends, enabling in-depth competitor and market analysis

    🔹 Collect customer reviews and ratings from business websites to analyze brand sentiment and product performance, helping businesses refine their strategies

    ➤ Domain Name Data

    🔹 Access comprehensive data, including domain registration details, ownership information, expiration dates, and contact information. Ideal for market research, brand monitoring, lead generation, and cybersecurity efforts

    ➤ Real Estate Data

    🔹 Access property listings, prices, and availability 🔹 Analyze trends and opportunities for investment or sales strategies

    ― Data Collection & Quality ―

    ► Publicly Sourced Data: Altosight collects web scraping data from publicly available websites, online platforms, and industry-specific aggregators

    ► AI-Powered Scraping: Our technology handles dynamic content, JavaScript-heavy sites, and pagination, ensuring complete data extraction

    ► High Data Quality: We clean and structure unstructured data, ensuring it is reliable, accurate, and delivered in formats such as API, CSV, JSON, and more

    ► Industry Coverage: We serve industries including e-commerce, real estate, travel, finance, and more. Our solution supports use cases like market research, competitive analysis, and business intelligence

    ► Bulk Data Extraction: We support large-scale data extraction from multiple websites, allowing you to gather millions of data points across industries in a single project

    ► Scalable Infrastructure: Our platform is built to scale with your needs, allowing seamless extraction for projects of any size, from small pilot projects to ongoing, large-scale data extraction

    ― Why Choose Altosight? ―

    ✔ Unlimited Data Points: Altosight offers unlimited free attributes, meaning you can extract as many data points from a page as you need without extra charges

    ✔ Proprietary Anti-Blocking Technology: Altosight utilizes proprietary techniques to bypass blocking mechanisms, including CAPTCHAs, Cloudflare, and other obstacles. This ensures uninterrupted access to data, no matter how complex the target websites are

    ✔ Flexible Across Industries: Our crawlers easily adapt across industries, including e-commerce, real estate, finance, and more. We offer customized data solutions tailored to specific needs

    ✔ GDPR & CCPA Compliance: Your data is handled securely and ethically, ensuring compliance with GDPR, CCPA and other regulations

    ✔ No Setup or Infrastructure Costs: Start scraping without worrying about additional costs. We provide a hassle-free experience with fast project deployment

    ✔ Free Data Delivery Methods: Receive your data via API, CSV, JSON, or FTP at no extra charge. We ensure seamless integration with your systems

    ✔ Fast Support: Our team is always available via phone and email, resolving over 90% of support tickets within the same day

    ― Custom Projects & Real-Time Data ―

    ✦ Tailored Solutions: Every business has unique needs, which is why Altosight offers custom data projects. Contact us for a feasibility analysis, and we’ll design a solution that fits your goals

    ✦ Real-Time Data: Whether you need real-time data delivery or scheduled updates, we provide the flexibility to receive data when you need it. Track price changes, monitor product trends, or gather...

  5. f

    Business Software Alliance | Web Hosting & Domain Names | Technology Data

    • datastore.forage.ai
    Updated Sep 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Business Software Alliance | Web Hosting & Domain Names | Technology Data [Dataset]. https://datastore.forage.ai/searchresults/?resource_keyword=web
    Explore at:
    Dataset updated
    Sep 24, 2024
    Description

    Business Software Alliance is a trade association that represents the world's leading software companies, including Autodesk, IBM, and Symantec. The organization's members are committed to promoting the use of legitimate software and ensuring the integrity of their intellectual property.

    As a result, the data housed on BSA's website is rich in information related to the software industry, including software licensing, anti-piracy efforts, and digital piracy statistics. The data includes information on software usage, software development, and the impact of piracy on the technology industry. With its focus on promoting legitimate software use, the data on BSA's website provides valuable insights into the global software industry.

  6. d

    1950 Census: Official 1950 Census Website

    • catalog.data.gov
    • datasets.ai
    Updated Mar 11, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office of Innovation (2023). 1950 Census: Official 1950 Census Website [Dataset]. https://catalog.data.gov/dataset/1950-census-official-1950-census-website
    Explore at:
    Dataset updated
    Mar 11, 2023
    Dataset provided by
    Office of Innovation
    Description

    "Website allows the public full access to the 1950 Census images, census maps and descriptions.

  7. NYC Open Data Plan: Website Data

    • data.cityofnewyork.us
    application/rdfxml +5
    Updated Aug 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office of Technology and Innovation (OTI) (2024). NYC Open Data Plan: Website Data [Dataset]. https://data.cityofnewyork.us/City-Government/NYC-Open-Data-Plan-Website-Data/duz4-2gn9
    Explore at:
    application/rdfxml, csv, application/rssxml, tsv, xml, jsonAvailable download formats
    Dataset updated
    Aug 27, 2024
    Dataset provided by
    New York City Office of Technology and Innovationhttps://www.nyc.gov/content/oti/pages/
    Authors
    Office of Technology and Innovation (OTI)
    Description

    NOTE: To review the latest plan, make sure to filter the "Report Year" column to the latest year.

    Data on public websites maintained by or on behalf of the city agencies.

  8. Z

    Network Traffic Analysis: Data and Code

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jun 12, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Honig, Joshua (2024). Network Traffic Analysis: Data and Code [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_11479410
    Explore at:
    Dataset updated
    Jun 12, 2024
    Dataset provided by
    Homan, Sophia
    Honig, Joshua
    Moran, Madeline
    Soni, Shreena
    Chan-Tin, Eric
    Ferrell, Nathan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Code:

    Packet_Features_Generator.py & Features.py

    To run this code:

    pkt_features.py [-h] -i TXTFILE [-x X] [-y Y] [-z Z] [-ml] [-s S] -j

    -h, --help show this help message and exit -i TXTFILE input text file -x X Add first X number of total packets as features. -y Y Add first Y number of negative packets as features. -z Z Add first Z number of positive packets as features. -ml Output to text file all websites in the format of websiteNumber1,feature1,feature2,... -s S Generate samples using size s. -j

    Purpose:

    Turns a text file containing lists of incomeing and outgoing network packet sizes into separate website objects with associative features.

    Uses Features.py to calcualte the features.

    startMachineLearning.sh & machineLearning.py

    To run this code:

    bash startMachineLearning.sh

    This code then runs machineLearning.py in a tmux session with the nessisary file paths and flags

    Options (to be edited within this file):

    --evaluate-only to test 5 fold cross validation accuracy

    --test-scaling-normalization to test 6 different combinations of scalers and normalizers

    Note: once the best combination is determined, it should be added to the data_preprocessing function in machineLearning.py for future use

    --grid-search to test the best grid search hyperparameters - note: the possible hyperparameters must be added to train_model under 'if not evaluateOnly:' - once best hyperparameters are determined, add them to train_model under 'if evaluateOnly:'

    Purpose:

    Using the .ml file generated by Packet_Features_Generator.py & Features.py, this program trains a RandomForest Classifier on the provided data and provides results using cross validation. These results include the best scaling and normailzation options for each data set as well as the best grid search hyperparameters based on the provided ranges.

    Data

    Encrypted network traffic was collected on an isolated computer visiting different Wikipedia and New York Times articles, different Google search queres (collected in the form of their autocomplete results and their results page), and different actions taken on a Virtual Reality head set.

    Data for this experiment was stored and analyzed in the form of a txt file for each experiment which contains:

    First number is a classification number to denote what website, query, or vr action is taking place.

    The remaining numbers in each line denote:

    The size of a packet,

    and the direction it is traveling.

    negative numbers denote incoming packets

    positive numbers denote outgoing packets

    Figure 4 Data

    This data uses specific lines from the Virtual Reality.txt file.

    The action 'LongText Search' refers to a user searching for "Saint Basils Cathedral" with text in the Wander app.

    The action 'ShortText Search' refers to a user searching for "Mexico" with text in the Wander app.

    The .xlsx and .csv file are identical

    Each file includes (from right to left):

    The origional packet data,

    each line of data organized from smallest to largest packet size in order to calculate the mean and standard deviation of each packet capture,

    and the final Cumulative Distrubution Function (CDF) caluclation that generated the Figure 4 Graph.

  9. Share of top U.S. websites ignoring user privacy preferences 2024

    • statista.com
    Updated Mar 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Share of top U.S. websites ignoring user privacy preferences 2024 [Dataset]. https://www.statista.com/statistics/1560221/us-privacy-preference-ignoring/
    Explore at:
    Dataset updated
    Mar 4, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Sep 2024
    Area covered
    United States
    Description

    As of September 2024, 75 percent of the 100 most visited websites in the United States shared personal data with advertising 3rd parties, even when users opted out. Moreover, 70 percent of them drop advertising 3rd party cookies even when users opt out.

  10. D

    Website Analytics

    • data.nola.gov
    • gimi9.com
    • +3more
    application/rdfxml +5
    Updated Feb 2, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Information Technology and Innovation Web Team (2017). Website Analytics [Dataset]. https://data.nola.gov/City-Administration/Website-Analytics/62d3-pst8
    Explore at:
    csv, tsv, xml, application/rssxml, application/rdfxml, jsonAvailable download formats
    Dataset updated
    Feb 2, 2017
    Dataset authored and provided by
    Information Technology and Innovation Web Team
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This data about nola.gov provides a window into how people are interacting with the the City of New Orleans online. The data comes from a unified Google Analytics account for New Orleans. We do not track individuals and we anonymize the IP addresses of all visitors.

  11. Crunchyroll Meta-Data

    • kaggle.com
    Updated Aug 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    BIT_Guber (2023). Crunchyroll Meta-Data [Dataset]. https://www.kaggle.com/datasets/bitguber/crunchyroll-meta-data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 15, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    BIT_Guber
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This is just prepared data from crunchyroll web scraped data using code line here I extracted meta-data from crunchyroll websites.

    Before please visit Crunchyroll

    Dataset contains 7 files

    popular.csv

    Each row represented a series in popular page. note: some information not updated ( I guess Crunchyroll not update is Popular table in Database )

    series.csv

    It's also have similar feature as popular.csv but updated data points.

    seasons.csv

    Each row represented a season from it's corresponding series.

    episodes.csv

    Information about individual episodes from it's corresponding series.

    series_music.csv

    Some series have featured music collection.

    audio.json

    Mapping full representation of audio version of episode dubbed.

    categories.json

    Mapping each categories of series ,it defined by crunchyroll.

  12. f

    Web Data | Global | Reach - 200 Million Records for Precise Audience...

    • factori.ai
    Updated Dec 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Web Data | Global | Reach - 200 Million Records for Precise Audience Segments & Market Intelligence [Dataset]. https://www.factori.ai/datasets/web-data/
    Explore at:
    Dataset updated
    Dec 24, 2024
    License

    https://www.factori.ai/privacy-policyhttps://www.factori.ai/privacy-policy

    Area covered
    Global
    Description

    We provide detailed web activity data from users browsing popular websites worldwide. This comprehensive data allows for in-depth analysis of web behavior, enabling the creation of precise audience segments based on web activity. These segments can be used to target ads effectively, focusing on users' interests and their search or browsing intent.

    Web Data Reach

    Our web data reach includes extensive counts across various categories, covering attributes such as country, anonymous ID, IP addresses, search queries, and more.

    • Record Count: 200 Million
    • Capturing Frequency: Once per Event
    • Delivering Frequency: Once per Day
    • Updated: Monthly
    • Historic Data: Past 6 Months

    Data Export Methodology

    We dynamically collect and update data, providing the latest insights through the most appropriate method at intervals that best suit your needs, whether daily, weekly, or monthly.

    Use Cases

    Our web activity data is instrumental for personalized targeting, data enrichment, market intelligence, and enhancing fraud and cybersecurity measures, helping businesses optimize their strategies and security efforts.

  13. f

    Hilco Streambank | Web Hosting & Domain Names | Technology Data

    • datastore.forage.ai
    Updated Sep 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Hilco Streambank | Web Hosting & Domain Names | Technology Data [Dataset]. https://datastore.forage.ai/searchresults/?resource_keyword=web
    Explore at:
    Dataset updated
    Sep 24, 2024
    Description

    Hilco Streambank is a trusted marketplace leader dedicated to reliable and transparent service. As the world's largest IPv4 address broker, Hilco Streambank has successfully completed more transfers than any other organization, worldwide, with over $0 billion generated for clients since 2014. The company's team has extensive experience in region internet registry transfer regulations and provides buyers and sellers with expert advice to help reach a deal that meets even the most complex of needs.

    Hilco Streambank's online marketplace provides a streamlined and transparent process to transfer the rights to IPv4 assets, including buyer and seller checklists, private brokered solutions, and LEASE IPv4 options. The company also offers the IPv4 Analyzer widget and its ReView digital IP address audit tool, a free tool working with 6connect. With operating presence in all five internet registries, including ARIN, APNIC, RIPE, LACNIC, and AFRINIC, Hilco Streambank is well-positioned to facilitate IPv4 transactions worldwide.

  14. Data used for personalization on e-commerce websites U.S. and UK 2020

    • statista.com
    Updated Dec 10, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2024). Data used for personalization on e-commerce websites U.S. and UK 2020 [Dataset]. https://www.statista.com/statistics/1211718/data-personalization-ecommerce-website-us-uk/
    Explore at:
    Dataset updated
    Dec 10, 2024
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jul 6, 2020 - Jul 20, 2020
    Area covered
    United States, United Kingdom
    Description

    During a study conducted among e-commerce professionals in the UK and the U.S. in July 2020, respondents were asked about their use of personalization on their websites. According to the results, 76 percent of survey participants were already using real-time behavioral data to personalize user experience on their e-commerce websites.

  15. d

    State of Oklahoma County Government Websites

    • catalog.data.gov
    • data.ok.gov
    • +1more
    Updated Nov 22, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.ok.gov (2024). State of Oklahoma County Government Websites [Dataset]. https://catalog.data.gov/dataset/state-of-oklahoma-county-government-websites-b5a3b
    Explore at:
    Dataset updated
    Nov 22, 2024
    Dataset provided by
    data.ok.gov
    Area covered
    Oklahoma County, Oklahoma
    Description

    List of State of Oklahoma county government websites.

  16. d

    TagX Data collection for AI/ ML training | LLM data | Data collection for AI...

    • datarade.ai
    .json, .csv, .xls
    Updated Apr 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TagX (2023). TagX Data collection for AI/ ML training | LLM data | Data collection for AI development & model finetuning | Text, image, audio, and document data [Dataset]. https://datarade.ai/data-products/data-collection-and-capture-services-tagx
    Explore at:
    .json, .csv, .xlsAvailable download formats
    Dataset updated
    Apr 13, 2023
    Dataset authored and provided by
    TagX
    Area covered
    Belize, Qatar, Antigua and Barbuda, Equatorial Guinea, Saudi Arabia, Russian Federation, Benin, Djibouti, Iceland, Colombia
    Description

    We offer comprehensive data collection services that cater to a wide range of industries and applications. Whether you require image, audio, or text data, we have the expertise and resources to collect and deliver high-quality data that meets your specific requirements. Our data collection methods include manual collection, web scraping, and other automated techniques that ensure accuracy and completeness of data.

    Our team of experienced data collectors and quality assurance professionals ensure that the data is collected and processed according to the highest standards of quality. We also take great care to ensure that the data we collect is relevant and applicable to your use case. This means that you can rely on us to provide you with clean and useful data that can be used to train machine learning models, improve business processes, or conduct research.

    We are committed to delivering data in the format that you require. Whether you need raw data or a processed dataset, we can deliver the data in your preferred format, including CSV, JSON, or XML. We understand that every project is unique, and we work closely with our clients to ensure that we deliver the data that meets their specific needs. So if you need reliable data collection services for your next project, look no further than us.

  17. w

    Websites using Custom Searchable Data Entry System

    • webtechsurvey.com
    csv
    Updated Jan 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    WebTechSurvey (2025). Websites using Custom Searchable Data Entry System [Dataset]. https://webtechsurvey.com/technology/custom-searchable-data-entry-system
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jan 13, 2025
    Dataset authored and provided by
    WebTechSurvey
    License

    https://webtechsurvey.com/termshttps://webtechsurvey.com/terms

    Time period covered
    2025
    Area covered
    Global
    Description

    A complete list of live websites using the Custom Searchable Data Entry System technology, compiled through global website indexing conducted by WebTechSurvey.

  18. BIA BOGS Open Data Site

    • catalog.data.gov
    Updated Jan 20, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bureau of Indian Affairs (2024). BIA BOGS Open Data Site [Dataset]. https://catalog.data.gov/dataset/opendata-1-bia-geospatial-hub-arcgis-com
    Explore at:
    Dataset updated
    Jan 20, 2024
    Dataset provided by
    Bureau of Indian Affairshttp://www.bia.gov/
    Description

    This site provides National level geospatial data within the open public domain that can be useful to support tribal community resiliency, research, and more. The data is available for download as CSV, KML, Shapefile, and accessible via web services to support application development and data visualization. This site contains data created and maintained by the Branch of Geospatial Support.

  19. w

    Web Data Commons - RDFa, Microdata, and Microformat Data Sets

    • webdatacommons.org
    n-quads
    Updated Oct 15, 2016
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christian Bizer; Robert Meusel; Anna Primpeli (2016). Web Data Commons - RDFa, Microdata, and Microformat Data Sets [Dataset]. http://webdatacommons.org/structureddata/2016-10/stats/stats.html
    Explore at:
    n-quadsAvailable download formats
    Dataset updated
    Oct 15, 2016
    Authors
    Christian Bizer; Robert Meusel; Anna Primpeli
    Description

    Microformat, Microdata and RDFa data from the October 2016 Common Crawl web corpus. We found structured data within 1.24 billion HTML pages out of the 3.2 billion pages contained in the crawl (38%). These pages originate from 5.63 million different pay-level-domains out of the 34 million pay-level-domains covered by the crawl (16.5%). Altogether, the extracted data sets consist of 44.2 billion RDF quads.

  20. Website updates

    • data.nasa.gov
    • data.staging.idas-ds1.appdat.jsc.nasa.gov
    • +1more
    Updated Mar 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.nasa.gov (2025). Website updates [Dataset]. https://data.nasa.gov/dataset/website-updates
    Explore at:
    Dataset updated
    Mar 31, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    Updates to Website: (Please add new items at the top of this description with the date of the website change) May 9, 2012: Uploaded experimental data in matlab format for HIRENASD November 8, 2011: New grids, experimental data for HIRENASD configuration, new FEM for HIRENASD configuration. (JHeeg) Oct 13: Uploaded BSCW grids (VGRID) (PChwalowski) Oct 5: Added HIRENASD experimental data for test points #159 and #132 (JHeeg, PChwalowski)

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
PredictLeads (2024). Web Scraping Data | Key Customers Domain Name Data | Scanning Logos found on Websites | 222M+ Records [Dataset]. https://datarade.ai/data-products/predictleads-web-scraping-data-domain-name-data-business-predictleads

Web Scraping Data | Key Customers Domain Name Data | Scanning Logos found on Websites | 222M+ Records

Explore at:
.jsonAvailable download formats
Dataset updated
Jun 27, 2024
Dataset authored and provided by
PredictLeads
Area covered
Malaysia, Nigeria, Svalbard and Jan Mayen, Benin, Oman, Northern Mariana Islands, Colombia, Burkina Faso, Turkmenistan, Curaçao
Description

PredictLeads Key Customers Data provides essential business intelligence by analyzing company relationships, uncovering vendor partnerships, client connections, and strategic affiliations through advanced web scraping and logo recognition. This dataset captures business interactions directly from company websites, offering valuable insights into market positioning, competitive landscapes, and growth opportunities.

Use Cases:

✅ Account Profiling – Gain a 360-degree customer view by mapping company relationships and partnerships. ✅ Competitive Intelligence – Track vendor-client connections and business affiliations to identify key industry players. ✅ B2B Lead Targeting – Prioritize leads based on their business relationships, improving sales and marketing efficiency. ✅ CRM Data Enrichment – Enhance company records with detailed key customer data, ensuring data accuracy. ✅ Market Research – Identify emerging trends and industry networks to optimize strategic planning.

Key API Attributes:

  • id (string, UUID) – Unique identifier for the company connection.
  • category (string) – Type of relationship (e.g., vendor, client, partner).
  • source_category (string) – Where the connection was detected (e.g., partner page, case study).
  • source_url (string, URL) – Website where the relationship was found.
  • individual_source_url (string, URL) – Specific page confirming the connection.
  • context (string) – Extracted description of the business relationship (e.g., "Company X - partners with Company Y to enhance payment processing").
  • first_seen_at (ISO 8601 date-time) – Date the connection was first detected.
  • last_seen_at (ISO 8601 date-time) – Most recent confirmation of the relationship.
  • company1 & company2 (objects) – Details of the two connected companies, including:
  • - domain (string) – Company website domain.
  • - company_name (string) – Official company name.
  • - ticker (string, nullable) – Stock ticker, if available.

📌 PredictLeads Key Customers Data is an indispensable tool for B2B sales, marketing, and market intelligence teams, providing actionable relationship insights to drive targeted outreach, competitor tracking, and strategic decision-making.

API Example: https://docs.predictleads.com/v3/guide/connections_dataset/data_model

Search
Clear search
Close search
Google apps
Main menu