100+ datasets found
  1. w

    Websites using data-urls

    • webtechsurvey.com
    csv
    Updated Feb 10, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    WebTechSurvey (2025). Websites using data-urls [Dataset]. https://webtechsurvey.com/technology/data-urls
    Explore at:
    csvAvailable download formats
    Dataset updated
    Feb 10, 2025
    Dataset authored and provided by
    WebTechSurvey
    License

    https://webtechsurvey.com/termshttps://webtechsurvey.com/terms

    Time period covered
    2025
    Area covered
    Global
    Description

    A complete list of live websites using the data-urls technology, compiled through global website indexing conducted by WebTechSurvey.

  2. Ecommerce Product Dataset | Amazon Best Seller Products | Pricing Database -...

    • datarade.ai
    .json, .xml, .csv
    Updated Dec 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    PromptCloud (2023). Ecommerce Product Dataset | Amazon Best Seller Products | Pricing Database - Global Coverage, with Custom Datasets as per Requirement | PromptCloud [Dataset]. https://datarade.ai/data-products/ecommerce-product-dataset-amazon-best-seller-products-datas-promptcloud
    Explore at:
    .json, .xml, .csvAvailable download formats
    Dataset updated
    Dec 5, 2023
    Dataset authored and provided by
    PromptCloud
    Area covered
    Guinea, Greenland, United States of America, Morocco, Uzbekistan, Spain, Côte d'Ivoire, Anguilla, Brunei Darussalam, Austria
    Description

    PromptCloud offers cutting-edge data extraction services that empower businesses with real-time, actionable intelligence from the vast expanses of the online marketplace. We are committed to putting data at the heart of your business. Reach out for a no-frills PromptCloud experience- professional, technologically ahead and reliable.

    Our Amazon Best Seller Products Dataset is a key tool for businesses looking to understand and capitalize on market trends. It allows you to identify top-selling products and sellers, and track their performance across various categories and subcategories. This dataset is invaluable for competitive intelligence, monitoring trending products, and understanding customer sentiment. It also plays a crucial role in monitoring competitor prices and enhancing product inventory, ensuring that your business stays relevant and competitive.

    Beyond Amazon, PromptCloud offers access to a diverse range of Ecommerce Product Data from various e-commerce websites. PromptCloud is a leading provider of advanced web scraping services, uniquely tailored to meet the dynamic needs of modern businesses. Our services are fully customizable, allowing clients to specify source websites, data collection frequencies, data points, and delivery mechanisms to fit their unique requirements​​. The data aggregation feature of our web crawler enables the extraction of data from multiple sources in a single stream, catering to a diverse range of ecommerce clients.

    PromptCloud is a leading provider of advanced web scraping services, uniquely tailored to meet the dynamic needs of modern businesses. Our services are fully customizable, allowing clients to specify source websites, data collection frequencies, data points, and delivery mechanisms to fit their unique requirements​​. The data aggregation feature of our web crawler enables the extraction of data from multiple sources in a single stream, catering to a diverse range of clients, from news aggregators to job boards​​.

    With over a decade of experience in extracting web data from any e-commerce website, PromptCloud stands as a seasoned veteran in the field. This extensive experience translates into high-quality, reliable data extraction, making PromptCloud your ideal product web data extraction partner. The reliability of our data is uncompromised, with a 100% verification process that ensures accuracy and trustworthiness.

  3. Concerns over the protection of personal data by websites in Sweden 2018

    • statista.com
    Updated Jul 7, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Concerns over the protection of personal data by websites in Sweden 2018 [Dataset]. https://www.statista.com/statistics/498171/concerns-over-the-protection-of-personal-data-by-websites-in-sweden/
    Explore at:
    Dataset updated
    Jul 7, 2022
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Oct 2019
    Area covered
    Sweden
    Description

    The majority of the Swedes who took part in a survey conducted on 2019, stated they were concerned that their online information was not kept secure by websites (67 percent). 31 percent of the respondents disagreed with that statement.

  4. n

    Metalloprotein Site Database

    • neuinfo.org
    • scicrunch.org
    • +2more
    Updated Mar 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Metalloprotein Site Database [Dataset]. http://identifiers.org/RRID:SCR_007780/resolver?q=&i=rrid
    Explore at:
    Dataset updated
    Mar 12, 2025
    Description

    THIS RESOURCE IS NO LONGER IN SERVICE, documented on June 24, 2013. Database and Browser containing quantitative information on all the metal-containing sites available from structures in the PDB distribution. This database contains geometrical and molecular information that allows the classification and search of particular combinations of site characteristics, and answer questions such as: How many mononuclear zinc-containing sites are five coordinate with X-ray resolution better than 1.8 Angstroms?, and then be able to visualize and manipulate the matching sites. The database also includes enough information to answer questions involving type and number of ligands (e.g. "at least 2 His"), and include distance cutoff criteria (e.g. a metal-ligand distance no more than 3.0 Angstroms and no less than 2.2 Angstroms). This database is being developed as part of a project whose ultimate goal is metalloprotein design, allowing the interactive visualization of geometrical and functional information garnered from the MDB. The database is created by automatic recognition and extraction of metal-binding sites from metal-containing proteins. Quantitative information is extracted and organized into a searchable form, by iterating through all the entries in the latest PDB release (at the moment: September 2001). This is a comprehensive quantitative database, which exists in SQL format and contains information on about 5,500 proteins.

  5. Data used for personalization on e-commerce websites U.S. and UK 2020

    • statista.com
    Updated Dec 10, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data used for personalization on e-commerce websites U.S. and UK 2020 [Dataset]. https://www.statista.com/statistics/1211718/data-personalization-ecommerce-website-us-uk/
    Explore at:
    Dataset updated
    Dec 10, 2024
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jul 6, 2020 - Jul 20, 2020
    Area covered
    United States, United Kingdom
    Description

    During a study conducted among e-commerce professionals in the UK and the U.S. in July 2020, respondents were asked about their use of personalization on their websites. According to the results, 76 percent of survey participants were already using real-time behavioral data to personalize user experience on their e-commerce websites.

  6. d

    Altosight | AI Custom Web Scraping Data | 100% Global | Free Unlimited Data...

    • datarade.ai
    .json, .csv, .xls
    Updated Sep 7, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Altosight (2024). Altosight | AI Custom Web Scraping Data | 100% Global | Free Unlimited Data Points | Bypassing All CAPTCHAs & Blocking Mechanisms | GDPR Compliant [Dataset]. https://datarade.ai/data-products/altosight-ai-custom-web-scraping-data-100-global-free-altosight
    Explore at:
    .json, .csv, .xlsAvailable download formats
    Dataset updated
    Sep 7, 2024
    Dataset authored and provided by
    Altosight
    Area covered
    Guatemala, Chile, Svalbard and Jan Mayen, Paraguay, Singapore, Côte d'Ivoire, Greenland, Czech Republic, Wallis and Futuna, Tajikistan
    Description

    Altosight | AI Custom Web Scraping Data

    ✦ Altosight provides global web scraping data services with AI-powered technology that bypasses CAPTCHAs, blocking mechanisms, and handles dynamic content.

    We extract data from marketplaces like Amazon, aggregators, e-commerce, and real estate websites, ensuring comprehensive and accurate results.

    ✦ Our solution offers free unlimited data points across any project, with no additional setup costs.

    We deliver data through flexible methods such as API, CSV, JSON, and FTP, all at no extra charge.

    ― Key Use Cases ―

    ➤ Price Monitoring & Repricing Solutions

    🔹 Automatic repricing, AI-driven repricing, and custom repricing rules 🔹 Receive price suggestions via API or CSV to stay competitive 🔹 Track competitors in real-time or at scheduled intervals

    ➤ E-commerce Optimization

    🔹 Extract product prices, reviews, ratings, images, and trends 🔹 Identify trending products and enhance your e-commerce strategy 🔹 Build dropshipping tools or marketplace optimization platforms with our data

    ➤ Product Assortment Analysis

    🔹 Extract the entire product catalog from competitor websites 🔹 Analyze product assortment to refine your own offerings and identify gaps 🔹 Understand competitor strategies and optimize your product lineup

    ➤ Marketplaces & Aggregators

    🔹 Crawl entire product categories and track best-sellers 🔹 Monitor position changes across categories 🔹 Identify which eRetailers sell specific brands and which SKUs for better market analysis

    ➤ Business Website Data

    🔹 Extract detailed company profiles, including financial statements, key personnel, industry reports, and market trends, enabling in-depth competitor and market analysis

    🔹 Collect customer reviews and ratings from business websites to analyze brand sentiment and product performance, helping businesses refine their strategies

    ➤ Domain Name Data

    🔹 Access comprehensive data, including domain registration details, ownership information, expiration dates, and contact information. Ideal for market research, brand monitoring, lead generation, and cybersecurity efforts

    ➤ Real Estate Data

    🔹 Access property listings, prices, and availability 🔹 Analyze trends and opportunities for investment or sales strategies

    ― Data Collection & Quality ―

    ► Publicly Sourced Data: Altosight collects web scraping data from publicly available websites, online platforms, and industry-specific aggregators

    ► AI-Powered Scraping: Our technology handles dynamic content, JavaScript-heavy sites, and pagination, ensuring complete data extraction

    ► High Data Quality: We clean and structure unstructured data, ensuring it is reliable, accurate, and delivered in formats such as API, CSV, JSON, and more

    ► Industry Coverage: We serve industries including e-commerce, real estate, travel, finance, and more. Our solution supports use cases like market research, competitive analysis, and business intelligence

    ► Bulk Data Extraction: We support large-scale data extraction from multiple websites, allowing you to gather millions of data points across industries in a single project

    ► Scalable Infrastructure: Our platform is built to scale with your needs, allowing seamless extraction for projects of any size, from small pilot projects to ongoing, large-scale data extraction

    ― Why Choose Altosight? ―

    ✔ Unlimited Data Points: Altosight offers unlimited free attributes, meaning you can extract as many data points from a page as you need without extra charges

    ✔ Proprietary Anti-Blocking Technology: Altosight utilizes proprietary techniques to bypass blocking mechanisms, including CAPTCHAs, Cloudflare, and other obstacles. This ensures uninterrupted access to data, no matter how complex the target websites are

    ✔ Flexible Across Industries: Our crawlers easily adapt across industries, including e-commerce, real estate, finance, and more. We offer customized data solutions tailored to specific needs

    ✔ GDPR & CCPA Compliance: Your data is handled securely and ethically, ensuring compliance with GDPR, CCPA and other regulations

    ✔ No Setup or Infrastructure Costs: Start scraping without worrying about additional costs. We provide a hassle-free experience with fast project deployment

    ✔ Free Data Delivery Methods: Receive your data via API, CSV, JSON, or FTP at no extra charge. We ensure seamless integration with your systems

    ✔ Fast Support: Our team is always available via phone and email, resolving over 90% of support tickets within the same day

    ― Custom Projects & Real-Time Data ―

    ✦ Tailored Solutions: Every business has unique needs, which is why Altosight offers custom data projects. Contact us for a feasibility analysis, and we’ll design a solution that fits your goals

    ✦ Real-Time Data: Whether you need real-time data delivery or scheduled updates, we provide the flexibility to receive data when you need it. Track price changes, monitor product trends, or gather...

  7. f

    Web Designer Express | Graphics Multimedia & Web Design | Technology Data

    • datastore.forage.ai
    Updated Sep 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Web Designer Express | Graphics Multimedia & Web Design | Technology Data [Dataset]. https://datastore.forage.ai/searchresults/?resource_keyword=web
    Explore at:
    Dataset updated
    Sep 19, 2024
    Description

    Web Designer Express is a reputable Miami-based company that has been in business for 20 years. With a team of experienced web designers and developers, they offer a wide range of services, including web design, e-commerce development, web development, and more. Their portfolio showcases over 10,000 websites designed, with a focus on creating custom, unique solutions for each client. With a presence in Miami, Florida, they cater to businesses and individuals seeking to establish a strong online presence. As a company, Web Designer Express is dedicated to building long-lasting relationships with their clients, providing personalized service, and exceeding expectations.

  8. f

    Business Software Alliance | Web Hosting & Domain Names | Technology Data

    • datastore.forage.ai
    Updated Sep 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Business Software Alliance | Web Hosting & Domain Names | Technology Data [Dataset]. https://datastore.forage.ai/searchresults/?resource_keyword=web
    Explore at:
    Dataset updated
    Sep 19, 2024
    Description

    Business Software Alliance is a trade association that represents the world's leading software companies, including Autodesk, IBM, and Symantec. The organization's members are committed to promoting the use of legitimate software and ensuring the integrity of their intellectual property.

    As a result, the data housed on BSA's website is rich in information related to the software industry, including software licensing, anti-piracy efforts, and digital piracy statistics. The data includes information on software usage, software development, and the impact of piracy on the technology industry. With its focus on promoting legitimate software use, the data on BSA's website provides valuable insights into the global software industry.

  9. Data from: CottonGen: Cotton Database Resources

    • catalog.data.gov
    • agdatacommons.nal.usda.gov
    • +1more
    Updated Mar 30, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agricultural Research Service (2024). CottonGen: Cotton Database Resources [Dataset]. https://catalog.data.gov/dataset/cottongen-cotton-database-resources-151bf
    Explore at:
    Dataset updated
    Mar 30, 2024
    Dataset provided by
    Agricultural Research Servicehttps://www.ars.usda.gov/
    Description

    CottonGen (https://www.cottongen.org) is a curated and integrated web-based relational database providing access to publicly available genomic, genetic and breeding data to enable basic, translational and applied research in cotton. Built using the open-source Tripal database infrastructure, CottonGen supersedes CottonDB and the Cotton Marker Database, which includes sequences, genetic and physical maps, genotypic and phenotypic markers and polymorphisms, quantitative trait loci (QTLs), pathogens, germplasm collections and trait evaluations, pedigrees, and relevant bibliographic citations, with enhanced tools for easier data sharing, mining, visualization, and data retrieval of cotton research data. CottonGen contains annotated whole genome sequences, unigenes from expressed sequence tags (ESTs), markers, trait loci, genetic maps, genes, taxonomy, germplasm, publications and communication resources for the cotton community. Annotated whole genome sequences of Gossypium raimondii are available with aligned genetic markers and transcripts. These whole genome data can be accessed through genome pages, search tools and GBrowse, a popular genome browser. Most of the published cotton genetic maps can be viewed and compared using CMap, a comparative map viewer, and are searchable via map search tools. Search tools also exist for markers, quantitative trait loci (QTLs), germplasm, publications and trait evaluation data. CottonGen also provides online analysis tools such as NCBI BLAST and Batch BLAST. This project is funded/supported by Cotton Incorporated, the USDA-ARS Crop Germplasm Research Unit at College Station, TX, the Southern Association of Agricultural Experiment Station Directors, Bayer CropScience, Corteva/Agriscience, Dow/Phytogen, Monsanto, Washington State University, and NRSP10. Resources in this dataset:Resource Title: Website Pointer for CottonGen. File Name: Web Page, url: https://www.cottongen.org/ Genomic, Genetic and Breeding Resources for Cotton Research Discovery and Crop Improvement organized by : Species (Gossypium arboreum, barbadense, herbaceum, hirsutum, raimondii, others), Data (Contributors, Download, Submission, Community Projects, Archives, Cotton Trait Ontology, Nomenclatures, and links to Variety Testing Data and NCBISRA Datasets), Search options (Colleague, Genes and Transcripts, Genotype, Germplasm, Map, Markers, Publications, QTLs, Sequences, Trait Evaluation, MegaSearch), Tools (BIMS, BLAST+, CottonCyc, JBrowse, Map Viewer, Primer3, Sequence Retrieval, Synteny Viewer), International Cotton Genome Initiative (ICGI), and Help sources (User manual, FAQs). Also provides Quick Start links for Major Species and Tools.

  10. Share of top U.S. websites ignoring user privacy preferences 2024

    • statista.com
    Updated Mar 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Share of top U.S. websites ignoring user privacy preferences 2024 [Dataset]. https://www.statista.com/statistics/1560221/us-privacy-preference-ignoring/
    Explore at:
    Dataset updated
    Mar 4, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Sep 2024
    Area covered
    United States
    Description

    As of September 2024, 75 percent of the 100 most visited websites in the United States shared personal data with advertising 3rd parties, even when users opted out. Moreover, 70 percent of them drop advertising 3rd party cookies even when users opt out.

  11. Magnetotelluric (MT) Sites database

    • researchdata.edu.au
    • ecat.ga.gov.au
    Updated Sep 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ravindranath, P.; Sedgmen, A.; Jiang, W.; Sexton, M; Duan, J.; Hawkins, S.; Kyi, D.; Sexton, M; Sedgmen, A.; Ravindranath, P.; Kyi, D.; Jiang, W.; Hawkins, S.; Duan, J. (2023). Magnetotelluric (MT) Sites database [Dataset]. http://doi.org/10.26186/148701
    Explore at:
    Dataset updated
    Sep 12, 2023
    Dataset provided by
    Geoscience Australiahttp://ga.gov.au/
    Authors
    Ravindranath, P.; Sedgmen, A.; Jiang, W.; Sexton, M; Duan, J.; Hawkins, S.; Kyi, D.; Sexton, M; Sedgmen, A.; Ravindranath, P.; Kyi, D.; Jiang, W.; Hawkins, S.; Duan, J.
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Description
    The Magnetotelluric (MT) Sites database contains the location of sites where magnetotelluric (MT) data have been acquired by surveys. These surveys have been undertaken by Geoscience Australia and its predecessor organisations and collaborative partners including, but not limited to, the Geological Survey of New South Wales, the Northern Territory Geological Survey, the Geological Survey of Queensland, the Geological Survey of South Australia, Mineral Resources Tasmania, the Geological Survey of Victoria and the Geological Survey of Western Australia and their parent government departments, AuScope, the University of Adelaide, Curtin University and University of Tasmania. Database development was completed as part of Exploring for the Future (EFTF) and the database will utilised for ongoing storage of site information from future MT acquisition projects beyond EFTF. Location, elevation, data acquisition date and instrument information are provided with each site. The MT Sites database is a subset of tables within the larger Geophysical Surveys and Datasets Database.

    The resource is accessible via the Geoscience Australia Portal (https://portal.ga.gov.au/), use Magnetotelluric as your search term to find the relevant data.
  12. USDA Nematode Collection Database

    • agdatacommons.nal.usda.gov
    • catalog.data.gov
    • +2more
    bin
    Updated Nov 30, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Department of Agriculture, Agricultural Research Service (2023). USDA Nematode Collection Database [Dataset]. http://doi.org/10.15482/USDA.ADC/1326824
    Explore at:
    binAvailable download formats
    Dataset updated
    Nov 30, 2023
    Dataset provided by
    United States Department of Agriculturehttp://usda.gov/
    Agricultural Research Servicehttps://www.ars.usda.gov/
    Authors
    U.S. Department of Agriculture, Agricultural Research Service
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Description

    The USDA Nematode Collection is one of the largest and most valuable nematode collections in existence. It contains over 49,000 permanent slides and vials, with a total repository of nematode specimens reaching several million, including Cobb-Steiner, Thorne, and other valuable collections. Nematodes contained in this collection originate from world-wide sources. The USDA Nematode Collection Database contains over 38,000 species entries. A broad range of data is stored for each specimen, including species, host, origin, collector, date collected and date received. All records are searchable and available to the public through the online database. The physical collection is housed at the USDA Nematology Laboratory in Beltsville, MD. Specimens are available for loan to scientists who cannot personally visit the collection. Please see the Policy for Loaning USDANC Specimens for more information on this process. Scientists and other workers are always welcomed and encouraged to deposit material into the collection. Resources in this dataset:Resource Title: USDA Nematode Collection Database. File Name: Web Page, url: https://nt.ars-grin.gov/nematodes/search.cfm The database portal for this collection

  13. Data from: Afromoths, online database of Afrotropical moth species...

    • gbif.org
    Updated Oct 21, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jurate De Prins; Willy De Prins; Jurate De Prins; Willy De Prins (2024). Afromoths, online database of Afrotropical moth species (Lepidoptera) [Dataset]. http://doi.org/10.15468/s1kwuw
    Explore at:
    Dataset updated
    Oct 21, 2024
    Dataset provided by
    Global Biodiversity Information Facilityhttps://www.gbif.org/
    Belgian Biodiversity Platform
    Authors
    Jurate De Prins; Willy De Prins; Jurate De Prins; Willy De Prins
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Description

    This dataset covers all relevant information on every Afrotropical moth species. The zoogeographic area covered can be defined as the Africa continent south of the Sahara (i.e. excl. Morocco, Algeria, Tunisia, Libya and Egypt), the islands in the Atlantic Ocean: Amsterdam Island, Ascension, Cape Verde Archipelago, Inaccessible Island, St. Helena, São Tomé and Principe, Tristan da Cunha, and the islands in the Indian Ocean: Comores (Anjouan, Grande Comore, Mayotte, Mohéli), Madagascar, Mascarene Islands (La Réunion, Mauritius, Rodrigues), Seychelles (Félicité, Mahé, Praslin, Silhouette, a.o.). Furthermore, also those moth species occurring in the transition zone to the Palaearctic fauna have been included, namely most of the Arabia Peninsula (Kuwait, Oman, Saudi Arabia, United Arab Emirates, Yemen with Socotra) but not Iraq, Jordan and further north. Also, some Saharan species have been included (e. g. Hoggar Mts. in Algeria, Tibesti Mts. in South Libya). Utmost care was taken that the data incorporated in the database are correct. We decline any responsibility in case of damage to soft- or hardware based on information used in this website. Persons retrieving information from this website for their own research or for applied aspects such as pest control programmes, should acknowledge the usage of data from this website in the following format: De Prins, J. & De Prins, W. 2011. Afromoths, online database of Afrotropical moth species (Lepidoptera). World Wide Web electronic publication (www.afromoths.net)

  14. d

    Credibility Corpus with several datasets (Twitter, Web database) in French...

    • data.gouv.fr
    • data.europa.eu
    • +1more
    application/rar
    Updated Dec 1, 2016
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nicolas turenne (2016). Credibility Corpus with several datasets (Twitter, Web database) in French and English [Dataset]. https://www.data.gouv.fr/en/datasets/credibility-corpus-with-several-datasets-twitter-web-database-in-french-and-english/
    Explore at:
    application/rar(33261), application/rar(680351), application/rar(102374), application/rar(40693), application/rar(77120), application/rar(212274)Available download formats
    Dataset updated
    Dec 1, 2016
    Authors
    nicolas turenne
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    French
    Description

    Description of the corpora The set of these datasets are made to analyze ifnormation credibility in general (rumor and disinformation for English and French documents), and occuring on the social web. Target databases about rumor, hoax and disinformation helped to collect obviously misinformation. Some topic (with keywords) helps us to made corpora from the micrroblogging platform Twitter, great provider of rumors and disinformation. 1 corpus describes Texts from the web database about rumors and disinformation. 4 corpora from Social Media Twitter about specific rumors (2 in English, 2 in French). 4 corpora from Social Media Twitter randomly built (2 in English, 2 in French). 4 corpora from Social Media Twitter about specific rumors (2 in English, 2 in French). Size of different corpora : Social Web Rumorous corpus: 1,612 French Hollande Rumorous corpus (Twitter): 371 French Lemon Rumorous corpus (Twitter): 270 English Pin Rumorous corpus (Twitter): 679 English Swine Rumorous corpus (Twitter): 1024 French 1st Random corpus (Twitter): 1000 French 2st Random corpus (Twitter): 1000 English 3st Random corpus (Twitter): 1000 English 4st Random corpus (Twitter): 1000 French Rihanna Event corpus (Twitter): 543 English Rihanna Event corpus (Twitter): 1000 French Euro2016 Event corpus (Twitter): 1000 English Euro2016 Event corpus (Twitter): 1000 A matrix links tweets with most 50 frequent words Text data : _id : message id body text : string text data Matrix data : 52 columns (first column is id, second column is rumor indicator 1 or -1, other columns are words value is 1 contain or 0 does not contain) 11,102 lines (each line is a message) Hidalgo corpus: lines range 1:75 Lemon corpus : lines range 76:467 Pin rumor : lines range 468:656 swine : lines range 657:1311 random messages : lines range 1312:11103 Sample contains : French Pin Rumorous corpus (Twitter): 679 Matrix data : 52 columns (first column is id, second column is rumor indicator 1 or -1, other columns are words value is 1 contain or 0 does not contain) 189 lines (each line is a message)

  15. NOAA/WDS Paleoclimatology - Baker, R.G., Brayton Site (BRAYTON) North...

    • catalog.data.gov
    • s.cnmilf.com
    Updated Oct 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NOAA National Centers for Environmental Information (Point of Contact); NOAA World Data Service for Paleoclimatology (Point of Contact) (2023). NOAA/WDS Paleoclimatology - Baker, R.G., Brayton Site (BRAYTON) North American Plant Macrofossil Database [Dataset]. https://catalog.data.gov/dataset/noaa-wds-paleoclimatology-baker-r-g-brayton-site-brayton-north-american-plant-macrofossil-datab1
    Explore at:
    Dataset updated
    Oct 1, 2023
    Dataset provided by
    National Oceanic and Atmospheric Administrationhttp://www.noaa.gov/
    National Centers for Environmental Informationhttps://www.ncei.noaa.gov/
    Description

    This archived Paleoclimatology Study is available from the NOAA National Centers for Environmental Information (NCEI), under the World Data Service (WDS) for Paleoclimatology. The associated NCEI study type is Plant Macrofossil. The data include parameters of plant macrofossil (population abundance) with a geographic location of Iowa, United States Of America. The time period coverage is from 14473 to 14447 in calendar years before present (BP). See metadata information for parameter and study location details. Please cite this study when using the data.

  16. Z

    PIBASE.ligands database mysql dump

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Davis, Fred (2020). PIBASE.ligands database mysql dump [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_29588
    Explore at:
    Dataset updated
    Jan 24, 2020
    Dataset authored and provided by
    Davis, Fred
    License

    https://www.gnu.org/licenses/old-licenses/gpl-2.0-standalone.htmlhttps://www.gnu.org/licenses/old-licenses/gpl-2.0-standalone.html

    Description

    This dataset is a mysql dump of the PIBASE.ligands database describing the overlap of small molecule and protein binding sites ( http://fredpdavis.com/pibase.ligands ) described in:

    The overlap of small molecule and protein binding sites within families of protein structures. Davis FP, Sali A. PLoS Comput Biology 2010 6(2): e1000668. doi:10.1371/journal.pcbi.1000668

  17. f

    Hilco Streambank | Web Hosting & Domain Names | Technology Data

    • datastore.forage.ai
    Updated Sep 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Hilco Streambank | Web Hosting & Domain Names | Technology Data [Dataset]. https://datastore.forage.ai/searchresults/?resource_keyword=web
    Explore at:
    Dataset updated
    Sep 19, 2024
    Description

    Hilco Streambank is a trusted marketplace leader dedicated to reliable and transparent service. As the world's largest IPv4 address broker, Hilco Streambank has successfully completed more transfers than any other organization, worldwide, with over $0 billion generated for clients since 2014. The company's team has extensive experience in region internet registry transfer regulations and provides buyers and sellers with expert advice to help reach a deal that meets even the most complex of needs.

    Hilco Streambank's online marketplace provides a streamlined and transparent process to transfer the rights to IPv4 assets, including buyer and seller checklists, private brokered solutions, and LEASE IPv4 options. The company also offers the IPv4 Analyzer widget and its ReView digital IP address audit tool, a free tool working with 6connect. With operating presence in all five internet registries, including ARIN, APNIC, RIPE, LACNIC, and AFRINIC, Hilco Streambank is well-positioned to facilitate IPv4 transactions worldwide.

  18. Z

    SQL Injection Attack Netflow

    • data.niaid.nih.gov
    • zenodo.org
    Updated Sep 28, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Adrián Campazas (2022). SQL Injection Attack Netflow [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6907251
    Explore at:
    Dataset updated
    Sep 28, 2022
    Dataset provided by
    Adrián Campazas
    Ignacio Crespo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Introduction

    This datasets have SQL injection attacks (SLQIA) as malicious Netflow data. The attacks carried out are SQL injection for Union Query and Blind SQL injection. To perform the attacks, the SQLMAP tool has been used.

    NetFlow traffic has generated using DOROTHEA (DOcker-based fRamework fOr gaTHering nEtflow trAffic). NetFlow is a network protocol developed by Cisco for the collection and monitoring of network traffic flow data generated. A flow is defined as a unidirectional sequence of packets with some common properties that pass through a network device.

    Datasets

    The firts dataset was colleted to train the detection models (D1) and other collected using different attacks than those used in training to test the models and ensure their generalization (D2).

    The datasets contain both benign and malicious traffic. All collected datasets are balanced.

    The version of NetFlow used to build the datasets is 5.

        Dataset
        Aim
        Samples
        Benign-malicious
        traffic ratio
    
    
    
    
        D1
        Training
        400,003
        50%
    
    
        D2
        Test
        57,239
        50%
    

    Infrastructure and implementation

    Two sets of flow data were collected with DOROTHEA. DOROTHEA is a Docker-based framework for NetFlow data collection. It allows you to build interconnected virtual networks to generate and collect flow data using the NetFlow protocol. In DOROTHEA, network traffic packets are sent to a NetFlow generator that has a sensor ipt_netflow installed. The sensor consists of a module for the Linux kernel using Iptables, which processes the packets and converts them to NetFlow flows.

    DOROTHEA is configured to use Netflow V5 and export the flow after it is inactive for 15 seconds or after the flow is active for 1800 seconds (30 minutes)

    Benign traffic generation nodes simulate network traffic generated by real users, performing tasks such as searching in web browsers, sending emails, or establishing Secure Shell (SSH) connections. Such tasks run as Python scripts. Users may customize them or even incorporate their own. The network traffic is managed by a gateway that performs two main tasks. On the one hand, it routes packets to the Internet. On the other hand, it sends it to a NetFlow data generation node (this process is carried out similarly to packets received from the Internet).

    The malicious traffic collected (SQLI attacks) was performed using SQLMAP. SQLMAP is a penetration tool used to automate the process of detecting and exploiting SQL injection vulnerabilities.

    The attacks were executed on 16 nodes and launch SQLMAP with the parameters of the following table.

        Parameters
        Description
    
    
    
    
        '--banner','--current-user','--current-db','--hostname','--is-dba','--users','--passwords','--privileges','--roles','--dbs','--tables','--columns','--schema','--count','--dump','--comments', --schema'
        Enumerate users, password hashes, privileges, roles, databases, tables and columns
    
    
        --level=5
        Increase the probability of a false positive identification
    
    
        --risk=3
        Increase the probability of extracting data
    
    
        --random-agent
        Select the User-Agent randomly
    
    
        --batch
        Never ask for user input, use the default behavior
    
    
        --answers="follow=Y"
        Predefined answers to yes
    

    Every node executed SQLIA on 200 victim nodes. The victim nodes had deployed a web form vulnerable to Union-type injection attacks, which was connected to the MYSQL or SQLServer database engines (50% of the victim nodes deployed MySQL and the other 50% deployed SQLServer).

    The web service was accessible from ports 443 and 80, which are the ports typically used to deploy web services. The IP address space was 182.168.1.1/24 for the benign and malicious traffic-generating nodes. For victim nodes, the address space was 126.52.30.0/24. The malicious traffic in the test sets was collected under different conditions. For D1, SQLIA was performed using Union attacks on the MySQL and SQLServer databases.

    However, for D2, BlindSQL SQLIAs were performed against the web form connected to a PostgreSQL database. The IP address spaces of the networks were also different from those of D1. In D2, the IP address space was 152.148.48.1/24 for benign and malicious traffic generating nodes and 140.30.20.1/24 for victim nodes.

    To run the MySQL server we ran MariaDB version 10.4.12. Microsoft SQL Server 2017 Express and PostgreSQL version 13 were used.

  19. d

    CBP Customs Rulings Online Search System (CROSS)

    • catalog.data.gov
    • gimi9.com
    • +2more
    Updated Oct 19, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    BEMSD (2022). CBP Customs Rulings Online Search System (CROSS) [Dataset]. https://catalog.data.gov/dataset/cbp-customs-rulings-online-search-system-cross
    Explore at:
    Dataset updated
    Oct 19, 2022
    Dataset provided by
    BEMSD
    Description

    CROSS is a searchable database of CBP rulings that can be retrieved based on simple or complex search characteristics using keywords and Boolean operators. CROSS has the added functionality of CROSS referencing rulings from the initial search result set with their modified, revoked or referenced counterparts. Rulings collections are separated into Headquarters and New York.

  20. E

    World Sites

    • ecaidata.org
    Updated Oct 4, 2014
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ECAI Clearinghouse (2014). World Sites [Dataset]. https://ecaidata.org/dataset/ecaiclearinghouse-id-269
    Explore at:
    Dataset updated
    Oct 4, 2014
    Dataset provided by
    ECAI Clearinghouse
    Area covered
    World
    Description

    Initial data source was UNESCO web site, supplemented by individual work on different countires/regions;A database of cultural heritage sites assembled by volunteers at the Archaeological Computing Laboratory, University of Sydney;Database is now availabe online through ECAI and can be updated through a password-controlled web browser interface

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
WebTechSurvey (2025). Websites using data-urls [Dataset]. https://webtechsurvey.com/technology/data-urls

Websites using data-urls

Explore at:
csvAvailable download formats
Dataset updated
Feb 10, 2025
Dataset authored and provided by
WebTechSurvey
License

https://webtechsurvey.com/termshttps://webtechsurvey.com/terms

Time period covered
2025
Area covered
Global
Description

A complete list of live websites using the data-urls technology, compiled through global website indexing conducted by WebTechSurvey.

Search
Clear search
Close search
Google apps
Main menu