100+ datasets found
  1. 365 Data Science Web site statistics

    • kaggle.com
    zip
    Updated Aug 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    yasser messahli (2024). 365 Data Science Web site statistics [Dataset]. https://www.kaggle.com/yassermessahli/365-data-science-web-site-statistics
    Explore at:
    zip(3895191 bytes)Available download formats
    Dataset updated
    Aug 9, 2024
    Authors
    yasser messahli
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    365 Data Science Database

    365 Data Science is a website that provides online courses and resources for learning data science, machine learning, and data analysis.

    It is common for websites that offer online courses to have **databases **to store information about their courses, students, and progress. It is also possible that they use databases for storing and organizing the data used in their courses and examples.

    If you're looking for specific information about the database used by 365 Data Science, I recommend reaching out to them directly through their Website or support channels.

  2. w

    Websites using Participants Database

    • webtechsurvey.com
    csv
    Updated Jul 2, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    WebTechSurvey (2025). Websites using Participants Database [Dataset]. https://webtechsurvey.com/technology/participants-database
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jul 2, 2025
    Dataset authored and provided by
    WebTechSurvey
    License

    https://webtechsurvey.com/termshttps://webtechsurvey.com/terms

    Time period covered
    2025
    Area covered
    Global
    Description

    A complete list of live websites using the Participants Database technology, compiled through global website indexing conducted by WebTechSurvey.

  3. o

    PhishingWebsites

    • openml.org
    Updated Feb 16, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rami Mustafa A Mohammad ( University of Huddersfield; rami.mohammad '@' hud.ac.uk; rami.mustafa.a '@' gmail.com) Lee McCluskey (University of Huddersfield; t.l.mccluskey '@' hud.ac.uk ) Fadi Thabtah (Canadian University of Dubai; fadi '@' cud.ac.ae) (2016). PhishingWebsites [Dataset]. https://www.openml.org/d/4534
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 16, 2016
    Authors
    Rami Mustafa A Mohammad ( University of Huddersfield; rami.mohammad '@' hud.ac.uk; rami.mustafa.a '@' gmail.com) Lee McCluskey (University of Huddersfield; t.l.mccluskey '@' hud.ac.uk ) Fadi Thabtah (Canadian University of Dubai; fadi '@' cud.ac.ae)
    Description

    Author: Rami Mustafa A Mohammad ( University of Huddersfield","rami.mohammad '@' hud.ac.uk","rami.mustafa.a '@' gmail.com) Lee McCluskey (University of Huddersfield","t.l.mccluskey '@' hud.ac.uk ) Fadi Thabtah (Canadian University of Dubai","fadi '@' cud.ac.ae)
    Source: UCI
    Please cite: Please refer to the Machine Learning Repository's citation policy

    Source:

    Rami Mustafa A Mohammad ( University of Huddersfield, rami.mohammad '@' hud.ac.uk, rami.mustafa.a '@' gmail.com) Lee McCluskey (University of Huddersfield,t.l.mccluskey '@' hud.ac.uk ) Fadi Thabtah (Canadian University of Dubai,fadi '@' cud.ac.ae)

    Data Set Information:

    One of the challenges faced by our research was the unavailability of reliable training datasets. In fact this challenge faces any researcher in the field. However, although plenty of articles about predicting phishing websites have been disseminated these days, no reliable training dataset has been published publically, may be because there is no agreement in literature on the definitive features that characterize phishing webpages, hence it is difficult to shape a dataset that covers all possible features. In this dataset, we shed light on the important features that have proved to be sound and effective in predicting phishing websites. In addition, we propose some new features.

    Attribute Information:

    For Further information about the features see the features file in the data folder of UCI.

    Relevant Papers:

    Mohammad, Rami, McCluskey, T.L. and Thabtah, Fadi (2012) An Assessment of Features Related to Phishing Websites using an Automated Technique. In: International Conferece For Internet Technology And Secured Transactions. ICITST 2012 . IEEE, London, UK, pp. 492-497. ISBN 978-1-4673-5325-0

    Mohammad, Rami, Thabtah, Fadi Abdeljaber and McCluskey, T.L. (2014) Predicting phishing websites based on self-structuring neural network. Neural Computing and Applications, 25 (2). pp. 443-458. ISSN 0941-0643

    Mohammad, Rami, McCluskey, T.L. and Thabtah, Fadi Abdeljaber (2014) Intelligent Rule based Phishing Websites Classification. IET Information Security, 8 (3). pp. 153-160. ISSN 1751-8709

    Citation Request:

    Please refer to the Machine Learning Repository's citation policy

  4. w

    Websites using Advanced Database Cleaner

    • webtechsurvey.com
    csv
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    WebTechSurvey, Websites using Advanced Database Cleaner [Dataset]. https://webtechsurvey.com/technology/advanced-database-cleaner
    Explore at:
    csvAvailable download formats
    Dataset authored and provided by
    WebTechSurvey
    License

    https://webtechsurvey.com/termshttps://webtechsurvey.com/terms

    Time period covered
    2025
    Area covered
    Global
    Description

    A complete list of live websites using the Advanced Database Cleaner technology, compiled through global website indexing conducted by WebTechSurvey.

  5. Database of a Domain Store

    • kaggle.com
    zip
    Updated Sep 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Artin Mohammadi (2023). Database of a Domain Store [Dataset]. https://www.kaggle.com/datasets/sheikhartin/a-fake-dataset-from-a-domain-store
    Explore at:
    zip(493020 bytes)Available download formats
    Dataset updated
    Sep 2, 2023
    Authors
    Artin Mohammadi
    Description

    This work was for a YouTube video in which we wanted to learn how to create fake data and how to query the database...

  6. w

    Websites using Wordpress Database Reset

    • webtechsurvey.com
    csv
    Updated Oct 9, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    WebTechSurvey (2025). Websites using Wordpress Database Reset [Dataset]. https://webtechsurvey.com/technology/wordpress-database-reset
    Explore at:
    csvAvailable download formats
    Dataset updated
    Oct 9, 2025
    Dataset authored and provided by
    WebTechSurvey
    License

    https://webtechsurvey.com/termshttps://webtechsurvey.com/terms

    Time period covered
    2025
    Area covered
    Global
    Description

    A complete list of live websites using the Wordpress Database Reset technology, compiled through global website indexing conducted by WebTechSurvey.

  7. w

    Websites using Wp Database Error Manager

    • webtechsurvey.com
    csv
    Updated Oct 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    WebTechSurvey (2025). Websites using Wp Database Error Manager [Dataset]. https://webtechsurvey.com/technology/wp-database-error-manager
    Explore at:
    csvAvailable download formats
    Dataset updated
    Oct 10, 2025
    Dataset authored and provided by
    WebTechSurvey
    License

    https://webtechsurvey.com/termshttps://webtechsurvey.com/terms

    Time period covered
    2025
    Area covered
    Global
    Description

    A complete list of live websites using the Wp Database Error Manager technology, compiled through global website indexing conducted by WebTechSurvey.

  8. CS Track database - Dataset

    • zenodo.org
    • data.europa.eu
    csv
    Updated Nov 28, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TIDE-UPF; TIDE-UPF (2022). CS Track database - Dataset [Dataset]. http://doi.org/10.5281/zenodo.7356627
    Explore at:
    csvAvailable download formats
    Dataset updated
    Nov 28, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    TIDE-UPF; TIDE-UPF
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is the main dataset which consist a list all relevant details of the CS Track database. The database contains information about 4949 Citizen Science (CS) projects extracted for more than 59 websites. This dataset contains the following information from the CS Track database:

    • CS projects title
    • the data extracted date
    • the language of the CS projects informations
    • the URL(s) of the website(s) from where the CS projects information was extracted. For other studies developed in CS Track consortium it might be useful to consult this data
    • full list of assignments for research areas and SDGs for each CS project.
  9. D

    CompuCrawl: Full database and code

    • dataverse.nl
    Updated Sep 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Richard Haans; Richard Haans (2025). CompuCrawl: Full database and code [Dataset]. http://doi.org/10.34894/OBVAOY
    Explore at:
    Dataset updated
    Sep 23, 2025
    Dataset provided by
    DataverseNL
    Authors
    Richard Haans; Richard Haans
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This folder contains the full set of code and data for the CompuCrawl database. The database contains the archived websites of publicly traded North American firms listed in the Compustat database between 1996 and 2020\u2014representing 11,277 firms with 86,303 firm/year observations and 1,617,675 webpages in the final cleaned and selected set.The files are ordered by moment of use in the work flow. For example, the first file in the list is the input file for code files 01 and 02, which create and update the two tracking files "scrapedURLs.csv" and "URLs_1_deeper.csv" and which write HTML files to its folder. "HTML.zip" is the resultant folder, converted to .zip for ease of sharing. Code file 03 then reads this .zip file and is therefore below it in the ordering.The full set of files, in order of use, is as follows:Compustat_2021.xlsx: The input file containing the URLs to be scraped and their date range.01 Collect frontpages.py: Python script scraping the front pages of the list of URLs and generating a list of URLs one page deeper in the domains.URLs_1_deeper.csv: List of URLs one page deeper on the main domains.02 Collect further pages.py: Python script scraping the list of URLs one page deeper in the domains.scrapedURLs.csv: Tracking file containing all URLs that were accessed and their scraping status.HTML.zip: Archived version of the set of individual HTML files.03 Convert HTML to plaintext.py: Python script converting the individual HTML pages to plaintext.TXT_uncleaned.zip: Archived version of the converted yet uncleaned plaintext files.input_categorization_allpages.csv: Input file for classification of pages using GPT according to their HTML title and URL.04 GPT application.py: Python script using OpenAI\u2019s API to classify selected pages according to their HTML title and URL.categorization_applied.csv: Output file containing classification of selected pages.exclusion_list.xlsx: File containing three sheets: 'gvkeys' containing the GVKEYs of duplicate observations (that need to be excluded), 'pages' containing page IDs for pages that should be removed, and 'sentences' containing (sub-)sentences to be removed.05 Clean and select.py: Python script applying data selection and cleaning (including selection based on page category), with setting and decisions described at the top of the script. This script also combined individual pages into one combined observation per GVKEY/year.metadata.csv: Metadata containing information on all processed HTML pages, including those not selected.TXT_cleaned.zip: Archived version of the selected and cleaned plaintext page files. This file serves as input for the word embeddings application.TXT_combined.zip: Archived version of the combined plaintext files at the GVKEY/year level. This file serves as input for the data description using topic modeling.06 Topic model.R: R script that loads up the combined text data from the folder stored in "TXT_combined.zip", applies further cleaning, and estimates a 125-topic model.TM_125.RData: RData file containing the results of the 125-topic model.loadings125.csv: CSV file containing the loadings for all 125 topics for all GVKEY/year observations that were included in the topic model.125_topprob.xlsx: Overview of top-loading terms for the 125 topic model.07 Word2Vec train and align.py: Python script that loads the plaintext files in the "TXT_cleaned.zip" archive to train a series of Word2Vec models and subsequently align them in order to compare word embeddings across time periods.Word2Vec_models.zip: Archived version of the saved Word2Vec models, both unaligned and aligned.08 Word2Vec work with aligned models.py: Python script which loads the trained Word2Vec models to trace the development of the embeddings for the terms \u201csustainability\u201d and \u201cprofitability\u201d over time.99 Scrape further levels down.py: Python script that can be used to generate a list of unscraped URLs from the pages that themselves were one level deeper than the front page.URLs_2_deeper.csv: CSV file containing unscraped URLs from the pages that themselves were one level deeper than the front page.For those only interested in downloading the final database of texts, the files "HTML.zip", "TXT_uncleaned.zip", "TXT_cleaned.zip", and "TXT_combined.zip" contain the full set of HTML pages, the processed but uncleaned texts, the selected and cleaned texts, and combined and cleaned texts at the GVKEY/year level, respectively.The following webpage contains answers to frequently asked questions: https://haans-mertens.github.io/faq/. More information on the database and the underlying project can be found here: https://haans-mertens.github.io/ and the following article: \u201cThe Internet Never Forgets: A Four-Step Scraping Tutorial, Codebase, and Database for Longitudinal Organizational Website Data\u201d, by Richard F.J. Haans and Marc J. Mertens in Organizational Research Methods. The full paper can be accessed here.

  10. Service Provider Database

    • catalog.data.gov
    Updated Sep 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    International Trade Administration (2025). Service Provider Database [Dataset]. https://catalog.data.gov/dataset/service-provider-database-46d19
    Explore at:
    Dataset updated
    Sep 30, 2025
    Dataset provided by
    International Trade Administrationhttp://trade.gov/
    Description

    Database of Service Provider Names, Websites, Mission, Location by Country, and Service Type who participated in the SelectUSA 2017 and 2018 Investment Summits

  11. detecting phishing websites

    • kaggle.com
    zip
    Updated May 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    gov (2024). detecting phishing websites [Dataset]. https://www.kaggle.com/datasets/mohammedsemry/detecting-phishing-websites
    Explore at:
    zip(83149 bytes)Available download formats
    Dataset updated
    May 12, 2024
    Authors
    gov
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    Dataset

    This dataset was created by gov

    Released under Database: Open Database, Contents: Database Contents

    Contents

  12. e

    IPv6 readiness - websites having a AAAA coverage in DNS records (as % of...

    • data.europa.eu
    csv, rdf n-triples +2
    Updated May 21, 2014
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Directorate-General for Communications Networks, Content and Technology (2014). IPv6 readiness - websites having a AAAA coverage in DNS records (as % of most visited websites) [Dataset]. https://data.europa.eu/data/datasets/cgmsrtpos2a3jdaenxv5ra?locale=en
    Explore at:
    unknown, rdf xml, rdf n-triples, csvAvailable download formats
    Dataset updated
    May 21, 2014
    Dataset authored and provided by
    Directorate-General for Communications Networks, Content and Technology
    License

    http://data.europa.eu/eli/dec/2011/833/ojhttp://data.europa.eu/eli/dec/2011/833/oj

    Description

    IPv6 ready websites are those having at least one AAAA in their DNS records (means the website is visible/can reply to users having an IPv6 connectivity). Tests are done every trimester through a script run by the IPv6 Observatory study on the 1 million most visited websites list provided by Alexa. Websites are attributed to countries on the basis of their main operation location as provided by MaxMind GeoIP database.

    Original source

    IPv6 Observatory, Study for the EC realized by inno:

    http://www.ipv6observatory.eu/the-study/

    Parent dataset

    This dataset is part of of another dataset:

    http://digital-agenda-data.eu/datasets/digital_agenda_scoreboard_key_indicators

  13. Common Screens Project - Free Website Screenshots

    • kaggle.com
    zip
    Updated Nov 4, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AKS (2022). Common Screens Project - Free Website Screenshots [Dataset]. https://www.kaggle.com/datasets/bpmtips/commonscreens/suggestions
    Explore at:
    zip(1958628844 bytes)Available download formats
    Dataset updated
    Nov 4, 2022
    Authors
    AKS
    Description

    I will like to introduce all of you to the https://commonscreens.com Common Screens project which is based on data derived from Common Crawl, it supplements the Common Crawl project with screenshots of around 55+ Million websites and domains, targeted 150 million, The screens can be used for OCR, Machine Learning, Domain profiling and categorization etc. You can hotlink AWS cloudfront CDN screenshot images directly into your project or website.

    Checkout the metadata section which provides a csv format database of all major domain information.

    The uploads are in process.

    Below dataset is metadata for the screenshot images, you can link s3 bucket directly to do image processing on 55+ million screenshots of websites.

  14. d

    Business Website Data | 50 Countries Coverage | GDPR Compliant | 7,838,729...

    • datarade.ai
    .json, .csv
    Updated Aug 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    HitHorizons (2025). Business Website Data | 50 Countries Coverage | GDPR Compliant | 7,838,729 Websites [Dataset]. https://datarade.ai/data-products/business-website-data-48-countries-coverage-gdpr-complian-hithorizons
    Explore at:
    .json, .csvAvailable download formats
    Dataset updated
    Aug 22, 2025
    Dataset authored and provided by
    HitHorizons
    Area covered
    Estonia, Luxembourg, Belgium, United Kingdom, Denmark, Italy, Germany, France
    Description

    The Business Websites Database of European Companies serves as an invaluable and comprehensive resource, meticulously curated to include an extensive and diverse collection of links directing users to the official websites of prominent and influential companies headquartered or operating within Europe. This database spans a wide array of industries and sectors, ranging from technology and finance to manufacturing, healthcare, retail, and beyond, ensuring that users have access to a broad spectrum of business information. By offering direct access to these companies' online platforms, the database not only facilitates seamless navigation to their digital presence but also provides users with the opportunity to explore detailed insights about their products, services, corporate values, and market activities, making it an essential tool for researchers, professionals, and anyone seeking to engage with the European business landscape.

  15. Data from: Inventory of online public databases and repositories holding...

    • catalog.data.gov
    • s.cnmilf.com
    • +2more
    Updated Apr 21, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agricultural Research Service (2025). Inventory of online public databases and repositories holding agricultural data in 2017 [Dataset]. https://catalog.data.gov/dataset/inventory-of-online-public-databases-and-repositories-holding-agricultural-data-in-2017-d4c81
    Explore at:
    Dataset updated
    Apr 21, 2025
    Dataset provided by
    Agricultural Research Servicehttps://www.ars.usda.gov/
    Description

    United States agricultural researchers have many options for making their data available online. This dataset aggregates the primary sources of ag-related data and determines where researchers are likely to deposit their agricultural data. These data serve as both a current landscape analysis and also as a baseline for future studies of ag research data. Purpose As sources of agricultural data become more numerous and disparate, and collaboration and open data become more expected if not required, this research provides a landscape inventory of online sources of open agricultural data. An inventory of current agricultural data sharing options will help assess how the Ag Data Commons, a platform for USDA-funded data cataloging and publication, can best support data-intensive and multi-disciplinary research. It will also help agricultural librarians assist their researchers in data management and publication. The goals of this study were to establish where agricultural researchers in the United States-- land grant and USDA researchers, primarily ARS, NRCS, USFS and other agencies -- currently publish their data, including general research data repositories, domain-specific databases, and the top journals compare how much data is in institutional vs. domain-specific vs. federal platforms determine which repositories are recommended by top journals that require or recommend the publication of supporting data ascertain where researchers not affiliated with funding or initiatives possessing a designated open data repository can publish data Approach The National Agricultural Library team focused on Agricultural Research Service (ARS), Natural Resources Conservation Service (NRCS), and United States Forest Service (USFS) style research data, rather than ag economics, statistics, and social sciences data. To find domain-specific, general, institutional, and federal agency repositories and databases that are open to US research submissions and have some amount of ag data, resources including re3data, libguides, and ARS lists were analysed. Primarily environmental or public health databases were not included, but places where ag grantees would publish data were considered. Search methods We first compiled a list of known domain specific USDA / ARS datasets / databases that are represented in the Ag Data Commons, including ARS Image Gallery, ARS Nutrition Databases (sub-components), SoyBase, PeanutBase, National Fungus Collection, i5K Workspace @ NAL, and GRIN. We then searched using search engines such as Bing and Google for non-USDA / federal ag databases, using Boolean variations of “agricultural data” /“ag data” / “scientific data” + NOT + USDA (to filter out the federal / USDA results). Most of these results were domain specific, though some contained a mix of data subjects. We then used search engines such as Bing and Google to find top agricultural university repositories using variations of “agriculture”, “ag data” and “university” to find schools with agriculture programs. Using that list of universities, we searched each university web site to see if their institution had a repository for their unique, independent research data if not apparent in the initial web browser search. We found both ag specific university repositories and general university repositories that housed a portion of agricultural data. Ag specific university repositories are included in the list of domain-specific repositories. Results included Columbia University – International Research Institute for Climate and Society, UC Davis – Cover Crops Database, etc. If a general university repository existed, we determined whether that repository could filter to include only data results after our chosen ag search terms were applied. General university databases that contain ag data included Colorado State University Digital Collections, University of Michigan ICPSR (Inter-university Consortium for Political and Social Research), and University of Minnesota DRUM (Digital Repository of the University of Minnesota). We then split out NCBI (National Center for Biotechnology Information) repositories. Next we searched the internet for open general data repositories using a variety of search engines, and repositories containing a mix of data, journals, books, and other types of records were tested to determine whether that repository could filter for data results after search terms were applied. General subject data repositories include Figshare, Open Science Framework, PANGEA, Protein Data Bank, and Zenodo. Finally, we compared scholarly journal suggestions for data repositories against our list to fill in any missing repositories that might contain agricultural data. Extensive lists of journals were compiled, in which USDA published in 2012 and 2016, combining search results in ARIS, Scopus, and the Forest Service's TreeSearch, plus the USDA web sites Economic Research Service (ERS), National Agricultural Statistics Service (NASS), Natural Resources and Conservation Service (NRCS), Food and Nutrition Service (FNS), Rural Development (RD), and Agricultural Marketing Service (AMS). The top 50 journals' author instructions were consulted to see if they (a) ask or require submitters to provide supplemental data, or (b) require submitters to submit data to open repositories. Data are provided for Journals based on a 2012 and 2016 study of where USDA employees publish their research studies, ranked by number of articles, including 2015/2016 Impact Factor, Author guidelines, Supplemental Data?, Supplemental Data reviewed?, Open Data (Supplemental or in Repository) Required? and Recommended data repositories, as provided in the online author guidelines for each the top 50 journals. Evaluation We ran a series of searches on all resulting general subject databases with the designated search terms. From the results, we noted the total number of datasets in the repository, type of resource searched (datasets, data, images, components, etc.), percentage of the total database that each term comprised, any dataset with a search term that comprised at least 1% and 5% of the total collection, and any search term that returned greater than 100 and greater than 500 results. We compared domain-specific databases and repositories based on parent organization, type of institution, and whether data submissions were dependent on conditions such as funding or affiliation of some kind. Results A summary of the major findings from our data review: Over half of the top 50 ag-related journals from our profile require or encourage open data for their published authors. There are few general repositories that are both large AND contain a significant portion of ag data in their collection. GBIF (Global Biodiversity Information Facility), ICPSR, and ORNL DAAC were among those that had over 500 datasets returned with at least one ag search term and had that result comprise at least 5% of the total collection. Not even one quarter of the domain-specific repositories and datasets reviewed allow open submission by any researcher regardless of funding or affiliation. See included README file for descriptions of each individual data file in this dataset. Resources in this dataset:Resource Title: Journals. File Name: Journals.csvResource Title: Journals - Recommended repositories. File Name: Repos_from_journals.csvResource Title: TDWG presentation. File Name: TDWG_Presentation.pptxResource Title: Domain Specific ag data sources. File Name: domain_specific_ag_databases.csvResource Title: Data Dictionary for Ag Data Repository Inventory. File Name: Ag_Data_Repo_DD.csvResource Title: General repositories containing ag data. File Name: general_repos_1.csvResource Title: README and file inventory. File Name: README_InventoryPublicDBandREepAgData.txt

  16. The websites of seven public databases used in this work.

    • plos.figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hai-Lu Wu; Zhao-Tao Duan; Zong-Dan Jiang; Wei-Jun Cao; Zhi-Bing Wang; Ke-Wei Hu; Xin Gao; Shu-Kui Wang; Bang-Shun He; Zhen-Yu Zhang; Hong-Guang Xie (2023). The websites of seven public databases used in this work. [Dataset]. http://doi.org/10.1371/journal.pone.0074381.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Hai-Lu Wu; Zhao-Tao Duan; Zong-Dan Jiang; Wei-Jun Cao; Zhi-Bing Wang; Ke-Wei Hu; Xin Gao; Shu-Kui Wang; Bang-Shun He; Zhen-Yu Zhang; Hong-Guang Xie
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    NCBI, National Center for Biotechnology Information; KEGG, Kyoto Encyclopedia of Genes and Genomes.

  17. m

    Website movie data

    • data.mendeley.com
    Updated Jan 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Junang Gan (2025). Website movie data [Dataset]. http://doi.org/10.17632/7s4ttfpg88.1
    Explore at:
    Dataset updated
    Jan 17, 2025
    Authors
    Junang Gan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Image and Text Introduction Data of Movies in Eleven Sites

  18. w

    Websites using Cf7 Database

    • webtechsurvey.com
    csv
    Updated Oct 13, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    WebTechSurvey (2025). Websites using Cf7 Database [Dataset]. https://webtechsurvey.com/technology/cf7-database
    Explore at:
    csvAvailable download formats
    Dataset updated
    Oct 13, 2025
    Dataset authored and provided by
    WebTechSurvey
    License

    https://webtechsurvey.com/termshttps://webtechsurvey.com/terms

    Time period covered
    2025
    Area covered
    Global
    Description

    A complete list of live websites using the Cf7 Database technology, compiled through global website indexing conducted by WebTechSurvey.

  19. g

    Websites of the database Sites & Organisations in the territory of Rennes...

    • gimi9.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Websites of the database Sites & Organisations in the territory of Rennes Métropole [Dataset]. https://gimi9.com/dataset/eu_https-data-rennesmetropole-fr-explore-dataset-sites_organismes_sites-
    Explore at:
    License

    Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
    License information was derived automatically

    Description

    This layer of data locates the sites contained in the Sites & Organisations database managed by Rennes Métropole services. In the long term, other actors will be able to contribute to the updating of this database (territorial authorities, equipment managers, association...). This layer is the result of merging 3 layers of data that were managed by 3 different services. To streamline their management and pool updating efforts, it was decided to group them into a single database (the Sites & Organisations database) and set up an appropriate management tool. The Sites & Organisations database aims to bring together public, semi-public and private facilities receiving from the public as well as many associations that use them at the Rennes Métropole scale. The Sites & Organisations described may also be provided with business data specific to the profile of these various contributors. A “site” is a geographically located “container” (by an address and geographical coordinates) to which an “activity” described by an organisation (content) is mandatory. When a site is not limited to a building but is represented by a large area (e.g. sports complex, university, hospitals...), the notion of “father site” is used. An “organisation” represents an “activity” specified by a description, a nomenclature, keywords, timetables and, where applicable, business data. Its location will result from its association with a site. An organisation may not have a linked site (no location) when it comes to a phone number, a website... Description of the fields awarded: id_site: Site Identifier name_site: Site name name_pvci: Name of the site used on the Plan de Ville Communal et Intercommunal de Rennes Métropole etat_site: Site Status (active/inactive/project) id_level_site: Level (0 by default) id_site_pere: Father site identifier name_site_pere: Father site name id_org_main: Principal organism identifier name_org_main: Name of main organisation id_theme_main: Nomenclature (identifying the theme attached) name_theme_main: Nomenclature (name of theme attached) id_activite_main: Nomenclature (identifying the activity attached) name_activite_main: Nomenclature (name of activity attached) id_specialite_main: Nomenclature (identifying the speciality attached) name_specialite_main: Nomenclature (name of speciality attached) name_org_secondary: Name of organisation(s) associated with the site Access to the nomenclature is available below in the metadata.

  20. the-joi-database.com Website Traffic, Ranking, Analytics [October 2025]

    • semrush.ebundletools.com
    Updated Nov 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Semrush (2025). the-joi-database.com Website Traffic, Ranking, Analytics [October 2025] [Dataset]. https://semrush.ebundletools.com/website/the-joi-database.com/overview/
    Explore at:
    Dataset updated
    Nov 11, 2025
    Dataset authored and provided by
    Semrushhttps://fr.semrush.com/
    License

    https://semrush.ebundletools.com/company/legal/terms-of-service/https://semrush.ebundletools.com/company/legal/terms-of-service/

    Time period covered
    Nov 11, 2025
    Area covered
    Worldwide
    Variables measured
    visits, backlinks, bounceRate, pagesPerVisit, authorityScore, organicKeywords, avgVisitDuration, referringDomains, trafficByCountry, paidSearchTraffic, and 3 more
    Measurement technique
    Semrush Traffic Analytics; Click-stream data
    Description

    the-joi-database.com is ranked #7672 in US with 5.09M Traffic. Categories: Online Services. Learn more about website traffic, market share, and more!

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
yasser messahli (2024). 365 Data Science Web site statistics [Dataset]. https://www.kaggle.com/yassermessahli/365-data-science-web-site-statistics
Organization logo

365 Data Science Web site statistics

This is a database containing some statistics from 365 data science website.

Explore at:
zip(3895191 bytes)Available download formats
Dataset updated
Aug 9, 2024
Authors
yasser messahli
License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

365 Data Science Database

365 Data Science is a website that provides online courses and resources for learning data science, machine learning, and data analysis.

It is common for websites that offer online courses to have **databases **to store information about their courses, students, and progress. It is also possible that they use databases for storing and organizing the data used in their courses and examples.

If you're looking for specific information about the database used by 365 Data Science, I recommend reaching out to them directly through their Website or support channels.

Search
Clear search
Close search
Google apps
Main menu