19 datasets found
  1. Book Genome Dataset

    • kaggle.com
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniel Young (2023). Book Genome Dataset [Dataset]. https://www.kaggle.com/datasets/youngdaniel/book-genome-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 30, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Daniel Young
    License

    Attribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
    License information was derived automatically

    Description

    I uploaded GroupLens' Book Genome dataset on Kaggle. It doesn't seem like they're active here any more and I want to use this here for some exploratory learning work I did.

    Official link here: https://grouplens.org/datasets/book-genome/

    Tag Genome is a data structure containing scores indicating the degree to which tags apply to items, such as movies or books. This dataset contains a Tag Genome generated for a set of books along with the data used for its generation (raw data). Raw data consists of a subset of the Goodreads dataset [Wan and McAuley, 2018, Wan et al., 2019] and book-tag ratings. The Goodreads subset includes information on popular books, such as titles, authors, release years, user ratings, reviews and shelves. Shelves are lists that users use to organize books in Goodreads (https://www.goodreads.com/). In these instructions, we refer to adding books to shelves as attaching tags (shelf names) to books. To collect book-tag ratings, we conducted a survey on Amazon Mechanical Turk, where we asked users to indicate degree to which tags apply to books from this subset. To generate book-tag scores, we used two state-of-the-art algorithms: Glmer [Vig et al., 2012] and TagDL [Kotkov et al., 2021]. The code is available in the following GitHub repository: https://github.com/Bionic1251/Revisiting-the-Tag-Relevance-Prediction-Problem

  2. BL Labs Flickr Data: Book data and tag history (Dec 2013 - Dec 2014)

    • figshare.com
    zip
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ben O'steen; James Baker (2023). BL Labs Flickr Data: Book data and tag history (Dec 2013 - Dec 2014) [Dataset]. http://doi.org/10.6084/m9.figshare.1269249.v2
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Ben O'steen; James Baker
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Contains the tag information for the 1 million+ images uploaded to the British Library Flickr Commons account. taghistory.zip - contains a single .tsv file that lists the tags that were added (or removed) from images on the Flickr Common account for the first year. NB There was an issue getting information from Flickr for the first few months, so early information is not available.

    book_data.zip - contains a .json file, that holds a list of records, one for each digitised work. The record holds information on the work's title, authors and so on, as well as information on what images on Flickr correspond to it, as well as the identifier required to download PDF version(s) of the entire work.

  3. Public tags added to resources in Trove, 2008 to 2024

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Jun 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tim Sherratt; Tim Sherratt (2024). Public tags added to resources in Trove, 2008 to 2024 [Dataset]. http://doi.org/10.5281/zenodo.11496377
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 6, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Tim Sherratt; Tim Sherratt
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    This dataset contains details of 2,495,958 unique public tags added to 10,403,650 resources in Trove between August 2008 and June 2024. I harvested the data using the Trove API and saved it as a CSV file with the following columns:

    • `tag` – lower-cased text tag
    • `date` – date the tag was added
    • `zone` – API zone containing the tagged resource
    • `record_id` – the identifier of the tagged resource

    I've documented the method used to harvest the tags in this notebook.

    Using the `zone` and `record_id` you can find more information about a tagged item. To create urls to the resources in Trove:

    Notes:

    • Works (such as books) in Trove can have tags attached at either work or version level. This dataset aggregates all tags at the work level, removing any duplicates.
    • A single resource in Trove can appear in multiple zones – for example, a book that includes maps and illustrations might appear in the 'book', 'picture', and 'map' zones. This means that some of the tags will essentially be duplicates – harvested from different zones, but relating to the same resource. Depending on your needs, you might want to remove these duplicates.
    • While most of the tags were added by Trove users, more than 500,000 tags were added by Trove itself in November 2009. I think these tags were automatically generated from related Wikipedia pages. Depending on your needs, you might want to exclude these by limiting the date range or zones.
    • User content added to Trove, including tags, is available for reuse under a CC-BY-NC licence.

    See this notebook for some examples of how you can manipulate, analyse, and visualise the tag data.

  4. w

    Dataset of authors, books and publication dates of book subjects where books...

    • workwithdata.com
    Updated Nov 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2024). Dataset of authors, books and publication dates of book subjects where books equals Tag [Dataset]. https://www.workwithdata.com/datasets/book-subjects?col=book_subject%2Cj0-author%2Cj0-book%2Cj0-publication_date&f=1&fcol0=j0-book&fop0=%3D&fval0=Tag&j=1&j0=books
    Explore at:
    Dataset updated
    Nov 7, 2024
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about book subjects. It has 13 rows and is filtered where the books is Tag. It features 4 columns: authors, books, and publication dates.

  5. Booklet Label Market Size & Share Analysis - Industry Research Report -...

    • mordorintelligence.com
    pdf,excel,csv,ppt
    Updated Mar 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mordor Intelligence (2025). Booklet Label Market Size & Share Analysis - Industry Research Report - Growth Trends [Dataset]. https://www.mordorintelligence.com/industry-reports/booklet-label-market
    Explore at:
    pdf,excel,csv,pptAvailable download formats
    Dataset updated
    Mar 20, 2025
    Dataset authored and provided by
    Mordor Intelligence
    License

    https://www.mordorintelligence.com/privacy-policyhttps://www.mordorintelligence.com/privacy-policy

    Time period covered
    2019 - 2030
    Area covered
    Global
    Description

    The Booklet Label Market report segments the industry into By Product Type (Multi-Panel Labels, Peel-and-Reveal Labels), By Material (Paper, Film), By Printing Technology (Flexographic, Digital, Offset, Other Printing Technology), By End-Use Industry (Pharmaceuticals, Food and Beverage, Chemicals, Cosmetics and Personal Care, Other End-use Industries), and By Geography (North America, Europe, Asia, and more).

  6. d

    Data from: Development and application of a novel approach to scoring ear...

    • datadryad.org
    • data.niaid.nih.gov
    • +1more
    zip
    Updated Apr 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Megan Lynn Harmon; Blair Caitlin Downey; Alycia Marie Drwencke; Cassandra Blaine Tucker (2023). Development and application of a novel approach to scoring ear tag wounds in dairy calves [Dataset]. http://doi.org/10.25338/B8BS8J
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 7, 2023
    Dataset provided by
    Dryad
    Authors
    Megan Lynn Harmon; Blair Caitlin Downey; Alycia Marie Drwencke; Cassandra Blaine Tucker
    Time period covered
    Apr 4, 2023
    Description

    Application of ear tags in cattle is a common husbandry practice for identification purposes. While it is known that ear tag application causes damage, little is known about the duration and process of wound healing associated with this procedure. Our objective was to quantify the wound healing progression in dairy calves with plastic identification tags. Calves (n=33) were ear tagged at 2 d of age and wound photos were taken weekly until 9–22 wk of age. This approach generated 10–22 observations per calf that were analyzed using a novel wound-scoring system. We developed this system to score the presence or absence of 8 different tissue types related to piercing trauma or mechanical irritation along the top of the tag (impressions, crust, and desquamation) and around the piercing (exudate, crust, tissue growth, and desquamation). Ears were scored as undamaged when tissue was intact. We found that wound tissue types associated with damage were still seen in many calves for at least 12 w...

  7. w

    Dataset of authors, books and publication dates of book series where books...

    • workwithdata.com
    Updated Nov 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2024). Dataset of authors, books and publication dates of book series where books equals Cards & tags [Dataset]. https://www.workwithdata.com/datasets/book-series?col=book_series%2Cj0-author%2Cj0-book%2Cj0-publication_date&f=1&fcol0=j0-book&fop0=%3D&fval0=Cards+%26+tags&j=1&j0=books
    Explore at:
    Dataset updated
    Nov 25, 2024
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about book series. It has 1 row and is filtered where the books is Cards & tags. It features 4 columns: authors, books, and publication dates.

  8. F

    Producer Price Index by Industry: Commercial Printing, Except Screen and...

    • fred.stlouisfed.org
    json
    Updated Jul 16, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Producer Price Index by Industry: Commercial Printing, Except Screen and Books: Label and Wrapper Printing (Lithographic) [Dataset]. https://fred.stlouisfed.org/series/PCU32311K32311K03
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Jul 16, 2025
    License

    https://fred.stlouisfed.org/legal/#copyright-public-domainhttps://fred.stlouisfed.org/legal/#copyright-public-domain

    Description

    Graph and download economic data for Producer Price Index by Industry: Commercial Printing, Except Screen and Books: Label and Wrapper Printing (Lithographic) (PCU32311K32311K03) from Jun 1982 to Jun 2025 about book, printing, commercial, PPI, industry, inflation, price index, indexes, price, and USA.

  9. w

    Dataset of books about Warning labels-Humor

    • workwithdata.com
    Updated Apr 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2025). Dataset of books about Warning labels-Humor [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=j0-book_subject&fop0=%3D&fval0=Warning+labels-Humor&j=1&j0=book_subjects
    Explore at:
    Dataset updated
    Apr 17, 2025
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about books. It has 3 rows and is filtered where the book subjects is Warning labels-Humor. It features 9 columns including author, publication date, language, and book publisher.

  10. w

    Dataset of author, BNB id, book publisher, and publication date of Cards,...

    • workwithdata.com
    Updated Apr 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2025). Dataset of author, BNB id, book publisher, and publication date of Cards, wrap and tags [Dataset]. https://www.workwithdata.com/datasets/books?col=author%2Cbnb_id%2Cbook%2Cbook%2Cbook_publisher%2Cpublication_date&f=1&fcol0=book&fop0=%3D&fval0=Cards%2C+wrap+and+tags
    Explore at:
    Dataset updated
    Apr 17, 2025
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about books. It has 2 rows and is filtered where the book is Cards, wrap and tags. It features 5 columns: author, publication date, book publisher, and BNB id.

  11. o

    Data from: People versus Books

    • explore.openaire.eu
    • zenodo.org
    Updated Jul 6, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sarah Bowen Savant; Masoumeh Seydi (2021). People versus Books [Dataset]. http://doi.org/10.5281/zenodo.5074632
    Explore at:
    Dataset updated
    Jul 6, 2021
    Authors
    Sarah Bowen Savant; Masoumeh Seydi
    Description

    This explanation pertains to the data prepared for Non sola scriptura: Essays on the Qur’an and Islam in Honour of William A. Graham (Routledge), Chapter by Sarah Bowen Savant, “People versus Books.” We are releasing data that was used to create for the chapter, Graphs 1 and 2 and also Tables 1-3. Note: All the data files (except the text in number 3) are in TSV format (Tab Separated Values) and any text editor or tabular data editor, such as Excel can deal with it. “IsnadFractions_PeopleversusBooks”. This file represents a filtered version of an output from Ryan Muther’s isnād classifier algorithm. Muther ran the algorithm in July 2020, based on the Version 2020.1.2 release of the corpus, available at: http://doi.org/10.5281/zenodo.3891466. The data file includes: author: the name of the author. died: death date of author. NB: Especially the early dates cannot be relied on. title: the title of the author’s book, from the OpenITI Corpus. length: length of the book, measured in word-tokens. isnad_fraction: the percentage of the book’s word-tokens that are made up of isnāds. “GALTags_PeopleversusBooks”. Books in the OpenITI were mapped by Walid A. Akef in 2018 to: Brockelmann, Carl, History of the Arabic Written Traditions, trans. Joep Lameer, 2 vols and 3 supplements, Leiden: Brill, 2016-2018. The file includes the following columns: id: book id, from the OpenITI Corpus. gal_tags: the GAL tags, also used in the OpenITI Corpus “0571IbnCasakir.TarikhDimashq.JK000916-ara1.mARkdown”. The Ibn ʿAsākir text file, from the Version 2020.1.2 release of the OpenITI Corpus. “NamedEntities_PeopleversusBooks”. This is a very first effort at working on named entities in Ibn ʿAsākir’s Taʾrīkh Madīnat Dimashq and represents only a tiny fraction of the surface forms of names. Most of the names pertain to persons who transmitted from Ibn Saʿd. There may be some duplicate surface forms (which does not affect the method). We use this list to replace the surface forms with transliterated values. The column description is as below: name: the normalized name. ar_name: the Arabic name, which are the surface forms. status: true (T)/false (F) values to include/exclude the cases in the replacement process. We have used true values. “SplittingTerms_PeopleversusBooks”. We started with a list of transmissive terms that R. Kevin Jaques originated and then added more terms, which include the various normalized forms of the same term. We used this list to split isnāds into names. “IbnSadIsnads_PeopleversusBooks”. This file includes the pieces of texts that the algorithm tags as isnāds in the text. We extracted the tagged pieces and made a list of isnāds. Almost all of the isnāds start with a transmissive term. We use this file to extract the names and clean some rows to generate a data table that we can use for clustering. Below are the brief description of the column: text_ID: this contains the book id from the OpenITI Corpus. This column can be ignored as we are using it for one text in this project. However, it is required in the collection of isnāds from multiple texts. id: a unique identifier assigned to each isnād. The isnād classifier algorithm assigns this id and can be used to identify each isnād in the text when required. isnad_text: the isnād that we extract from the text. length: length of the extracted isnād in tokens “IsnadNames_PeopleversusBooks”. This file is the isnāds list (number 5 on this list) splitted by the transmissive terms (number 4 on this list) in order to extract the names in the isnāds. ‌The column are the same as below: text_ID: this contains the book id from the OpenITI corpus. This column can be ignored as we are using it for one text in this project. However, it is required in the collection of isnāds from multiple texts. isnad_text: this column is the isnād that we extract from the text. ibnSad_cnt: number of times that the name Ibn Saʿd is mentioned in the corresponding isnād. name_at_position_X: the rest of the columns in this table include the pieces of the isnād that we get after splitting the isnāds with a list of terms. Each column contains a name or any string that appears between two transmissive terms. Some cells are empty and it is because we probably miss some transmissive terms. “IbnSadClusters_PeopleversusBooks”. This file includes clusters of isnāds of length six (i.e. isnāds that include six names). We have used the affinity propagation (AP) clustering algorithm based on the Levenstein similarity score of the names. Below is the column description: frequency: the frequency of the isnād in the data cluster_id: the id of the cluster to which the isnād belongs nameX: columns C to H include the names in isnād at position 1 to 6, running back to Muhammad b. Saʿd at position 6. “JK000916-ara1.mARkdown_Shamela0001686-ara1.completed”. This is the passim output from the February 2020 run (which used the same version of the corpus; Version 2020.1.2). For definition of fields in this file, please see...

  12. F

    Producer Price Index by Industry: Commercial Printing, Except Screen and...

    • fred.stlouisfed.org
    json
    Updated Jul 16, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Producer Price Index by Industry: Commercial Printing, Except Screen and Books: Label and Wrapper Printing (Flexographic) [Dataset]. https://fred.stlouisfed.org/series/PCU32311K32311K21
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Jul 16, 2025
    License

    https://fred.stlouisfed.org/legal/#copyright-public-domainhttps://fred.stlouisfed.org/legal/#copyright-public-domain

    Description

    Graph and download economic data for Producer Price Index by Industry: Commercial Printing, Except Screen and Books: Label and Wrapper Printing (Flexographic) (PCU32311K32311K21) from Dec 2001 to Jun 2025 about book, printing, commercial, PPI, industry, inflation, price index, indexes, price, and USA.

  13. w

    Dataset of books about Matchbox labels, British-Collectors and collecting

    • workwithdata.com
    Updated Apr 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2025). Dataset of books about Matchbox labels, British-Collectors and collecting [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=j0-book_subject&fop0=%3D&fval0=Matchbox+labels%2C+British-Collectors+and+collecting&j=1&j0=book_subjects
    Explore at:
    Dataset updated
    Apr 17, 2025
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about books. It has 4 rows and is filtered where the book subjects is Matchbox labels, British-Collectors and collecting. It features 9 columns including author, publication date, language, and book publisher.

  14. n

    Market Analysis for RULE BOOK 4.0

    • nsc.onl
    Updated Aug 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Market Analysis for RULE BOOK 4.0 [Dataset]. https://nsc.onl/cards/tag/665619/rule-book-4-0
    Explore at:
    Dataset updated
    Aug 8, 2025
    Variables measured
    Countries, Price Range, Median Price, Average Price, Sold Listings, Total Listings, Active Listings, Unsold Listings, Number of Sellers, Sell-Through Rate
    Description

    Comprehensive market data and analytics for RULE BOOK 4.0 including pricing distribution, seller metrics, and market trends.

  15. w

    Dataset of books called The Parlophone red label popular series, E5000 -...

    • workwithdata.com
    Updated Apr 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2025). Dataset of books called The Parlophone red label popular series, E5000 - E6428 [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=book&fop0=%3D&fval0=The+Parlophone+red+label+popular+series%2C+E5000+-+E6428
    Explore at:
    Dataset updated
    Apr 17, 2025
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about books. It has 1 row and is filtered where the book is The Parlophone red label popular series, E5000 - E6428. It features 7 columns including author, publication date, language, and book publisher.

  16. d

    Massive units emplaced by bedload transport in sheet flow mode

    • search.dataone.org
    Updated Feb 15, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ricardo Hernandez Moreira (2017). Massive units emplaced by bedload transport in sheet flow mode [Dataset]. https://search.dataone.org/view/seadva-RicardoHernandezMoreira-f77ec8c0-2a42-4c1a-bfe6-992db22b6239
    Explore at:
    Dataset updated
    Feb 15, 2017
    Dataset provided by
    SEAD Virtual Archive
    Authors
    Ricardo Hernandez Moreira
    Time period covered
    Feb 15, 2017
    Area covered
    Earth
    Description

    Herein we present data collected during experiments on massive deposition in upper regime. (Refer to http://sedexp.net/experiment/experiments-massive-deposits-upper-regime for more information on the experimental setup).

    The data are separated as follows: 00-profiles: Water surface and bed elevation profiles. 01-sonar data: Instantaneous realizations of bed elevation fluctuations captured by JSR ultrasonic probes. 02-media: collection of pictures, time-lapses and movies corresponding to the experiments.

    Data are divided by flow rate (i.e., 20 l/s, 30 l/s), by feed rate (e.g., 1.5 kg/min, 8, kg/min, 16 kg/min) and by experiment type (i.e. equilibrium or aggradational runs), wherever appropriate.

  17. e

    TMTpro: Design and initial evaluation of a novel Proline-based isobaric...

    • ebi.ac.uk
    Updated Nov 25, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Michael Bremang (2019). TMTpro: Design and initial evaluation of a novel Proline-based isobaric 16-plex Tandem Mass Tag reagent set [Dataset]. https://www.ebi.ac.uk/pride/archive/projects/PXD014750
    Explore at:
    Dataset updated
    Nov 25, 2019
    Authors
    Michael Bremang
    Variables measured
    Proteomics
    Description

    The design and synthesis of a novel proline-reporter-based isobaric Tandem Mass Tag 16 tag set (TMTpro) was carried out and the data uploaded here is a comparison of the performance of the new TMTpro tags with the current commercially available dimethylpiperidine-reporter-based TMT10/11 reagents. Data from 2 experiments are provided.

  18. f

    Data used in Fig 3B.

    • plos.figshare.com
    • datasetcatalog.nlm.nih.gov
    xlsx
    Updated Jun 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Michael Francis; Changwei Li; Yitang Sun; Jingqi Zhou; Xiang Li; J. Thomas Brenna; Kaixiong Ye (2023). Data used in Fig 3B. [Dataset]. http://doi.org/10.1371/journal.pgen.1009431.s011
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 8, 2023
    Dataset provided by
    PLOS Genetics
    Authors
    Michael Francis; Changwei Li; Yitang Sun; Jingqi Zhou; Xiang Li; J. Thomas Brenna; Kaixiong Ye
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Fish oil status, number of G alleles at rs112803755, mean triglycerides, sample size, standard deviation of triglycerides, and 95% confidence interval for combined participants from Stage 1 and Stage 2. (XLSX)

  19. Math equations for tag generation.

    • plos.figshare.com
    • figshare.com
    xls
    Updated Jun 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Reem Almarwani; Ning Zhang; James Garside (2023). Math equations for tag generation. [Dataset]. http://doi.org/10.1371/journal.pone.0244731.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 4, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Reem Almarwani; Ning Zhang; James Garside
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Math equations for tag generation.

  20. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Daniel Young (2023). Book Genome Dataset [Dataset]. https://www.kaggle.com/datasets/youngdaniel/book-genome-dataset
Organization logo

Book Genome Dataset

GroupLens's Tag Genome dataset for Books

Explore at:
315 scholarly articles cite this dataset (View in Google Scholar)
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 30, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Daniel Young
License

Attribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
License information was derived automatically

Description

I uploaded GroupLens' Book Genome dataset on Kaggle. It doesn't seem like they're active here any more and I want to use this here for some exploratory learning work I did.

Official link here: https://grouplens.org/datasets/book-genome/

Tag Genome is a data structure containing scores indicating the degree to which tags apply to items, such as movies or books. This dataset contains a Tag Genome generated for a set of books along with the data used for its generation (raw data). Raw data consists of a subset of the Goodreads dataset [Wan and McAuley, 2018, Wan et al., 2019] and book-tag ratings. The Goodreads subset includes information on popular books, such as titles, authors, release years, user ratings, reviews and shelves. Shelves are lists that users use to organize books in Goodreads (https://www.goodreads.com/). In these instructions, we refer to adding books to shelves as attaching tags (shelf names) to books. To collect book-tag ratings, we conducted a survey on Amazon Mechanical Turk, where we asked users to indicate degree to which tags apply to books from this subset. To generate book-tag scores, we used two state-of-the-art algorithms: Glmer [Vig et al., 2012] and TagDL [Kotkov et al., 2021]. The code is available in the following GitHub repository: https://github.com/Bionic1251/Revisiting-the-Tag-Relevance-Prediction-Problem

Search
Clear search
Close search
Google apps
Main menu