1 dataset found
  1. h

    ClueWeb-Reco

    • huggingface.co
    Updated May 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chenyan Xiong Research Group at CMU (2025). ClueWeb-Reco [Dataset]. https://huggingface.co/datasets/cx-cmu/ClueWeb-Reco
    Explore at:
    Dataset updated
    May 11, 2025
    Dataset authored and provided by
    Chenyan Xiong Research Group at CMU
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    ClueWeb-Reco:

      Source Files
    

    -- cwid_to_id.tsv: mapping bewteen official ClueWeb22 docids and our internal docids

      Splits in pure interaction format
    

    interaction_splits: valid_inter_input.tsv: input for validation dataset valid_inter_target.tsv: validation dataset ground truth test_inter_input.tsv: input for testing dataset (ground truth hidden)

      Splits in ordered cw id list format
    

    ordered_id_splits: valid_input.tsv: input for validation dataset… See the full description on the dataset page: https://huggingface.co/datasets/cx-cmu/ClueWeb-Reco.

  2. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Chenyan Xiong Research Group at CMU (2025). ClueWeb-Reco [Dataset]. https://huggingface.co/datasets/cx-cmu/ClueWeb-Reco

ClueWeb-Reco

cx-cmu/ClueWeb-Reco

Explore at:
Dataset updated
May 11, 2025
Dataset authored and provided by
Chenyan Xiong Research Group at CMU
License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

ClueWeb-Reco:

  Source Files

-- cwid_to_id.tsv: mapping bewteen official ClueWeb22 docids and our internal docids

  Splits in pure interaction format

interaction_splits: valid_inter_input.tsv: input for validation dataset valid_inter_target.tsv: validation dataset ground truth test_inter_input.tsv: input for testing dataset (ground truth hidden)

  Splits in ordered cw id list format

ordered_id_splits: valid_input.tsv: input for validation dataset… See the full description on the dataset page: https://huggingface.co/datasets/cx-cmu/ClueWeb-Reco.

Search
Clear search
Close search
Google apps
Main menu