3 datasets found
  1. h

    covid-bing-query-gpt4-avs_triplets

    • huggingface.co
    Updated Aug 30, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aivin Solatorio (2024). covid-bing-query-gpt4-avs_triplets [Dataset]. https://huggingface.co/datasets/avsolatorio/covid-bing-query-gpt4-avs_triplets
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 30, 2024
    Authors
    Aivin Solatorio
    Description

    COVq dataset

    This dataset was used in the paper GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embedding Fine-tuning. Refer to https://arxiv.org/abs/2402.16829 for details. The code for generating the data is available at https://github.com/avsolatorio/GISTEmbed.

      Citation
    

    @article{solatorio2024gistembed, title={GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embedding Fine-tuning}, author={Aivin V. Solatorio}… See the full description on the dataset page: https://huggingface.co/datasets/avsolatorio/covid-bing-query-gpt4-avs_triplets.

  2. h

    covid-bing-query-gpt4

    • huggingface.co
    Updated Dec 15, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aivin Solatorio (2019). covid-bing-query-gpt4 [Dataset]. https://huggingface.co/datasets/avsolatorio/covid-bing-query-gpt4
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 15, 2019
    Authors
    Aivin Solatorio
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Bing x GPT-4 Synthetic Query Dataset

    This dataset was used in the paper GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embedding Fine-tuning. Refer to https://arxiv.org/abs/2402.16829 for details. The code for generating the data is available at https://github.com/avsolatorio/GISTEmbed.

      Citation
    

    @article{solatorio2024gistembed, title={GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embedding Fine-tuning}… See the full description on the dataset page: https://huggingface.co/datasets/avsolatorio/covid-bing-query-gpt4.

  3. h

    medi-data-mteb-covid-bing-query-gpt4-avs_triplets

    • huggingface.co
    Updated Dec 15, 2010
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aivin Solatorio (2010). medi-data-mteb-covid-bing-query-gpt4-avs_triplets [Dataset]. https://huggingface.co/datasets/avsolatorio/medi-data-mteb-covid-bing-query-gpt4-avs_triplets
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 15, 2010
    Authors
    Aivin Solatorio
    Description

    MEDI+MTEBcls+COVq dataset

    This dataset was used in the paper GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embedding Fine-tuning. Refer to https://arxiv.org/abs/2402.16829 for details. The code for generating the data is available at https://github.com/avsolatorio/GISTEmbed.

      Citation
    

    @article{solatorio2024gistembed, title={GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embedding Fine-tuning}, author={Aivin V. Solatorio}… See the full description on the dataset page: https://huggingface.co/datasets/avsolatorio/medi-data-mteb-covid-bing-query-gpt4-avs_triplets.

  4. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Aivin Solatorio (2024). covid-bing-query-gpt4-avs_triplets [Dataset]. https://huggingface.co/datasets/avsolatorio/covid-bing-query-gpt4-avs_triplets

covid-bing-query-gpt4-avs_triplets

avsolatorio/covid-bing-query-gpt4-avs_triplets

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 30, 2024
Authors
Aivin Solatorio
Description

COVq dataset

This dataset was used in the paper GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embedding Fine-tuning. Refer to https://arxiv.org/abs/2402.16829 for details. The code for generating the data is available at https://github.com/avsolatorio/GISTEmbed.

  Citation

@article{solatorio2024gistembed, title={GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embedding Fine-tuning}, author={Aivin V. Solatorio}… See the full description on the dataset page: https://huggingface.co/datasets/avsolatorio/covid-bing-query-gpt4-avs_triplets.

Search
Clear search
Close search
Google apps
Main menu