5 datasets found
  1. h

    Eurovoc_en

    • huggingface.co
    Updated May 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dominik Weckmüller (2024). Eurovoc_en [Dataset]. https://huggingface.co/datasets/do-me/Eurovoc_en
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 16, 2024
    Authors
    Dominik Weckmüller
    License

    https://choosealicense.com/licenses/eupl-1.1/https://choosealicense.com/licenses/eupl-1.1/

    Description

    European legislation from CELLAR/EUROVOC, English entries only of https://huggingface.co/datasets/EuropeanParliament/Eurovoc. This data is enriched with embeddings, ready for semantic search. Last update 16.05.2024: 352011 entries.

      Usage
    
    
    
    
    
      With Pandas / Polars
    

    Simply download the parquet file and read with pandas or polars. import pandas as pd # or import polars as pd df = pd.read_parquet("CELLAR_EN_16_05_2024.parquet") df

      With HF datsets
    

    from datasets import… See the full description on the dataset page: https://huggingface.co/datasets/do-me/Eurovoc_en.

  2. CHAMP and Swarm solar activity- and height-scaled polar cap plasma density...

    • zenodo.org
    • data.niaid.nih.gov
    hdf
    Updated May 13, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Spencer Mark Hatch; Spencer Mark Hatch; Stein Haaland; Stein Haaland; Karl Magnus Laundal; Karl Magnus Laundal; Therese Moretto Jørgensen; Therese Moretto Jørgensen; Andrew Yau; Andrew Yau; Lindis Merete Bjoland; Lindis Merete Bjoland; Jone Peter Reistad; Jone Peter Reistad; Anders Ohma; Anders Ohma; Kjellmar Oksavik; Kjellmar Oksavik (2020). CHAMP and Swarm solar activity- and height-scaled polar cap plasma density measurements [Dataset]. http://doi.org/10.5281/zenodo.3813146
    Explore at:
    hdfAvailable download formats
    Dataset updated
    May 13, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Spencer Mark Hatch; Spencer Mark Hatch; Stein Haaland; Stein Haaland; Karl Magnus Laundal; Karl Magnus Laundal; Therese Moretto Jørgensen; Therese Moretto Jørgensen; Andrew Yau; Andrew Yau; Lindis Merete Bjoland; Lindis Merete Bjoland; Jone Peter Reistad; Jone Peter Reistad; Anders Ohma; Anders Ohma; Kjellmar Oksavik; Kjellmar Oksavik
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Solar activity- and height-adjusted plasma density measurements in the polar cap (i.e., above 80° latitude in Modified Apex110 coordinates) from the Swarm and CHAMP satellites. covering the entire CHAMP mission period (2002–2009) and the Swarm mission period from launch through February 2020.

    Plasma density measurements are scaled to a nominal solar activity level of <F10.7>27 = 80 sfu, and an altitude of 500 km, as described in Hatch et al. (submitted to JGR: Space Physics; ESSOAr pre-print)

    This dataset was prepared as a part of the "Swarm+ Coupling High-Low Atmosphere Interactions: Ion Outflow" project (project website) (ESA website), and is funded by European Space Agency Contract #4000126731.

    Data are stored in HDF5 format as a Python Pandas dataframe. They can be loaded into Python via the following.

    import pandas as pd
    
    df = pd.read_hdf('CHAMP_Swarm_polarcap_adjDensity.hdf',key='df')

    The data columns are

    • 'NeAdj' : Solar activity- and height-adjusted plasma density (cm-3)
    • 'a110lat' : Modified Apex110 latitude (deg)
    • 'a110lon' : Modified Apex110 longitude (deg)
    • 'mlt' : Modified Apex110 magnetic local time
    • 'h_km' : satellite altitude (km)
    • 'gclat' : geocentric latitude (deg)
    • 'gclon' : geocentric longitude (deg)
    • 'sat' : satellite identifier (string, one of 'A', 'B',' 'C', or 'CHAMP')
  3. PlaygroundS4E06|OriginalData

    • kaggle.com
    Updated Jun 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ravi Ramakrishnan (2024). PlaygroundS4E06|OriginalData [Dataset]. https://www.kaggle.com/datasets/ravi20076/playgrounds4e06originaldata
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 1, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Ravi Ramakrishnan
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    This data is downloaded from the link shared in the PlaygroundS4E06 episode on the data page. We add a column id to keep consistency with the competition data and upload herewith.
    Please feel free to use this dataset as part of your pipeline.

    Key links:- 1. Competition - https://www.kaggle.com/competitions/playground-series-s4e6 2. Data page- https://www.kaggle.com/competitions/playground-series-s4e6/data
    3. Original dataset link- https://archive.ics.uci.edu/dataset/697/predict+students+dropout+and+academic+success

    This is a .csv file. Please use pandas.read_csv() or polars.scan_csv() to read in the file

    Best regards!

  4. h

    Data from: clinvar

    • huggingface.co
    Updated Feb 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Song Lab @ Cal (2025). clinvar [Dataset]. https://huggingface.co/datasets/songlab/clinvar
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 1, 2025
    Dataset authored and provided by
    Song Lab @ Cal
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    ClinVar variants

    For more information check out our paper and repository.

      Usage
    

    Pandas

    import pandas as pd df = pd.read_parquet("hf://datasets/songlab/clinvar/test.parquet")

    Polars

    import polars as pl df = pl.read_parquet("https://huggingface.co/datasets/songlab/clinvar/resolve/main/test.parquet")

    Datasets

    from datasets import load_dataset dataset = load_dataset("songlab/clinvar", split="test")

  5. h

    cosmic

    • huggingface.co
    Updated Feb 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Song Lab @ Cal (2025). cosmic [Dataset]. https://huggingface.co/datasets/songlab/cosmic
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 1, 2025
    Dataset authored and provided by
    Song Lab @ Cal
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    COSMIC variants

    For more information check out our paper and repository.

      Usage
    

    Pandas

    import pandas as pd df = pd.read_parquet("hf://datasets/songlab/cosmic/test.parquet")

    Polars

    import polars as pl df = pl.read_parquet("https://huggingface.co/datasets/songlab/cosmic/resolve/main/test.parquet")

    Datasets

    from datasets import load_dataset dataset = load_dataset("songlab/cosmic", split="test")

  6. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Dominik Weckmüller (2024). Eurovoc_en [Dataset]. https://huggingface.co/datasets/do-me/Eurovoc_en

Eurovoc_en

do-me/Eurovoc_en

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 16, 2024
Authors
Dominik Weckmüller
License

https://choosealicense.com/licenses/eupl-1.1/https://choosealicense.com/licenses/eupl-1.1/

Description

European legislation from CELLAR/EUROVOC, English entries only of https://huggingface.co/datasets/EuropeanParliament/Eurovoc. This data is enriched with embeddings, ready for semantic search. Last update 16.05.2024: 352011 entries.

  Usage





  With Pandas / Polars

Simply download the parquet file and read with pandas or polars. import pandas as pd # or import polars as pd df = pd.read_parquet("CELLAR_EN_16_05_2024.parquet") df

  With HF datsets

from datasets import… See the full description on the dataset page: https://huggingface.co/datasets/do-me/Eurovoc_en.

Search
Clear search
Close search
Google apps
Main menu