1 dataset found

h
babylm-nso
huggingface.co
Updated Oct 29, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
BabyLM Challenge (2025). babylm-nso [Dataset]. https://huggingface.co/datasets/BabyLM-community/babylm-nso
Explore at:
Dataset updated
Oct 29, 2025
Dataset authored and provided by
BabyLM Challenge
License
https://choosealicense.com/licenses/unknown/https://choosealicense.com/licenses/unknown/
Description
babylm-nso

Dataset Description

This dataset is part of the BabyLM multilingual collection.

Dataset Summary

Language: nso Script: Latin Number of Documents: 26772 Total Tokens: 1067761

Tokens Per Category

child-books: 122083 tokens child-news: 130 tokens educational: 92589 tokens padding-mt: 206703 tokens padding-news: 150960 tokens padding-wikipedia: 495296 tokens

Data Fields

text: The document text category: Type of content (e.g.… See the full description on the dataset page: https://huggingface.co/datasets/BabyLM-community/babylm-nso.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

BabyLM Challenge (2025). babylm-nso [Dataset]. https://huggingface.co/datasets/BabyLM-community/babylm-nso

babylm-nso

BabyLM-community/babylm-nso

Explore at:

Dataset updated

Oct 29, 2025

Dataset authored and provided by

BabyLM Challenge

License

https://choosealicense.com/licenses/unknown/https://choosealicense.com/licenses/unknown/

Description

babylm-nso

  Dataset Description

This dataset is part of the BabyLM multilingual collection.

  Dataset Summary

Language: nso Script: Latin Number of Documents: 26772 Total Tokens: 1067761

  Tokens Per Category

child-books: 122083 tokens child-news: 130 tokens educational: 92589 tokens padding-mt: 206703 tokens padding-news: 150960 tokens padding-wikipedia: 495296 tokens

  Data Fields

text: The document text category: Type of content (e.g.… See the full description on the dataset page: https://huggingface.co/datasets/BabyLM-community/babylm-nso.

Clear search

Close search

Google apps

Main menu