14 datasets found
  1. Nepali Handwritten Images for text detection

    • kaggle.com
    zip
    Updated Sep 23, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sweekar Dahal (2023). Nepali Handwritten Images for text detection [Dataset]. https://www.kaggle.com/datasets/sweekardahal/nepali-handwritten-images-for-text-detection
    Explore at:
    zip(1304832759 bytes)Available download formats
    Dataset updated
    Sep 23, 2023
    Authors
    Sweekar Dahal
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Implementations:

    1. https://github.com/R4j4n/Nepali-Text-Detection-DBnet
    2. https://github.com/dahalsweekar/TextBoxes---Compatible-with-python-3.10 (outdated)

    Description:

    We present the Nepali Handwriting Dataset (NHD), which is a collection of camera-captured images of Nepali handwritten text from various regions in Nepal. The dataset aims to provide a benchmark for researchers to explore new techniques in handwriting detection and recognition. We also present benchmark results for text localization and recognition using well-established deep-learning frameworks. The dataset and benchmark results are available here.

    Key Features:

    The role of data collection and preprocessing in the research on handwritten text detection cannot be overstated. It is a crucial aspect that plays a significant role in obtaining a comprehensive and diverse dataset. To this end, the researchers personally collected 1,000 mobile phone-captured data samples from various sources, including schools, government offices, universities, and student councils.

    The dataset was carefully curated to encompass three distinct categories based on age groups, namely kids, youth, and adults, with 599, 152, and 249 samples, respectively. Each of the 1,000 pages was meticulously annotated by the researchers to ensure accurate labeling and create a reliable dataset. The data collection process focused on capturing a wide range of handwriting styles and variations prevalent among different age groups and settings.

    The collected dataset served as a valuable resource for training and evaluating the handwritten text detection models in the research. It provided a rich and diverse set of data that enabled the researchers to develop robust models capable of accurately detecting handwritten text across different age groups and settings.

    Use Cases:

    1. Real-Time Text Detection
    2. Text Recognition

    Results:

    You can find its implementation here: https://github.com/R4j4n/Nepali-Text-Detection-DBnet

    Recall: 0.9069154470416869

    Precision: 0.9178659178659179

    HMean: 0.9123578206927347

    Test Image:

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F4786384%2Ff8d9aa282a42848b359aeeb021b97937%2Foutput.png?generation=1695433752833462&alt=media" alt="">

    If you find this dataset useful, your support through an upvote would be greatly appreciated ❤️🙂

    Thank you

  2. h

    NaBI

    • huggingface.co
    Updated Apr 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Utkarsha Khanal (2025). NaBI [Dataset]. https://huggingface.co/datasets/Utkarsha666/NaBI
    Explore at:
    Dataset updated
    Apr 11, 2025
    Authors
    Utkarsha Khanal
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    🧠 NaBI: Nepali Bias & Information Dataset

    The NaBI (Nepali Bias & Information) Dataset is a curated and annotated dataset developed for the classification of Nepali language content into four critical categories related to information integrity and harmful content. It is designed to aid in the development of NLP models for moderation, content filtering, and sociopolitical analysis.

      💡 Dataset Description
    

    The dataset consists of Nepali text samples sourced from public… See the full description on the dataset page: https://huggingface.co/datasets/Utkarsha666/NaBI.

  3. New Events Data in Nepal

    • kaggle.com
    zip
    Updated Sep 14, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Techsalerator (2024). New Events Data in Nepal [Dataset]. https://www.kaggle.com/datasets/techsalerator/new-events-data-in-nepal
    Explore at:
    zip(4948 bytes)Available download formats
    Dataset updated
    Sep 14, 2024
    Authors
    Techsalerator
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Area covered
    Nepal
    Description

    Techsalerator's News Events Data for Nepal: A Comprehensive Overview

    Techsalerator's News Events Data for Nepal offers a valuable resource for businesses, researchers, and media organizations. This dataset aggregates information on key news events throughout Nepal, sourcing data from various media outlets, including news channels, online publications, and social media platforms. It provides essential insights for those interested in tracking trends, analyzing public sentiment, or observing industry-specific developments.

    Key Data Fields - Event Date: Captures the exact date of the news event, crucial for analysts monitoring trends over time or businesses responding to market changes. - Event Title: A brief headline describing the event, allowing users to quickly categorize and assess news content based on their interests. - Source: Identifies the news outlet or platform where the event was reported, helping users track credible sources and evaluate the reach and influence of the event. - Location: Provides geographic information on where the event occurred within Nepal, valuable for regional analysis or localized marketing efforts. - Event Description: A detailed summary of the event, outlining key developments, participants, and potential impact, aiding researchers and businesses in understanding the context and implications.

    Top 5 News Categories in Nepal - Politics: Major news on government decisions, political movements, elections, and policy changes affecting the national landscape. - Economy: Covers Nepal’s economic indicators, inflation rates, international trade, and corporate activities influencing business and finance sectors. - Social Issues: News on protests, public health, education, and other societal concerns driving public discourse. - Sports: Highlights events in popular sports such as football and cricket, drawing widespread attention and engagement. - Technology and Innovation: Reports on tech developments, startups, and innovations within Nepal’s growing tech ecosystem, featuring emerging companies and advancements.

    Top 5 News Sources in Nepal - The Kathmandu Post: A leading news outlet providing extensive coverage of national politics, economy, and social issues. - Republica: A major newspaper known for its timely updates on breaking news, politics, and current affairs. - Nagarik News: A widely-read source offering insights into local politics, economic developments, and societal trends. - My Republica: Covers a broad spectrum of topics, including politics, economy, and social issues. - Khabarhub: The national news agency delivering updates on significant events, public health, and sports across Nepal.

    Accessing Techsalerator’s News Events Data for Nepal To access Techsalerator’s News Events Data for Nepal, please contact info@techsalerator.com with your specific needs. We will provide a customized quote based on the data fields and records you require, with delivery available within 24 hours. Ongoing access options can also be discussed.

    Included Data Fields - Event Date - Event Title - Source - Location - Event Description - Event Category (Politics, Economy, Sports, etc.) - Participants (if applicable) - Event Impact (Social, Economic, etc.)

    Techsalerator’s dataset is an essential tool for tracking significant events in Nepal, supporting informed decisions whether for business strategy, market analysis, or academic research, and offering a clear view of the country’s news landscape.

  4. Z

    Population and Economic Data of Nepal 2011 - Municipal-Level Data from...

    • data-staging.niaid.nih.gov
    Updated Feb 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Maharjan, Roisha; Bhochhibhoya, Sanish (2025). Population and Economic Data of Nepal 2011 - Municipal-Level Data from different sources, including the National Census [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_14010806
    Explore at:
    Dataset updated
    Feb 11, 2025
    Dataset provided by
    Pulchowk Campus
    Authors
    Maharjan, Roisha; Bhochhibhoya, Sanish
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Nepal
    Description

    This dataset comprises compiled data utilized for the integrated seismic risk assessment presented in the following study:

    Bhochhibhoya, S., & Maharjan, R. (2022). Integrated seismic risk assessment in Nepal. Natural Hazards and Earth System Sciences, 22(10), 3211-3230. https://doi.org/10.5194/nhess-22-3211-2022

    Dataset Contents:

    data_used_paper.csv: Municipality-level data used directly in the paper. Note that some entries are available only at the district level; please refer to the study for specific details.

    opendata.xlsx: A comprehensive Excel file compiling relevant district-level data obtained from the OpenData website.

    additional_survey_district.csv: Census data at the district level that was not included in the analysis.

    Data Sources:The data were compiled from publicly available sources and were not originally collected by the authors. Key sources include:

    CBS – Central Bureau of Statistic: National Population and Housing Census 2011 (National Report),https://unstats.un.org/unsd/demographic-social/census/documents/Nepal/Nepal-Census-2011-Vol1.pdf (last access:20 November 2021), 2012.

    CBS – Central Bureau of Statistic: Population Monograph of Nepal,Vol. I (Population Dynamics), https://nepal.unfpa.org/sites/default/files/pub-pdf/PopulationMonograph2014Volume1.pdf(last access: 20 November 2021), 2014a.

    CBS – Central Bureau of Statistic: Population Monograph of Nepal,Vol. III (Economical Demography), https://nepal.unfpa.org/sites/default/files/pub-pdf/PopulationMonographV02.pdf (last access:20 November 2021), 2014b.

    Sharma, P., Guha-Khasnobis, B., and Khanal, D. R.: Nepal human development report 2014, https://www.npc.gov.np/images/category/NHDR_Report_2014.pdf (last access: 20 Novem-ber 2021), 2014

    Department of Health Services (2013).

    Budget report for year 2070–2071 BS (Bikram Sambat,based on Nepali calendar) (2013–2014 CE).

    Department of Education (2013–2014).

    Opendata Website.

    If the dataset is used, please cite both the dataset and the paper (below).

    Bhochhibhoya, S., & Maharjan, R. (2022). Integrated seismic risk assessment in Nepal. Natural Hazards and Earth System Sciences, 22(10), 3211-3230. https://doi.org/10.5194/nhess-22-3211-2022

    Roisha, M. & Bhochhibhoya, S. (2024). Population and Economic Data of Nepal 2011 - Municipal-Level Data from different sources, including the National Census (Version v1) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.14010807

    If files are not working, or any other queries, contact sonicewrites@gmail.com.

  5. Public Holidays in Nepal 2081 BS

    • kaggle.com
    zip
    Updated May 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Saleep Shrestha (2024). Public Holidays in Nepal 2081 BS [Dataset]. https://www.kaggle.com/datasets/saleepshrestha/public-holidays-in-nepal-2081-bs
    Explore at:
    zip(1404 bytes)Available download formats
    Dataset updated
    May 22, 2024
    Authors
    Saleep Shrestha
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Area covered
    Nepal
    Description

    Dataset

    This dataset was created by Saleep Shrestha

    Released under Apache 2.0

    Contents

  6. LinCE (Linguistic Code-switching Evaluation)

    • kaggle.com
    zip
    Updated Dec 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2022). LinCE (Linguistic Code-switching Evaluation) [Dataset]. https://www.kaggle.com/datasets/thedevastator/unlock-universal-language-with-the-lince-dataset
    Explore at:
    zip(11808965 bytes)Available download formats
    Dataset updated
    Dec 1, 2022
    Authors
    The Devastator
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    LinCE (Linguistic Code-switching Evaluation)

    Data for training and evaluating NLP systems on code-switching tasks

    By Huggingface Hub [source]

    About this dataset

    Do you want to uncover the power of language through analysis? The Lince Dataset is the answer! An expansive collection of language technologies and data, this dataset can be utilized for a multitude of purposes. With six different languages to explore - Spanish, Hindi, Nepali, Spanish-English, Hindi-English as well as Spanish Multi-Source-English (MSAEA) - you are granted access to an enormous selection of language identification (LID), part-of-speech (POS) tagging, Named-Entity Recognition (NER), sentiment analysis (SA) and much more. Train your models efficiently with the help of ML in order to automatically detect and classify tasks such as POS or NER from each variation. Or even build cross linguistic models between multiple languages if preferred! Push the boundaries with Lince Dataset's unparalleled diversity. Dive into exploratory research within this feast for NLP connoisseurs and unlock hidden opportunities today!

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    Are you looking to unlock the potential of multilingual natural language processing (NLP) with the Lince Dataset? If so, you’re in the right place! With six languages and training data for language identification (LID), part-of-speech (POS) tagging, Named-Entity Recognition (NER), sentiment analysis (SA) and more, this is one of the most comprehensive datasets for NLP today.

    Understand what is included in this dataset This dataset includes language technology data from six different languages. These include Spanish, Hindi, Nepali, Spanish-English, Hindi-English and Spanish Multi**Source**English (MSAEA). Each file is labelled according to its content - e.g. lid_msaea_test.csv which contains test data for language identificaiton (LID) with 5 columns containing words, part of speech tags as well as sentiment analysis labels. A brief summary of each file's contents can be found when you pull this dataset up on Kaggle or when running a script such as “head()” or “describe()” depending on your software preferences

    Decide What Kind Of Analysis You Want To Do Once you are familiar with what type of data is provided it will be necessary to decide which kind of model or analysis you want to do before diving into coding any algorithms relevant for that task . For example if one wants to build a cross lingual model for POS tagging then it would be ideal to have training and validation sets from 3 different languages so that one can take advantage multi domain knowledge interchange between them during training phase hence selecting files such as pos_spaeng _train , pos_hineng _validation will come into play . While designing your model architecture make sure that task specific hyper parameters should complement each other while taking decisions , also choosing an appropriate feature vector representation strategy helps in improved performance

    Run Appropriate Algorithms On The Data Provided In The Dataset Now upon understanding all elements presented in front we can start running appropriate algorithms irespective respectively of tools used while tuning our models using metrics like accuracy , f1 score etc . Once tuned ensure that our system works reliably by testing on unseen test set and ensuring desired results . During optimization various hyper parameter tuning has makes significant role depending upon algorithm chosen irespective respective ly

    Research Ideas

    • Developing a multilingual sentiment analysis system that can analyze sentiment in any of the six languages.
    • Training a model to identify and classify named entities across multiple languages, such as identifying certain words for proper nouns or locations regardless of language or coding scheme.
    • Developing an AI-powered cross-lingual translator that is able to effectively translate text from one language to another with minimal errors and maximum accuracy

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

    Columns

    File: lid_msaea_test.csv...

  7. h

    Data from: Nepal Heritage Documentation Project, NHDP

    • heidata.uni-heidelberg.de
    json
    Updated May 26, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christiane Brosius; Christiane Brosius; Axel Michaels; Axel Michaels (2020). Nepal Heritage Documentation Project, NHDP [Dataset]. http://doi.org/10.11588/DATA/A9DCZA
    Explore at:
    json(21606916), json(22639652)Available download formats
    Dataset updated
    May 26, 2020
    Dataset provided by
    heiDATA
    Authors
    Christiane Brosius; Christiane Brosius; Axel Michaels; Axel Michaels
    License

    https://heidata.uni-heidelberg.de/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.11588/DATA/A9DCZAhttps://heidata.uni-heidelberg.de/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.11588/DATA/A9DCZA

    Area covered
    Nepal
    Dataset funded by
    Arcadia Foundation
    Description

    DANAM is the Digital Archive for Nepalese Art and Monuments and the heart of the Nepal Heritage Documentation Project (NHDP), located at the Heidelberg Centre of Transcultural Studies (HCTS) and the Academy of Sciences (AdW) and operated in cooperation with Saraf Foundation and the Department of Archaeology, Nepal. The database offers visual and textual documentation of heritage monuments, which are threatened by urbanisation and natural disasters. Data sets contain structured information on the monuments, i.e. details of their location, history, architectural structure, and religious and social activities. It presents photographs, maps, plans and drawings, transcriptions of inscriptions, and historical and anthropological reports. Descriptions are available in both English and Nepali. References to scholarly documentation and resources enable further research interests to be explored. DANAM is based on Arches (v.4), an open-source, geospatially-enabled software platform for cultural heritage inventory and management, developed jointly by the Getty Conservation Institute and World Monuments Fund. Funded by the British Arcadia Foundation, the entire content of DANAM is is openly accessible to the general public. A large part of the data is also stored in heidICON and heiDATA.

  8. Nepali-luxury-hotel-reviews-2024

    • kaggle.com
    zip
    Updated Feb 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Suprabal Pandey (2025). Nepali-luxury-hotel-reviews-2024 [Dataset]. https://www.kaggle.com/datasets/suprapandey/nepali-luxury-hotel-reviews-2024
    Explore at:
    zip(3473974 bytes)Available download formats
    Dataset updated
    Feb 12, 2025
    Authors
    Suprabal Pandey
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    Nepal
    Description

    Dataset

    This dataset was created by Suprabal Pandey

    Released under CC0: Public Domain

    Contents

  9. Handwritten Labeled Dataset for Text Recognition

    • kaggle.com
    zip
    Updated May 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sweekar Dahal (2024). Handwritten Labeled Dataset for Text Recognition [Dataset]. https://www.kaggle.com/datasets/sweekardahal/handwritten-labeled-dataset-for-text-recognition/versions/1
    Explore at:
    zip(5761156 bytes)Available download formats
    Dataset updated
    May 21, 2024
    Authors
    Sweekar Dahal
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This is labeled handwritten Nepali dataset for text recognition. The dataset was created for Optical Character Recognition tool. Due to its small size, using large classification models will over-fit the training results. Hence, in order to gain leverage over this constrain, HuggingFace TransformerOCR was implemented. The model achieved good results, however was slow during inference. The motivation to make this dataset public is to develop a optimal model that can recognize the texts.

    Results

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F4786384%2F46e658c372c33874e22a5c8cda2713aa%2FScreenshot%20from%202024-05-21%2019-49-38.png?generation=1716301271869838&alt=media" alt="Results">

  10. Air Pollution Image Dataset from India and Nepal

    • kaggle.com
    zip
    Updated May 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Adarsh Rouniyar (2024). Air Pollution Image Dataset from India and Nepal [Dataset]. https://www.kaggle.com/datasets/adarshrouniyar/air-pollution-image-dataset-from-india-and-nepal/code
    Explore at:
    zip(667113768 bytes)Available download formats
    Dataset updated
    May 7, 2024
    Authors
    Adarsh Rouniyar
    License

    Attribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
    License information was derived automatically

    Area covered
    India, Nepal
    Description

    Air-Pollution-Image-Dataset-From-India-and-Nepal

    Please Fork and Star our work by visiting our GitHub Repository before using or downloading our dataset

    1. Forking our repository allows you to create your own copy of our repository, which you can modify and use as you wish.

    2. Starring our repository is a way for people to show their support and appreciation for our work.

    https://github.com/ICCC-Platform/Air-Pollution-Image-Dataset-From-India-and-Nepal

    Introduction:

    This dataset contains images of Air Pollution for different cities in India and Nepal. The dataset is divided into two folders: Combined_Dataset and Country_wise_Dataset.

    Total number of image dataset: 12,240 Image size: 224*224

    Air Quality Index (AQI) Class and its defination used in the dataset.

    There are a total of six classes of Air Pollution, which we represent in our dataset as follows:

    1. Good (0-50): Air quality is considered satisfactory and air pollution poses little or no risk.

    2. Moderate (51-100): Air quality is acceptable; however, for some pollutants, there may be a moderate health concern for a very small number of people who are unusually sensitive to air pollution.

    3. Unhealthy for Sensitive Groups (101-150): Members of sensitive groups may experience health effects, but the general public is unlikely to be affected.

    4. Unhealthy (151-200): Some members of the general public may experience health effects; members of sensitive groups may experience more serious health effects.

    5. Very Unhealthy (201-300): Health alert: The risk of health effects is increased for everyone.

    6. Hazardous/Severe (301-500): Health warning of emergency conditions: Everyone is more likely to be affected.

    Reference:

    https://airtw.epa.gov.tw/ENG/Information/Standard/AirQualityIndicator.aspx https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F11024368%2F3865850ad0720dc148c71b79946f4196%2FAQI%20reference.JPG?generation=1681898437013999&alt=media" alt="">

    Cities of India

    1. ITO, Delhi 2. Dimapur, Nagaland 3. Spice Garden, Bengaluru 4. Knowledge Park III, Greater Noida 5. New Ind Town, Faridabad 6. Borivali East, Mumbai 7. Oragadam, Tamil Nadu

    City of Nepal 1. Biratnagar

    Combined dataset:

    The combined dataset folder contains two subfolders. 1. All_img: This subfolder contains all the collected images from all AQI classes. 2. IND_and_NEP: This subfolder contains six different subfolders representing six different classes of AQI.

    The csv file in this folder contains all the data and its parameters. It is labeled as

    Location, Filename, Year, Month, Day, Hour, AQI, PM2.5, PM10, O3, CO, SO2, NO2, and AQI_Class

    Country_wise_Dataset:

    This folder contains two subfolders representing the countries from which the dataset was collected.

    **1. India: ** This subfolder contains the subfolder representing the names of all cities from where data were collected. Each subfolder of cities contains folders representing the data collected for each respective AQI class, as well as a csv file. which contains the details of each image, like we mentioned above. Such as,

    Location, Filename, Year, Month, Day, Hour, AQI, PM2.5, PM10, O3, CO, SO2, NO2, and AQI_Class

    **2. Nepal: ** We managed to collect the image dataset from Nepal. This subfolder contains the subfolder representing the name of the city from where data were collected. This subfolder of the city contains folders representing the data collected for each AQI class and also a csv file. which contains the details of each image, like we mentioned above. Such as,

    Location, Filename, Year, Month, Day, Hour, AQI, PM2.5, PM10, O3, CO, SO2, NO2, and AQI_Class

    ////////////////////////////////////////////////////////////////////////////////

    Dataset Collection Process:

    1. Visit the site: The first step in collecting the air pollution data was to personally visit the site. This involved physically going to the location and capturing images and videos of the area.

    2. Note current parameters: While visiting the site, various parameters related to air pollution were noted. These included measurements of PM2.5, PM10, NO2, SO2, CO, etc. These parameters were noted by referring to publicly available data sources such as the Central Pollution Control Board (CPCB) website. For India we used https://app.cpcbccr.com/AQI_India/ and for Nepal we used: https://www.tomorrow.io/weather/NP/4/Biratnagar/079711/hourly/

    3. Preprocess images: Once the images and videos were captured, they were preprocessed to remove any images that were blurry, overexposed, or had other quality issues. Only the images that met the desired quality criteria were selected for further analysis.

    4. Extract frames from videos: In addition to the images, videos were also capture...

  11. Quarterly profits of Commercial Banks of Nepal

    • kaggle.com
    zip
    Updated Sep 15, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dimanjan Dahal (2021). Quarterly profits of Commercial Banks of Nepal [Dataset]. https://www.kaggle.com/dimanjung/quarterly-profits-of-commercial-banks-of-nepal
    Explore at:
    zip(7902 bytes)Available download formats
    Dataset updated
    Sep 15, 2021
    Authors
    Dimanjan Dahal
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    Nepal
    Description

    Dataset

    This dataset was created by Dimanjan Dahal

    Released under CC0: Public Domain

    Contents

    Note: GBIME and JBNL merged in 2019.

  12. NEPSE Floorsheet

    • kaggle.com
    zip
    Updated May 18, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shivahari Subedi (2021). NEPSE Floorsheet [Dataset]. https://www.kaggle.com/shivaharisubedi/nepse-floorsheet
    Explore at:
    zip(4807923 bytes)Available download formats
    Dataset updated
    May 18, 2021
    Authors
    Shivahari Subedi
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    This dataset was created by Shivahari Subedi

    Released under CC0: Public Domain

    Contents

  13. Synthetic Nepali Words Dataset

    • kaggle.com
    zip
    Updated Jul 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Allah Hitler (2025). Synthetic Nepali Words Dataset [Dataset]. https://www.kaggle.com/datasets/allahhitler/synth-nepali-words-dataset
    Explore at:
    zip(456495482 bytes)Available download formats
    Dataset updated
    Jul 29, 2025
    Authors
    Allah Hitler
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    This dataset was created by Allah Hitler

    Released under CC0: Public Domain

    Contents

  14. nepali_unigrams_cleaned

    • kaggle.com
    zip
    Updated Jan 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sundeep Dawadi (2025). nepali_unigrams_cleaned [Dataset]. https://www.kaggle.com/datasets/thenepaliguy/nepali-unigrams-cleaned
    Explore at:
    zip(38667038 bytes)Available download formats
    Dataset updated
    Jan 20, 2025
    Authors
    Sundeep Dawadi
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    This dataset was created by Sundeep Dawadi

    Released under CC0: Public Domain

    Contents

  15. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Sweekar Dahal (2023). Nepali Handwritten Images for text detection [Dataset]. https://www.kaggle.com/datasets/sweekardahal/nepali-handwritten-images-for-text-detection
Organization logo

Nepali Handwritten Images for text detection

Exhaustive text dataset collection of varying ages

Explore at:
zip(1304832759 bytes)Available download formats
Dataset updated
Sep 23, 2023
Authors
Sweekar Dahal
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

Implementations:

  1. https://github.com/R4j4n/Nepali-Text-Detection-DBnet
  2. https://github.com/dahalsweekar/TextBoxes---Compatible-with-python-3.10 (outdated)

Description:

We present the Nepali Handwriting Dataset (NHD), which is a collection of camera-captured images of Nepali handwritten text from various regions in Nepal. The dataset aims to provide a benchmark for researchers to explore new techniques in handwriting detection and recognition. We also present benchmark results for text localization and recognition using well-established deep-learning frameworks. The dataset and benchmark results are available here.

Key Features:

The role of data collection and preprocessing in the research on handwritten text detection cannot be overstated. It is a crucial aspect that plays a significant role in obtaining a comprehensive and diverse dataset. To this end, the researchers personally collected 1,000 mobile phone-captured data samples from various sources, including schools, government offices, universities, and student councils.

The dataset was carefully curated to encompass three distinct categories based on age groups, namely kids, youth, and adults, with 599, 152, and 249 samples, respectively. Each of the 1,000 pages was meticulously annotated by the researchers to ensure accurate labeling and create a reliable dataset. The data collection process focused on capturing a wide range of handwriting styles and variations prevalent among different age groups and settings.

The collected dataset served as a valuable resource for training and evaluating the handwritten text detection models in the research. It provided a rich and diverse set of data that enabled the researchers to develop robust models capable of accurately detecting handwritten text across different age groups and settings.

Use Cases:

  1. Real-Time Text Detection
  2. Text Recognition

Results:

You can find its implementation here: https://github.com/R4j4n/Nepali-Text-Detection-DBnet

Recall: 0.9069154470416869

Precision: 0.9178659178659179

HMean: 0.9123578206927347

Test Image:

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F4786384%2Ff8d9aa282a42848b359aeeb021b97937%2Foutput.png?generation=1695433752833462&alt=media" alt="">

If you find this dataset useful, your support through an upvote would be greatly appreciated ❤️🙂

Thank you

Search
Clear search
Close search
Google apps
Main menu