9 datasets found
  1. F

    South Asian Multi-Year Facial Image Dataset

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). South Asian Multi-Year Facial Image Dataset [Dataset]. https://www.futurebeeai.com/dataset/image-dataset/facial-images-historical-south-asian
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Area covered
    South Asia
    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    Welcome to the South Asian Multi-Year Facial Image Dataset, thoughtfully curated to support the development of advanced facial recognition systems, biometric identification models, KYC verification tools, and other computer vision applications. This dataset is ideal for training AI models to recognize individuals over time, track facial changes, and enhance age progression capabilities.

    Facial Image Data

    This dataset includes over 10,000+ high-quality facial images, organized into individual participant sets, each containing:

    Historical Images: 22 facial images per participant captured across a span of 10 years
    Enrollment Image: One recent high-resolution facial image for reference or ground truth

    Diversity & Representation

    Geographic Coverage: Participants from India, Pakistan, Bangladesh, Nepal, Sri Lanka, Bhutan, Maldives, and more and other South Asian regions
    Demographics: Individuals aged 18 to 70 years, with a gender distribution of 60% male and 40% female
    File Formats: All images are available in JPEG and HEIC formats

    Image Quality & Capture Conditions

    To ensure model generalization and practical usability, images in this dataset reflect real-world diversity:

    Lighting Conditions: Images captured under various natural and artificial lighting setups
    Backgrounds: A wide range of indoor and outdoor backgrounds
    Device Quality: Captured using modern, high-resolution mobile devices for consistency and clarity

    Metadata

    Each participant’s dataset is accompanied by rich metadata to support advanced model training and analysis, including:

    Unique participant ID
    File name
    Age at the time of image capture
    Gender
    Country of origin
    Demographic profile
    File format

    Use Cases & Applications

    This dataset is highly valuable for a wide range of AI and computer vision applications:

    Facial Recognition Systems: Train models for high-accuracy face matching across time
    KYC & Identity Verification: Improve time-spanning verification for banks, insurance, and government services
    Biometric Security Solutions: Build reliable identity authentication models
    Age Progression & Estimation Models: Train AI to predict aging patterns or estimate age from facial features
    Generative AI: Support creation and validation of synthetic age progression or longitudinal face generation

    Secure & Ethical Collection

    Platform: All data was securely collected and processed through FutureBeeAI’s proprietary systems
    Ethical Compliance: Full participant consent obtained with transparent communication of use cases
    Privacy-Protected: No personally identifiable information is included; all data is anonymized and handled with care

    Dataset Updates & Customization

    To keep pace with evolving AI needs, this dataset is regularly updated and customizable. Custom data collection options include:

    <div style="margin-top:10px; margin-bottom: 10px;

  2. F

    Urdu TTS Speech Dataset for Speech Synthesis

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Urdu TTS Speech Dataset for Speech Synthesis [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/tts-monolgue-urdu-pakistan
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    The Urdu TTS Monologue Speech Dataset is a professionally curated resource built to train realistic, expressive, and production-grade text-to-speech (TTS) systems. It contains studio-recorded long-form speech by trained native Urdu voice artists, each contributing 1 to 2 hours of clean, uninterrupted monologue audio.

    Unlike typical prompt-based datasets with short, isolated phrases, this collection features long-form, topic-driven monologues that mirror natural human narration. It includes content types that are directly useful for real-world applications, like audiobook-style storytelling, educational lectures, health advisories, product explainers, digital how-tos, formal announcements, and more.

    All recordings are captured in professional studios using high-end equipment and under the guidance of experienced voice directors.

    Recording & Audio Quality

    Audio Format: WAV, 48 kHz, available in 16-bit, 24-bit, and 32-bit depth
    SNR: Minimum 30 dB
    Channel: Mono
    Recording Duration: 20-30 minutes
    Recording Environment: Studio-controlled, acoustically treated rooms
    Per Speaker Volume: 1–2 hours of speech per artist
    Quality Control: Each file is reviewed and cleaned for common acoustic issues, including: reverberation, lip smacks, mouth clicks, thumping, hissing, plosives, sibilance, background noise, static interference, clipping, and other artifacts.

    Only clean, production-grade audio makes it into the final dataset.

    Voice Artist Selection

    All voice artists are native Urdu speakers with professional training or prior experience in narration. We ensure a diverse pool in terms of age, gender, and region to bring a balanced and rich vocal dataset.

    Artist Profile:
    Gender: Male and Female
    Age Range: 20–60 years
    Regions: Native Urdu-speaking states from Pakistan
    Selection Process: All artists are screened, onboarded, and sample-approved using FutureBeeAI’s proprietary Yugo platform.

    Script Quality & Coverage

    Scripts are not generic or repetitive. Scripts are professionally authored by domain experts to reflect real-world use cases. They avoid redundancy and include modern vocabulary, emotional range, and phonetically rich sentence structures.

    Word Count per Script: 3,000–5,000 words per 30-minute session
    Content Types:
    Storytelling
    Script and book reading
    Informational explainers
    Government service instructions
    E-commerce tutorials
    Motivational content
    Health & wellness guides
    Education & career advice
    Linguistic Design: Balanced punctuation, emotional range, modern syntax, and vocabulary diversity

    Transcripts & Alignment

    While the script is used during the recording, we also provide post-recording updates to ensure the transcript reflects the final spoken audio. Minor edits are made to adjust for skipped or rephrased words.

    Segmentation: Time-stamped at the sentence level, aligned to actual spoken delivery
    Format: Available in plain text and JSON
    Post-processing:
    Corrected for disfluencies
    <div

  3. T

    Pakistan Foreign Exchange Reserves

    • tradingeconomics.com
    • pl.tradingeconomics.com
    • +13more
    csv, excel, json, xml
    Updated Jun 23, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2016). Pakistan Foreign Exchange Reserves [Dataset]. https://tradingeconomics.com/pakistan/foreign-exchange-reserves
    Explore at:
    excel, csv, json, xmlAvailable download formats
    Dataset updated
    Jun 23, 2016
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Dec 31, 1998 - Jul 31, 2025
    Area covered
    Pakistan
    Description

    Foreign Exchange Reserves in Pakistan increased to 19607 USD Million in July from 19269.80 USD Million in June of 2025. This dataset provides the latest reported value for - Pakistan Foreign Exchange Reserves - plus previous releases, historical high and low, short-term forecast and long-term prediction, economic calendar, survey consensus and news.

  4. T

    Pakistan Total External Debt

    • tradingeconomics.com
    • tr.tradingeconomics.com
    • +13more
    csv, excel, json, xml
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS, Pakistan Total External Debt [Dataset]. https://tradingeconomics.com/pakistan/external-debt
    Explore at:
    xml, excel, json, csvAvailable download formats
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jun 30, 2002 - Mar 31, 2025
    Area covered
    Pakistan
    Description

    External Debt in Pakistan decreased to 130310 USD Million in the first quarter of 2025 from 130921 USD Million in the fourth quarter of 2024. This dataset provides - Pakistan External Debt - actual values, historical data, forecast, chart, statistics, economic calendar and news.

  5. Pakistan - Internally displaced persons - IDPs

    • data.amerigeoss.org
    • data.wu.ac.at
    csv
    Updated Jul 15, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UN Humanitarian Data Exchange (2021). Pakistan - Internally displaced persons - IDPs [Dataset]. https://data.amerigeoss.org/sl/dataset/idmc-idp-data-for-pakistan
    Explore at:
    csv(8362), csv(824)Available download formats
    Dataset updated
    Jul 15, 2021
    Dataset provided by
    United Nationshttp://un.org/
    Area covered
    Pakistan
    Description

    Internally displaced persons are defined according to the 1998 Guiding Principles (http://www.internal-displacement.org/publications/1998/ocha-guiding-principles-on-internal-displacement) as people or groups of people who have been forced or obliged to flee or to leave their homes or places of habitual residence, in particular as a result of armed conflict, or to avoid the effects of armed conflict, situations of generalized violence, violations of human rights, or natural or human-made disasters and who have not crossed an international border.

    "People Displaced" refers to the number of people living in displacement as of the end of each year.

    "New Displacement" refers to the number of new cases or incidents of displacement recorded, rather than the number of people displaced. This is done because people may have been displaced more than once.

    Contains data from IDMC's Global Internal Displacement Database.

  6. Pakistan - Internal Displacements (New Displacements) – IDPs

    • data.humdata.org
    csv
    Updated Jul 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Internal Displacement Monitoring Centre (IDMC) (2025). Pakistan - Internal Displacements (New Displacements) – IDPs [Dataset]. https://data.humdata.org/dataset/idmc-idp-data-pak
    Explore at:
    csv(38268), csv(917)Available download formats
    Dataset updated
    Jul 21, 2025
    Dataset provided by
    Internal Displacement Monitoring Centrehttp://internal-displacement.org/
    License

    Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
    License information was derived automatically

    Area covered
    Pakistan
    Description

    The Global Internal Displacement Database (GIDD), maintained by the Internal Displacement Monitoring Centre (IDMC), provides comprehensive, validated annual estimates of internal displacement worldwide. It defines internally displaced persons (IDPs) in line with the 1998 Guiding Principles, as people or groups of people who have been forced or obliged to flee or to leave their homes or places of habitual residence, in particular as a result of armed conflict, or to avoid the effects of armed conflict, situations of generalized violence, violations of human rights, or natural or human-made disasters and who have not crossed an international border.

    The GIDD tracks two primary metrics: "People Displaced" or population "Stock" figures, which represent the total number of people living in displacement at year-end, and "New Displacement," which counts new displacement incidents (population Flows) rather than individual people, accounting for potential multiple displacements by the same person. This dataset serves as a crucial resource for understanding long-term trends and validated displacement figures globally. For further detailed information and complete API specifications, users are encouraged to consult the official documentation at https://www.internal-displacement.org/database/api-documentation/.

    "Internally displaced persons - IDPs" refers to the number of people living in displacement as of the end of each year.

    "Internal displacements (New Displacements)" refers to the number of new cases or incidents of displacement recorded, rather than the number of people displaced. This is done because people may have been displaced more than once.

  7. T

    Pakistan Gold Reserves

    • tradingeconomics.com
    • it.tradingeconomics.com
    • +12more
    csv, excel, json, xml
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS, Pakistan Gold Reserves [Dataset]. https://tradingeconomics.com/pakistan/gold-reserves
    Explore at:
    csv, xml, excel, jsonAvailable download formats
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Mar 31, 2000 - Jun 30, 2025
    Area covered
    Pakistan
    Description

    Gold Reserves in Pakistan remained unchanged at 64.75 Tonnes in the second quarter of 2025 from 64.75 Tonnes in the first quarter of 2025. This dataset provides - Pakistan Gold Reserves - actual values, historical data, forecast, chart, statistics, economic calendar and news.

  8. F

    Urdu General Conversation Speech Dataset for ASR

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Urdu General Conversation Speech Dataset for ASR [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/general-conversation-urdu-pakistan
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    Welcome to the Urdu General Conversation Speech Dataset — a rich, linguistically diverse corpus purpose-built to accelerate the development of Urdu speech technologies. This dataset is designed to train and fine-tune ASR systems, spoken language understanding models, and generative voice AI tailored to real-world Urdu communication.

    Curated by FutureBeeAI, this 30 hours dataset offers unscripted, spontaneous two-speaker conversations across a wide array of real-life topics. It enables researchers, AI developers, and voice-first product teams to build robust, production-grade Urdu speech models that understand and respond to authentic Urdu accents and dialects.

    Speech Data

    The dataset comprises 30 hours of high-quality audio, featuring natural, free-flowing dialogue between native speakers of Urdu. These sessions range from informal daily talks to deeper, topic-specific discussions, ensuring variability and context richness for diverse use cases.

    Participant Diversity:
    Speakers: 60 verified native Urdu speakers from FutureBeeAI’s contributor community.
    Regions: Representing various provinces of Pakistan to ensure dialectal diversity and demographic balance.
    Demographics: A balanced gender ratio (60% male, 40% female) with participant ages ranging from 18 to 70 years.
    Recording Details:
    Conversation Style: Unscripted, spontaneous peer-to-peer dialogues.
    Duration: Each conversation ranges from 15 to 60 minutes.
    Audio Format: Stereo WAV files, 16-bit depth, recorded at 16kHz sample rate.
    Environment: Quiet, echo-free settings with no background noise.

    Topic Diversity

    The dataset spans a wide variety of everyday and domain-relevant themes. This topic diversity ensures the resulting models are adaptable to broad speech contexts.

    Sample Topics Include:
    Family & Relationships
    Food & Recipes
    Education & Career
    Healthcare Discussions
    Social Issues
    Technology & Gadgets
    Travel & Local Culture
    Shopping & Marketplace Experiences, and many more.

    Transcription

    Each audio file is paired with a human-verified, verbatim transcription available in JSON format.

    Transcription Highlights:
    Speaker-segmented dialogues
    Time-coded utterances
    Non-speech elements (pauses, laughter, etc.)
    High transcription accuracy, achieved through double QA pass, average WER < 5%

    These transcriptions are production-ready, enabling seamless integration into ASR model pipelines or conversational AI workflows.

    Metadata

    The dataset comes with granular metadata for both speakers and recordings:

    Speaker Metadata: Age, gender, accent, dialect, state/province, and participant ID.
    Recording Metadata: Topic, duration, audio format, device type, and sample rate.

    Such metadata helps developers fine-tune model training and supports use-case-specific filtering or demographic analysis.

    Usage and Applications

    This dataset is a versatile resource for multiple Urdu speech and language AI applications:

    ASR Development: Train accurate speech-to-text systems for Urdu.
    Voice Assistants: Build smart assistants capable of understanding natural Urdu conversations.

  9. d

    Pakistan - Learning and Educational Achievement in Punjab Schools (LEAPS) -...

    • waterdata3.staging.derilinx.com
    Updated Mar 16, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2020). Pakistan - Learning and Educational Achievement in Punjab Schools (LEAPS) - 2003 - Dataset - waterdata [Dataset]. https://waterdata3.staging.derilinx.com/dataset/pakistan-learning-and-educational-achievement-punjab-schools-leaps-2003
    Explore at:
    Dataset updated
    Mar 16, 2020
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Punjab, Pakistan
    Description

    Whether one is in favor of private education or not, it is here to stay and there is a critical need to understand this new environment. Unfortunately, little is known about the private sector and what its growth implies for the provision of education. There are important questions we need to answer before engaging in productive debate about how education can be best provided in the Pakistani context. For instance: a. Where are private schools setting up? Are they only being established in urban areas and only for the elite? b. What is the quality of education in private sector schools? How does it compare to public schools? c. Are the poor being left out? Is the private sector creating two classes of people in Pakistan—those who can afford private education and those who cannot? d. What is the effect of private schools on government schools?

  10. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
FutureBee AI (2022). South Asian Multi-Year Facial Image Dataset [Dataset]. https://www.futurebeeai.com/dataset/image-dataset/facial-images-historical-south-asian

South Asian Multi-Year Facial Image Dataset

Human Past Image Dataset

Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License

https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

Area covered
South Asia
Dataset funded by
FutureBeeAI
Description

Introduction

Welcome to the South Asian Multi-Year Facial Image Dataset, thoughtfully curated to support the development of advanced facial recognition systems, biometric identification models, KYC verification tools, and other computer vision applications. This dataset is ideal for training AI models to recognize individuals over time, track facial changes, and enhance age progression capabilities.

Facial Image Data

This dataset includes over 10,000+ high-quality facial images, organized into individual participant sets, each containing:

Historical Images: 22 facial images per participant captured across a span of 10 years
Enrollment Image: One recent high-resolution facial image for reference or ground truth

Diversity & Representation

Geographic Coverage: Participants from India, Pakistan, Bangladesh, Nepal, Sri Lanka, Bhutan, Maldives, and more and other South Asian regions
Demographics: Individuals aged 18 to 70 years, with a gender distribution of 60% male and 40% female
File Formats: All images are available in JPEG and HEIC formats

Image Quality & Capture Conditions

To ensure model generalization and practical usability, images in this dataset reflect real-world diversity:

Lighting Conditions: Images captured under various natural and artificial lighting setups
Backgrounds: A wide range of indoor and outdoor backgrounds
Device Quality: Captured using modern, high-resolution mobile devices for consistency and clarity

Metadata

Each participant’s dataset is accompanied by rich metadata to support advanced model training and analysis, including:

Unique participant ID
File name
Age at the time of image capture
Gender
Country of origin
Demographic profile
File format

Use Cases & Applications

This dataset is highly valuable for a wide range of AI and computer vision applications:

Facial Recognition Systems: Train models for high-accuracy face matching across time
KYC & Identity Verification: Improve time-spanning verification for banks, insurance, and government services
Biometric Security Solutions: Build reliable identity authentication models
Age Progression & Estimation Models: Train AI to predict aging patterns or estimate age from facial features
Generative AI: Support creation and validation of synthetic age progression or longitudinal face generation

Secure & Ethical Collection

Platform: All data was securely collected and processed through FutureBeeAI’s proprietary systems
Ethical Compliance: Full participant consent obtained with transparent communication of use cases
Privacy-Protected: No personally identifiable information is included; all data is anonymized and handled with care

Dataset Updates & Customization

To keep pace with evolving AI needs, this dataset is regularly updated and customizable. Custom data collection options include:

<div style="margin-top:10px; margin-bottom: 10px;

Search
Clear search
Close search
Google apps
Main menu