9 datasets found

F
South Asian Multi-Year Facial Image Dataset
futurebeeai.com
wav
Updated Aug 1, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). South Asian Multi-Year Facial Image Dataset [Dataset]. https://www.futurebeeai.com/dataset/image-dataset/facial-images-historical-south-asian
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Area covered
South Asia
Dataset funded by
FutureBeeAI
Description
Introduction
Welcome to the South Asian Multi-Year Facial Image Dataset, thoughtfully curated to support the development of advanced facial recognition systems, biometric identification models, KYC verification tools, and other computer vision applications. This dataset is ideal for training AI models to recognize individuals over time, track facial changes, and enhance age progression capabilities.
Facial Image Data
This dataset includes over 10,000+ high-quality facial images, organized into individual participant sets, each containing:
•
Historical Images: 22 facial images per participant captured across a span of 10 years

•
Enrollment Image: One recent high-resolution facial image for reference or ground truth

Diversity & Representation
•
Geographic Coverage: Participants from India, Pakistan, Bangladesh, Nepal, Sri Lanka, Bhutan, Maldives, and more and other South Asian regions

•
Demographics: Individuals aged 18 to 70 years, with a gender distribution of 60% male and 40% female

•
File Formats: All images are available in JPEG and HEIC formats

Image Quality & Capture Conditions
To ensure model generalization and practical usability, images in this dataset reflect real-world diversity:
•
Lighting Conditions: Images captured under various natural and artificial lighting setups

•
Backgrounds: A wide range of indoor and outdoor backgrounds

•
Device Quality: Captured using modern, high-resolution mobile devices for consistency and clarity

Metadata
Each participant’s dataset is accompanied by rich metadata to support advanced model training and analysis, including:
•Unique participant ID
•File name
•Age at the time of image capture
•Gender
•Country of origin
•Demographic profile
•File format
Use Cases & Applications
This dataset is highly valuable for a wide range of AI and computer vision applications:
•
Facial Recognition Systems: Train models for high-accuracy face matching across time

•
KYC & Identity Verification: Improve time-spanning verification for banks, insurance, and government services

•
Biometric Security Solutions: Build reliable identity authentication models

•
Age Progression & Estimation Models: Train AI to predict aging patterns or estimate age from facial features

•
Generative AI: Support creation and validation of synthetic age progression or longitudinal face generation

Secure & Ethical Collection
•
Platform: All data was securely collected and processed through FutureBeeAI’s proprietary systems

•
Ethical Compliance: Full participant consent obtained with transparent communication of use cases

•
Privacy-Protected: No personally identifiable information is included; all data is anonymized and handled with care

Dataset Updates & Customization
To keep pace with evolving AI needs, this dataset is regularly updated and customizable. Custom data collection options include:
<div style="margin-top:10px; margin-bottom: 10px;
F
Urdu TTS Speech Dataset for Speech Synthesis
futurebeeai.com
wav
Updated Aug 1, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). Urdu TTS Speech Dataset for Speech Synthesis [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/tts-monolgue-urdu-pakistan
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Dataset funded by
FutureBeeAI
Description
The Urdu TTS Monologue Speech Dataset is a professionally curated resource built to train realistic, expressive, and production-grade text-to-speech (TTS) systems. It contains studio-recorded long-form speech by trained native Urdu voice artists, each contributing 1 to 2 hours of clean, uninterrupted monologue audio.
Unlike typical prompt-based datasets with short, isolated phrases, this collection features long-form, topic-driven monologues that mirror natural human narration. It includes content types that are directly useful for real-world applications, like audiobook-style storytelling, educational lectures, health advisories, product explainers, digital how-tos, formal announcements, and more.
All recordings are captured in professional studios using high-end equipment and under the guidance of experienced voice directors.
Recording & Audio Quality
•
Audio Format: WAV, 48 kHz, available in 16-bit, 24-bit, and 32-bit depth

•
SNR: Minimum 30 dB

•
Channel: Mono

•
Recording Duration: 20-30 minutes

•
Recording Environment: Studio-controlled, acoustically treated rooms

•
Per Speaker Volume: 1–2 hours of speech per artist

•
Quality Control: Each file is reviewed and cleaned for common acoustic issues, including: reverberation, lip smacks, mouth clicks, thumping, hissing, plosives, sibilance, background noise, static interference, clipping, and other artifacts.

Only clean, production-grade audio makes it into the final dataset.
Voice Artist Selection
All voice artists are native Urdu speakers with professional training or prior experience in narration. We ensure a diverse pool in terms of age, gender, and region to bring a balanced and rich vocal dataset.
•Artist Profile:
•Gender: Male and Female
•Age Range: 20–60 years
•Regions: Native Urdu-speaking states from Pakistan
•
Selection Process: All artists are screened, onboarded, and sample-approved using FutureBeeAI’s proprietary Yugo platform.

Script Quality & Coverage
Scripts are not generic or repetitive. Scripts are professionally authored by domain experts to reflect real-world use cases. They avoid redundancy and include modern vocabulary, emotional range, and phonetically rich sentence structures.
•
Word Count per Script: 3,000–5,000 words per 30-minute session

•Content Types:
•Storytelling
•Script and book reading
•Informational explainers
•Government service instructions
•E-commerce tutorials
•Motivational content
•Health & wellness guides
•Education & career advice
•
Linguistic Design: Balanced punctuation, emotional range, modern syntax, and vocabulary diversity

Transcripts & Alignment
While the script is used during the recording, we also provide post-recording updates to ensure the transcript reflects the final spoken audio. Minor edits are made to adjust for skipped or rephrased words.
•
Segmentation: Time-stamped at the sentence level, aligned to actual spoken delivery

•
Format: Available in plain text and JSON

•Post-processing:
•Corrected for disfluencies
<div
T
Pakistan Foreign Exchange Reserves
tradingeconomics.com
pl.tradingeconomics.com
+13more
csv, excel, json, xml
Updated Jun 23, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2016). Pakistan Foreign Exchange Reserves [Dataset]. https://tradingeconomics.com/pakistan/foreign-exchange-reserves
Explore at:
excel, csv, json, xmlAvailable download formats
Dataset updated
Jun 23, 2016
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Dec 31, 1998 - Jul 31, 2025
Area covered
Pakistan
Description
Foreign Exchange Reserves in Pakistan increased to 19607 USD Million in July from 19269.80 USD Million in June of 2025. This dataset provides the latest reported value for - Pakistan Foreign Exchange Reserves - plus previous releases, historical high and low, short-term forecast and long-term prediction, economic calendar, survey consensus and news.
T
Pakistan Total External Debt
tradingeconomics.com
tr.tradingeconomics.com
+13more
csv, excel, json, xml
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS, Pakistan Total External Debt [Dataset]. https://tradingeconomics.com/pakistan/external-debt
Explore at:
xml, excel, json, csvAvailable download formats
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jun 30, 2002 - Mar 31, 2025
Area covered
Pakistan
Description
External Debt in Pakistan decreased to 130310 USD Million in the first quarter of 2025 from 130921 USD Million in the fourth quarter of 2024. This dataset provides - Pakistan External Debt - actual values, historical data, forecast, chart, statistics, economic calendar and news.
Pakistan - Internally displaced persons - IDPs
data.amerigeoss.org
data.wu.ac.at
csv
Updated Jul 15, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
UN Humanitarian Data Exchange (2021). Pakistan - Internally displaced persons - IDPs [Dataset]. https://data.amerigeoss.org/sl/dataset/idmc-idp-data-for-pakistan
Explore at:
csv(8362), csv(824)Available download formats
Dataset updated
Jul 15, 2021
Dataset provided by
United Nationshttp://un.org/
Area covered
Pakistan
Description
Internally displaced persons are defined according to the 1998 Guiding Principles (http://www.internal-displacement.org/publications/1998/ocha-guiding-principles-on-internal-displacement) as people or groups of people who have been forced or obliged to flee or to leave their homes or places of habitual residence, in particular as a result of armed conflict, or to avoid the effects of armed conflict, situations of generalized violence, violations of human rights, or natural or human-made disasters and who have not crossed an international border.

"People Displaced" refers to the number of people living in displacement as of the end of each year.

"New Displacement" refers to the number of new cases or incidents of displacement recorded, rather than the number of people displaced. This is done because people may have been displaced more than once.

Contains data from IDMC's Global Internal Displacement Database.
Pakistan - Internal Displacements (New Displacements) – IDPs
data.humdata.org
csv
Updated Jul 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Internal Displacement Monitoring Centre (IDMC) (2025). Pakistan - Internal Displacements (New Displacements) – IDPs [Dataset]. https://data.humdata.org/dataset/idmc-idp-data-pak
Explore at:
csv(38268), csv(917)Available download formats
Dataset updated
Jul 21, 2025
Dataset provided by
Internal Displacement Monitoring Centrehttp://internal-displacement.org/
License
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
Area covered
Pakistan
Description
The Global Internal Displacement Database (GIDD), maintained by the Internal Displacement Monitoring Centre (IDMC), provides comprehensive, validated annual estimates of internal displacement worldwide. It defines internally displaced persons (IDPs) in line with the 1998 Guiding Principles, as people or groups of people who have been forced or obliged to flee or to leave their homes or places of habitual residence, in particular as a result of armed conflict, or to avoid the effects of armed conflict, situations of generalized violence, violations of human rights, or natural or human-made disasters and who have not crossed an international border.

The GIDD tracks two primary metrics: "People Displaced" or population "Stock" figures, which represent the total number of people living in displacement at year-end, and "New Displacement," which counts new displacement incidents (population Flows) rather than individual people, accounting for potential multiple displacements by the same person. This dataset serves as a crucial resource for understanding long-term trends and validated displacement figures globally. For further detailed information and complete API specifications, users are encouraged to consult the official documentation at https://www.internal-displacement.org/database/api-documentation/.

"Internally displaced persons - IDPs" refers to the number of people living in displacement as of the end of each year.

"Internal displacements (New Displacements)" refers to the number of new cases or incidents of displacement recorded, rather than the number of people displaced. This is done because people may have been displaced more than once.
T
Pakistan Gold Reserves
tradingeconomics.com
it.tradingeconomics.com
+12more
csv, excel, json, xml
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS, Pakistan Gold Reserves [Dataset]. https://tradingeconomics.com/pakistan/gold-reserves
Explore at:
csv, xml, excel, jsonAvailable download formats
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Mar 31, 2000 - Jun 30, 2025
Area covered
Pakistan
Description
Gold Reserves in Pakistan remained unchanged at 64.75 Tonnes in the second quarter of 2025 from 64.75 Tonnes in the first quarter of 2025. This dataset provides - Pakistan Gold Reserves - actual values, historical data, forecast, chart, statistics, economic calendar and news.
F
Urdu General Conversation Speech Dataset for ASR
futurebeeai.com
wav
Updated Aug 1, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). Urdu General Conversation Speech Dataset for ASR [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/general-conversation-urdu-pakistan
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Dataset funded by
FutureBeeAI
Description
Introduction
Welcome to the Urdu General Conversation Speech Dataset — a rich, linguistically diverse corpus purpose-built to accelerate the development of Urdu speech technologies. This dataset is designed to train and fine-tune ASR systems, spoken language understanding models, and generative voice AI tailored to real-world Urdu communication.
Curated by FutureBeeAI, this 30 hours dataset offers unscripted, spontaneous two-speaker conversations across a wide array of real-life topics. It enables researchers, AI developers, and voice-first product teams to build robust, production-grade Urdu speech models that understand and respond to authentic Urdu accents and dialects.
Speech Data
The dataset comprises 30 hours of high-quality audio, featuring natural, free-flowing dialogue between native speakers of Urdu. These sessions range from informal daily talks to deeper, topic-specific discussions, ensuring variability and context richness for diverse use cases.
•Participant Diversity:
•
Speakers: 60 verified native Urdu speakers from FutureBeeAI’s contributor community.

•
Regions: Representing various provinces of Pakistan to ensure dialectal diversity and demographic balance.

•
Demographics: A balanced gender ratio (60% male, 40% female) with participant ages ranging from 18 to 70 years.

•Recording Details:
•
Conversation Style: Unscripted, spontaneous peer-to-peer dialogues.

•
Duration: Each conversation ranges from 15 to 60 minutes.

•
Audio Format: Stereo WAV files, 16-bit depth, recorded at 16kHz sample rate.

•
Environment: Quiet, echo-free settings with no background noise.

Topic Diversity
The dataset spans a wide variety of everyday and domain-relevant themes. This topic diversity ensures the resulting models are adaptable to broad speech contexts.
•Sample Topics Include:
•Family & Relationships
•Food & Recipes
•Education & Career
•Healthcare Discussions
•Social Issues
•Technology & Gadgets
•Travel & Local Culture
•Shopping & Marketplace Experiences, and many more.
Transcription
Each audio file is paired with a human-verified, verbatim transcription available in JSON format.
•Transcription Highlights:
•Speaker-segmented dialogues
•Time-coded utterances
•Non-speech elements (pauses, laughter, etc.)
•High transcription accuracy, achieved through double QA pass, average WER < 5%
These transcriptions are production-ready, enabling seamless integration into ASR model pipelines or conversational AI workflows.
Metadata
The dataset comes with granular metadata for both speakers and recordings:
•
Speaker Metadata: Age, gender, accent, dialect, state/province, and participant ID.

•
Recording Metadata: Topic, duration, audio format, device type, and sample rate.

Such metadata helps developers fine-tune model training and supports use-case-specific filtering or demographic analysis.
Usage and Applications
This dataset is a versatile resource for multiple Urdu speech and language AI applications:
•
ASR Development: Train accurate speech-to-text systems for Urdu.

•
Voice Assistants: Build smart assistants capable of understanding natural Urdu conversations.

•
d
Pakistan - Learning and Educational Achievement in Punjab Schools (LEAPS) -...
waterdata3.staging.derilinx.com
Updated Mar 16, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2020). Pakistan - Learning and Educational Achievement in Punjab Schools (LEAPS) - 2003 - Dataset - waterdata [Dataset]. https://waterdata3.staging.derilinx.com/dataset/pakistan-learning-and-educational-achievement-punjab-schools-leaps-2003
Explore at:
Dataset updated
Mar 16, 2020
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Punjab, Pakistan
Description
Whether one is in favor of private education or not, it is here to stay and there is a critical need to understand this new environment. Unfortunately, little is known about the private sector and what its growth implies for the provision of education. There are important questions we need to answer before engaging in productive debate about how education can be best provided in the Pakistani context. For instance: a. Where are private schools setting up? Are they only being established in urban areas and only for the elite? b. What is the quality of education in private sector schools? How does it compare to public schools? c. Are the poor being left out? Is the private sector creating two classes of people in Pakistan—those who can afford private education and those who cannot? d. What is the effect of private schools on government schools?
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

FutureBee AI (2022). South Asian Multi-Year Facial Image Dataset [Dataset]. https://www.futurebeeai.com/dataset/image-dataset/facial-images-historical-south-asian

South Asian Multi-Year Facial Image Dataset

Human Past Image Dataset

Explore at:

wavAvailable download formats

Dataset updated

Aug 1, 2022

Dataset provided by

FutureBeeAI

Authors

FutureBee AI

License

https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

Area covered

South Asia

Dataset funded by

FutureBeeAI

Description

Introduction

Welcome to the South Asian Multi-Year Facial Image Dataset, thoughtfully curated to support the development of advanced facial recognition systems, biometric identification models, KYC verification tools, and other computer vision applications. This dataset is ideal for training AI models to recognize individuals over time, track facial changes, and enhance age progression capabilities.

Facial Image Data

This dataset includes over 10,000+ high-quality facial images, organized into individual participant sets, each containing:

•

Historical Images: 22 facial images per participant captured across a span of 10 years

•

Enrollment Image: One recent high-resolution facial image for reference or ground truth

Diversity & Representation

•

Geographic Coverage: Participants from India, Pakistan, Bangladesh, Nepal, Sri Lanka, Bhutan, Maldives, and more and other South Asian regions

•

Demographics: Individuals aged 18 to 70 years, with a gender distribution of 60% male and 40% female

•

File Formats: All images are available in JPEG and HEIC formats

Image Quality & Capture Conditions

To ensure model generalization and practical usability, images in this dataset reflect real-world diversity:

•

Lighting Conditions: Images captured under various natural and artificial lighting setups

•

Backgrounds: A wide range of indoor and outdoor backgrounds

•

Device Quality: Captured using modern, high-resolution mobile devices for consistency and clarity

Metadata

Each participant’s dataset is accompanied by rich metadata to support advanced model training and analysis, including:

•Unique participant ID

•File name

•Age at the time of image capture

•Gender

•Country of origin

•Demographic profile

•File format

Use Cases & Applications

This dataset is highly valuable for a wide range of AI and computer vision applications:

•

Facial Recognition Systems: Train models for high-accuracy face matching across time

•

KYC & Identity Verification: Improve time-spanning verification for banks, insurance, and government services

•

Biometric Security Solutions: Build reliable identity authentication models

•

Age Progression & Estimation Models: Train AI to predict aging patterns or estimate age from facial features

•

Generative AI: Support creation and validation of synthetic age progression or longitudinal face generation

Secure & Ethical Collection

•

Platform: All data was securely collected and processed through FutureBeeAI’s proprietary systems

•

Ethical Compliance: Full participant consent obtained with transparent communication of use cases

•

Privacy-Protected: No personally identifiable information is included; all data is anonymized and handled with care

Dataset Updates & Customization

To keep pace with evolving AI needs, this dataset is regularly updated and customizable. Custom data collection options include:

<div style="margin-top:10px; margin-bottom: 10px;

Clear search

Close search

Google apps

Main menu

South Asian Multi-Year Facial Image Dataset

Introduction

Facial Image Data

Diversity & Representation

Image Quality & Capture Conditions

Metadata

Use Cases & Applications

Secure & Ethical Collection

Dataset Updates & Customization

Urdu TTS Speech Dataset for Speech Synthesis

Recording & Audio Quality

Voice Artist Selection

Script Quality & Coverage

Transcripts & Alignment

Pakistan Foreign Exchange Reserves

Pakistan Total External Debt

Pakistan - Internally displaced persons - IDPs

Pakistan - Internal Displacements (New Displacements) – IDPs

Pakistan Gold Reserves

Urdu General Conversation Speech Dataset for ASR

Introduction

Speech Data

Topic Diversity

Transcription

Metadata

Usage and Applications

Pakistan - Learning and Educational Achievement in Punjab Schools (LEAPS) -...

South Asian Multi-Year Facial Image DatasetSee More Versions

Human Past Image Dataset

Introduction

Facial Image Data

Diversity & Representation

Image Quality & Capture Conditions

Metadata

Use Cases & Applications

Secure & Ethical Collection

Dataset Updates & Customization

South Asian Multi-Year Facial Image Dataset