23 datasets found
  1. Speech Synthesis Data | 400 Hours | TTS Data | Audio Data | AI Training...

    • datarade.ai
    Updated Dec 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nexdata (2023). Speech Synthesis Data | 400 Hours | TTS Data | Audio Data | AI Training Data| AI Datasets [Dataset]. https://datarade.ai/data-products/nexdata-multilingual-speech-synthesis-data-400-hours-a-nexdata
    Explore at:
    .bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
    Dataset updated
    Dec 10, 2023
    Dataset authored and provided by
    Nexdata
    Area covered
    Canada, Finland, Hong Kong, Colombia, Austria, Sweden, Belgium, Malaysia, Philippines, Singapore
    Description
    1. Specifications Format : 44.1 kHz/48 kHz, 16bit/24bit, uncompressed wav, mono channel.

    Recording environment : professional recording studio.

    Recording content : general narrative sentences, interrogative sentences, etc.

    Speaker : native speaker

    Annotation Feature : word transcription, part-of-speech, phoneme boundary, four-level accents, four-level prosodic boundary.

    Device : Microphone

    Language : American English, British English, Japanese, French, Dutch, Catonese, Canadian French,Australian English, Italian, New Zealand English, Spanish, Mexican Spanish

    Application scenarios : speech synthesis

    Accuracy rate: Word transcription: the sentences accuracy rate is not less than 99%. Part-of-speech annotation: the sentences accuracy rate is not less than 98%. Phoneme annotation: the sentences accuracy rate is not less than 98% (the error rate of voiced and swallowed phonemes is not included, because the labelling is more subjective). Accent annotation: the word accuracy rate is not less than 95%. Prosodic boundary annotation: the sentences accuracy rate is not less than 97% Phoneme boundary annotation: the phoneme accuracy rate is not less than 95% (the error range of boundary is within 5%)

    1. About Nexdata Nexdata owns off-the-shelf PB-level Large Language Model(LLM) Data, 1 million hours of Audio Data and 800TB of Annotated Imagery Data. These ready-to-go AI & ML Training Data support instant delivery, quickly improve the accuracy of AI models. For more details, please visit us at https://www.nexdata.ai/datasets/tts?source=Datarade
  2. Gesture Recognition Data |10,000 ID | Computer Vision Data| AI Training Data...

    • data.nexdata.ai
    Updated Aug 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nexdata (2024). Gesture Recognition Data |10,000 ID | Computer Vision Data| AI Training Data | Machine Learning (ML) Data [Dataset]. https://data.nexdata.ai/products/nexdata-gesture-recognition-data-10-000-id-image-ai-m-nexdata
    Explore at:
    Dataset updated
    Aug 16, 2024
    Dataset authored and provided by
    Nexdata
    Area covered
    India, Afghanistan, Saudi Arabia, Iraq, Luxembourg, United States, Singapore, Russian Federation, Cambodia, Canada
    Description

    Off-the-shelf gesture recognition data covers multiple scenes, such as conference, in-car and home. All the machine learning (ML) data is collected with signed authorization agreement.

  3. D

    Artificial Intelligence in Small and Medium Business Market Report | Global...

    • dataintelo.com
    csv, pdf, pptx
    Updated Sep 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2024). Artificial Intelligence in Small and Medium Business Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/global-artificial-intelligence-in-small-and-medium-business-market
    Explore at:
    pptx, pdf, csvAvailable download formats
    Dataset updated
    Sep 23, 2024
    Authors
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Artificial Intelligence in Small and Medium Business Market Outlook



    The global market size of Artificial Intelligence (AI) in Small and Medium Businesses (SMBs) is increasingly gaining momentum and is projected to grow from USD 15 billion in 2023 to USD 58 billion by 2032, reflecting a compound annual growth rate (CAGR) of 16%. A significant growth factor fueling this expansion is the rising adoption of AI technologies to enhance operational efficiency and customer engagement among SMBs.



    One of the critical growth factors driving the AI in SMB market is the rapid advancement and increased accessibility of AI technologies. The cost of deploying AI solutions has dramatically decreased over the years, making it feasible for smaller enterprises to leverage these technologies without substantial financial burdens. This democratization of AI is enabling SMBs to compete with larger corporations by automating routine tasks, gaining insights from big data, and enhancing customer service through intelligent chatbots and personalized marketing strategies. The proliferation of AI-powered tools tailored specifically for SMBs further propels market growth, allowing businesses to optimize their operations and derive maximum value from their investments.



    Another significant growth factor is the growing awareness and understanding of AI's potential among SMB owners and decision-makers. Many SMBs are recognizing that adopting AI technologies is not merely a trend but a necessity to stay competitive in a rapidly evolving market landscape. By integrating AI into various business functions, SMBs can streamline their processes, reduce operational costs, and improve decision-making capabilities. The increasing availability of educational resources, training programs, and consulting services focused on AI is empowering SMBs to embark on their AI journey with confidence, thereby contributing to market expansion.



    The surge in digital transformation initiatives among SMBs is also a crucial driver of market growth. In the wake of the COVID-19 pandemic, businesses worldwide have accelerated their digital transformation efforts to adapt to new market dynamics and customer behaviors. AI plays a pivotal role in this transformation by enabling SMBs to digitize their operations, enhance customer experiences, and create new business models. The integration of AI with other emerging technologies, such as the Internet of Things (IoT) and blockchain, is further expanding the horizons for SMBs, opening up new avenues for innovation and growth.



    Regionally, North America holds a dominant position in the AI in SMB market, driven by the early adoption of advanced technologies and robust support infrastructure. The presence of key market players and a strong ecosystem of AI startups and research institutions further bolster the region's growth. Europe follows closely, with significant investments in AI research and development, particularly in the healthcare and manufacturing sectors. The Asia Pacific region is expected to witness the highest growth rate during the forecast period, fueled by the rapid digitalization of economies and increasing government initiatives to promote AI adoption among SMBs. Latin America and the Middle East & Africa, though at nascent stages, are also showing promising signs of growth, driven by increasing awareness and investment in AI technologies.



    Component Analysis



    The components segment of the AI in SMB market comprises software, hardware, and services, each playing a vital role in the deployment and functioning of AI solutions. The software segment holds a significant share of the market, driven by the increasing adoption of AI-powered applications and platforms. These software solutions range from AI algorithms and models to complex systems that integrate machine learning, natural language processing, and computer vision. SMBs are increasingly leveraging AI software to automate their operations, enhance customer experiences, and gain actionable insights from data. The availability of off-the-shelf AI software tailored for various business functions has made it easier for SMBs to implement AI without requiring extensive technical expertise.



    The hardware segment, though smaller compared to software, is crucial for the effective deployment of AI solutions. This segment includes AI accelerators, GPUs, and other specialized hardware components designed to support the high computational demands of AI applications. As AI models become more complex and data-intensive, the need for advanced hardware solutions becomes imperative. SMBs are increasingly investing in AI ha

  4. In-Cabin Speech Data | 15,000 Hours | AI Training Data | Speech Recognition...

    • datarade.ai
    Updated Dec 14, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nexdata (2023). In-Cabin Speech Data | 15,000 Hours | AI Training Data | Speech Recognition Data | Audio Data |Natural Language Processing (NLP) Data [Dataset]. https://datarade.ai/data-products/nexdata-in-car-speech-data-15-000-hours-audio-ai-ml-t-nexdata
    Explore at:
    .bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
    Dataset updated
    Dec 14, 2023
    Dataset authored and provided by
    Nexdata
    Area covered
    Turkey, Egypt, Austria, Romania, Switzerland, Russian Federation, Argentina, Poland, Netherlands, Germany
    Description
    1. Specifications Format : Audio format: 48kHz, 16bit, uncompressed wav, mono channel; Vedio format: MP4

    Recording Environment : In-car;1 quiet scene, 1 low noise scene, 3 medium noise scenes and 2 high noise scenes

    Recording Content : It covers 5 fields: navigation field, multimedia field, telephone field, car control field and question and answer field; 500 sentences per people

    Speaker : Speakers are evenly distributed across all age groups, covering children, teenagers, middle-aged, elderly, etc.

    Device : High fidelity microphone; Binocular camera

    Language : 20 languages

    Transcription content : text

    Accuracy rate : 98%

    Application scenarios : speech recognition, Human-computer interaction; Natural language processing and text analysis; Visual content understanding, etc.

    1. About Nexdata Nexdata owns off-the-shelf PB-level Large Language Model(LLM) Data, 1 million hours of Audio Data and 800TB of Annotated Imagery Data. These ready-to-go Natural Language Processing (NLP) Data support instant delivery, quickly improve the accuracy of AI models. For more details, please visit us at https://www.nexdata.ai/datasets/speechrecog?source=Datarade
  5. Driver & Passenger Behavior Data | 100,000 ID | DMS & OMS Data| Image AI...

    • datarade.ai
    Updated Dec 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nexdata (2023). Driver & Passenger Behavior Data | 100,000 ID | DMS & OMS Data| Image AI Training Data | Annotated Imagery Data| AI Datasets [Dataset]. https://datarade.ai/data-products/nexdata-driver-passenger-behavior-data-100-000-id-im-nexdata
    Explore at:
    .bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
    Dataset updated
    Dec 8, 2023
    Dataset authored and provided by
    Nexdata
    Area covered
    Serbia, Trinidad and Tobago, Denmark, Korea (Republic of), Cuba, Latvia, Japan, Malta, Armenia, Greece
    Description
    1. Specifications Data size : 100,000 id

    Population distribution : gender distribution: balance gender; race distribution: Caucasians,blacks,Indians,Asians; age distribution: aged from 18 to 60

    Collection environment : In-car Cameras

    Collection diversity : multiple races, multiple age periods, multiple time periods and behaviors (Dangerous behavior, Fatigue behavior, Visual movement behavior)

    Device : binocular camera of RGB and infrared channels, the resolutions are 640x480

    Collection time : day, evening and night

    Image parameter : the video format is .avi

    Accuracy : according to the accuracy of each person's action, the accuracy is greater than 95%; the accuracy of label annotation is not less than 95%

    1. About Nexdata Nexdata owns off-the-shelf PB-level Large Language Model(LLM) Data, 1 million hours of Audio Data and 800TB of Annotated Imagery Data. These ready-to-go Annotated Imagery Data supports instant delivery, quickly improve the accuracy of AI models. For more details, please visit us at https://www.nexdata.ai/datasets/computervision?source=Datarade
  6. Re-ID Data | 600,000 ID | CCTV Data |Computer Vision Data| Identity Data| AI...

    • data.nexdata.ai
    Updated Aug 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nexdata (2024). Re-ID Data | 600,000 ID | CCTV Data |Computer Vision Data| Identity Data| AI Datasets [Dataset]. https://data.nexdata.ai/products/nexdata-re-id-data-60-000-id-image-video-ai-ml-train-nexdata
    Explore at:
    Dataset updated
    Aug 3, 2024
    Dataset authored and provided by
    Nexdata
    Area covered
    Singapore, Thailand, Iran, Tajikistan, Bahrain, Panama, Italy, Ecuador, Hungary, Brazil
    Description

    Off-the-shelf Re-ID data is collected from real surveillance scenes. The Identity Data diversity includes different age groups, different time periods, different shooting angles, different human body orientations and postures, clothing for different seasons.

  7. Gesture Recognition Data |10,000 ID | Computer Vision Data| AI Training Data...

    • datarade.ai
    Updated Dec 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nexdata (2023). Gesture Recognition Data |10,000 ID | Computer Vision Data| AI Training Data | Machine Learning (ML) Data [Dataset]. https://datarade.ai/data-products/nexdata-gesture-recognition-data-10-000-id-image-ai-m-nexdata
    Explore at:
    .bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
    Dataset updated
    Dec 22, 2023
    Dataset authored and provided by
    Nexdata
    Area covered
    Colombia, Tajikistan, Cyprus, Sri Lanka, Chile, Iceland, Mongolia, Bosnia and Herzegovina, Nicaragua, Belarus
    Description
    1. Specifications Data size : 10,000 ID

    Race distribution : Asian, Caucasian, Black, Brown

    Gender distribution : male, female

    Age distribution : from teenagers to the elderly, mainly young and middle-aged

    Collection environment : indoor office scenes, in-car,conference, etc.

    Collection diversity : different gestures data, different races, different age groups, different scenes

    Collection equipment : cellphone, laptop camera, in-car camera

    Data format : .mp4, .mov, .jpg

    Accuracy rate : the accuracy exceeds 97% based on the accuracy of the actions; the accuracy of action naming is more than 97%

    1. About Nexdata Nexdata owns off-the-shelf PB-level Large Language Model(LLM) Data, 1 million hours of Audio Data and 800TB of Annotated Imagery Data. These ready-to-go machine learning (ML) data supports instant delivery, quickly improve the accuracy of AI models. For more details, please visit us at https://www.nexdata.ai/datasets/computervision?source=Datarade
  8. Driver & Passenger Behavior Data | 100,000 ID | DMS & OMS Data| Image AI...

    • data.nexdata.ai
    Updated Aug 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nexdata (2024). Driver & Passenger Behavior Data | 100,000 ID | DMS & OMS Data| Image AI Training Data | Annotated Imagery Data| AI Datasets [Dataset]. https://data.nexdata.ai/products/nexdata-driver-passenger-behavior-data-100-000-id-im-nexdata
    Explore at:
    Dataset updated
    Aug 16, 2024
    Dataset authored and provided by
    Nexdata
    Area covered
    United States, Switzerland, Romania, Egypt, Portugal, Bolivia, Ecuador, Pakistan, Kyrgyzstan, Bahamas
    Description

    Off-the-shelf driver & passenger behavior data is Annotated Imagery Data that includes multiple ages, multiple time periods and multiple races (Caucasian, Black, Indian). The driver behaviors includes dangerous behavior, fatigue behavior and visual movement behavior.

  9. H

    Dataset for "Forecasting Coastal Hypoxia Using a Blend of Numeric and...

    • hydroshare.org
    zip
    Updated Jul 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yanda Ou (2024). Dataset for "Forecasting Coastal Hypoxia Using a Blend of Numeric and Artificial Intelligence Models" [Dataset]. https://www.hydroshare.org/resource/0cc2093cc67543fd82b06d6b5b9c79e4
    Explore at:
    zip(1.4 GB)Available download formats
    Dataset updated
    Jul 9, 2024
    Dataset provided by
    HydroShare
    Authors
    Yanda Ou
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    May 1, 2007 - Aug 26, 2020
    Area covered
    Description

    Dataset (splitted into 14 files due to file size limitation) for training, validating, and evaluating AI model which was developed for daily hypoxia prediction in the Louisiana-Texas shelf. The dateset is derived from a coupled hydrodynamic-biogeochemical model embedded in the Regional Ocean Modeling System. External dataset from three independent hydrodynamic forecast products: HYCOM, NEMO, and FVCOM.

  10. Dataset - Enhancing Brick-and-Mortar Shopping Experience Through Explainable...

    • zenodo.org
    • data.niaid.nih.gov
    bin
    Updated Apr 28, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Robert Zimmermann; Daniel Mora; Douglas Cirqueira; Robert Zimmermann; Daniel Mora; Douglas Cirqueira (2021). Dataset - Enhancing Brick-and-Mortar Shopping Experience Through Explainable Artificial Intelligence in a Smartphone-based Augmented Reality Shopping Assistant Application [Dataset]. http://doi.org/10.5281/zenodo.4723468
    Explore at:
    binAvailable download formats
    Dataset updated
    Apr 28, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Robert Zimmermann; Daniel Mora; Douglas Cirqueira; Robert Zimmermann; Daniel Mora; Douglas Cirqueira
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is a dataset obtained from an online survey conducted in August 2020.

    In the survey, participants were introduced to the concept of a smartphone-based shopping assistant application with the help of pictures and videos when shopping with and without the application. Participants were presented with three different shopping scenarios. In each scenario, we showed products on a shelf (groceries, luxury chocolate, shoes, books). The first shopping scenario was a regular shopping scenario (RSS), the second was an augmented reality shopping scenario (ARSS), and the third was an augmented reality shopping scenario with explainable AI features (XARSS). For each scenario participants had to answer questions about how they perceived the scenario and how it influenced their overall purchase intention.

    The present work was conducted within the Innovative Training Network project PERFORM funded by the European Union’s Horizon 2020 research and innovation program under the Marie Skłodowska-Curie grant agreement No. 765395. The EU Research Executive Agency is not responsible for any use that may be made of the information it contains.

  11. Multi-race Human Face Data | 200,000 ID | Face Recognition Data| Image/Video...

    • data.nexdata.ai
    Updated Aug 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nexdata (2024). Multi-race Human Face Data | 200,000 ID | Face Recognition Data| Image/Video AI Training Data | Biometric AI Datasets [Dataset]. https://data.nexdata.ai/products/nexdata-multi-race-human-face-data-200-000-id-image-vi-nexdata
    Explore at:
    Dataset updated
    Aug 3, 2024
    Dataset authored and provided by
    Nexdata
    Area covered
    Austria, Romania, India, Afghanistan, Uzbekistan, Saudi Arabia, Montenegro, Brazil, Hong Kong, Turkmenistan
    Description

    Off-the-shelf biometric data (human face) covers 3D depth, segmentation: face organs and accessory, key points, facial expression, alpha Matte, age in variety and etc. All the Biometric Data are collected with signed authorization agreement.

  12. Natural Scene and Handwriting OCR Data | 500,000 Images| Computer Vision...

    • data.nexdata.ai
    Updated Aug 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nexdata (2024). Natural Scene and Handwriting OCR Data | 500,000 Images| Computer Vision Data| AI Datasets [Dataset]. https://data.nexdata.ai/products/nexdata-ocr-data-500-000-images-ai-ml-training-data-nexdata
    Explore at:
    Dataset updated
    Aug 3, 2024
    Dataset authored and provided by
    Nexdata
    Area covered
    Austria, Japan, Australia, Norway, Spain, Hong Kong, Mexico, Pakistan, Peru, Egypt
    Description

    Off-the-shelf OCR data covers natural scenes image and handwriting image data, covering 20 languages, multiple natural scenes, and multiple photographic angles.

  13. h

    egyptian-arabic-general-conversation-customer-speech-dataset

    • huggingface.co
    Updated Jul 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Macgence (2025). egyptian-arabic-general-conversation-customer-speech-dataset [Dataset]. https://huggingface.co/datasets/Macgence/egyptian-arabic-general-conversation-customer-speech-dataset
    Explore at:
    Dataset updated
    Jul 17, 2025
    Authors
    Macgence
    License

    https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/

    Description

    🧠 About This OTS Dataset

    With an extensive 250-hour collection of high-quality general conversation audio recordings, this off-the-shelf (OTS) dataset empowers researchers and developers to advance Natural Language Processing (NLP), Conversational AI, and Generative Voice AI models across diverse sectors. Whether in finance, healthcare, retail, or any other domain, this dataset provides a rich training and evaluation resource for AI systems.

      📊 Metadata Availability:… See the full description on the dataset page: https://huggingface.co/datasets/Macgence/egyptian-arabic-general-conversation-customer-speech-dataset.
    
  14. Re-ID Data | 600,000 ID | CCTV Data |Computer Vision Data| Identity Data| AI...

    • datarade.ai
    Updated Dec 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nexdata (2023). Re-ID Data | 600,000 ID | CCTV Data |Computer Vision Data| Identity Data| AI Datasets [Dataset]. https://datarade.ai/data-products/nexdata-re-id-data-60-000-id-image-video-ai-ml-train-nexdata
    Explore at:
    .bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
    Dataset updated
    Dec 8, 2023
    Dataset authored and provided by
    Nexdata
    Area covered
    United Arab Emirates, Trinidad and Tobago, Portugal, Turkmenistan, Luxembourg, Ecuador, Russian Federation, Sri Lanka, Cuba, Bolivia (Plurinational State of)
    Description
    1. Specifications Data size : 60,000 ID

    Population distribution : the race distribution is Asians, Caucasians and black people, the gender distribution is male and female, the age distribution is from children to the elderly

    Collecting environment : including indoor and outdoor scenes (such as supermarket, mall and residential area, etc.)

    Data diversity : different ages, different time periods, different cameras, different human body orientations and postures, different ages collecting environment

    Device : surveillance cameras, the image resolution is not less than 1,9201,080

    Data format : the image data format is .jpg, the annotation file format is .json

    Annotation content : human body rectangular bounding boxes, 15 human body attributes

    Quality Requirements : A rectangular bounding box of human body is qualified when the deviation is not more than 3 pixels, and the qualified rate of the bounding boxes shall not be lower than 97%;Annotation accuracy of attributes is over 97%

    1. About Nexdata Nexdata owns off-the-shelf PB-level Large Language Model(LLM) Data, 1 million hours of Audio Data and 800TB of Annotated Imagery Data.These ready-to-go Identity Data support instant delivery, quickly improve the accuracy of AI models. For more details, please visit us at https://www.nexdata.ai/datasets/computervision?source=Datarade
  15. n

    Unsupervised Speech Data |1 Million Hours | Spontaneous Speech | LLM |...

    • data.nexdata.ai
    Updated Feb 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nexdata (2025). Unsupervised Speech Data |1 Million Hours | Spontaneous Speech | LLM | Pre-training |Large Language Model(LLM) Data [Dataset]. https://data.nexdata.ai/products/nexdata-multilingual-unsupervised-speech-data-1-million-ho-nexdata
    Explore at:
    Dataset updated
    Feb 13, 2025
    Dataset authored and provided by
    Nexdata
    Area covered
    France
    Description

    Off-the-shelf 1 million hours of Unsupervised speech dataset, covering 10+ languages(English, French, German, Japanese, Arabic, Mandarin and etc. , 100,000 hours each). The content covers dialogues or monologues in 28 common domains, such as daily vlogs, travel, podcast, technology, beauty, etc.

  16. o

    Data from: Recommendations for quantifying and reducing uncertainty in...

    • explore.openaire.eu
    • search.dataone.org
    • +2more
    Updated Sep 9, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stephanie Brodie (2022). Recommendations for quantifying and reducing uncertainty in climate projections of species distributions [Dataset]. http://doi.org/10.7291/d1jq2k
    Explore at:
    Dataset updated
    Sep 9, 2022
    Authors
    Stephanie Brodie
    Description

    Summary We used a combination of regional ocean climate projections and simulated species distributions (Leroy et al., 2016) to quantify sources of uncertainty in projections of spatially-explicit biomass for three species archetypes in the CCS (1985-2100; Fig. 1). Species archetypes were simplified representations of three general groups of marine finfish found in the CCS that comprise ecologically and/or economically important fisheries and that might be expected to show variable patterns of redistribution under climate change based on their habitat preferences, population dynamics, and mobility characteristics: 1) a highly migratory species (HMS) that was designed to resemble north Pacific albacore; 2) a coastal pelagic species (CPS) that was designed to resemble northern anchovy (CPS); and 3) a groundfish species (GFS) that was designed to resemble sablefish. SDMs (n=15; Figure 1) were then fitted to simulated biomass data for each archetype (training period 1985-2010) and projected from 2011-2100 using each of the three regional ocean climate models. Our framework resulted in 252 SDMs (15 SDM types, three species archetypes, three ESMs, and two environmental parameter simulations; Figure 1). To address our study goal of assessing SDM performance and understanding sources of uncertainty in species distribution projections, we compared the output of SDM projections against simulated “observations” for 2011-2100 and quantified the uncertainty introduced by the climate projection (ESM uncertainty) versus the uncertainty introduced by the SDM structure (SDM uncertainty). Environmental Covariates from Regional Ocean Projections Environmental covariates used in species distribution simulations were obtained from regional ocean projections (Pozo Buil et al., 2021) forced by three ESMs from phase 5 of the Coupled Model Intercomparison Project (CMIP5) archive: Geophysical Fluid Dynamics Laboratory (GFDL) ESM2M, Hadley Center HadGEM2-ES (HAD), and Institut Pierre Simon Laplace (IPSL) CM5A-MR. These ESMs, hereafter referred to as GFDL, HAD, and IPSL, span the approximate range of potential changes in physical and biogeochemical conditions across all CMIP5 models (Pozo Buil et al., 2021). ESMs were downscaled using the Regional Ocean Modelling System (ROMS) coupled with a biogeochemical model (NEMUCSC) (Fiechter et al., 2018, 2021) based on the North Pacific Ecosystem Model for Understanding Regional Oceanography (NEMURO) (Kishi et al., 2007). The ROMS domain spans the CCS from 30-48°N and from the coast to 134°W at 0.1° horizontal resolution with 42 terrain-following vertical layers (Figure 2). Each downscaled ESM used the Representative Concentration Pathway (RCP) 8.5 climate change scenario. While we only examined RCP 8.5, it should be noted that using RCPs 2.6 and 4.5 would result in only minor differences in the spread of future environmental change for the variables and ESMs examined here. Specifically, uncertainty in biogeochemical change among the chosen ESMs in RCP8.5 envelops the uncertainty among RCPs 2.6 and 4.5; while for temperature GFDL and HAD represent opposite ends of the spectrum for the projected magnitude of warming in the CMIP5 ensemble (Drenkard et al., 2021; Pozo Buil et al., 2021). As such, we do not explore scenario uncertainty. Environmental covariates used in species distribution simulations were sea surface temperature (SST; C), bottom temperature (BT; C), bottom oxygen (BO; mmol m-3), mixed layer depth (MLD; m), surface chlorophyll-a (Chl-a; mg m-3), and zooplankton concentration integrated over 50 m (zoo_50; mmol N m-2) and 200 m (zoo_200; mmol N m-2). These environmental covariates were averaged over spring months (March-May) annually (1985-2100) to encompass the seasonal period when ocean productivity is most influential on the long-term population dynamics of most marine fishes in the CCS. Operating Models: Simulated Species Biomass Biomass distributions for three species archetypes were simulated on the ROMS grid for each year and each ESM from 1985-2100. Simulations were run using the ‘virtualspecies’ R package (Leroy et al., 2016) that is specifically designed to reflect real-world ecological properties and species-environment relationships (Meynard et al., 2019). We refer to these simulated species distributions as ‘operating models’. Species simulations used a two-step process. First, habitat suitability was calculated based on environmental data and specified species’ habitat preferences (Table S1). Environmental preferences used to force species distributions varied among species archetypes based on representative life histories (see Supplementary Material). The domain for the HMS archetype was set to the entire CCS, whereas the CPS and GFS archetypes were reduced to inshore waters to reflect the CPS archetype’s preference for pelagic waters over the continental shelf and slope, and the GFS archetype’s p...

  17. Multi-race Human Face Data | 200,000 ID | Face Recognition Data| Image/Video...

    • datarade.ai
    Updated Dec 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nexdata (2023). Multi-race Human Face Data | 200,000 ID | Face Recognition Data| Image/Video AI Training Data | Biometric AI Datasets [Dataset]. https://datarade.ai/data-products/nexdata-multi-race-human-face-data-200-000-id-image-vi-nexdata
    Explore at:
    .bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
    Dataset updated
    Dec 22, 2023
    Dataset authored and provided by
    Nexdata
    Area covered
    Iran (Islamic Republic of), Chile, Bulgaria, Bosnia and Herzegovina, Canada, Lao People's Democratic Republic, Cambodia, Belarus, Mexico, Germany
    Description
    1. Specifications Product : Biometric Data

    Data size : 200,000 ID

    Race distribution : black people, Caucasian people, brown(Mexican) people, Indian people and Asian people

    Gender distribution : gender balance

    Age distribution : young, midlife and senior

    Collecting environment : including indoor and outdoor scenes

    Data diversity : different face poses, races, ages, light conditions and scenes Device : cellphone

    Data format : .jpg/png

    Accuracy : the accuracy of labels of face pose, race, gender and age are more than 97%

    1. About Nexdata Nexdata owns off-the-shelf PB-level Large Language Model(LLM) Data, 1 million hours of Audio Data and 800TB of Annotated Imagery Data. These ready-to-go Biometric Data support instant delivery, quickly improve the accuracy of AI models. For more details, please visit us at https://www.nexdata.ai/datasets/computervision?source=Datarade
  18. C

    Civil Engineering and Architectural Software Report

    • marketresearchforecast.com
    doc, pdf, ppt
    Updated Mar 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Market Research Forecast (2025). Civil Engineering and Architectural Software Report [Dataset]. https://www.marketresearchforecast.com/reports/civil-engineering-and-architectural-software-27315
    Explore at:
    doc, pdf, pptAvailable download formats
    Dataset updated
    Mar 4, 2025
    Dataset authored and provided by
    Market Research Forecast
    License

    https://www.marketresearchforecast.com/privacy-policyhttps://www.marketresearchforecast.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The global Civil Engineering and Architectural Software market is experiencing robust growth, driven by increasing infrastructure development globally, the adoption of Building Information Modeling (BIM) methodologies, and a rising demand for efficient project management solutions. The market, estimated at $15 billion in 2025, is projected to exhibit a Compound Annual Growth Rate (CAGR) of 8% from 2025 to 2033, reaching approximately $28 billion by 2033. This growth is fueled by several key factors. Firstly, governments worldwide are investing heavily in infrastructure projects, necessitating sophisticated software for design, analysis, and project management. Secondly, the increasing adoption of BIM, which facilitates collaborative design and minimizes errors, is driving demand for advanced software capabilities. Furthermore, the construction industry's ongoing digital transformation necessitates software that integrates various aspects of project delivery, including cost budgeting, materials acquisition, and resource allocation, leading to greater efficiency and reduced project timelines. The off-the-shelf software segment currently holds the largest market share, but tailored solutions are witnessing significant growth due to the increasing need for customized functionalities to meet specific project requirements. Key regional markets include North America and Europe, which together account for a significant portion of the global market share. However, the Asia-Pacific region is expected to experience the fastest growth in the coming years, driven by rapid urbanization and significant infrastructure development in countries like China and India. While the market faces some restraints, such as the high initial investment costs of software and the need for skilled professionals to operate them, these challenges are being addressed by vendors offering flexible licensing models and robust training programs. The competitive landscape is characterized by a mix of established players and emerging innovative companies, fostering a dynamic and innovative market. The ongoing development of Artificial Intelligence (AI) and Machine Learning (ML) technologies within the software is poised to further revolutionize the industry by enabling predictive analytics and automated design processes.

  19. R

    Smart Eye Dataset

    • universe.roboflow.com
    zip
    Updated Apr 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    smarteye (2025). Smart Eye Dataset [Dataset]. https://universe.roboflow.com/smarteye-bls2k/smart-eye-syxsc/dataset/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 20, 2025
    Dataset authored and provided by
    smarteye
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Variables measured
    Objects Bounding Boxes
    Description

    Smart-Eye: AI Model for Empty Shelf Detection

    Smart-Eye is an AI-powered computer vision system developed to monitor and detect empty or partially empty shelves in retail environments such as shopping malls and supermarkets. The primary objective of Smart-Eye is to automate shelf inventory checks, reduce stock-out situations, and improve overall customer experience by ensuring timely restocking.

    Smart-Eye utilizes advanced deep learning algorithms, particularly convolutional neural networks (CNNs), trained on thousands of annotated images representing various shelf conditions — from fully stocked to completely empty. The system processes real-time footage from in-store cameras to identify shelf areas that are understocked or vacant.

    Key features of Smart-Eye include:

    **Real-Time Monitoring: **Smart-Eye continuously analyzes live video feeds from installed surveillance cameras across the store, allowing for immediate identification of empty or low-stock shelves.

    High Accuracy Detection: Through rigorous training and validation, Smart-Eye achieves high precision in distinguishing between full, partially filled, and empty shelf segments, even under varying lighting conditions and camera angles.

    Scalable Architecture: The system is built to support scalability, allowing it to be deployed in large retail chains with hundreds of cameras and diverse store layouts.

    Integration Capabilities: Smart-Eye can be integrated with existing inventory management systems. Upon detecting an empty shelf, it can trigger automatic alerts to store staff or initiate restocking workflows.

    Continuous Learning: The model is designed to improve over time. It supports feedback loops where human validations of shelf conditions can be fed back into the training data to enhance accuracy.

    By leveraging Smart-Eye, retailers can minimize the risk of product unavailability, optimize inventory control, and enhance operational efficiency. In turn, this leads to improved customer satisfaction and increased sales opportunities.

  20. F

    Hispanic Facial Images Dataset | Selfie & ID Card Images

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Hispanic Facial Images Dataset | Selfie & ID Card Images [Dataset]. https://www.futurebeeai.com/dataset/image-dataset/facial-images-selfie-id-hispanic
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    Welcome to the Hispanic Human Facial Images Dataset, meticulously curated to enhance face recognition models and support the development of advanced biometric identification systems, KYC models, and other facial recognition technologies.

    Facial Image Data

    This dataset comprises over 1500 Hispanic individual facial image sets, with each set including:

    Selfie Images: 5 different high-quality selfie images per individual.
    ID Card Images: 2 high-quality images of the individual’s face from different ID cards.

    Diversity and Representation

    The dataset includes contributions from a diverse network of individuals across Hispanic countries.

    Geographical Representation: Participants from Hispanic countries, including Argentina, Brazil, Costa Rica, Ecuador, Colombia, Peru, and more.
    Participant Profile: Participants range from 18 to 70 years old, representing both males and females in a 60:40 ratio, respectively.
    File Format: The dataset contains images in JPEG and HEIC file format.

    Quality and Conditions

    To ensure high utility and robustness, all images are captured under varying conditions:

    Lighting Conditions: Images are taken in different lighting environments to ensure variability and realism.
    Backgrounds: A variety of backgrounds are available to enhance model generalization.
    Device Quality: Photos are taken using the latest mobile devices to ensure high resolution and clarity.

    Metadata

    Each facial image set is accompanied by detailed metadata for each participant, including:

    Unique Identifier
    File Name
    Age
    Gender
    Country
    Demographic Information
    File Format

    This metadata is essential for training models that can accurately recognize and identify faces across different demographics and conditions.

    Usage and Applications

    This facial image dataset is ideal for various applications in the field of computer vision, including but not limited to:

    Facial Recognition Models: Improving the accuracy and reliability of facial recognition systems.
    KYC Models: Streamlining the identity verification processes for financial and other services.
    Biometric Identity Systems: Developing robust facial biometric identification solutions.
    Age Prediction Models: Training models to accurately predict the age of individuals based on facial features.
    Generative AI Models: Training generative AI models to create realistic and diverse synthetic facial images.

    Secure and Ethical Collection

    Data was securely stored and processed within our platform, ensuring data security and confidentiality.
    The biometric data collection process adhered to strict ethical guidelines, ensuring the privacy and consent of all participants.
    All participants were informed of the purpose of collection and potential use of the data, as agreed through written consent. Also, demographic-related regulations are kept in mind.

    Updates and Customization

    We understand the evolving nature of AI and machine learning requirements. Therefore, we continuously add more assets with diverse conditions to this off-the-shelf facial image dataset.

    <span

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Nexdata (2023). Speech Synthesis Data | 400 Hours | TTS Data | Audio Data | AI Training Data| AI Datasets [Dataset]. https://datarade.ai/data-products/nexdata-multilingual-speech-synthesis-data-400-hours-a-nexdata
Organization logo

Speech Synthesis Data | 400 Hours | TTS Data | Audio Data | AI Training Data| AI Datasets

Explore at:
.bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
Dataset updated
Dec 10, 2023
Dataset authored and provided by
Nexdata
Area covered
Canada, Finland, Hong Kong, Colombia, Austria, Sweden, Belgium, Malaysia, Philippines, Singapore
Description
  1. Specifications Format : 44.1 kHz/48 kHz, 16bit/24bit, uncompressed wav, mono channel.

Recording environment : professional recording studio.

Recording content : general narrative sentences, interrogative sentences, etc.

Speaker : native speaker

Annotation Feature : word transcription, part-of-speech, phoneme boundary, four-level accents, four-level prosodic boundary.

Device : Microphone

Language : American English, British English, Japanese, French, Dutch, Catonese, Canadian French,Australian English, Italian, New Zealand English, Spanish, Mexican Spanish

Application scenarios : speech synthesis

Accuracy rate: Word transcription: the sentences accuracy rate is not less than 99%. Part-of-speech annotation: the sentences accuracy rate is not less than 98%. Phoneme annotation: the sentences accuracy rate is not less than 98% (the error rate of voiced and swallowed phonemes is not included, because the labelling is more subjective). Accent annotation: the word accuracy rate is not less than 95%. Prosodic boundary annotation: the sentences accuracy rate is not less than 97% Phoneme boundary annotation: the phoneme accuracy rate is not less than 95% (the error range of boundary is within 5%)

  1. About Nexdata Nexdata owns off-the-shelf PB-level Large Language Model(LLM) Data, 1 million hours of Audio Data and 800TB of Annotated Imagery Data. These ready-to-go AI & ML Training Data support instant delivery, quickly improve the accuracy of AI models. For more details, please visit us at https://www.nexdata.ai/datasets/tts?source=Datarade
Search
Clear search
Close search
Google apps
Main menu