Recording environment : professional recording studio.
Recording content : general narrative sentences, interrogative sentences, etc.
Speaker : native speaker
Annotation Feature : word transcription, part-of-speech, phoneme boundary, four-level accents, four-level prosodic boundary.
Device : Microphone
Language : American English, British English, Japanese, French, Dutch, Catonese, Canadian French,Australian English, Italian, New Zealand English, Spanish, Mexican Spanish
Application scenarios : speech synthesis
Accuracy rate: Word transcription: the sentences accuracy rate is not less than 99%. Part-of-speech annotation: the sentences accuracy rate is not less than 98%. Phoneme annotation: the sentences accuracy rate is not less than 98% (the error rate of voiced and swallowed phonemes is not included, because the labelling is more subjective). Accent annotation: the word accuracy rate is not less than 95%. Prosodic boundary annotation: the sentences accuracy rate is not less than 97% Phoneme boundary annotation: the phoneme accuracy rate is not less than 95% (the error range of boundary is within 5%)
Off-the-shelf gesture recognition data covers multiple scenes, such as conference, in-car and home. All the machine learning (ML) data is collected with signed authorization agreement.
https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
The global market size of Artificial Intelligence (AI) in Small and Medium Businesses (SMBs) is increasingly gaining momentum and is projected to grow from USD 15 billion in 2023 to USD 58 billion by 2032, reflecting a compound annual growth rate (CAGR) of 16%. A significant growth factor fueling this expansion is the rising adoption of AI technologies to enhance operational efficiency and customer engagement among SMBs.
One of the critical growth factors driving the AI in SMB market is the rapid advancement and increased accessibility of AI technologies. The cost of deploying AI solutions has dramatically decreased over the years, making it feasible for smaller enterprises to leverage these technologies without substantial financial burdens. This democratization of AI is enabling SMBs to compete with larger corporations by automating routine tasks, gaining insights from big data, and enhancing customer service through intelligent chatbots and personalized marketing strategies. The proliferation of AI-powered tools tailored specifically for SMBs further propels market growth, allowing businesses to optimize their operations and derive maximum value from their investments.
Another significant growth factor is the growing awareness and understanding of AI's potential among SMB owners and decision-makers. Many SMBs are recognizing that adopting AI technologies is not merely a trend but a necessity to stay competitive in a rapidly evolving market landscape. By integrating AI into various business functions, SMBs can streamline their processes, reduce operational costs, and improve decision-making capabilities. The increasing availability of educational resources, training programs, and consulting services focused on AI is empowering SMBs to embark on their AI journey with confidence, thereby contributing to market expansion.
The surge in digital transformation initiatives among SMBs is also a crucial driver of market growth. In the wake of the COVID-19 pandemic, businesses worldwide have accelerated their digital transformation efforts to adapt to new market dynamics and customer behaviors. AI plays a pivotal role in this transformation by enabling SMBs to digitize their operations, enhance customer experiences, and create new business models. The integration of AI with other emerging technologies, such as the Internet of Things (IoT) and blockchain, is further expanding the horizons for SMBs, opening up new avenues for innovation and growth.
Regionally, North America holds a dominant position in the AI in SMB market, driven by the early adoption of advanced technologies and robust support infrastructure. The presence of key market players and a strong ecosystem of AI startups and research institutions further bolster the region's growth. Europe follows closely, with significant investments in AI research and development, particularly in the healthcare and manufacturing sectors. The Asia Pacific region is expected to witness the highest growth rate during the forecast period, fueled by the rapid digitalization of economies and increasing government initiatives to promote AI adoption among SMBs. Latin America and the Middle East & Africa, though at nascent stages, are also showing promising signs of growth, driven by increasing awareness and investment in AI technologies.
The components segment of the AI in SMB market comprises software, hardware, and services, each playing a vital role in the deployment and functioning of AI solutions. The software segment holds a significant share of the market, driven by the increasing adoption of AI-powered applications and platforms. These software solutions range from AI algorithms and models to complex systems that integrate machine learning, natural language processing, and computer vision. SMBs are increasingly leveraging AI software to automate their operations, enhance customer experiences, and gain actionable insights from data. The availability of off-the-shelf AI software tailored for various business functions has made it easier for SMBs to implement AI without requiring extensive technical expertise.
The hardware segment, though smaller compared to software, is crucial for the effective deployment of AI solutions. This segment includes AI accelerators, GPUs, and other specialized hardware components designed to support the high computational demands of AI applications. As AI models become more complex and data-intensive, the need for advanced hardware solutions becomes imperative. SMBs are increasingly investing in AI ha
Recording Environment : In-car;1 quiet scene, 1 low noise scene, 3 medium noise scenes and 2 high noise scenes
Recording Content : It covers 5 fields: navigation field, multimedia field, telephone field, car control field and question and answer field; 500 sentences per people
Speaker : Speakers are evenly distributed across all age groups, covering children, teenagers, middle-aged, elderly, etc.
Device : High fidelity microphone; Binocular camera
Language : 20 languages
Transcription content : text
Accuracy rate : 98%
Application scenarios : speech recognition, Human-computer interaction; Natural language processing and text analysis; Visual content understanding, etc.
Population distribution : gender distribution: balance gender; race distribution: Caucasians,blacks,Indians,Asians; age distribution: aged from 18 to 60
Collection environment : In-car Cameras
Collection diversity : multiple races, multiple age periods, multiple time periods and behaviors (Dangerous behavior, Fatigue behavior, Visual movement behavior)
Device : binocular camera of RGB and infrared channels, the resolutions are 640x480
Collection time : day, evening and night
Image parameter : the video format is .avi
Accuracy : according to the accuracy of each person's action, the accuracy is greater than 95%; the accuracy of label annotation is not less than 95%
Off-the-shelf Re-ID data is collected from real surveillance scenes. The Identity Data diversity includes different age groups, different time periods, different shooting angles, different human body orientations and postures, clothing for different seasons.
Race distribution : Asian, Caucasian, Black, Brown
Gender distribution : male, female
Age distribution : from teenagers to the elderly, mainly young and middle-aged
Collection environment : indoor office scenes, in-car,conference, etc.
Collection diversity : different gestures data, different races, different age groups, different scenes
Collection equipment : cellphone, laptop camera, in-car camera
Data format : .mp4, .mov, .jpg
Accuracy rate : the accuracy exceeds 97% based on the accuracy of the actions; the accuracy of action naming is more than 97%
Off-the-shelf driver & passenger behavior data is Annotated Imagery Data that includes multiple ages, multiple time periods and multiple races (Caucasian, Black, Indian). The driver behaviors includes dangerous behavior, fatigue behavior and visual movement behavior.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset (splitted into 14 files due to file size limitation) for training, validating, and evaluating AI model which was developed for daily hypoxia prediction in the Louisiana-Texas shelf. The dateset is derived from a coupled hydrodynamic-biogeochemical model embedded in the Regional Ocean Modeling System. External dataset from three independent hydrodynamic forecast products: HYCOM, NEMO, and FVCOM.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is a dataset obtained from an online survey conducted in August 2020.
In the survey, participants were introduced to the concept of a smartphone-based shopping assistant application with the help of pictures and videos when shopping with and without the application. Participants were presented with three different shopping scenarios. In each scenario, we showed products on a shelf (groceries, luxury chocolate, shoes, books). The first shopping scenario was a regular shopping scenario (RSS), the second was an augmented reality shopping scenario (ARSS), and the third was an augmented reality shopping scenario with explainable AI features (XARSS). For each scenario participants had to answer questions about how they perceived the scenario and how it influenced their overall purchase intention.
The present work was conducted within the Innovative Training Network project PERFORM funded by the European Union’s Horizon 2020 research and innovation program under the Marie Skłodowska-Curie grant agreement No. 765395. The EU Research Executive Agency is not responsible for any use that may be made of the information it contains.
Off-the-shelf biometric data (human face) covers 3D depth, segmentation: face organs and accessory, key points, facial expression, alpha Matte, age in variety and etc. All the Biometric Data are collected with signed authorization agreement.
Off-the-shelf OCR data covers natural scenes image and handwriting image data, covering 20 languages, multiple natural scenes, and multiple photographic angles.
https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
🧠 About This OTS Dataset
With an extensive 250-hour collection of high-quality general conversation audio recordings, this off-the-shelf (OTS) dataset empowers researchers and developers to advance Natural Language Processing (NLP), Conversational AI, and Generative Voice AI models across diverse sectors. Whether in finance, healthcare, retail, or any other domain, this dataset provides a rich training and evaluation resource for AI systems.
📊 Metadata Availability:… See the full description on the dataset page: https://huggingface.co/datasets/Macgence/egyptian-arabic-general-conversation-customer-speech-dataset.
Population distribution : the race distribution is Asians, Caucasians and black people, the gender distribution is male and female, the age distribution is from children to the elderly
Collecting environment : including indoor and outdoor scenes (such as supermarket, mall and residential area, etc.)
Data diversity : different ages, different time periods, different cameras, different human body orientations and postures, different ages collecting environment
Device : surveillance cameras, the image resolution is not less than 1,9201,080
Data format : the image data format is .jpg, the annotation file format is .json
Annotation content : human body rectangular bounding boxes, 15 human body attributes
Quality Requirements : A rectangular bounding box of human body is qualified when the deviation is not more than 3 pixels, and the qualified rate of the bounding boxes shall not be lower than 97%;Annotation accuracy of attributes is over 97%
Off-the-shelf 1 million hours of Unsupervised speech dataset, covering 10+ languages(English, French, German, Japanese, Arabic, Mandarin and etc. , 100,000 hours each). The content covers dialogues or monologues in 28 common domains, such as daily vlogs, travel, podcast, technology, beauty, etc.
Summary We used a combination of regional ocean climate projections and simulated species distributions (Leroy et al., 2016) to quantify sources of uncertainty in projections of spatially-explicit biomass for three species archetypes in the CCS (1985-2100; Fig. 1). Species archetypes were simplified representations of three general groups of marine finfish found in the CCS that comprise ecologically and/or economically important fisheries and that might be expected to show variable patterns of redistribution under climate change based on their habitat preferences, population dynamics, and mobility characteristics: 1) a highly migratory species (HMS) that was designed to resemble north Pacific albacore; 2) a coastal pelagic species (CPS) that was designed to resemble northern anchovy (CPS); and 3) a groundfish species (GFS) that was designed to resemble sablefish. SDMs (n=15; Figure 1) were then fitted to simulated biomass data for each archetype (training period 1985-2010) and projected from 2011-2100 using each of the three regional ocean climate models. Our framework resulted in 252 SDMs (15 SDM types, three species archetypes, three ESMs, and two environmental parameter simulations; Figure 1). To address our study goal of assessing SDM performance and understanding sources of uncertainty in species distribution projections, we compared the output of SDM projections against simulated “observations” for 2011-2100 and quantified the uncertainty introduced by the climate projection (ESM uncertainty) versus the uncertainty introduced by the SDM structure (SDM uncertainty). Environmental Covariates from Regional Ocean Projections Environmental covariates used in species distribution simulations were obtained from regional ocean projections (Pozo Buil et al., 2021) forced by three ESMs from phase 5 of the Coupled Model Intercomparison Project (CMIP5) archive: Geophysical Fluid Dynamics Laboratory (GFDL) ESM2M, Hadley Center HadGEM2-ES (HAD), and Institut Pierre Simon Laplace (IPSL) CM5A-MR. These ESMs, hereafter referred to as GFDL, HAD, and IPSL, span the approximate range of potential changes in physical and biogeochemical conditions across all CMIP5 models (Pozo Buil et al., 2021). ESMs were downscaled using the Regional Ocean Modelling System (ROMS) coupled with a biogeochemical model (NEMUCSC) (Fiechter et al., 2018, 2021) based on the North Pacific Ecosystem Model for Understanding Regional Oceanography (NEMURO) (Kishi et al., 2007). The ROMS domain spans the CCS from 30-48°N and from the coast to 134°W at 0.1° horizontal resolution with 42 terrain-following vertical layers (Figure 2). Each downscaled ESM used the Representative Concentration Pathway (RCP) 8.5 climate change scenario. While we only examined RCP 8.5, it should be noted that using RCPs 2.6 and 4.5 would result in only minor differences in the spread of future environmental change for the variables and ESMs examined here. Specifically, uncertainty in biogeochemical change among the chosen ESMs in RCP8.5 envelops the uncertainty among RCPs 2.6 and 4.5; while for temperature GFDL and HAD represent opposite ends of the spectrum for the projected magnitude of warming in the CMIP5 ensemble (Drenkard et al., 2021; Pozo Buil et al., 2021). As such, we do not explore scenario uncertainty. Environmental covariates used in species distribution simulations were sea surface temperature (SST; C), bottom temperature (BT; C), bottom oxygen (BO; mmol m-3), mixed layer depth (MLD; m), surface chlorophyll-a (Chl-a; mg m-3), and zooplankton concentration integrated over 50 m (zoo_50; mmol N m-2) and 200 m (zoo_200; mmol N m-2). These environmental covariates were averaged over spring months (March-May) annually (1985-2100) to encompass the seasonal period when ocean productivity is most influential on the long-term population dynamics of most marine fishes in the CCS. Operating Models: Simulated Species Biomass Biomass distributions for three species archetypes were simulated on the ROMS grid for each year and each ESM from 1985-2100. Simulations were run using the ‘virtualspecies’ R package (Leroy et al., 2016) that is specifically designed to reflect real-world ecological properties and species-environment relationships (Meynard et al., 2019). We refer to these simulated species distributions as ‘operating models’. Species simulations used a two-step process. First, habitat suitability was calculated based on environmental data and specified species’ habitat preferences (Table S1). Environmental preferences used to force species distributions varied among species archetypes based on representative life histories (see Supplementary Material). The domain for the HMS archetype was set to the entire CCS, whereas the CPS and GFS archetypes were reduced to inshore waters to reflect the CPS archetype’s preference for pelagic waters over the continental shelf and slope, and the GFS archetype’s p...
Data size : 200,000 ID
Race distribution : black people, Caucasian people, brown(Mexican) people, Indian people and Asian people
Gender distribution : gender balance
Age distribution : young, midlife and senior
Collecting environment : including indoor and outdoor scenes
Data diversity : different face poses, races, ages, light conditions and scenes Device : cellphone
Data format : .jpg/png
Accuracy : the accuracy of labels of face pose, race, gender and age are more than 97%
https://www.marketresearchforecast.com/privacy-policyhttps://www.marketresearchforecast.com/privacy-policy
The global Civil Engineering and Architectural Software market is experiencing robust growth, driven by increasing infrastructure development globally, the adoption of Building Information Modeling (BIM) methodologies, and a rising demand for efficient project management solutions. The market, estimated at $15 billion in 2025, is projected to exhibit a Compound Annual Growth Rate (CAGR) of 8% from 2025 to 2033, reaching approximately $28 billion by 2033. This growth is fueled by several key factors. Firstly, governments worldwide are investing heavily in infrastructure projects, necessitating sophisticated software for design, analysis, and project management. Secondly, the increasing adoption of BIM, which facilitates collaborative design and minimizes errors, is driving demand for advanced software capabilities. Furthermore, the construction industry's ongoing digital transformation necessitates software that integrates various aspects of project delivery, including cost budgeting, materials acquisition, and resource allocation, leading to greater efficiency and reduced project timelines. The off-the-shelf software segment currently holds the largest market share, but tailored solutions are witnessing significant growth due to the increasing need for customized functionalities to meet specific project requirements. Key regional markets include North America and Europe, which together account for a significant portion of the global market share. However, the Asia-Pacific region is expected to experience the fastest growth in the coming years, driven by rapid urbanization and significant infrastructure development in countries like China and India. While the market faces some restraints, such as the high initial investment costs of software and the need for skilled professionals to operate them, these challenges are being addressed by vendors offering flexible licensing models and robust training programs. The competitive landscape is characterized by a mix of established players and emerging innovative companies, fostering a dynamic and innovative market. The ongoing development of Artificial Intelligence (AI) and Machine Learning (ML) technologies within the software is poised to further revolutionize the industry by enabling predictive analytics and automated design processes.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Smart-Eye: AI Model for Empty Shelf Detection
Smart-Eye is an AI-powered computer vision system developed to monitor and detect empty or partially empty shelves in retail environments such as shopping malls and supermarkets. The primary objective of Smart-Eye is to automate shelf inventory checks, reduce stock-out situations, and improve overall customer experience by ensuring timely restocking.
Smart-Eye utilizes advanced deep learning algorithms, particularly convolutional neural networks (CNNs), trained on thousands of annotated images representing various shelf conditions — from fully stocked to completely empty. The system processes real-time footage from in-store cameras to identify shelf areas that are understocked or vacant.
Key features of Smart-Eye include:
**Real-Time Monitoring: **Smart-Eye continuously analyzes live video feeds from installed surveillance cameras across the store, allowing for immediate identification of empty or low-stock shelves.
High Accuracy Detection: Through rigorous training and validation, Smart-Eye achieves high precision in distinguishing between full, partially filled, and empty shelf segments, even under varying lighting conditions and camera angles.
Scalable Architecture: The system is built to support scalability, allowing it to be deployed in large retail chains with hundreds of cameras and diverse store layouts.
Integration Capabilities: Smart-Eye can be integrated with existing inventory management systems. Upon detecting an empty shelf, it can trigger automatic alerts to store staff or initiate restocking workflows.
Continuous Learning: The model is designed to improve over time. It supports feedback loops where human validations of shelf conditions can be fed back into the training data to enhance accuracy.
By leveraging Smart-Eye, retailers can minimize the risk of product unavailability, optimize inventory control, and enhance operational efficiency. In turn, this leads to improved customer satisfaction and increased sales opportunities.
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Welcome to the Hispanic Human Facial Images Dataset, meticulously curated to enhance face recognition models and support the development of advanced biometric identification systems, KYC models, and other facial recognition technologies.
This dataset comprises over 1500 Hispanic individual facial image sets, with each set including:
The dataset includes contributions from a diverse network of individuals across Hispanic countries.
To ensure high utility and robustness, all images are captured under varying conditions:
Each facial image set is accompanied by detailed metadata for each participant, including:
This metadata is essential for training models that can accurately recognize and identify faces across different demographics and conditions.
This facial image dataset is ideal for various applications in the field of computer vision, including but not limited to:
We understand the evolving nature of AI and machine learning requirements. Therefore, we continuously add more assets with diverse conditions to this off-the-shelf facial image dataset.
Recording environment : professional recording studio.
Recording content : general narrative sentences, interrogative sentences, etc.
Speaker : native speaker
Annotation Feature : word transcription, part-of-speech, phoneme boundary, four-level accents, four-level prosodic boundary.
Device : Microphone
Language : American English, British English, Japanese, French, Dutch, Catonese, Canadian French,Australian English, Italian, New Zealand English, Spanish, Mexican Spanish
Application scenarios : speech synthesis
Accuracy rate: Word transcription: the sentences accuracy rate is not less than 99%. Part-of-speech annotation: the sentences accuracy rate is not less than 98%. Phoneme annotation: the sentences accuracy rate is not less than 98% (the error rate of voiced and swallowed phonemes is not included, because the labelling is more subjective). Accent annotation: the word accuracy rate is not less than 95%. Prosodic boundary annotation: the sentences accuracy rate is not less than 97% Phoneme boundary annotation: the phoneme accuracy rate is not less than 95% (the error range of boundary is within 5%)