17 datasets found
  1. smol

    • huggingface.co
    Updated Mar 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Google (2025). smol [Dataset]. https://huggingface.co/datasets/google/smol
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 28, 2025
    Dataset authored and provided by
    Googlehttp://google.com/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    SMOL

    SMOL (Set for Maximal Overall Leverage) is a collection of professional translations into 221 Low-Resource Languages, for the purpose of training translation models, and otherwise increasing the representations of said languages in NLP and technology. Please read the SMOL Paper and the GATITOS Paper for a much more thorough description! There are four resources in this directory:

    SmolDoc: document-level translations into 100 languages SmolSent: sentence-level translations into… See the full description on the dataset page: https://huggingface.co/datasets/google/smol.

  2. i

    Benchmark dataset for small and narrow rectangular object detection from...

    • ieee-dataport.org
    Updated May 18, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    zhonghua hong (2022). Benchmark dataset for small and narrow rectangular object detection from Google Earth imagery [Dataset]. https://ieee-dataport.org/documents/benchmark-dataset-small-and-narrow-rectangular-object-detection-google-earth-imagery
    Explore at:
    Dataset updated
    May 18, 2022
    Authors
    zhonghua hong
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The benchmark dataset are consisted of 2

  3. h

    google_mt5-small-details

    • huggingface.co
    Updated Jul 30, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Open LLM Leaderboard (2025). google_mt5-small-details [Dataset]. https://huggingface.co/datasets/open-llm-leaderboard/google_mt5-small-details
    Explore at:
    Dataset updated
    Jul 30, 2025
    Dataset authored and provided by
    Open LLM Leaderboard
    Description

    Dataset Card for Evaluation run of google/mt5-small

    Dataset automatically created during the evaluation run of model google/mt5-small The dataset is composed of 38 configuration(s), each one corresponding to one of the evaluated task. The dataset has been created from 1 run(s). Each run can be found as a specific split in each configuration, the split being named using the timestamp of the run.The "train" split is always pointing to the latest results. An additional configuration… See the full description on the dataset page: https://huggingface.co/datasets/open-llm-leaderboard/google_mt5-small-details.

  4. n

    Google Small Business

    • library.nwosu.edu
    Updated Apr 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    null (2025). Google Small Business [Dataset]. https://library.nwosu.edu/business/market
    Explore at:
    Dataset updated
    Apr 24, 2025
    Authors
    null
    License

    https://www.youtube.com/t/termshttps://www.youtube.com/t/terms

    Description

    Tips from Google about marketing a small business online.

  5. Small towns in Italy with the most Google searches per month 2023

    • statista.com
    Updated Jul 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Small towns in Italy with the most Google searches per month 2023 [Dataset]. https://www.statista.com/statistics/1262452/most-popular-small-towns-italy/
    Explore at:
    Dataset updated
    Jul 23, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2023
    Area covered
    Italy
    Description

    A May 2024 study analyzed the small towns in Italy with a population of under **** thousand with the highest average monthly number of Google searches in 2023. Based on the analysis, *** Sicilian destinations, Favignana and San Vito Lo Capo, recorded the highest figure, each with an average of ****** monthly Google searches in 2023. Portofino in Liguria followed in the ranking, with ****** monthly Google searches on average that year.

  6. a

    Small Object Dataset

    • academictorrents.com
    bittorrent
    Updated Jun 6, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zheng Ma and Lei Yu and Antoni B. Chan (2017). Small Object Dataset [Dataset]. https://academictorrents.com/details/8e751c111cf90123374b5f0cf61e6af9f5e5231e
    Explore at:
    bittorrent(5858609)Available download formats
    Dataset updated
    Jun 6, 2017
    Dataset authored and provided by
    Zheng Ma and Lei Yu and Antoni B. Chan
    License

    https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified

    Description

    Images of small objects for small instance detections. Currently four object types are available. ![]() We collect four datasets of small objects from images/videos on the Internet (e.g.YouTube or Google). Fly Dataset: contains 600 video frames with an average of 86 ± 39 flies per frame (648×72 @ 30 fps). 32 images are used for training (1:6:187) and 50 images for testing (301:6:600). Honeybee Dataset: contains 118 images with an average of 28 ± 6 honeybees per image (640×480). The dataset is divided evenly for training and test sets. Only the first 32 images are used for training. Fish Dataset: contains 387 frames of video with an average of 56±9 fish per frame (300×410 @ 30 fps). 32 images are used for training (1:3:94) and 65 for testing (193:3:387). Seagull Dataset: contains three high-resolution images (624×964) with an average of 866±107 seagulls per image. The first image is used for training, and the res

  7. h

    google_flan-t5-small-details

    • huggingface.co
    Updated Jul 30, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Open LLM Leaderboard (2025). google_flan-t5-small-details [Dataset]. https://huggingface.co/datasets/open-llm-leaderboard/google_flan-t5-small-details
    Explore at:
    Dataset updated
    Jul 30, 2025
    Dataset authored and provided by
    Open LLM Leaderboard
    Description

    Dataset Card for Evaluation run of google/flan-t5-small

    Dataset automatically created during the evaluation run of model google/flan-t5-small The dataset is composed of 38 configuration(s), each one corresponding to one of the evaluated task. The dataset has been created from 1 run(s). Each run can be found as a specific split in each configuration, the split being named using the timestamp of the run.The "train" split is always pointing to the latest results. An additional… See the full description on the dataset page: https://huggingface.co/datasets/open-llm-leaderboard/google_flan-t5-small-details.

  8. Company Datasets for Business Profiling

    • datarade.ai
    Updated Feb 23, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Oxylabs (2017). Company Datasets for Business Profiling [Dataset]. https://datarade.ai/data-products/company-datasets-for-business-profiling-oxylabs
    Explore at:
    .json, .xml, .csv, .xlsAvailable download formats
    Dataset updated
    Feb 23, 2017
    Dataset authored and provided by
    Oxylabs
    Area covered
    Moldova (Republic of), Bangladesh, Canada, Isle of Man, British Indian Ocean Territory, Andorra, Taiwan, Northern Mariana Islands, Nepal, Tunisia
    Description

    Company Datasets for valuable business insights!

    Discover new business prospects, identify investment opportunities, track competitor performance, and streamline your sales efforts with comprehensive Company Datasets.

    These datasets are sourced from top industry providers, ensuring you have access to high-quality information:

    • Owler: Gain valuable business insights and competitive intelligence. -AngelList: Receive fresh startup data transformed into actionable insights. -CrunchBase: Access clean, parsed, and ready-to-use business data from private and public companies. -Craft.co: Make data-informed business decisions with Craft.co's company datasets. -Product Hunt: Harness the Product Hunt dataset, a leader in curating the best new products.

    We provide fresh and ready-to-use company data, eliminating the need for complex scraping and parsing. Our data includes crucial details such as:

    • Company name;
    • Size;
    • Founding date;
    • Location;
    • Industry;
    • Revenue;
    • Employee count;
    • Competitors.

    You can choose your preferred data delivery method, including various storage options, delivery frequency, and input/output formats.

    Receive datasets in CSV, JSON, and other formats, with storage options like AWS S3 and Google Cloud Storage. Opt for one-time, monthly, quarterly, or bi-annual data delivery.

    With Oxylabs Datasets, you can count on:

    • Fresh and accurate data collected and parsed by our expert web scraping team.
    • Time and resource savings, allowing you to focus on data analysis and achieving your business goals.
    • A customized approach tailored to your specific business needs.
    • Legal compliance in line with GDPR and CCPA standards, thanks to our membership in the Ethical Web Data Collection Initiative.

    Pricing Options:

    Standard Datasets: choose from various ready-to-use datasets with standardized data schemas, priced from $1,000/month.

    Custom Datasets: Tailor datasets from any public web domain to your unique business needs. Contact our sales team for custom pricing.

    Experience a seamless journey with Oxylabs:

    • Understanding your data needs: We work closely to understand your business nature and daily operations, defining your unique data requirements.
    • Developing a customized solution: Our experts create a custom framework to extract public data using our in-house web scraping infrastructure.
    • Delivering data sample: We provide a sample for your feedback on data quality and the entire delivery process.
    • Continuous data delivery: We continuously collect public data and deliver custom datasets per the agreed frequency.

    Unlock the power of data with Oxylabs' Company Datasets and supercharge your business insights today!

  9. f

    Table_1_Does “Dr. Google” improve discussion and decisions in small animal...

    • figshare.com
    docx
    Updated Jun 19, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Svenja Springer; Thomas Bøker Lund; Sandra A. Corr; Peter Sandøe (2024). Table_1_Does “Dr. Google” improve discussion and decisions in small animal practice? Dog and cat owners use of internet resources to find medical information about their pets in three European countries.docx [Dataset]. http://doi.org/10.3389/fvets.2024.1417927.s002
    Explore at:
    docxAvailable download formats
    Dataset updated
    Jun 19, 2024
    Dataset provided by
    Frontiers
    Authors
    Svenja Springer; Thomas Bøker Lund; Sandra A. Corr; Peter Sandøe
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Modern dog and cat owners increasingly use internet resources to obtain information on pet health issues. While access to online information can improve owners’ knowledge of patient care and inform conversations with their veterinarian during consultations, there is also a risk that owners will misinterpret online information or gain a false impression of current standards in veterinary medicine. This in turn can cause problems or tensions, for example if the owner delays consulting their veterinarian about necessary treatment, or questions the veterinarian’s medical advice. Based on an online questionnaire aimed at dog and cat owners in Austria, Denmark and the United Kingdom (N = 2117) we investigated the use of internet resources to find veterinary medical information, the type of internet resources that were used, and whether owner beliefs explain how often they used the internet to find medical information about their pet. Approximately one in three owners reported that they never used internet resources prior to (31.7%) or after (37.0%) a consultation with their veterinarian. However, when owners do make use of the internet, our results show that they were more likely to use it before than after the consultation. The most common internet resources used by owners were practice websites (35.0%), veterinary association websites (24.0%), or ‘other’ websites providing veterinary information (55.2%). Owners who believe that the use of internet resources enables them to have a more informed discussion with their veterinarians more often use internet resources prior to a consultation, whereas owners who believed that internet resources help them to make the right decision for their animal more often use internet resources after a consultation. The results suggest that veterinarians should actively ask pet owners if they use internet resources, and what resources they use, in order to facilitate open discussion about information obtained from the internet. Given that more than a third of pet owners use practice websites, the findings also suggest that veterinarians should actively curate their own websites where they can post information that they consider accurate and trustworthy.

  10. Google: global corporate demography 2014-2024, by gender

    • statista.com
    Updated Oct 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2024). Google: global corporate demography 2014-2024, by gender [Dataset]. https://www.statista.com/statistics/311800/google-employee-gender-global/
    Explore at:
    Dataset updated
    Oct 28, 2024
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Worldwide
    Description

    As of January 2024, the majority of Google employees worldwide, almost 66 percent, were male. The distribution of male and female employees at Google hasn’t seen a big change over the recent years. In 2014 the share of female employees at Google was 30.6 percent. In 2021 this number has increased by only 3 percent. Considering that the total number of Google employees increased greatly between the years 2007 and 2020, the female quota among the employees had seen rather a small increase. Google as a company Google is a diverse internet company that provides a wide range of digital products and services. In 2022, the company’s global revenue was over 279 billion U.S. dollars. Most of its revenue, around 305 billion U.S. dollars, was from advertising. Among its services, the most popular ones are YouTube and Google Play. Male and female employees at tech companies Google is not the only tech company with a lower number of female employees. This pattern can be seen in other big tech companies too. In 2019, in a ranking of 20 leading tech companies worldwide, only 23andMe had more than a 50 percent share of female employees. The majority of tech companies in the ranking have far more male than female employees.

  11. G

    Google Workspace Business Tool Report

    • marketresearchforecast.com
    doc, pdf, ppt
    Updated Mar 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Market Research Forecast (2025). Google Workspace Business Tool Report [Dataset]. https://www.marketresearchforecast.com/reports/google-workspace-business-tool-27470
    Explore at:
    pdf, ppt, docAvailable download formats
    Dataset updated
    Mar 5, 2025
    Dataset authored and provided by
    Market Research Forecast
    License

    https://www.marketresearchforecast.com/privacy-policyhttps://www.marketresearchforecast.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The Google Workspace Business Tools market, encompassing applications like Gmail, Docs, Sheets, and Drive, is experiencing robust growth fueled by the increasing adoption of cloud-based solutions and the rising demand for collaborative work environments. The market's expansion is driven by several key factors, including the enhanced productivity and efficiency offered by integrated tools, the accessibility provided by mobile and web interfaces, and the growing need for secure data storage and sharing. While precise market sizing data is not provided, considering the extensive market penetration of Google Workspace and the overall growth in the SaaS (Software as a Service) market, a reasonable estimate for the 2025 market value would be in the range of $10 billion to $15 billion, potentially reaching $20 billion by 2030. This estimate considers factors such as the robust growth in cloud computing, the increasing number of businesses adopting digital workspaces, and the global expansion of internet connectivity. Growth is primarily driven by adoption among small and medium-sized enterprises (SMEs), given Google Workspace's competitive pricing and ease of use compared to more complex enterprise solutions. However, large enterprises contribute significantly to the overall market value due to their higher purchasing power and complex business needs that Google Workspace addresses with its advanced features and integrations. The market faces some challenges, including competition from established players like Microsoft 365 and Salesforce, as well as security concerns related to data breaches and privacy. However, Google's continuous innovation, ongoing improvements to security protocols, and strategic partnerships are mitigating these risks. Future growth will likely be driven by further integration with other Google services, the expansion of AI-powered features, and increasing demand for tailored solutions for specific industries. This will solidify Google Workspace's position as a leading provider of collaborative business tools and further expand its market share. Regionally, North America and Europe will continue to dominate the market, owing to high levels of digitalization and adoption of cloud technologies. However, rapid growth is anticipated in the Asia-Pacific region driven by increasing internet penetration and economic growth in emerging markets.

  12. w

    Global Google Business View Market Research Report: By Service Type...

    • wiseguyreports.com
    Updated Aug 10, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    wWiseguy Research Consultants Pvt Ltd (2024). Global Google Business View Market Research Report: By Service Type (Photography, Virtual Tours, 360-Degree Images, Floor Plans), By Business Size (Small Businesses, Medium-Sized Businesses, Large Enterprises), By Industry (Hospitality, Retail, Healthcare, Education, Real Estate) and By Regional (North America, Europe, South America, Asia Pacific, Middle East and Africa) - Forecast to 2032. [Dataset]. https://www.wiseguyreports.com/reports/google-business-view-market
    Explore at:
    Dataset updated
    Aug 10, 2024
    Dataset authored and provided by
    wWiseguy Research Consultants Pvt Ltd
    License

    https://www.wiseguyreports.com/pages/privacy-policyhttps://www.wiseguyreports.com/pages/privacy-policy

    Time period covered
    Jan 8, 2024
    Area covered
    Global
    Description
    BASE YEAR2024
    HISTORICAL DATA2019 - 2024
    REPORT COVERAGERevenue Forecast, Competitive Landscape, Growth Factors, and Trends
    MARKET SIZE 20234.62(USD Billion)
    MARKET SIZE 20245.14(USD Billion)
    MARKET SIZE 203212.2(USD Billion)
    SEGMENTS COVEREDService Type ,Business Size ,Industry ,Regional
    COUNTRIES COVEREDNorth America, Europe, APAC, South America, MEA
    KEY MARKET DYNAMICSRising Adoption of Digital Marketing Technological Advancements Virtual Reality Integration Growing Popularity of 3D Virtual Tours Increased Focus on Customer Engagement
    MARKET FORECAST UNITSUSD Billion
    KEY COMPANIES PROFILEDYuneec ,3D Robotics ,Sony ,Matterport ,Capture3D ,Autel Robotics ,Skyline Multimedia ,FlyCAM ,DJI ,Parrot ,Aeryon Labs ,DroneDeploy ,Pix4D ,GoPro
    MARKET FORECAST PERIOD2025 - 2032
    KEY MARKET OPPORTUNITIES1 Expanding ecommerce industry 2 Growing demand for virtual tours 3 VRAR integration opportunities 4 Personalized customer experiences
    COMPOUND ANNUAL GROWTH RATE (CAGR) 11.39% (2025 - 2032)
  13. Leading search engine providers used by SMEs in the U.S. 2016

    • statista.com
    Updated Nov 25, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2016). Leading search engine providers used by SMEs in the U.S. 2016 [Dataset]. https://www.statista.com/statistics/642282/us-search-engine-providers-used-by-small-to-medium-sized-enterprises/
    Explore at:
    Dataset updated
    Nov 25, 2016
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Nov 16, 2016 - Nov 21, 2016
    Area covered
    United States
    Description

    This statistic shows the leading search engine providers used by small to medium sized enterprise (SME) owners in the United States in order to be found more quickly as of *************. During the Statista survey conducted in *************, ** percent of responding SME owners said that they had paid or were considering to pay Google in order to be found more quickly in their search engine.

  14. h

    flan-t5-small-embed-refinedweb

    • huggingface.co
    Updated Jun 5, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    maxine (2023). flan-t5-small-embed-refinedweb [Dataset]. https://huggingface.co/datasets/crumb/flan-t5-small-embed-refinedweb
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 5, 2023
    Authors
    maxine
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    All of the data together is around 41GB. It's the last hidden states of 131,072 samples from refinedweb padded/truncated to 512 tokens on the left, fed through google/flan-t5-small. Structure: { "encoding": List, shaped (512, 512) aka (tokens, d_model), "text": String, the original text that was encoded, "attention_mask": List, binary mask to pass to your model with encoding to not attend to pad tokens }

    just a tip, you cannot load this with the RAM in the free ver of google colab, not… See the full description on the dataset page: https://huggingface.co/datasets/crumb/flan-t5-small-embed-refinedweb.

  15. speech_commands

    • huggingface.co
    • tensorflow.org
    • +1more
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Google, speech_commands [Dataset]. https://huggingface.co/datasets/google/speech_commands
    Explore at:
    Dataset authored and provided by
    Googlehttp://google.com/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is a set of one-second .wav audio files, each containing a single spoken English word or background noise. These words are from a small set of commands, and are spoken by a variety of different speakers. This data set is designed to help train simple machine learning models. This dataset is covered in more detail at https://arxiv.org/abs/1804.03209.

    Version 0.01 of the data set (configuration "v0.01") was released on August 3rd 2017 and contains 64,727 audio files.

    In version 0.01 thirty different words were recoded: "Yes", "No", "Up", "Down", "Left", "Right", "On", "Off", "Stop", "Go", "Zero", "One", "Two", "Three", "Four", "Five", "Six", "Seven", "Eight", "Nine", "Bed", "Bird", "Cat", "Dog", "Happy", "House", "Marvin", "Sheila", "Tree", "Wow".

    In version 0.02 more words were added: "Backward", "Forward", "Follow", "Learn", "Visual".

    In both versions, ten of them are used as commands by convention: "Yes", "No", "Up", "Down", "Left", "Right", "On", "Off", "Stop", "Go". Other words are considered to be auxiliary (in current implementation it is marked by True value of "is_unknown" feature). Their function is to teach a model to distinguish core words from unrecognized ones.

    The _silence_ class contains a set of longer audio clips that are either recordings or a mathematical simulation of noise.

  16. h

    kokborok

    • huggingface.co
    Updated Jun 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    dbr (2025). kokborok [Dataset]. https://huggingface.co/datasets/sdmy/kokborok
    Explore at:
    Dataset updated
    Jun 2, 2025
    Authors
    dbr
    Description

    Kokborok Digitalisation Project

    The Kokborok Digitalisation Project is an initiative to curate and enhance parallel data for the Kokborok-English language pair. This project builds upon the SMOL dataset by Google, available on Hugging Face, and involves modifying and correcting it to better reflect the nuances of the local Kokborok dialect.

      From the Author
    

    "Language is a living, breathing entity—constantly evolving, shaping cultures, and connecting generations. When we… See the full description on the dataset page: https://huggingface.co/datasets/sdmy/kokborok.

  17. w

    Global Generative Ai For Business Market Research Report: By Application...

    • wiseguyreports.com
    Updated Aug 10, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    wWiseguy Research Consultants Pvt Ltd (2024). Global Generative Ai For Business Market Research Report: By Application (Content and media generation, Product and prototype design, Marketing and advertising, Data analysis and insights, Customer service and engagement), By Type (Text-based, Image-based, Audio-based, Video-based, Multi-modal), By Industry (Healthcare, Financial services, Manufacturing, Retail, Technology), By Deployment Model (Cloud-based, On-premise, Hybrid), By End User (Large enterprises, Small and medium-sized businesses (SMBs), Independent professionals) and By Regional (North America, Europe, South America, Asia Pacific, Middle East and Africa) - Forecast to 2032. [Dataset]. https://www.wiseguyreports.com/reports/generative-ai-for-business-market
    Explore at:
    Dataset updated
    Aug 10, 2024
    Dataset authored and provided by
    wWiseguy Research Consultants Pvt Ltd
    License

    https://www.wiseguyreports.com/pages/privacy-policyhttps://www.wiseguyreports.com/pages/privacy-policy

    Time period covered
    Jan 8, 2024
    Area covered
    Global
    Description
    BASE YEAR2024
    HISTORICAL DATA2019 - 2024
    REPORT COVERAGERevenue Forecast, Competitive Landscape, Growth Factors, and Trends
    MARKET SIZE 202334.07(USD Billion)
    MARKET SIZE 202439.85(USD Billion)
    MARKET SIZE 2032139.6(USD Billion)
    SEGMENTS COVEREDApplication ,Type ,Industry ,Deployment Model ,End User ,Regional
    COUNTRIES COVEREDNorth America, Europe, APAC, South America, MEA
    KEY MARKET DYNAMICSGrowing demand for personalized content Increasing use of AIpowered tools in businesses Advancements in generative AI technology Government initiatives to promote AI adoption Partnerships and collaborations between tech companies
    MARKET FORECAST UNITSUSD Billion
    KEY COMPANIES PROFILEDMicrosoft ,Google ,OpenAI ,Meta Platforms ,BigScience ,Teradata ,Adobe ,Tencent ,IBM ,Alibaba ,C3.ai ,Baidu ,Salesforce ,Amazon ,NVIDIA
    MARKET FORECAST PERIOD2025 - 2032
    KEY MARKET OPPORTUNITIESContent Creation Marketing Automation Sales Optimization Product Development Customer Service
    COMPOUND ANNUAL GROWTH RATE (CAGR) 16.97% (2025 - 2032)
  18. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Google (2025). smol [Dataset]. https://huggingface.co/datasets/google/smol
Organization logo

smol

Smol

google/smol

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 28, 2025
Dataset authored and provided by
Googlehttp://google.com/
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

SMOL

SMOL (Set for Maximal Overall Leverage) is a collection of professional translations into 221 Low-Resource Languages, for the purpose of training translation models, and otherwise increasing the representations of said languages in NLP and technology. Please read the SMOL Paper and the GATITOS Paper for a much more thorough description! There are four resources in this directory:

SmolDoc: document-level translations into 100 languages SmolSent: sentence-level translations into… See the full description on the dataset page: https://huggingface.co/datasets/google/smol.

Search
Clear search
Close search
Google apps
Main menu