Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
Summary
databricks-dolly-15k is an open source dataset of instruction-following records generated by thousands of Databricks employees in several of the behavioral categories outlined in the InstructGPT paper, including brainstorming, classification, closed QA, generation, information extraction, open QA, and summarization. This dataset can be used for any purpose, whether academic or commercial, under the terms of the Creative Commons Attribution-ShareAlike 3.0 Unported… See the full description on the dataset page: https://huggingface.co/datasets/databricks/databricks-dolly-15k.
Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
databricks-dolly-15k-ja
This repository provides an instruction tuning dataset developed by LLM-jp, a collaborative project launched in Japan. This dataset is a Japanese translation of databricks-dolly-15k using DeepL.
Send Questions to
llm-jp(at)nii.ac.jp
Model Card Authors
The names are listed in alphabetical order. Hirokazu Kiyomaru, Hiroshi Matsuda, Jun Suzuki, Namgi Han, Saku Sugawara, Shota Sasaki, Shuhei Kurita, Taishi Nakamura, Takashi Kodama, Takumi… See the full description on the dataset page: https://huggingface.co/datasets/llm-jp/databricks-dolly-15k-ja.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Dolly is a dataset for object detection tasks - it contains Dolly annotations for 335 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
databricks-dolly-15k
This dataset was not originally created by AI Squared. This dataset was curated and created by Databricks. The below text comes from the original release of the dataset's README file in GitHub (available at https://github.com/databrickslabs/dolly/tree/master/data):
Summary
databricks-dolly-15k is an open source dataset of instruction-following records generated by thousands of Databricks employees in several of the behavioral categories outlined in… See the full description on the dataset page: https://huggingface.co/datasets/aisquared/databricks-dolly-15k.
Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
This dataset contains 14,934 instructions, contexts and responses, in several natural language categories such as classification, closed QA, generation, etc. The English original dataset was created by @databricks, who crowd-sourced the data creation via its employees. The current dataset is a translation of that dataset through ChatGPT (gpt-3.5-turbo).
Data Instances
{ "id": 14963, "instruction": "Wat zijn de duurste steden ter wereld?", "context": "", "response": "Dit is een uitgebreide lijst van de duurste steden: Singapore, Tel Aviv, New York, Hong Kong, Los Angeles, Zurich, Genève, San Francisco, Parijs en Sydney.", "category": "brainstorming" }
Data Fields
id: the ID of the item. The following 77 IDs are not included because they could not be translated (or were too long): [1502, 1812, 1868, 4179, 4541, 6347, 8851, 9321, 10588, 10835, 11257, 12082, 12319, 12471, 12701, 12988, 13066, 13074, 13076, 13181, 13253, 13279, 13313, 13346, 13369, 13446, 13475, 13528, 13546, 13548, 13549, 13558, 13566, 13600, 13603, 13657, 13668, 13733, 13765, 13775, 13801, 13831, 13906, 13922, 13923, 13957, 13967, 13976, 14028, 14031, 14045, 14050, 14082, 14083, 14089, 14110, 14155, 14162, 14181, 14187, 14200, 14221, 14222, 14281, 14473, 14475, 14476, 14587, 14590, 14667, 14685, 14764, 14780, 14808, 14836, 14891, 1 4966]
instruction: the instruction (question)
context: additional context that the AI can use to answer the question
response: the AI's expected response
category: the category of this type of question (see Dolly for more info)
Dataset Creation
Both the translations and the topics were translated with OpenAI's API for gpt-3.5-turbo. max_tokens=1024, temperature=0 as parameters.
The prompt template to translate the input is (where src_lang was English and tgt_lang Dutch):
CONVERSATION_TRANSLATION_PROMPT = """You are asked to translate a task's instruction, optional context to the task, and the response to the task, from {src_lang} to {tgt_lang}.
Here are the requirements that you should adhere to:
1. maintain the format: the task consists of a task instruction (marked instruction:
), optional context to the task (marked context:
) and response for the task marked with response:
;
2. do not translate the identifiers instruction:
, context:
, and response:
but instead copy them to your output;
3. make sure that text is fluent to read and does not contain grammatical errors. Use standard {tgt_lang} without regional bias;
4. translate the instruction and context text using informal, but standard, language;
5. make sure to avoid biases (such as gender bias, grammatical bias, social bias);
6. if the instruction is to correct grammar mistakes or spelling mistakes then you have to generate a similar mistake in the context in {tgt_lang}, and then also generate a corrected output version in the output in {tgt_lang};
7. if the instruction is to translate text from one language to another, then you do not translate the text that needs to be translated in the instruction or the context, nor the translation in the response (just copy them as-is);
8. do not translate code fragments but copy them to your output. If there are English examples, variable names or definitions in code fragments, keep them in English.
Now translate the following task with the requirements set out above. Do not provide an explanation and do not add anything else.
"""
The system message was:
You are a helpful assistant that translates English to Dutch according to the requirements that are given to you.
Note that 77 items (0.5%) were not successfully translated. This can either mean that the prompt was too long for the given limit (max_tokens=1024) or that the generated translation could not be parsed into instruction, context and response fields. The missing IDs are [1502, 1812, 1868, 4179, 4541, 6347, 8851, 9321, 10588, 10835, 11257, 12082, 12319, 12471, 12701, 12988, 13066, 13074, 13076, 13181, 13253, 13279, 13313, 13346, 13369, 13446, 13475, 13528, 13546, 13548, 13549, 13558, 13566, 13600, 13603, 13657, 13668, 13733, 13765, 13775, 13801, 13831, 13906, 13922, 13923, 13957, 13967, 13976, 14028, 14031, 14045, 14050, 14082, 14083, 14089, 14110, 14155, 14162, 14181, 14187, 14200, 14221, 14222, 14281, 14473, 14475, 14476, 14587, 14590, 14667, 14685, 14764, 14780, 14808, 14836, 14891, 1 4966].
Initial Data Collection and Normalization
Initial data collection by databricks. See their repository for more information about this dataset.
Considerations for Using the Data
Note that the translations in this new dataset have not been verified by humans! Use at your own risk, both in terms of quality and biases.
Discussion of Biases
As with any machine-generated texts, users should be aware of potential biases that are included in this dataset. Although the prompt specifically includes make sure to avoid biases (such as gender bias, grammatical bias, social bias), of course the impact of such command is not known. It is likely that biases remain in the dataset so use with caution.
Other Known Limitations
The translation quality has not been verified. Use at your own risk!
Licensing Information
This repository follows the original databricks license, which is CC BY-SA 3.0 but see below for a specific restriction.
This text was generated (either in part or in full) with GPT-3 (gpt-3.5-turbo), OpenAI’s large-scale language-generation model. Upon generating draft language, the author reviewed, edited, and revised the language to their own liking and takes ultimate responsibility for the content of this publication.
If you use this dataset, you must also follow the Sharing and Usage policies.
As clearly stated in their Terms of Use, specifically 2c.iii, "[you may not] use output from the Services to develop models that compete with OpenAI". That means that you cannot use this dataset to build models that are intended to commercially compete with OpenAI. As far as I am aware, that is a specific restriction that should serve as an addendum to the current license.
This dataset is also available on the Hugging Face hub, its canonical repository.
Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
Databricks-dolly-15k is a corpus of over 15,000 records generated by thousands of Databricks employees, enabling large language models to demonstrate the amazing interactivity of ChatGPT. Databricks employees were invited to create prompt/response pairs in each of eight different instruction categories, including the seven outlined in the InstructGPT paper, as well as an open-ended, free-form category. Instruct contributors to refrain from using information from any source on the web, except Wikipedia (for a specific subset of command categories), and explicitly instruct contributors to avoid using generative AI in formulating commands or responses. Examples of each behavior are provided to motivate the question types and instructions appropriate to each category.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Dolly Dataset
Original version: aisquared/databricks-dolly-15k. This dataset is used to train MiniLLM.
https://www.archivemarketresearch.com/privacy-policyhttps://www.archivemarketresearch.com/privacy-policy
The furniture dolly market is experiencing robust growth, driven by the expanding e-commerce sector, increasing demand for efficient furniture handling in commercial settings (offices, warehouses, retail), and the rising popularity of DIY home improvement projects. The market size in 2025 is estimated at $850 million, demonstrating significant potential for growth. Considering a projected Compound Annual Growth Rate (CAGR) of 6% between 2025 and 2033, the market is poised to reach approximately $1.3 billion by 2033. This growth is further fueled by ongoing innovation in dolly design, including the introduction of lightweight, maneuverable models, and improved features like enhanced load capacity and ergonomic handles. The market is segmented by type (hand trucks, platform dollies, appliance dollies, etc.), material (steel, aluminum, wood), and application (residential, commercial). Major players like Milwaukee, Unitran, and others are driving market expansion through product diversification and strategic partnerships. However, the market faces certain restraints, primarily the fluctuating cost of raw materials (steel, aluminum) and potential supply chain disruptions. Furthermore, the increasing availability of affordable alternative furniture moving solutions, such as professional moving services, could partially constrain market growth. Nevertheless, the overall growth trajectory remains positive, with a significant opportunity for companies to innovate and expand their market share by focusing on environmentally friendly materials, ergonomic designs, and enhanced durability. Increased adoption of online sales channels presents a strong opportunity for direct-to-consumer sales of furniture dollies, further expanding the market.
Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
Dataset Card for "databricks-dolly-15k-curated-multilingual"
A curated and multilingual version of the Databricks Dolly instructions dataset. It includes a programmatically and manually corrected version of the original en dataset. See below. STATUS: Currently, the original Dolly v2 English version has been curated combining automatic processing and collaborative human curation using Argilla (~400 records have been manually edited and fixed). The following graph shows a summary… See the full description on the dataset page: https://huggingface.co/datasets/argilla/databricks-dolly-15k-curated-multilingual.
https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
The global camera dolly market size was estimated to be approximately USD 850 million in 2023 and is projected to reach USD 1.35 billion by 2032, registering a compound annual growth rate (CAGR) of 5.1% over the forecast period. The consistent demand for enhanced cinematographic quality in film production and broadcasting has been a significant growth factor in this market. The ability of camera dollies to deliver superior tracking shots and smooth camera movements has driven their adoption across multiple media production sectors. The burgeoning interest in high-quality video content for online streaming platforms has also contributed to the market's robust growth trajectory.
One of the primary growth factors for the camera dolly market is the increasing demand for high-definition video content. As the consumption of video content continues to rise, there is a parallel demand for advanced filmmaking equipment to ensure superior visual quality. Camera dollies, which facilitate fluid and dynamic movement, are becoming indispensable in creating engaging and professional-grade videos. This is particularly evident in the film and television industries, where directors and producers prioritize technical precision and cinematic flair. Additionally, the proliferation of streaming services and their need for original content have amplified the demand for camera dollies, as production companies strive to meet viewer expectations for quality and creativity.
Technological advancements and innovations in camera dolly systems are also fueling market growth. The transition from manual to motorized and hybrid dollies has enhanced the versatility and functionality of these devices, enabling filmmakers to execute complex shots with ease. Innovations such as remote-controlled dollies and programmable movement sequences have expanded the creative possibilities for camera operators, allowing them to achieve previously unattainable shots. Furthermore, the integration of digital technologies has led to improved user interfaces and enhanced compatibility with modern camera systems, making camera dollies more accessible and user-friendly for a broader range of users, from independent filmmakers to large production studios.
The rise of independent filmmaking and the democratization of video production have further propelled the camera dolly market. With the advent of affordable digital cameras and editing software, there has been a surge in independent content creators and filmmakers who seek professional-grade equipment to produce high-quality videos on a budget. Camera dollies are increasingly being adopted by this segment due to their ability to add professional polish to video productions without necessitating a significant financial investment. Educational institutions, recognizing the value of practical, hands-on learning, are also investing in camera dollies to equip students with the skills needed for careers in media production.
Regionally, North America and Europe are expected to maintain their dominance in the camera dolly market, driven by the presence of major production houses and a high concentration of film and television studios. The increasing number of film festivals and media events in these regions further supports market growth. Meanwhile, the Asia Pacific region is anticipated to witness substantial growth due to the rising popularity of regional cinema and the expansion of media production capabilities. The growth of digital platforms and the increasing penetration of internet services in countries like India and China are creating opportunities for content creation that relies on advanced cinematographic equipment like camera dollies.
The camera dolly market is segmented by product type into manual dolly, motorized dolly, and hybrid dolly. Manual dollies, which rely on physical push or pull mechanisms, have been a staple in the industry for many years. These dollies are favored for their simplicity and reliability, requiring minimal technical expertise to operate. They are a preferred choice for specific shooting scenarios where precise control over movement speed is needed without the interference of electronic components. The manual dolly segment continues to hold significant market share, particularly among traditionalists in the film industry who appreciate the tactile feedback and control these devices offer.
Motorized dollies represent a significant advancement in camera mobility technology, offering automated control over movement paths and speeds. These devices are particularly popular in professional
Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
Korean translation of databricks-dolly-15k via the DeepL API Note: There are cases where multilingual data has been converted to monolingual data during batch translation to Korean using the API. Below is databricks-dolly-15k's README.
Summary
databricks-dolly-15k is an open source dataset of instruction-following records generated by thousands of Databricks employees in several of the behavioral categories outlined in the InstructGPT paper, including brainstorming, classification… See the full description on the dataset page: https://huggingface.co/datasets/nlpai-lab/databricks-dolly-15k-ko.
https://www.marketreportanalytics.com/privacy-policyhttps://www.marketreportanalytics.com/privacy-policy
The global push cart dolly market, valued at $710 million in 2025, is projected to experience steady growth, driven by increasing e-commerce activities and the subsequent rise in last-mile delivery services. The 2.8% CAGR indicates a consistent, albeit moderate, expansion over the forecast period (2025-2033). Key growth drivers include the rising demand for efficient material handling solutions across diverse sectors like retail, logistics, and manufacturing. Lightweight models are gaining popularity due to ease of maneuverability and reduced strain on workers, while heavy-duty options cater to industries requiring robust load-bearing capabilities. The market segmentation by application (personal and commercial use) and type (lightweight and heavy-duty) reflects the diverse needs of various end-users. The commercial segment is expected to dominate, propelled by the aforementioned surge in e-commerce and the need for efficient warehouse operations. Geographic distribution shows a relatively balanced market share across North America, Europe, and Asia Pacific, with potential for further expansion in emerging economies. Competition is intense, with both established players like Harper Trucks and Magliner alongside several regional manufacturers vying for market share. Future growth will likely depend on technological advancements (e.g., improved wheel designs, ergonomic enhancements), sustainable material usage, and the ability to cater to the evolving demands of a rapidly changing logistics landscape. Pricing strategies and effective distribution channels also play a crucial role in market success. The competitive landscape is characterized by a mix of established international brands and regional manufacturers. Innovation in materials, design, and manufacturing processes will be critical for companies to maintain a competitive edge. Increasing labor costs and a focus on workplace safety are likely to fuel demand for ergonomically designed push cart dollies. Further market penetration in under-served regions, coupled with targeted marketing efforts, will be essential for sustained growth. The market will continue to evolve based on the needs of various industries and consumer preferences; factors such as sustainability initiatives and evolving regulations will influence both product development and market dynamics. Companies will need to adapt quickly to meet these demands and remain competitive within the push cart dolly market.
Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
Dataset Card for "dolly_hhrlhf"
This dataset is a combination of Databrick's dolly-15k dataset and a filtered subset of Anthropic's HH-RLHF. It also includes a test split, which was missing in the original dolly set. That test set is composed of 200 randomly selected samples from dolly + 4,929 of the test set samples from HH-RLHF which made it through the filtering process. The train set contains 59,310 samples; 15,014 - 200 = 14,814 from Dolly, and the remaining 44,496 from… See the full description on the dataset page: https://huggingface.co/datasets/mosaicml/dolly_hhrlhf.
https://www.marketreportanalytics.com/privacy-policyhttps://www.marketreportanalytics.com/privacy-policy
The global rail dolly market is experiencing robust growth, driven by increasing investments in railway infrastructure modernization and expansion across various regions. The market's expansion is fueled by the rising demand for efficient and safe material handling solutions within railway yards and maintenance facilities. Technological advancements, such as the integration of automated guidance systems and improved load-bearing capacities in rail dollies, are further contributing to market expansion. While precise market sizing data isn't provided, considering the involvement of numerous players and the continuous growth in railway infrastructure projects globally, a conservative estimate places the 2025 market size at approximately $500 million. Assuming a moderate CAGR of 5% (a reasonable estimate given the steady but not explosive growth likely in this niche market), the market is projected to reach approximately $650 million by 2033. Key segments within the market include heavy-duty rail dollies for freight transport, lighter-duty models for passenger rail maintenance, and specialized dollies catering to unique needs like those found in high-speed rail networks. Competition among established players like The Nolan Company, EGRIPMENT BV, and others, alongside emerging regional manufacturers, is fostering innovation and driving down costs, making rail dollies increasingly accessible to a wider range of clients. Geographic expansion presents significant opportunities, particularly in developing economies with burgeoning railway networks. However, factors like the high initial investment costs associated with adopting new technologies and the cyclical nature of railway infrastructure projects pose potential restraints to market growth. Nevertheless, the long-term outlook for the rail dolly market remains positive, driven by sustained investments in railway infrastructure and ongoing technological advancements aimed at enhancing efficiency and safety within the railway sector. This makes it a promising sector for continued investment and innovation.
This dataset provides information about the number of properties, residents, and average property values for Dolly Street cross streets in South Dennis, MA.
This dataset provides information about the number of properties, residents, and average property values for Dolly Drive cross streets in Manteno, IL.
This dataset provides information about the number of properties, residents, and average property values for Dolly Drive cross streets in La Plata, MD.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is about book subjects. It has 2 rows and is filtered where the books is Dolly : close up/up close. It features 10 columns including number of authors, number of books, earliest publication date, and latest publication date.
This dataset provides information about the number of properties, residents, and average property values for Dolly Drive cross streets in Beacon Falls, CT.
Region(s) of distribution of Dolly Varden (Salvelinus malma) (Walbaum, 1792) in the Arctic as digitized for U.S. Geological Survey Scientific Investigations Report 2016-5038. For details on the project and purpose, see the report at https://doi.org/10.3133/sir20165038. Complete metadata for the collection of species datasets is in the metadata document "Dataset_for_Alaska_Marine_Fish_Ecology_Catalog.xml" at https://doi.org/10.5066/F7M61HD7. Source(s) for this digitized data layer are listed in the metadata Process Steps section. Note that the original source may show an extended area; some datasets were limited to the published map boundary. Distributions of marine fishes are shown in adjacent Arctic seas where reliable data are available. The data were clipped to show only the marine distribution areas although some species also may have an inland presence.
Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
Summary
databricks-dolly-15k is an open source dataset of instruction-following records generated by thousands of Databricks employees in several of the behavioral categories outlined in the InstructGPT paper, including brainstorming, classification, closed QA, generation, information extraction, open QA, and summarization. This dataset can be used for any purpose, whether academic or commercial, under the terms of the Creative Commons Attribution-ShareAlike 3.0 Unported… See the full description on the dataset page: https://huggingface.co/datasets/databricks/databricks-dolly-15k.