To get high quality singers:
First we have to create a Google sheet. Name it as Project 3. then we have to create 23 sheets. name it from 1992 to 2014. now go to the website and copy the link. then by using importhtml function import the data to all the sheets from 1992 to 2014. create a sheet name it as merged data and copy the data from second row from all the 23 sheets and paste it in merged data. create the column names as Rank, Artist, Title, Year. we will get 2300 rows. now create a new google sheet name it as prolific-1. to get unique artist use unique function. and to get frequency use countif function. And sort them in descending order. now plot the bar. before we made with frequency now we make it with score. create a column score in merged data and use 101-rank function to get the scores. now create a google sheet as prolific-2. use artist and score columns. now use unique function to get the data of artists. for score use arrayfunction(). now sort the data and plot the bar
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Motivated by the recent increase in bank mergers, this paper examines the performance of German cooperative banks that merged between 2014 and 2019. We are particularly interested in whether elevated merger rates are due to bank inefficiencies or to challenging policy measures such as low-for-long interest rates. The results indicate that banks that perform relatively worse before and during the low interest environment exhibit a greater probability of becoming a target during this period. Consolidation generally occurs among low performing banks where large and well-capitalized banks merge with their small and inefficient peers. Ultimately, our results attribute the increased number of mergers to inefficiencies in the banking industry, as banks that exited the market were inefficient prior to the adverse low interest rate environment.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Merging Raw, Empty, 0.5 Synthetic 2 is a dataset for instance segmentation tasks - it contains Stickers annotations for 1,964 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains simulations of a model of two human drivers interacting in a top-down-view merging scenario. This merging scenario is a simplified version of highway merging. In this scenario, two vehicles approach a pre-defined merge point. In a previous experiment, we asked two participants to stick to their initial velocity yet avoid collisions. The data in this dataset contains model simulations that describe this human interactive behaviour during driving.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We present the data used in "DeepMerge II: Building Robust Deep Learning Algorithms for Merging Galaxy Identification Across Domains". In this paper, we test domain adaptation techniques, such as Maximum Mean Discrepancy (MMD) and adversarial training with Domain Adversarial Neural Networks (DANNs) for cross-domain studies of merging galaxies. Domain adaptation is performed between two simulated datasets of various levels of observational realism (simulation-to-simulation experiments), and between simulated data and observed telescope images (simulation-to-real experiments). For more details about the datasets please see the paper mentioned above.
Simulation-to-Simulation Experiments
Data used to study distant merging galaxies using simulated images from the Illustris-1 cosmological simulation at redshift z=2. The images are 75x75 pixels with three filters applied that mimic Hubble Space Telescope (HST) observations (ACS F814W,NC F356W, WFC3 F160W) with added point-spread function (PSF) and with or without observational noise.
Source Domain
Images: SimSim_SOURCE_X_Illustris2_pristine.npy
Labels: SimSim_SOURCE_y_Illustris2_pristine.npy
Target Domain
Images: SimSim_TARGET_X_Illustris2_noisy.npy
Labels: SimSim_TARGET_y_Illustris2_noisy.npy
Simulation-to-Real Experiments
Data used to study nearby merging galaxies using simulated Illustris-1 images at redshift z=0 and observed Sloan Digital Sky Survey (SDSS) images from the Galaxy Zoo project. All images have three filters. SDSS images have (g,r,i) filters, while simulated Illustris images also mimic the same three SDSS filters with added effects of dust, PSF and observational noise.
Source Domain
Images: SimReal_SOURCE_X_Illustris0.npy
Labels: SimReal_SOURCE_y_Illustris0.npy
Target Domain
Images: SimReal_TARGET_X_postmergers_SDSS.npy
Labels: SimReal_TARGET_y_postmergers_SDSS.npy
This dataset contains preprocessed audio samples from three major voice spoof detection benchmarks (ASVSpoof2021, Fake-or-Real, DEEP-VOICE), standardized for machine learning applications. All files have been converted to a uniform format and segmented for direct use in AI model training.
This dataset combines preprocessed samples from ASVSpoof2021 (non-commercial), Fake-or-Real (CC-BY-NC-SA), and DEEP-VOICE (Apache 2.0). Derivative work licensed under CC-BY-NC-SA 4.0 for research use only
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Merge V1 is a dataset for object detection tasks - it contains Merging annotations for 5,174 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This file contains Table S1-Table S3. Table S1, Diagnosis code classification for influenza-like-illness syndrome group analysis. Table S2, Diagnosis code classification for gastrointestinal syndrome group analysis. Table S3, Counts of Core-Based Statistical Areas (CBSAs) for Veterans Affairs (VA) and Department of Defense (DoD) medical facilities for three population scales. A. Distribution of CBSAs with VA and DoD facilities by population density. B. Comparison of number of patient visits between VA and DoD in each of the population scales by CBSAs which have both systems. (DOCX)
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The sample SAS and Stata code provided here is intended for use with certain datasets in the National Neighborhood Data Archive (NaNDA). NaNDA (https://www.openicpsr.org/openicpsr/nanda) contains some datasets that measure neighborhood context at the ZIP Code Tabulation Area (ZCTA) level. They are intended for use with survey or other individual-level data containing ZIP codes. Because ZIP codes do not exactly match ZIP code tabulation areas, a crosswalk is required to use ZIP-code-level geocoded datasets with ZCTA-level datasets from NaNDA. A ZIP-code-to-ZCTA crosswalk was previously available on the UDS Mapper website, which is no longer active. An archived copy of the ZIP-code-to-ZCTA crosswalk file has been included here. Sample SAS and Stata code are provided for merging the UDS mapper crosswalk with NaNDA datasets.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This resource introduces a merged dataset, integrating Convective Triggering Potential (CTP) and Humidity Index (HI) from three established reanalysis products: the Modern-Era Retrospective Analysis for Research and Applications, version 2 (MERRA2), Climate Forecast System Reanalysis (CFSR), and the European Centre for Medium-Range Weather Forecasts Reanalysis v5 (ERA5). This innovative dataset, crafted using the Triple Collocation (TC) method, addresses the challenges posed using single-source reanalysis data and offers a more reliable representation of atmospheric conditions. It mitigates biases associated with individual datasets and compensates for satellite-derived estimates' shortcomings, such as missing observations and lower vertical resolution. This merged CTP-HI product offers a robust alternative to single-source datasets, enhancing accuracy in characterizing atmospheric conditions and addressing the limitations of satellite-derived data. Verification against the Integrated Global Radiosonde Archive version 2 (IGRA2) in-situ measurements and Atmospheric Infrared Sounder version 7 (AIRSv7) satellite observations ensure reliability for meteorological research. The dataset provides a valuable tool for analyzing atmospheric stability and humidity, with potential implications for weather prediction and climate research.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset was recorded in a top-down-view driving simulator where two participants solved a merging conflict. This merging scenario is a simplified version of highway merging. In this scenario, two vehicles approach a pre-defined merge point. Two participants were asked to stick to their initial velocity, yet avoid collisions. The data was (and can be) used to analyze human interaction behavior during driving.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is for the reproduction of ConGra, which is designed to evaluate code-merging tools across a diverse range of merging scenarios and assess their ability to resolve conflicts of varying complexities.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The Lane Merging and Fork Area Segmentation Dataset specifically addresses the complexities of lane merging and forking, critical scenarios in autonomous driving. This dataset, consisting of driving recorder images, is annotated for binary segmentation, focusing on areas where lanes merge or branch off. It includes detailed labels for lane merging areas, lane fork areas (marked by triangular inverted lines), and potential obstructions such as vehicles, trees, road signs, and pedestrians. This dataset is a vital tool for training AI models to navigate these challenging road situations, ensuring smoother and safer autonomous driving experiences.
https://www.gesis.org/en/institute/data-usage-termshttps://www.gesis.org/en/institute/data-usage-terms
We collected data on almost the complete population of the merger control decisions by the Directorate-General Competition’s (DG COMP) of the European Commission. We started the data collection with the first year of common European merger control, 1990, and included all years up to 2014. This amounts to 25 years of data on European merger control. With regard to the scope of the decisions, we collected data in all cases where a legal decision document exists. This includes all cases settled in the first phase of an investigation (Art. 6(1)(a), 6(1)(b), 6(1)(c) and 6(2)) and all cases decided in the second phase of an investigation (Art. 8(1), 8(2), and 8(3)). Note that this also includes all cases settled under a ‘simplified procedure’, provided that a legal decision document exists. Furthermore, we also intended to collect data on cases that were either referred back to member states by DG COMP or aborted by the merging parties. While we have collected some data on such cases, data on these cases is not always available. Therefore, we cannot guarantee that the final dataset covers all of these cases. The level of observation is not a particular merger case but a particular product/geographic market combination concerned by a merger. In total, the final dataset contains 5,196 DG COMP merger decisions. For each of this decision, we record a number of observations equal to the number of product/geographic markets identified in the specific transaction. Hence, the total dataset contains 31,451 observations.
Mosaic format for combine all dataset to train Malaysian LLM
This repository is to store dataset shards using mosaic format.
prepared at https://github.com/malaysia-ai/dedup-text-dataset/blob/main/pretrain-llm/combine-all.ipynb using tokenizer https://huggingface.co/malaysia-ai/bpe-tokenizer 4096 context length.
how-to
git clone,
git lfs clone https://huggingface.co/datasets/malaysia-ai/mosaic-combine-all
load it,
from streaming import LocalDataset import numpy… See the full description on the dataset page: https://huggingface.co/datasets/malaysia-ai/mosaic-combine-all.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data associated with the paper: G. Roncoroni, E. Forte, I. Santin, M. Pipan, (2023), Deep Learning based multi-frequency GPR data merging, Geophysics, DOI: Data are associated to code presented in https://github.com/Giacomo-Roncoroni/merging_GPR/
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data used by scripts applied for analyzing merging tools.
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
The global PDF merge software market is experiencing robust growth, driven by the increasing adoption of digital documents across personal and enterprise sectors. The market's expansion is fueled by the rising need for efficient document management, collaboration tools, and streamlined workflows. Businesses of all sizes rely on PDF merging for tasks ranging from creating comprehensive reports and presentations to consolidating contracts and legal documents. The seamless integration of PDF merge functionality within broader productivity suites and cloud-based platforms further enhances market appeal. While the precise market size in 2025 is unavailable, considering a conservative estimate for a market with a projected CAGR (let's assume 10% for illustration purposes, though this would need verification against actual data), and given the prevalence of PDF usage, a reasonable estimate for the 2025 market size could be in the range of $500 million USD. This figure accounts for both consumer and enterprise segments, with the latter likely commanding a larger share due to higher software spending. The market is segmented by operating system (iOS and Android) and application type (personal and enterprise). The enterprise segment is projected to grow faster due to increased demand for advanced features like security and integration with enterprise resource planning (ERP) systems. The prevalence of mobile devices and cloud-based services is pushing the adoption of mobile-friendly PDF merge solutions. Key restraints include the availability of free or low-cost alternatives, along with the learning curve associated with some advanced software. However, the overall market trajectory indicates sustained growth, with the increasing complexity of document management and the growing preference for digital workflows fueling demand for sophisticated PDF merging tools. Competition is fierce, with established players like Adobe and newer entrants constantly innovating to capture market share. The continued rise in remote work and digital transformation initiatives across industries will significantly impact future market growth.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
CTIMG MERGE is a dataset for object detection tasks - it contains Sj annotations for 598 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Xogta Isku-dhafka Haadka iyo Fargeetada Aagga Xogta ayaa si gaar ah wax uga qabanaysa isku-dhafka haadka iyo fargeetada, xaaladaha muhiimka ah ee wadista iskeed u madaxbannaan. Xog-ururintan, oo ka kooban sawirada duubiyaasha wadista, ayaa loo sharraxay kala qaybinta laba-geesoodka ah, iyada oo diiradda la saarayo meelaha ay jidadku ku biiraan ama laanta. Waxa ku jira calaamado tafatiran oo loogu talagalay aagagga isku-dhafka haadka, meelaha haadka fargeetada ah (oo lagu calaamadeeyay khadadka saddex-geesoodka ah), iyo xannibaadaha suurtagalka ah sida baabuurta, geedaha, calamadaha waddooyinka, iyo dadka lugeynaya. Xog-ururintan ayaa ah qalab muhiim u ah tababarida moodooyinka AI si ay ugu maraan xaaladahan adag ee waddooyinka, hubinta fududaanta iyo khibradaha wadista ee badbaadada leh.
To get high quality singers:
First we have to create a Google sheet. Name it as Project 3. then we have to create 23 sheets. name it from 1992 to 2014. now go to the website and copy the link. then by using importhtml function import the data to all the sheets from 1992 to 2014. create a sheet name it as merged data and copy the data from second row from all the 23 sheets and paste it in merged data. create the column names as Rank, Artist, Title, Year. we will get 2300 rows. now create a new google sheet name it as prolific-1. to get unique artist use unique function. and to get frequency use countif function. And sort them in descending order. now plot the bar. before we made with frequency now we make it with score. create a column score in merged data and use 101-rank function to get the scores. now create a google sheet as prolific-2. use artist and score columns. now use unique function to get the data of artists. for score use arrayfunction(). now sort the data and plot the bar