Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Semantic Artist Similarity dataset consists of two datasets of artists entities with their corresponding biography texts, and the list of top-10 most similar artists within the datasets used as ground truth. The dataset is composed by a corpus of 268 artists and a slightly larger one of 2,336 artists, both gathered from Last.fm in March 2015. The former is mapped to the MIREX Audio and Music Similarity evaluation dataset, so that its similarity judgments can be used as ground truth. For the latter corpus we use the similarity between artists as provided by the Last.fm API. For every artist there is a list with the top-10 most related artists. In the MIREX dataset there are 188 artists with at least 10 similar artists, the other 80 artists have less than 10 similar artists. In the Last.fm API dataset all artists have a list of 10 similar artists. There are 4 files in the dataset.mirex_gold_top10.txt and lastfmapi_gold_top10.txt have the top-10 lists of artists for every artist of both datasets. Artists are identified by MusicBrainz ID. The format of the file is one line per artist, with the artist mbid separated by a tab with the list of top-10 related artists identified by their mbid separated by spaces.artist_mbid \t artist_mbid_top10_list_separated_by_spaces mb2uri_mirex and mb2uri_lastfmapi.txt have the list of artists. In each line there are three fields separated by tabs. First field is the MusicBrainz ID, second field is the last.fm name of the artist, and third field is the DBpedia uri.artist_mbid \t lastfm_name \t dbpedia_uri There are also 2 folders in the dataset with the biography texts of each dataset. Each .txt file in the biography folders is named with the MusicBrainz ID of the biographied artist. Biographies were gathered from the Last.fm wiki page of every artist.Using this datasetWe would highly appreciate if scientific publications of works partly based on the Semantic Artist Similarity dataset quote the following publication:Oramas, S., Sordo M., Espinosa-Anke L., & Serra X. (In Press). A Semantic-based Approach for Artist Similarity. 16th International Society for Music Information Retrieval Conference.We are interested in knowing if you find our datasets useful! If you use our dataset please email us at mtg-info@upf.edu and tell us about your research. https://www.upf.edu/web/mtg/semantic-similarity
Facebook
TwitterIn the 2024 financial year, the airline SAS Scandinavian Airlines generated an operating loss of **** billion Swedish kronor. This was less loss than last year's figure of *** billion Swedish kroner.
Facebook
TwitterThis database is the Third Small Astronomy Satellite (SAS-3) Y-Axis Pointed Observation Log. It identifies possible pointed observations of celestial X-ray sources which were performed with the y-axis detectors of the SAS-3 X-Ray Observatory. This log was compiled (by R. Kelley, P. Goetz and L. Petro) from notes made at the time of the observations and it is expected that it is neither complete nor fully accurate. Possible errors in the log are (i) the misclassification of an observation as a pointed observation when it was either a spinning or dither observation and (ii) inaccuracy of the dates and times of the start and end of an observation. In addition, as described in the HEASARC_Updates section, the HEASARC added some additional information when creating this database. Further information about the SAS-3 detectors and their fields of view can be found at: http://heasarc.gsfc.nasa.gov/docs/sas3/sas3_about.html Disclaimer: The HEASARC is aware of certain inconsistencies between the Start_date, End_date, and Duration fields for a number of rows in this database table. They appear to be errors present in the original table. Except for one entry where the HEASARC corrected an error where there was a near-certainty which parameter was incorrect (as noted in the 'HEASARC_Updates' section of this documentation), these inconsistencies have been left as they were in the original table. This database table was released by the HEASARC in June 2000, based on the SAS-3 Y-Axis pointed Observation Log (available from the NSSDC as dataset ID 75-037A-02B), together with some additional information provided by the HEASARC itself. This is a service provided by NASA HEASARC .
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Credit report of Last Mile Sas contains unique and detailed export import market intelligence with it's phone, email, Linkedin and details of each import and export shipment like product, quantity, price, buyer, supplier names, country and date of shipment.
Facebook
TwitterThis SAS code extracts data from EU-SILC User Database (UDB) longitudinal files and edits it such that a file is produced that can be further used for differential mortality analyses. Information from the original D, R, H and P files is merged per person and possibly pooled over several longitudinal data releases. Vital status information is extracted from target variables DB110 and RB110, and time at risk between the first interview and either death or censoring is estimated based on quarterly date information. Apart from path specifications, the SAS code consists of several SAS macros. Two of them require parameter specification from the user. The other ones are just executed. The code was written in Base SAS, Version 9.4. By default, the output file contains several variables which are necessary for differential mortality analyses, such as sex, age, country, year of first interview, and vital status information. In addition, the user may specify the analytical variables by which mortality risk should be compared later, for example educational level or occupational class. These analytical variables may be measured either at the first interview (the baseline) or at the last interview of a respondent. The output file is available in SAS format and by default also in csv format.
Facebook
Twitterhttps://www.wiseguyreports.com/pages/privacy-policyhttps://www.wiseguyreports.com/pages/privacy-policy
| BASE YEAR | 2024 |
| HISTORICAL DATA | 2019 - 2023 |
| REGIONS COVERED | North America, Europe, APAC, South America, MEA |
| REPORT COVERAGE | Revenue Forecast, Competitive Landscape, Growth Factors, and Trends |
| MARKET SIZE 2024 | 3.46(USD Billion) |
| MARKET SIZE 2025 | 3.64(USD Billion) |
| MARKET SIZE 2035 | 6.0(USD Billion) |
| SEGMENTS COVERED | Application, Technology, End Use, Circuit Type, Regional |
| COUNTRIES COVERED | US, Canada, Germany, UK, France, Russia, Italy, Spain, Rest of Europe, China, India, Japan, South Korea, Malaysia, Thailand, Indonesia, Rest of APAC, Brazil, Mexico, Argentina, Rest of South America, GCC, South Africa, Rest of MEA |
| KEY MARKET DYNAMICS | Technological advancements, Increasing data center demand, Rising storage solutions market, Growing adoption of cloud computing, Enhanced data transfer speeds |
| MARKET FORECAST UNITS | USD Billion |
| KEY COMPANIES PROFILED | Broadcom, Infineon Technologies, STMicroelectronics, NXP Semiconductors, Skyworks Solutions, Nordic Semiconductor, Renesas Electronics, Analog Devices, Texas Instruments, ON Semiconductor, Maxim Integrated, Microchip Technology, Cypress Semiconductor, Marvell Technology, Semtech |
| MARKET FORECAST PERIOD | 2025 - 2035 |
| KEY MARKET OPPORTUNITIES | Growing data center investments, Increasing demand for high-speed storage, Adoption of cloud computing solutions, Rising need for data redundancy systems, Advancements in storage technology. |
| COMPOUND ANNUAL GROWTH RATE (CAGR) | 5.1% (2025 - 2035) |
Facebook
TwitterThe SAS2RAW database is a log of the 28 SAS-2 observation intervals and contains target names, sky coordinates start times and other information for all 13056 photons detected by SAS-2. The original data came from 2 sources. The photon information was obtained from the Event Encyclopedia, and the exposures were derived from the original "Orbit Attitude Live Time" (OALT) tapes stored at NASA/GSFC. These data sets were combined into FITS format images at HEASARC. The images were formed by making the center pixel of a 512 x 512 pixel image correspond to the RA and DEC given in the event file. Each photon's RA and DEC was converted to a relative pixel in the image. This was done by using Aitoff projections. All the raw data from the original SAS-2 binary data files are now stored in 28 FITS files. These images can be accessed and plotted using XIMAGE and other columns of the FITS file extensions can be plotted with the FTOOL FPLOT. This is a service provided by NASA HEASARC .
Facebook
Twitter
According to our latest research, the global SAS Controller market size in 2024 is valued at USD 2.68 billion, driven by the escalating demand for high-performance data storage solutions across diverse sectors. The market is set to witness robust expansion at a CAGR of 6.7% from 2025 to 2033. By the end of 2033, the SAS Controller market is forecasted to reach a valuation of USD 4.88 billion. This growth trajectory is primarily attributed to the increasing adoption of cloud computing, big data analytics, and the proliferation of enterprise applications that require reliable and scalable storage infrastructures.
The growth of the SAS Controller market is significantly influenced by the rising demand for advanced data storage technologies in enterprise environments. As organizations continue to generate and process massive volumes of data, the need for robust storage management solutions becomes paramount. SAS controllers, with their ability to offer high-speed data transfer, enhanced scalability, and superior reliability, are becoming the preferred choice over traditional storage interfaces. The rapid adoption of virtualization and cloud-based services further amplifies the need for efficient storage architectures, thereby fueling the demand for SAS controllers across various industry verticals. Moreover, the evolution of data center infrastructure and the shift towards hyper-converged systems are expected to drive sustained investments in SAS controller solutions over the coming years.
Another key growth factor for the SAS Controller market is the increasing deployment of servers and storage systems in sectors such as BFSI, healthcare, and manufacturing. These industries require seamless data access, secure storage, and high availability to support mission-critical applications. SAS controllers play a vital role in ensuring data integrity and optimizing storage performance, especially in environments where downtime can result in significant financial losses or compromise sensitive information. The growing digital transformation initiatives across both public and private sectors are creating new opportunities for SAS controller vendors to offer innovative products that cater to evolving storage requirements, including support for higher data rates and integration with hybrid storage architectures.
Technological advancements in SAS controller design, such as the integration of RAID functionalities, enhanced error correction capabilities, and support for next-generation SAS protocols, are also contributing to market growth. Vendors are focusing on developing controllers that can handle increasing data workloads while maintaining energy efficiency and minimizing latency. The emergence of NVMe and SSD-based storage solutions is prompting SAS controller manufacturers to innovate and offer products that provide seamless interoperability and future-proofing for enterprise storage environments. Additionally, the trend towards distributed and edge computing is expected to create further demand for SAS controllers that can deliver high performance in decentralized storage architectures.
From a regional perspective, North America remains the dominant market for SAS controllers, owing to the presence of major technology companies, advanced IT infrastructure, and the early adoption of innovative storage solutions. However, the Asia Pacific region is witnessing the fastest growth, driven by rapid industrialization, increasing investments in data centers, and the expansion of cloud services. Europe and Latin America are also showing steady growth, supported by digitalization initiatives in various industries. The Middle East & Africa region, although still emerging, presents significant potential as enterprises in the region ramp up their investments in IT modernization and storage infrastructure.
In the context of technological advancements, the integration of RAID-on-Chip technology within SAS controllers is gaining traction. This innovation allows for the consolidation of RAID functionalities directly onto the controller chip, enhancing performance and reducing latency. RAID-on-Chip solutions offer improved data protection and reliability, which are critical in environments that demand high availability and fault tolerance. As enterprises continue to seek ways
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
analyze the survey of consumer finances (scf) with r the survey of consumer finances (scf) tracks the wealth of american families. every three years, more than five thousand households answer a battery of questions about income, net worth, credit card debt, pensions, mortgages, even the lease on their cars. plenty of surveys collect annual income, only the survey of consumer finances captures such detailed asset data. responses are at the primary economic unit-level (peu) - the economically dominant, financially interdependent family members within a sampled household. norc at the university of chicago administers the data collection, but the board of governors of the federal reserve pay the bills and therefore call the shots. if you were so brazen as to open up the microdata and run a simple weighted median, you'd get the wrong answer. the five to six thousand respondents actually gobble up twenty-five to thirty thousand records in the final pub lic use files. why oh why? well, those tables contain not one, not two, but five records for each peu. wherever missing, these data are multiply-imputed, meaning answers to the same question for the same household might vary across implicates. each analysis must account for all that, lest your confidence intervals be too tight. to calculate the correct statistics, you'll need to break the single file into five, necessarily complicating your life. this can be accomplished with the meanit sas macro buried in the 2004 scf codebook (search for meanit - you'll need the sas iml add-on). or you might blow the dust off this website referred to in the 2010 codebook as the home of an alternative multiple imputation technique, but all i found were broken links. perhaps it's time for plan c, and by c, i mean free. read the imputation section of the latest codebook (search for imputation), then give these scripts a whirl. they've got that new r smell. the lion's share of the respondents in the survey of consumer finances get drawn from a pretty standard sample of american dwellings - no nursing homes, no active-duty military. then there's this secondary sample of richer households to even out the statistical noise at the higher end of the i ncome and assets spectrum. you can read more if you like, but at the end of the day the weights just generalize to civilian, non-institutional american households. one last thing before you start your engine: read everything you always wanted to know about the scf. my favorite part of that title is the word always. this new github repository contains t hree scripts: 1989-2010 download all microdata.R initiate a function to download and import any survey of consumer finances zipped stata file (.dta) loop through each year specified by the user (starting at the 1989 re-vamp) to download the main, extract, and replicate weight files, then import each into r break the main file into five implicates (each containing one record per peu) and merge the appropriate extract data onto each implicate save the five implicates and replicate weights to an r data file (.rda) for rapid future loading 2010 analysis examples.R prepare two survey of consumer finances-flavored multiply-imputed survey analysis functions load the r data files (.rda) necessary to create a multiply-imputed, replicate-weighted survey design demonstrate how to access the properties of a multiply-imput ed survey design object cook up some descriptive statistics and export examples, calculated with scf-centric variance quirks run a quick t-test and regression, but only because you asked nicely replicate FRB SAS output.R reproduce each and every statistic pr ovided by the friendly folks at the federal reserve create a multiply-imputed, replicate-weighted survey design object re-reproduce (and yes, i said/meant what i meant/said) each of those statistics, now using the multiply-imputed survey design object to highlight the statistically-theoretically-irrelevant differences click here to view these three scripts for more detail about the survey of consumer finances (scf), visit: the federal reserve board of governors' survey of consumer finances homepage the latest scf chartbook, to browse what's possible. (spoiler alert: everything.) the survey of consumer finances wikipedia entry the official frequently asked questions notes: nationally-representative statistics on the financial health, wealth, and assets of american hous eholds might not be monopolized by the survey of consumer finances, but there isn't much competition aside from the assets topical module of the survey of income and program participation (sipp). on one hand, the scf interview questions contain more detail than sipp. on the other hand, scf's smaller sample precludes analyses of acute subpopulations. and for any three-handed martians in the audience, ther e's also a few biases between these two data sources that you ought to consider. the survey methodologists at the federal reserve take their job...
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Charlie839242/SAS dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterThis statistic shows the results of a survey conducted by Cint on the distribution of airlines most frequently used in the last 12 months in Turkey in 2017 and 2018. In 2017, **** percent of respondents stated that they have flown with Scandinavian Airlines (SAS) during the last 12 months.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
## Overview
Sas is a dataset for object detection tasks - it contains Sasas annotations for 2,737 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [Public Domain license](https://creativecommons.org/licenses/Public Domain).
Facebook
TwitterPregnancy is a condition of broad interest across many medical and health services research domains, but one not easily identified in healthcare claims data. Our objective was to establish an algorithm to identify pregnant women and their pregnancies in claims data. We identified pregnancy-related diagnosis, procedure, and diagnosis-related group codes, accounting for the transition to International Statistical Classification of Diseases, 10th Revision, Clinical Modification (ICD-10-CM) diagnosis and procedure codes, in health encounter reporting on 10/1/2015. We selected women in Merative MarketScan commercial databases aged 15–49 years with pregnancy-related claims, and their infants, during 2008–2019. Pregnancies, pregnancy outcomes, and gestational ages were assigned using the constellation of service dates, code types, pregnancy outcomes, and linkage to infant records. We describe pregnancy outcomes and gestational ages, as well as maternal age, census region, and health plan type. In a sensitivity analysis, we compared our algorithm-assigned date of last menstrual period (LMP) to fertility procedure-based LMP (date of procedure + 14 days) among women with embryo transfer or insemination procedures. Among 5,812,699 identified pregnancies, most (77.9%) were livebirths, followed by spontaneous abortions (16.2%); 3,274,353 (72.2%) livebirths could be linked to infants. Most pregnancies were among women 25–34 years (59.1%), living in the South (39.1%) and Midwest (22.4%), with large employer-sponsored insurance (52.0%). Outcome distributions were similar across ICD-9 and ICD-10 eras, with some variation in gestational age distribution observed. Sensitivity analyses supported our algorithm’s framework; algorithm- and fertility procedure-derived LMP estimates were within a week of each other (mean difference: -4 days [IQR: -13 to 6 days]; n = 107,870). We have developed an algorithm to identify pregnancies, their gestational age, and outcomes, across ICD-9 and ICD-10 eras using administrative data. This algorithm may be useful to reproductive health researchers investigating a broad range of pregnancy and infant outcomes.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This SAS macro generates childhood mortality estimates (neonatal, post-neonatal, infant (1q0), child (4q1) and under-five (5q0) mortality) and standard errors based on birth histories reported by women during a household survey. We have made the SAS macro flexible enough to accommodate a range of calculation specifications including multi-stage sampling frames, and simple random samples or censuses. Childhood mortality rates are the component death probabilities of dying before a specific age. This SAS macro is based on a macro built by Keith Purvis at MeasureDHS. His method is described in Estimating Sampling Errors of Means, Total Fertility, and Childhood Mortality Rates Using SAS (www.measuredhs.com/pubs/pdf/OD17/OD17.pdf, section 4). More information about Childhood Mortality Estimation can also be found in the Guide to DHS Statistics (www.measuredhs.com/pubs/pdf/DHSG1/Guide_DHS_Statistics.pdf, page 93). We allow the user to specify whether childhood mortality calculations should be based on 5 or 10 years of birth histories, when the birth history window ends, and how to handle age of death with it is reported in whole months (rather than days). The user can also calculate mortality rates within sub-populations, and take account of a complex survey design (unequal probability and cluster samples). Finally, this SAS program is designed to read data in a number of different formats.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
One of four dataset to replicate numbers for tables and figures in the article "Mammography screening: eliciting the voices of informed citizens" by Manja D. Jensen, Kasper M. Hansen, Volkert Siersma, and John Brodersen
Facebook
Twitter
According to our latest research, the global SAS Switch market size reached USD 1.42 billion in 2024, reflecting a robust industry presence. The market is projected to expand at a CAGR of 7.1% from 2025 to 2033, reaching a forecasted value of USD 2.66 billion by 2033. This growth is primarily driven by the escalating demand for high-performance storage solutions across data-intensive sectors such as cloud computing, enterprise storage, and industrial automation. As organizations continue to transition toward digital transformation and data-centric operations, the adoption of SAS Switches is witnessing significant momentum worldwide.
The primary growth factor for the SAS Switch market is the exponential surge in data generation and storage requirements across enterprises. With the proliferation of big data analytics, artificial intelligence, and machine learning applications, businesses are increasingly relying on robust storage area networks (SANs) to ensure fast, reliable, and secure data access. SAS Switches, known for their high-speed connectivity and scalability, are becoming indispensable in modern data center architectures. Moreover, the growing adoption of hybrid and multi-cloud environments is compelling organizations to invest in advanced storage solutions that can seamlessly integrate with diverse IT infrastructures, further propelling the demand for SAS Switches.
Another critical driver is the increasing focus on business continuity and disaster recovery strategies. Enterprises are prioritizing data protection, backup, and recovery capabilities to safeguard against potential cyber threats, hardware failures, and natural disasters. SAS Switches play a pivotal role in enabling efficient data replication, redundancy, and failover mechanisms, which are essential for ensuring uninterrupted business operations. Additionally, the rise of IoT-enabled industrial automation and smart manufacturing is creating new avenues for SAS Switch deployment, as these environments require high-speed, low-latency data transfer between connected devices and storage systems.
Technological advancements in SAS Switch design, including the integration of intelligent management features and enhanced interoperability, are further contributing to market expansion. Vendors are focusing on developing multi-protocol switches that support both SAS and SATA devices, offering greater flexibility and investment protection for end-users. The emergence of NVMe over Fabrics and the shift towards all-flash storage arrays are also influencing the evolution of SAS Switches, as enterprises seek to maximize performance and minimize latency in their storage networks. These innovations are expected to drive sustained growth and competitive differentiation in the global SAS Switch market over the forecast period.
From a regional perspective, North America continues to dominate the SAS Switch market, accounting for the largest revenue share in 2024, followed closely by Asia Pacific and Europe. The presence of major technology vendors, advanced IT infrastructure, and a strong focus on digital transformation initiatives are key factors supporting market growth in these regions. Meanwhile, emerging economies in Asia Pacific and Latin America are witnessing accelerated adoption of SAS Switches, driven by rapid industrialization, expanding data center investments, and increasing enterprise IT spending. As organizations across all regions prioritize data accessibility, security, and scalability, the global outlook for the SAS Switch market remains highly promising.
The SAS Switch market by product type is primarily segmented into Single Port SAS Switches and Multi Port SAS Switches. Single Port SAS Switches are predominantly used in applications requiring dedicated, point-to-point connectivity, such as small-scale storage networks and direct-attached storage (DAS) environments. These switches are valued for their simplicity, cost-effectiveness, and ease of deployment, especi
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
One of three dataset to replicate numbers for tables and figures in the article "Using a Deliberative Poll on breast cancer screening to assess and improve the decision quality of laypeople" by Manja D. Jensen, Kasper M. Hansen, Volkert Siersma, and John Brodersen
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Blockchain data query: L1s Last 3months Perf
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Blockchain data query: L1 Last 7 Days Performance
Facebook
TwitterA response model can provide a significant boost to the efficiency of a marketing campaign by increasing responses or reducing expenses. The objective is to predict who will respond to an offer for a product or service
AcceptedCmp1 - 1 if customer accepted the offer in the 1st campaign, 0 otherwise AcceptedCmp2 - 1 if customer accepted the offer in the 2nd campaign, 0 otherwise AcceptedCmp3 - 1 if customer accepted the offer in the 3rd campaign, 0 otherwise AcceptedCmp4 - 1 if customer accepted the offer in the 4th campaign, 0 otherwise AcceptedCmp5 - 1 if customer accepted the offer in the 5th campaign, 0 otherwise Response (target) - 1 if customer accepted the offer in the last campaign, 0 otherwise Complain - 1 if customer complained in the last 2 years DtCustomer - date of customer’s enrolment with the company Education - customer’s level of education Marital - customer’s marital status Kidhome - number of small children in customer’s household 
Teenhome - number of teenagers in customer’s household 
Income - customer’s yearly household income MntFishProducts - amount spent on fish products in the last 2 years MntMeatProducts - amount spent on meat products in the last 2 years MntFruits - amount spent on fruits products in the last 2 years MntSweetProducts - amount spent on sweet products in the last 2 years MntWines - amount spent on wine products in the last 2 years MntGoldProds - amount spent on gold products in the last 2 years NumDealsPurchases - number of purchases made with discount NumCatalogPurchases - number of purchases made using catalogue NumStorePurchases - number of purchases made directly in stores NumWebPurchases - number of purchases made through company’s web site NumWebVisitsMonth - number of visits to company’s web site in the last month Recency - number of days since the last purchase
O. Parr-Rud. Business Analytics Using SAS Enterprise Guide and SAS Enterprise Miner. SAS Institute, 2014.
The main objective is to train a predictive model which allows the company to maximize the profit of the next marketing campaign.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Semantic Artist Similarity dataset consists of two datasets of artists entities with their corresponding biography texts, and the list of top-10 most similar artists within the datasets used as ground truth. The dataset is composed by a corpus of 268 artists and a slightly larger one of 2,336 artists, both gathered from Last.fm in March 2015. The former is mapped to the MIREX Audio and Music Similarity evaluation dataset, so that its similarity judgments can be used as ground truth. For the latter corpus we use the similarity between artists as provided by the Last.fm API. For every artist there is a list with the top-10 most related artists. In the MIREX dataset there are 188 artists with at least 10 similar artists, the other 80 artists have less than 10 similar artists. In the Last.fm API dataset all artists have a list of 10 similar artists. There are 4 files in the dataset.mirex_gold_top10.txt and lastfmapi_gold_top10.txt have the top-10 lists of artists for every artist of both datasets. Artists are identified by MusicBrainz ID. The format of the file is one line per artist, with the artist mbid separated by a tab with the list of top-10 related artists identified by their mbid separated by spaces.artist_mbid \t artist_mbid_top10_list_separated_by_spaces mb2uri_mirex and mb2uri_lastfmapi.txt have the list of artists. In each line there are three fields separated by tabs. First field is the MusicBrainz ID, second field is the last.fm name of the artist, and third field is the DBpedia uri.artist_mbid \t lastfm_name \t dbpedia_uri There are also 2 folders in the dataset with the biography texts of each dataset. Each .txt file in the biography folders is named with the MusicBrainz ID of the biographied artist. Biographies were gathered from the Last.fm wiki page of every artist.Using this datasetWe would highly appreciate if scientific publications of works partly based on the Semantic Artist Similarity dataset quote the following publication:Oramas, S., Sordo M., Espinosa-Anke L., & Serra X. (In Press). A Semantic-based Approach for Artist Similarity. 16th International Society for Music Information Retrieval Conference.We are interested in knowing if you find our datasets useful! If you use our dataset please email us at mtg-info@upf.edu and tell us about your research. https://www.upf.edu/web/mtg/semantic-similarity