Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Semantic Artist Similarity dataset consists of two datasets of artists entities with their corresponding biography texts, and the list of top-10 most similar artists within the datasets used as ground truth. The dataset is composed by a corpus of 268 artists and a slightly larger one of 2,336 artists, both gathered from Last.fm in March 2015. The former is mapped to the MIREX Audio and Music Similarity evaluation dataset, so that its similarity judgments can be used as ground truth. For the latter corpus we use the similarity between artists as provided by the Last.fm API. For every artist there is a list with the top-10 most related artists. In the MIREX dataset there are 188 artists with at least 10 similar artists, the other 80 artists have less than 10 similar artists. In the Last.fm API dataset all artists have a list of 10 similar artists. There are 4 files in the dataset.mirex_gold_top10.txt and lastfmapi_gold_top10.txt have the top-10 lists of artists for every artist of both datasets. Artists are identified by MusicBrainz ID. The format of the file is one line per artist, with the artist mbid separated by a tab with the list of top-10 related artists identified by their mbid separated by spaces.artist_mbid \t artist_mbid_top10_list_separated_by_spaces mb2uri_mirex and mb2uri_lastfmapi.txt have the list of artists. In each line there are three fields separated by tabs. First field is the MusicBrainz ID, second field is the last.fm name of the artist, and third field is the DBpedia uri.artist_mbid \t lastfm_name \t dbpedia_uri There are also 2 folders in the dataset with the biography texts of each dataset. Each .txt file in the biography folders is named with the MusicBrainz ID of the biographied artist. Biographies were gathered from the Last.fm wiki page of every artist.Using this datasetWe would highly appreciate if scientific publications of works partly based on the Semantic Artist Similarity dataset quote the following publication:Oramas, S., Sordo M., Espinosa-Anke L., & Serra X. (In Press). A Semantic-based Approach for Artist Similarity. 16th International Society for Music Information Retrieval Conference.We are interested in knowing if you find our datasets useful! If you use our dataset please email us at mtg-info@upf.edu and tell us about your research. https://www.upf.edu/web/mtg/semantic-similarity
This database is the Third Small Astronomy Satellite (SAS-3) Y-Axis Pointed Observation Log. It identifies possible pointed observations of celestial X-ray sources which were performed with the y-axis detectors of the SAS-3 X-Ray Observatory. This log was compiled (by R. Kelley, P. Goetz and L. Petro) from notes made at the time of the observations and it is expected that it is neither complete nor fully accurate. Possible errors in the log are (i) the misclassification of an observation as a pointed observation when it was either a spinning or dither observation and (ii) inaccuracy of the dates and times of the start and end of an observation. In addition, as described in the HEASARC_Updates section, the HEASARC added some additional information when creating this database. Further information about the SAS-3 detectors and their fields of view can be found at: http://heasarc.gsfc.nasa.gov/docs/sas3/sas3_about.html Disclaimer: The HEASARC is aware of certain inconsistencies between the Start_date, End_date, and Duration fields for a number of rows in this database table. They appear to be errors present in the original table. Except for one entry where the HEASARC corrected an error where there was a near-certainty which parameter was incorrect (as noted in the 'HEASARC_Updates' section of this documentation), these inconsistencies have been left as they were in the original table. This database table was released by the HEASARC in June 2000, based on the SAS-3 Y-Axis pointed Observation Log (available from the NSSDC as dataset ID 75-037A-02B), together with some additional information provided by the HEASARC itself. This is a service provided by NASA HEASARC .
Eximpedia Export import trade data lets you search trade data and active Exporters, Importers, Buyers, Suppliers, manufacturers exporters from over 209 countries
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The rising trend of scientific researches have led more people to pay their attention towards scientific researches, but simply the word "scientific research" does not explain the whole nature of itself, like any other things in reality, it is divided into many realms. The various fields of scientific research have already been discussed by many scholarly articles and have been evaluated by previous census and researches. However, the ultimate question remains unanswered, namely, what is the most popular field of scientific research and which one will become the focus in the future. Although the number of specific fields that can be derived is too vast to be counted, numerous major fields can be identified to categorize the various fields, such as astronomy, engineering, computer science, medicine, biology and chemistry. Several main factors are related to the popularity, such as the number of articles relating to respective fields, number of posts on social media and the number of views on professional sites. A program was developed to analyze the relationship between the subjects for scientific research and the future trend of them based on the number of mentions for each field of research, scholarly articles and quotations about them. The program uses the data from Altmetric data, an authoritative data source. SAS is used to analyze the data and put the data on several graphs that represent the value for each factor. Finally, suggestions for future scientific researches can be summarized and inferred from the result of this research, which is aimed to provide enlightenment for future research directions.Fig 1 - The functions used in this research.Fig 2 - The main Python program used in this research.Fig 3 - The structure of output.Fig 4 - Factor 1: Number of articles relating to each field.Fig 5 - Factor 2: Number of views on Mendeley, Connotea, and Citeulike.Fig 6 - Factor 3: Number of posts on Facebook and Twitter.Fig 7 - The correlation between individual factors.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is about books. It has 2 rows and is filtered where the book publisher is SAS Institute. It features 7 columns including author, publication date, language, and book publisher.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Identifies fields and units in the SAS datasaet, VERMONT.SD2
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is about book subjects. It has 3 rows and is filtered where the books is SAS combat handbook. It features 10 columns including number of authors, number of books, earliest publication date, and latest publication date.
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
Surface active substances (SAS) in the water column were measured by voltammetry using the electrochemical probe o-nitrophenol (ONP) during EIFEX, a mesoscale open ocean iron enrichment experiment in the Southern Ocean. SAS levels were low throughout the experiment (0.02 mg/L Triton X-100 equivalents) were found at the end of the bloom particularly at density discontinuities where organic material may accumulate. Exudates from diatoms appeared to be the major source of SAS during EIFEX, either from direct extracellular release or in the action of being grazed upon by zooplankton.
These data are part of NACJD's Fast Track Release and are distributed as they were received from the data depositor. The files have been zipped by NACJD for release, but not checked or processed except for the removal of direct identifiers. Users should refer to the accompanying readme file for a brief description of the files available with this collection and consult the investigator(s) if further information is needed. This study sought to apply current and advanced Y-STR DNA technology in forensic laboratories to a large in vivo population of proxy-couples, to provide groundwork for future inquiry about the conditions affecting DNA recovery in the living patient, to determine timing for evidence collection, and to attempt to identify variables influencing DNA recovery. The objective of this research was to create the evidence base supporting or limiting the expansion of the 72-hour period for evidence collection. Another objective was to identify conditions that might influence the recovery of DNA, and therefore influence policies related to sample collection from the complex post-coital environment. The collection includes 6 SPSS data files: AlleleRecovery Jun 2014 Allrec.sav (n=70; 34 variables) AlleleRecovery Jun 2014 Used for descriptve analysis.sav (n=66; 58 variables) Condom_collections-baseline-d9-Jun2014 Allrec without open-ended-ICPSR.sav (n=70; 66 variables) DNADemogFemalesJun2014- without open-ended AllRec-ICPSR.sav (n=73; 67 variables) DNADemogFemalesJun2014- without open-ended -For analysis with group variables-ICPSR.sav (n=66; 73 variables) DNADemogMalesJun2014- without open-ended AllRec-ICPSR.sav (n=73; 46 variables) and 1 SAS data file (dnalong.sas7bdat (n=264; 7 variables)). Data from a focus group of subject matter experts which convened to identify themes from their practice are not included with this collection.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Integrated Postsecondary Education Data System (IPEDS) Complete Data Files from 1980 to 2023. Includes data file, STATA data file, SPSS program, SAS program, STATA program, and dictionary. All years compressed into one .zip file due to storage limitations.From IPEDS Complete Data File Help Page (https://nces.ed.gov/Ipeds/help/complete-data-files):Choose the file to download by reading the description in the available titles. Then, click on the link in that row corresponding to the column header of the type of file/information desired to download.To download and view the survey files in basic CSV format use the main download link in the Data File column.For files compatible with the Stata statistical software package, use the alternate download link in the Stata Data File column.To download files with the SPSS, SAS, or STATA (.do) file extension for use with statistical software packages, use the download link in the Programs column.To download the data Dictionary for the selected file, click on the corresponding link in the far right column of the screen. The data dictionary serves as a reference for using and interpreting the data within a particular survey file. This includes the names, definitions, and formatting conventions for each table, field, and data element within the file, important business rules, and information on any relationships to other IPEDS data.For statistical read programs to work properly, both the data file and the corresponding read program file must be downloaded to the same subdirectory on the computer’s hard drive. Download the data file first; then click on the corresponding link in the Programs column to download the desired read program file to the same subdirectory.When viewing downloaded survey files, categorical variables are identified using codes instead of labels. Labels for these variables are available in both the data read program files and data dictionary for each file; however, for files that automatically incorporate this information you will need to select the Custom Data Files option.
This layer contains census tract level 2020 Decennial Census redistricting data as reported by the U.S. Census Bureau for all states plus DC and Puerto Rico. The attributes come from the 2020 Public Law 94-171 (P.L. 94-171) tables.Data download date: August 12, 2021Census tables: P1, P2, P3, P4, H1, P5, HeaderDownloaded from: Census FTP siteProcessing Notes:Data was downloaded from the U.S. Census Bureau FTP site, imported into SAS format and joined to the 2020 TIGER boundaries. Boundaries are sourced from the 2020 TIGER/Line Geodatabases. Boundaries have been projected into Web Mercator and each attribute has been given a clear descriptive alias name. No alterations have been made to the vertices of the data.Each attribute maintains it's specified name from Census, but also has a descriptive alias name and long description derived from the technical documentation provided by the Census. For a detailed list of the attributes contained in this layer, view the Data tab and select "Fields". The following alterations have been made to the tabular data:Joined all tables to create one wide attribute table:P1 - RaceP2 - Hispanic or Latino, and not Hispanic or Latino by RaceP3 - Race for the Population 18 Years and OverP4 - Hispanic or Latino, and not Hispanic or Latino by Race for the Population 18 Years and OverH1 - Occupancy Status (Housing)P5 - Group Quarters Population by Group Quarters Type (correctional institutions, juvenile facilities, nursing facilities/skilled nursing, college/university student housing, military quarters, etc.)HeaderAfter joining, dropped fields: FILEID, STUSAB, CHARITER, CIFSN, LOGRECNO, GEOVAR, GEOCOMP, LSADC, BLOCK, BLKGRP, and TBLKGRP.GEOCOMP was renamed to GEOID and moved be the first column in the table, the original GEOID was dropped.Placeholder fields for future legislative districts have been dropped: CD118, CD119, CD120, CD121, SLDU22, SLDU24, SLDU26, SLDU28, SLDL22, SLDL24 SLDL26, SLDL28.P0020001 was dropped, as it is duplicative of P0010001. Similarly, P0040001 was dropped, as it is duplicative of P0030001.In addition to calculated fields, County_Name and State_Name were added.The following calculated fields have been added (see long field descriptions in the Data tab for formulas used): PCT_P0030001: Percent of Population 18 Years and OverPCT_P0020002: Percent Hispanic or LatinoPCT_P0020005: Percent White alone, not Hispanic or LatinoPCT_P0020006: Percent Black or African American alone, not Hispanic or LatinoPCT_P0020007: Percent American Indian and Alaska Native alone, not Hispanic or LatinoPCT_P0020008: Percent Asian alone, Not Hispanic or LatinoPCT_P0020009: Percent Native Hawaiian and Other Pacific Islander alone, not Hispanic or LatinoPCT_P0020010: Percent Some Other Race alone, not Hispanic or LatinoPCT_P0020011: Percent Population of Two or More Races, not Hispanic or LatinoPCT_H0010002: Percent of Housing Units that are OccupiedPCT_H0010003: Percent of Housing Units that are VacantPlease note these percentages might look strange at the individual tract level, since this data has been protected using differential privacy.**To protect the privacy and confidentiality of respondents, data has been protected using differential privacy techniques by the U.S. Census Bureau. This means that some individual tracts will have values that are inconsistent or improbable. However, when aggregated up, these issues become minimized. The pop-up on this layer uses Arcade to display aggregated values for the surrounding area rather than values for the tract itself.Download Census redistricting data in this layer as a file geodatabase.Additional links:U.S. Census BureauU.S. Census Bureau Decennial CensusAbout the 2020 Census2020 Census2020 Census data qualityDecennial Census P.L. 94-171 Redistricting Data Program
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is about books. It has 1 row and is filtered where the book is Silent heroes : the story of the SAS. It features 7 columns including author, publication date, language, and book publisher.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Matching is frequently used in observational studies, especially in medical research. However, only a small number of articles with matching programs for the SAS software (SAS Institute Inc., Cary, NC, USA) are available, even less are usable for inexperienced users of SAS software. This article presents a matching program for the SAS software and links to an online repository for examples and test data. The program enables matching on several variables and includes in-depth explanation of the expressions used and how to customize the program. The selection of controls is randomized and automated, minimizing the risk of selection bias. Also, the program provides means for the researcher to test for incomplete matching.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is about book subjects. It has 2 rows and is filtered where the books is The SAS escape, evasion and survival manual. It features 10 columns including number of authors, number of books, earliest publication date, and latest publication date.
This dataset contains the scrubbed chat logs from the Southeast Atmosphere Study (SAS) project, including NOMADSS (Nitrogen, Oxidants, Mercury and Aerosol Distributions, Sources and Sinks), from May 30 - July 17, 2013. The chat logs contain conversations between scientists and other field project participants regarding data collection within the SAS-NOMADSS project.
Exercise data set for the SAS book by Uehlinger. Sample of individual variables and cases from the data set of ZA Study 0757 (political ideology).
Topics: most important political problems of the country; political interest; party inclination; behavior at the polls in the Federal Parliament election 1972; political participation and willingness to participate in political protests.
Demography: age; sex; marital status; religious denomination; school education; interest in politics; party preference.
analyze the current population survey (cps) annual social and economic supplement (asec) with r the annual march cps-asec has been supplying the statistics for the census bureau's report on income, poverty, and health insurance coverage since 1948. wow. the us census bureau and the bureau of labor statistics ( bls) tag-team on this one. until the american community survey (acs) hit the scene in the early aughts (2000s), the current population survey had the largest sample size of all the annual general demographic data sets outside of the decennial census - about two hundred thousand respondents. this provides enough sample to conduct state- and a few large metro area-level analyses. your sample size will vanish if you start investigating subgroups b y state - consider pooling multiple years. county-level is a no-no. despite the american community survey's larger size, the cps-asec contains many more variables related to employment, sources of income, and insurance - and can be trended back to harry truman's presidency. aside from questions specifically asked about an annual experience (like income), many of the questions in this march data set should be t reated as point-in-time statistics. cps-asec generalizes to the united states non-institutional, non-active duty military population. the national bureau of economic research (nber) provides sas, spss, and stata importation scripts to create a rectangular file (rectangular data means only person-level records; household- and family-level information gets attached to each person). to import these files into r, the parse.SAScii function uses nber's sas code to determine how to import the fixed-width file, then RSQLite to put everything into a schnazzy database. you can try reading through the nber march 2012 sas importation code yourself, but it's a bit of a proc freak show. this new github repository contains three scripts: 2005-2012 asec - download all microdata.R down load the fixed-width file containing household, family, and person records import by separating this file into three tables, then merge 'em together at the person-level download the fixed-width file containing the person-level replicate weights merge the rectangular person-level file with the replicate weights, then store it in a sql database create a new variable - one - in the data table 2012 asec - analysis examples.R connect to the sql database created by the 'download all microdata' progr am create the complex sample survey object, using the replicate weights perform a boatload of analysis examples replicate census estimates - 2011.R connect to the sql database created by the 'download all microdata' program create the complex sample survey object, using the replicate weights match the sas output shown in the png file below 2011 asec replicate weight sas output.png statistic and standard error generated from the replicate-weighted example sas script contained in this census-provided person replicate weights usage instructions document. click here to view these three scripts for more detail about the current population survey - annual social and economic supplement (cps-asec), visit: the census bureau's current population survey page the bureau of labor statistics' current population survey page the current population survey's wikipedia article notes: interviews are conducted in march about experiences during the previous year. the file labeled 2012 includes information (income, work experience, health insurance) pertaining to 2011. when you use the current populat ion survey to talk about america, subract a year from the data file name. as of the 2010 file (the interview focusing on america during 2009), the cps-asec contains exciting new medical out-of-pocket spending variables most useful for supplemental (medical spending-adjusted) poverty research. confidential to sas, spss, stata, sudaan users: why are you still rubbing two sticks together after we've invented the butane lighter? time to transition to r. :D
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is about book subjects. It has 5 rows and is filtered where the books is A monk in the SAS. It features 10 columns including number of authors, number of books, earliest publication date, and latest publication date.
PLOSsyphThis is an ASCII file that is space delimited that was created in SAS. It has the variables that were used in the published paper. The readme.sas file is a .sas file that reads the data. You will need to change the infile statement to reflect the path to where you put the data.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is about books. It has 1 row and is filtered where the book is Soldier R: SAS : death on Gibraltar. It features 7 columns including author, publication date, language, and book publisher.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Semantic Artist Similarity dataset consists of two datasets of artists entities with their corresponding biography texts, and the list of top-10 most similar artists within the datasets used as ground truth. The dataset is composed by a corpus of 268 artists and a slightly larger one of 2,336 artists, both gathered from Last.fm in March 2015. The former is mapped to the MIREX Audio and Music Similarity evaluation dataset, so that its similarity judgments can be used as ground truth. For the latter corpus we use the similarity between artists as provided by the Last.fm API. For every artist there is a list with the top-10 most related artists. In the MIREX dataset there are 188 artists with at least 10 similar artists, the other 80 artists have less than 10 similar artists. In the Last.fm API dataset all artists have a list of 10 similar artists. There are 4 files in the dataset.mirex_gold_top10.txt and lastfmapi_gold_top10.txt have the top-10 lists of artists for every artist of both datasets. Artists are identified by MusicBrainz ID. The format of the file is one line per artist, with the artist mbid separated by a tab with the list of top-10 related artists identified by their mbid separated by spaces.artist_mbid \t artist_mbid_top10_list_separated_by_spaces mb2uri_mirex and mb2uri_lastfmapi.txt have the list of artists. In each line there are three fields separated by tabs. First field is the MusicBrainz ID, second field is the last.fm name of the artist, and third field is the DBpedia uri.artist_mbid \t lastfm_name \t dbpedia_uri There are also 2 folders in the dataset with the biography texts of each dataset. Each .txt file in the biography folders is named with the MusicBrainz ID of the biographied artist. Biographies were gathered from the Last.fm wiki page of every artist.Using this datasetWe would highly appreciate if scientific publications of works partly based on the Semantic Artist Similarity dataset quote the following publication:Oramas, S., Sordo M., Espinosa-Anke L., & Serra X. (In Press). A Semantic-based Approach for Artist Similarity. 16th International Society for Music Information Retrieval Conference.We are interested in knowing if you find our datasets useful! If you use our dataset please email us at mtg-info@upf.edu and tell us about your research. https://www.upf.edu/web/mtg/semantic-similarity