Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A compilation of different molecule databases ready to be used in HERMES. We have compiled different open-access DBs and adapted their format to the HERMES requisite columns. Since all databases share the "Name" and "MolecularFormula" columns, merges between databases can be easily generated.
More databases and merges will be added in the future. If you have any suggestions or want to contribute, feel free to contact us!
All rights reserved to the original authors of the databases.
Description of the files:
ECMDB.csv: Entries from E. coli Metabolome Database. 3760 compounds.
Merge_KEGG_ECMDB.csv: a merge between all metabolites from KEGG pathways associated to E.coli K12 with the ECMDB.csv from above. 6107 compounds.
Merge_LipidMaps_LipidBlast.csv: a merge between lipid entities from LipidMaps LMSD and the metadata (just Name and Molecular Formula) of LipidBlast entries. 163453 compounds.
norman.xls: Entries from NORMAN SusDat, containing common and emerging drugs, pollutants, etc. 52019 compounds.
PubChemLite_31Oct2020.csv Adapted column names from https://zenodo.org/record/4183801. 371,663 compounds related to exposomics.
MS1_2ID.csv. Merge of HMDB, ChEBI and NORMAN compounds. 183911 compounds related to Human Metabolism, drugs, etc..
COCONUT_NP.csv: parsed collection of entries from the COlleCtion of Open Natural ProdUcTs (COCONUT).406752 compounds.
DiTriPeptides.csv: a list of all theoretically possible dipeptides (400) and tripeptides (8000) and their associated molecular formulas. 8400 compounds.
The merged source tables contain the mean positions magnitudes and uncertainties for sources detected multiple times in each of the 2MASS data sets. The merging was carried out using an autocorrelation of the respective databases to identify groups of extractions that are positionally associated with each other, all lying within a 1.5" radius circular region. A number of confirmation statistics are also provided in the tables that can be used to test for source motion and/or variability, and the general quality of the merge. To access this resource via TAP, issue ADQL queries on the table named wsdb_info.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The WP5 team of the DIAMAS project designed a follow-up survey to investigate the funding
practices of IPSPs more deeply. We examined the proportion of Diamond publishing within the
same IPSP by output type, and the capability to plan for the future. We also enquired about
spending priorities, reasons for fundraising and the amount of work required, and asked about
views on institutional publishing funding.
The project sent the follow-up survey to respondents of the DIAMAS survey (metadata and
aggregated data available here) who agreed to be contacted. Emails used unique identifiers,
enabling us to merge databases and easily recover information gathered from the first survey
for more advanced cross-analysis.
This follow-up survey was open during the last two months of 2023 and successfully garnered
469 answers. After cleaning (mainly deleting blank surveys and duplicates), we retained 383
relevant answers, a response rate of 56%.
analyze the health and retirement study (hrs) with r the hrs is the one and only longitudinal survey of american seniors. with a panel starting its third decade, the current pool of respondents includes older folks who have been interviewed every two years as far back as 1992. unlike cross-sectional or shorter panel surveys, respondents keep responding until, well, death d o us part. paid for by the national institute on aging and administered by the university of michigan's institute for social research, if you apply for an interviewer job with them, i hope you like werther's original. figuring out how to analyze this data set might trigger your fight-or-flight synapses if you just start clicking arou nd on michigan's website. instead, read pages numbered 10-17 (pdf pages 12-19) of this introduction pdf and don't touch the data until you understand figure a-3 on that last page. if you start enjoying yourself, here's the whole book. after that, it's time to register for access to the (free) data. keep your username and password handy, you'll need it for the top of the download automation r script. next, look at this data flowchart to get an idea of why the data download page is such a righteous jungle. but wait, good news: umich recently farmed out its data management to the rand corporation, who promptly constructed a giant consolidated file with one record per respondent across the whole panel. oh so beautiful. the rand hrs files make much of the older data and syntax examples obsolete, so when you come across stuff like instructions on how to merge years, you can happily ignore them - rand has done it for you. the health and retirement study only includes noninstitutionalized adults when new respondents get added to the panel (as they were in 1992, 1993, 1998, 2004, and 2010) but once they're in, they're in - respondents have a weight of zero for interview waves when they were nursing home residents; but they're still responding and will continue to contribute to your statistics so long as you're generalizing about a population from a previous wave (for example: it's possible to compute "among all americans who were 50+ years old in 1998, x% lived in nursing homes by 2010"). my source for that 411? page 13 of the design doc. wicked. this new github repository contains five scripts: 1992 - 2010 download HRS microdata.R loop through every year and every file, download, then unzip everything in one big party impor t longitudinal RAND contributed files.R create a SQLite database (.db) on the local disk load the rand, rand-cams, and both rand-family files into the database (.db) in chunks (to prevent overloading ram) longitudinal RAND - analysis examples.R connect to the sql database created by the 'import longitudinal RAND contributed files' program create tw o database-backed complex sample survey object, using a taylor-series linearization design perform a mountain of analysis examples with wave weights from two different points in the panel import example HRS file.R load a fixed-width file using only the sas importation script directly into ram with < a href="http://blog.revolutionanalytics.com/2012/07/importing-public-data-with-sas-instructions-into-r.html">SAScii parse through the IF block at the bottom of the sas importation script, blank out a number of variables save the file as an R data file (.rda) for fast loading later replicate 2002 regression.R connect to the sql database created by the 'import longitudinal RAND contributed files' program create a database-backed complex sample survey object, using a taylor-series linearization design exactly match the final regression shown in this document provided by analysts at RAND as an update of the regression on pdf page B76 of this document . click here to view these five scripts for more detail about the health and retirement study (hrs), visit: michigan's hrs homepage rand's hrs homepage the hrs wikipedia page a running list of publications using hrs notes: exemplary work making it this far. as a reward, here's the detailed codebook for the main rand hrs file. note that rand also creates 'flat files' for every survey wave, but really, most every analysis you c an think of is possible using just the four files imported with the rand importation script above. if you must work with the non-rand files, there's an example of how to import a single hrs (umich-created) file, but if you wish to import more than one, you'll have to write some for loops yourself. confidential to sas, spss, stata, and sudaan users: a tidal wave is coming. you can get water up your nose and be dragged out to sea, or you can grab a surf board. time to transition to r. :D
Title: Cotality Loan-Level Market Analytics (LLMA)
Cotality Loan-Level Market Analytics (LLMA) for primary mortgages contains detailed loan data, including origination, events, performance, forbearance and inferred modification data. This dataset may not be linked or merged with any of the other datasets we have from Cotality.
Formerly known as CoreLogic Loan-Level Market Analytics (LLMA).
Cotality sources the Loan-Level Market Analytics data directly from loan servicers. Cotality cleans and augments the contributed records with modeled data. The Data Dictionary indicates which fields are contributed and which are inferred.
The Loan-Level Market Analytics data is aimed at providing lenders, servicers, investors, and advisory firms with the insights they need to make trustworthy assessments and accurate decisions. Stanford Libraries has purchased the Loan-Level Market Analytics data for researchers interested in housing, economics, finance and other topics related to prime and subprime first lien data.
Cotality provided the data to Stanford Libraries as pipe-delimited text files, which we have uploaded to Data Farm (Redivis) for preview, extraction and analysis.
For more information about how the data was prepared for Redivis, please see Cotality 2024 GitLab.
Per the End User License Agreement, the LLMA Data cannot be commingled (i.e. merged, mixed or combined) with Tax and Deed Data that Stanford University has licensed from Cotality, or other data which includes the same or similar data elements or that can otherwise be used to identify individual persons or loan servicers.
The 2015 major release of Cotality Loan-Level Market Analytics (for primary mortgages) was intended to enhance the Cotality servicing consortium through data quality improvements and integrated analytics. See **Cotality_LLMA_ReleaseNotes.pdf **for more information about these changes.
For more information about included variables, please see Cotality_LLMA_Data_Dictionary.pdf.
**
For more information about how the database was set up, please see LLMA_Download_Guide.pdf.
Data access is required to view this section.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A compilation of different molecule databases ready to be used in HERMES. We have compiled different open-access DBs and adapted their format to the HERMES requisite columns. Since all databases share the "Name" and "MolecularFormula" columns, merges between databases can be easily generated.
More databases and merges will be added in the future. If you have any suggestions or want to contribute, feel free to contact us!
All rights reserved to the original authors of the databases.
Description of the files:
ECMDB.csv: Entries from E. coli Metabolome Database. 3760 compounds.
Merge_KEGG_ECMDB.csv: a merge between all metabolites from KEGG pathways associated to E.coli K12 with the ECMDB.csv from above. 6107 compounds.
Merge_LipidMaps_LipidBlast.csv: a merge between lipid entities from LipidMaps LMSD and the metadata (just Name and Molecular Formula) of LipidBlast entries. 163453 compounds.
norman.xls: Entries from NORMAN SusDat, containing common and emerging drugs, pollutants, etc. 52019 compounds.
PubChemLite_31Oct2020.csv Adapted column names from https://zenodo.org/record/4183801. 371,663 compounds related to exposomics.
MS1_2ID.csv. Merge of HMDB, ChEBI and NORMAN compounds. 183911 compounds related to Human Metabolism, drugs, etc..
COCONUT_NP.csv: parsed collection of entries from the COlleCtion of Open Natural ProdUcTs (COCONUT).406752 compounds.
DiTriPeptides.csv: a list of all theoretically possible dipeptides (400) and tripeptides (8000) and their associated molecular formulas. 8400 compounds.