5 datasets found
  1. Z

    Data from: A collection of molecular formula databases for HERMES

    • data.niaid.nih.gov
    Updated Jul 16, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Roger Giné Bertomeu (2021). A collection of molecular formula databases for HERMES [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_5025559
    Explore at:
    Dataset updated
    Jul 16, 2021
    Dataset provided by
    Maria Vinaixa Crevillent
    Roger Giné Bertomeu
    Òscar Yanes Torrado
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    A compilation of different molecule databases ready to be used in HERMES. We have compiled different open-access DBs and adapted their format to the HERMES requisite columns. Since all databases share the "Name" and "MolecularFormula" columns, merges between databases can be easily generated.

    More databases and merges will be added in the future. If you have any suggestions or want to contribute, feel free to contact us!

    All rights reserved to the original authors of the databases.

    Description of the files:

    ECMDB.csv: Entries from E. coli Metabolome Database. 3760 compounds.

    Merge_KEGG_ECMDB.csv: a merge between all metabolites from KEGG pathways associated to E.coli K12 with the ECMDB.csv from above. 6107 compounds.

    Merge_LipidMaps_LipidBlast.csv: a merge between lipid entities from LipidMaps LMSD and the metadata (just Name and Molecular Formula) of LipidBlast entries. 163453 compounds.

    norman.xls: Entries from NORMAN SusDat, containing common and emerging drugs, pollutants, etc. 52019 compounds.

    PubChemLite_31Oct2020.csv Adapted column names from https://zenodo.org/record/4183801. 371,663 compounds related to exposomics.

    MS1_2ID.csv. Merge of HMDB, ChEBI and NORMAN compounds. 183911 compounds related to Human Metabolism, drugs, etc..

    COCONUT_NP.csv: parsed collection of entries from the COlleCtion of Open Natural ProdUcTs (COCONUT).406752 compounds.

    DiTriPeptides.csv: a list of all theoretically possible dipeptides (400) and tripeptides (8000) and their associated molecular formulas. 8400 compounds.

  2. e

    2MASS Survey Merged Point Source Information Table - Dataset - B2FIND

    • b2find.eudat.eu
    Updated Oct 23, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). 2MASS Survey Merged Point Source Information Table - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/86d072cc-cd82-57ee-baab-60cd64fadf3a
    Explore at:
    Dataset updated
    Oct 23, 2023
    Description

    The merged source tables contain the mean positions magnitudes and uncertainties for sources detected multiple times in each of the 2MASS data sets. The merging was carried out using an autocorrelation of the respective databases to identify groups of extractions that are positionally associated with each other, all lying within a 1.5" radius circular region. A number of confirmation statistics are also provided in the tables that can be used to test for source motion and/or variability, and the general quality of the merge. To access this resource via TAP, issue ADQL queries on the table named wsdb_info.

  3. Data from DIAMAS follow-up survey about funding practices

    • zenodo.org
    • data.niaid.nih.gov
    • +1more
    Updated Jul 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Victoria Brun; Victoria Brun; David Pontille; David Pontille; Didier Torny; Didier Torny (2024). Data from DIAMAS follow-up survey about funding practices [Dataset]. http://doi.org/10.5281/zenodo.10879080
    Explore at:
    Dataset updated
    Jul 6, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Victoria Brun; Victoria Brun; David Pontille; David Pontille; Didier Torny; Didier Torny
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The WP5 team of the DIAMAS project designed a follow-up survey to investigate the funding
    practices of IPSPs more deeply. We examined the proportion of Diamond publishing within the
    same IPSP by output type, and the capability to plan for the future. We also enquired about
    spending priorities, reasons for fundraising and the amount of work required, and asked about
    views on institutional publishing funding.


    The project sent the follow-up survey to respondents of the DIAMAS survey (metadata and
    aggregated data available here) who agreed to be contacted. Emails used unique identifiers,
    enabling us to merge databases and easily recover information gathered from the first survey
    for more advanced cross-analysis.


    This follow-up survey was open during the last two months of 2023 and successfully garnered
    469 answers. After cleaning (mainly deleting blank surveys and duplicates), we retained 383
    relevant answers, a response rate of 56%.

  4. d

    Health and Retirement Study (HRS)

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Damico, Anthony (2023). Health and Retirement Study (HRS) [Dataset]. http://doi.org/10.7910/DVN/ELEKOY
    Explore at:
    Dataset updated
    Nov 21, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Damico, Anthony
    Description

    analyze the health and retirement study (hrs) with r the hrs is the one and only longitudinal survey of american seniors. with a panel starting its third decade, the current pool of respondents includes older folks who have been interviewed every two years as far back as 1992. unlike cross-sectional or shorter panel surveys, respondents keep responding until, well, death d o us part. paid for by the national institute on aging and administered by the university of michigan's institute for social research, if you apply for an interviewer job with them, i hope you like werther's original. figuring out how to analyze this data set might trigger your fight-or-flight synapses if you just start clicking arou nd on michigan's website. instead, read pages numbered 10-17 (pdf pages 12-19) of this introduction pdf and don't touch the data until you understand figure a-3 on that last page. if you start enjoying yourself, here's the whole book. after that, it's time to register for access to the (free) data. keep your username and password handy, you'll need it for the top of the download automation r script. next, look at this data flowchart to get an idea of why the data download page is such a righteous jungle. but wait, good news: umich recently farmed out its data management to the rand corporation, who promptly constructed a giant consolidated file with one record per respondent across the whole panel. oh so beautiful. the rand hrs files make much of the older data and syntax examples obsolete, so when you come across stuff like instructions on how to merge years, you can happily ignore them - rand has done it for you. the health and retirement study only includes noninstitutionalized adults when new respondents get added to the panel (as they were in 1992, 1993, 1998, 2004, and 2010) but once they're in, they're in - respondents have a weight of zero for interview waves when they were nursing home residents; but they're still responding and will continue to contribute to your statistics so long as you're generalizing about a population from a previous wave (for example: it's possible to compute "among all americans who were 50+ years old in 1998, x% lived in nursing homes by 2010"). my source for that 411? page 13 of the design doc. wicked. this new github repository contains five scripts: 1992 - 2010 download HRS microdata.R loop through every year and every file, download, then unzip everything in one big party impor t longitudinal RAND contributed files.R create a SQLite database (.db) on the local disk load the rand, rand-cams, and both rand-family files into the database (.db) in chunks (to prevent overloading ram) longitudinal RAND - analysis examples.R connect to the sql database created by the 'import longitudinal RAND contributed files' program create tw o database-backed complex sample survey object, using a taylor-series linearization design perform a mountain of analysis examples with wave weights from two different points in the panel import example HRS file.R load a fixed-width file using only the sas importation script directly into ram with < a href="http://blog.revolutionanalytics.com/2012/07/importing-public-data-with-sas-instructions-into-r.html">SAScii parse through the IF block at the bottom of the sas importation script, blank out a number of variables save the file as an R data file (.rda) for fast loading later replicate 2002 regression.R connect to the sql database created by the 'import longitudinal RAND contributed files' program create a database-backed complex sample survey object, using a taylor-series linearization design exactly match the final regression shown in this document provided by analysts at RAND as an update of the regression on pdf page B76 of this document . click here to view these five scripts for more detail about the health and retirement study (hrs), visit: michigan's hrs homepage rand's hrs homepage the hrs wikipedia page a running list of publications using hrs notes: exemplary work making it this far. as a reward, here's the detailed codebook for the main rand hrs file. note that rand also creates 'flat files' for every survey wave, but really, most every analysis you c an think of is possible using just the four files imported with the rand importation script above. if you must work with the non-rand files, there's an example of how to import a single hrs (umich-created) file, but if you wish to import more than one, you'll have to write some for loops yourself. confidential to sas, spss, stata, and sudaan users: a tidal wave is coming. you can get water up your nose and be dragged out to sea, or you can grab a surf board. time to transition to r. :D

  5. Cotality Loan-Level Market Analytics

    • redivis.com
    • stanford.redivis.com
    application/jsonl +7
    Updated Aug 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stanford University Libraries (2024). Cotality Loan-Level Market Analytics [Dataset]. http://doi.org/10.57761/a96q-1j33
    Explore at:
    avro, sas, spss, stata, arrow, parquet, csv, application/jsonlAvailable download formats
    Dataset updated
    Aug 15, 2024
    Dataset provided by
    Redivis Inc.
    Authors
    Stanford University Libraries
    Description

    Abstract

    Title: Cotality Loan-Level Market Analytics (LLMA)

    Cotality Loan-Level Market Analytics (LLMA) for primary mortgages contains detailed loan data, including origination, events, performance, forbearance and inferred modification data. This dataset may not be linked or merged with any of the other datasets we have from Cotality.

    Formerly known as CoreLogic Loan-Level Market Analytics (LLMA).

    Methodology

    Cotality sources the Loan-Level Market Analytics data directly from loan servicers. Cotality cleans and augments the contributed records with modeled data. The Data Dictionary indicates which fields are contributed and which are inferred.

    The Loan-Level Market Analytics data is aimed at providing lenders, servicers, investors, and advisory firms with the insights they need to make trustworthy assessments and accurate decisions. Stanford Libraries has purchased the Loan-Level Market Analytics data for researchers interested in housing, economics, finance and other topics related to prime and subprime first lien data.

    Cotality provided the data to Stanford Libraries as pipe-delimited text files, which we have uploaded to Data Farm (Redivis) for preview, extraction and analysis.

    For more information about how the data was prepared for Redivis, please see Cotality 2024 GitLab.

    Usage

    Per the End User License Agreement, the LLMA Data cannot be commingled (i.e. merged, mixed or combined) with Tax and Deed Data that Stanford University has licensed from Cotality, or other data which includes the same or similar data elements or that can otherwise be used to identify individual persons or loan servicers.

    The 2015 major release of Cotality Loan-Level Market Analytics (for primary mortgages) was intended to enhance the Cotality servicing consortium through data quality improvements and integrated analytics. See **Cotality_LLMA_ReleaseNotes.pdf **for more information about these changes.

    For more information about included variables, please see Cotality_LLMA_Data_Dictionary.pdf.

    **

    For more information about how the database was set up, please see LLMA_Download_Guide.pdf.

    Bulk Data Access

    Data access is required to view this section.

  6. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Roger Giné Bertomeu (2021). A collection of molecular formula databases for HERMES [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_5025559

Data from: A collection of molecular formula databases for HERMES

Related Article
Explore at:
Dataset updated
Jul 16, 2021
Dataset provided by
Maria Vinaixa Crevillent
Roger Giné Bertomeu
Òscar Yanes Torrado
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

A compilation of different molecule databases ready to be used in HERMES. We have compiled different open-access DBs and adapted their format to the HERMES requisite columns. Since all databases share the "Name" and "MolecularFormula" columns, merges between databases can be easily generated.

More databases and merges will be added in the future. If you have any suggestions or want to contribute, feel free to contact us!

All rights reserved to the original authors of the databases.

Description of the files:

ECMDB.csv: Entries from E. coli Metabolome Database. 3760 compounds.

Merge_KEGG_ECMDB.csv: a merge between all metabolites from KEGG pathways associated to E.coli K12 with the ECMDB.csv from above. 6107 compounds.

Merge_LipidMaps_LipidBlast.csv: a merge between lipid entities from LipidMaps LMSD and the metadata (just Name and Molecular Formula) of LipidBlast entries. 163453 compounds.

norman.xls: Entries from NORMAN SusDat, containing common and emerging drugs, pollutants, etc. 52019 compounds.

PubChemLite_31Oct2020.csv Adapted column names from https://zenodo.org/record/4183801. 371,663 compounds related to exposomics.

MS1_2ID.csv. Merge of HMDB, ChEBI and NORMAN compounds. 183911 compounds related to Human Metabolism, drugs, etc..

COCONUT_NP.csv: parsed collection of entries from the COlleCtion of Open Natural ProdUcTs (COCONUT).406752 compounds.

DiTriPeptides.csv: a list of all theoretically possible dipeptides (400) and tripeptides (8000) and their associated molecular formulas. 8400 compounds.

Search
Clear search
Close search
Google apps
Main menu