100+ datasets found
  1. S

    Data for Development of the Enrichment Mentality Questionnaire

    • scidb.cn
    Updated Apr 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    he jian kang; Zhang Guohua (2024). Data for Development of the Enrichment Mentality Questionnaire [Dataset]. http://doi.org/10.57760/sciencedb.18124
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 24, 2024
    Dataset provided by
    Science Data Bank
    Authors
    he jian kang; Zhang Guohua
    Description

    Study 1 Text preparation (specific questionnaire questions can be found in the paper) Vocabulary 1: 845 words related to material wealth and spiritual wealth, as well as their relationship, in The Contemporary Chinese Dictionary (7th edition); Vocabulary 2: Further screening, deleting irrelevant words, merging synonyms, and organizing a total of 69 sets of vocabulary. Test (detailed information can be found in the paper, variable labels and meanings can be found in SPSS data) Data 1: In August 2021, questionnaires were distributed through online platforms with an IP address limited to Zhejiang Province. A total of 503 responses were received, and invalid responses such as short answer times and regular responses were deleted, resulting in 462 valid responses (91.85%). Data 2: In September 2021, questionnaires were distributed through online platforms with IP addresses limited to Zhejiang Province. A total of 208 responses were received, and invalid responses such as short response times and regular responses were deleted, resulting in 201 valid responses (96.63%). Study 2 Test (detailed information can be found in the paper, variable labels and meanings can be found in SPSS data) Data 3: From July to August 2023, questionnaires were distributed through online platforms with IP addresses limited to Zhejiang Province. A total of 1045 answer sheets were collected. Deleting invalid answers such as short answer times and regular responses resulted in 937valid responses (89.67%).

  2. Success.ai | US Company Data | Enrichment APIs | 28M+ Full Company Profiles...

    • data.success.ai
    Updated Nov 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Success.ai (2024). Success.ai | US Company Data | Enrichment APIs | 28M+ Full Company Profiles & Contact Data – Best Price & Quality Guarantee [Dataset]. https://data.success.ai/products/success-ai-us-company-data-enrichment-apis-28m-full-co-success-ai
    Explore at:
    Dataset updated
    Nov 20, 2024
    Dataset provided by
    Area covered
    United States
    Description

    Dive into Success.ai's extensive database featuring 28M+ full US company profiles and contact data. With AI-validated accuracy and comprehensive coverage, we offer tailored data solutions equipped with enrichment APIs that comply with regulatory standards, all at the best price.

  3. f

    Data from: Investigating shared and distinct mechanisms in semantic and...

    • tandf.figshare.com
    docx
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aine Ito; E. Matthew Husband (2023). Investigating shared and distinct mechanisms in semantic and syntactic enrichment: a priming study [Dataset]. http://doi.org/10.6084/m9.figshare.19161954.v1
    Explore at:
    docxAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Taylor & Francis
    Authors
    Aine Ito; E. Matthew Husband
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Aspectual verbs (e.g. begin) and intensional verbs (e.g. want) can both take entity-denoting NPs as a complement (begin/want the book) and acquire an implicit meaning (e.g. reading). Linguistic theory posits that such enriched implicit meanings can be acquired either by semantic enrichment with aspectual verbs or by syntactic enrichment with intensional verbs. To investigate whether semantic and syntactic enrichment share enrichment operations, we conducted a structural priming study. Experiment 1 repeated the verb on prime and target trials and found evidence for enrichment priming for both verb types. Experiment 2 crossed the verb type and found no evidence for priming. These results suggest that enrichment operations are distinct for aspectual and intensional verbs. However, Experiment 3 repeated Experiment 1 without lexical boost and found no enrichment priming within the verb type. Thus, producing an enriched structure may not robustly activate enrichment structures, leaving open questions concerning shared mechanisms.

  4. d

    Factori USA People Data | socio-demographic, location, interest and intent...

    • datarade.ai
    .json, .csv
    Updated Jul 23, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Factori (2022). Factori USA People Data | socio-demographic, location, interest and intent data | E-Commere |Mobile Apps | Online Services [Dataset]. https://datarade.ai/data-products/factori-usa-consumer-graph-data-socio-demographic-location-factori
    Explore at:
    .json, .csvAvailable download formats
    Dataset updated
    Jul 23, 2022
    Dataset authored and provided by
    Factori
    Area covered
    United States of America
    Description

    Our People data is gathered and aggregated via surveys, digital services, and public data sources. We use powerful profiling algorithms to collect and ingest only fresh and reliable data points.

    Our comprehensive data enrichment solution includes a variety of data sets that can help you address gaps in your customer data, gain a deeper understanding of your customers, and power superior client experiences.

    1. Geography - City, State, ZIP, County, CBSA, Census Tract, etc.
    2. Demographics - Gender, Age Group, Marital Status, Language etc.
    3. Financial - Income Range, Credit Rating Range, Credit Type, Net worth Range, etc
    4. Persona - Consumer type, Communication preferences, Family type, etc
    5. Interests - Content, Brands, Shopping, Hobbies, Lifestyle etc.
    6. Household - Number of Children, Number of Adults, IP Address, etc.
    7. Behaviours - Brand Affinity, App Usage, Web Browsing etc.
    8. Firmographics - Industry, Company, Occupation, Revenue, etc
    9. Retail Purchase - Store, Category, Brand, SKU, Quantity, Price etc.
    10. Auto - Car Make, Model, Type, Year, etc.
    11. Housing - Home type, Home value, Renter/Owner, Year Built etc.

    People Data Schema & Reach: Our data reach represents the total number of counts available within various categories and comprises attributes such as country location, MAU, DAU & Monthly Location Pings:

    Data Export Methodology: Since we collect data dynamically, we provide the most updated data and insights via a best-suited method on a suitable interval (daily/weekly/monthly).

    People Data Use Cases:

    360-Degree Customer View: Get a comprehensive image of customers by the means of internal and external data aggregation.

    Data Enrichment: Leverage Online to offline consumer profiles to build holistic audience segments to improve campaign targeting using user data enrichment

    Fraud Detection: Use multiple digital (web and mobile) identities to verify real users and detect anomalies or fraudulent activity.

    Advertising & Marketing: Understand audience demographics, interests, lifestyle, hobbies, and behaviors to build targeted marketing campaigns.

    Using Factori People Data you can solve use cases like:

    Acquisition Marketing Expand your reach to new users and customers using lookalike modeling with your first party audiences to extend to other potential consumers with similar traits and attributes.

    Lookalike Modeling

    Build lookalike audience segments using your first party audiences as a seed to extend your reach for running marketing campaigns to acquire new users or customers

    And also, CRM Data Enrichment, Consumer Data Enrichment B2B Data Enrichment B2C Data Enrichment Customer Acquisition Audience Segmentation 360-Degree Customer View Consumer Profiling Consumer Behaviour Data

    Here's the schema of People Data: person_id first_name last_name age gender linkedin_url twitter_url facebook_url city state address zip zip4 country delivery_point_bar_code carrier_route walk_seuqence_code fips_state_code fips_country_code country_name latitude longtiude address_type metropolitan_statistical_area core_based+statistical_area census_tract census_block_group census_block primary_address pre_address streer post_address address_suffix address_secondline address_abrev census_median_home_value home_market_value property_build+year property_with_ac property_with_pool property_with_water property_with_sewer general_home_value property_fuel_type year month household_id Census_median_household_income household_size marital_status length+of_residence number_of_kids pre_school_kids single_parents working_women_in_house_hold homeowner children adults generations net_worth education_level occupation education_history credit_lines credit_card_user newly_issued_credit_card_user credit_range_new
    credit_cards loan_to_value mortgage_loan2_amount mortgage_loan_type
    mortgage_loan2_type mortgage_lender_code
    mortgage_loan2_render_code
    mortgage_lender mortgage_loan2_lender
    mortgage_loan2_ratetype mortgage_rate
    mortgage_loan2_rate donor investor interest buyer hobby personal_email work_email devices phone employee_title employee_department employee_job_function skills recent_job_change company_id company_name company_description technologies_used office_address office_city office_country office_state office_zip5 office_zip4 office_carrier_route office_latitude office_longitude office_cbsa_code
    office_census_block_group
    office_census_tract office_county_code
    company_phone
    company_credit_score
    company_csa_code
    company_dpbc
    company_franchiseflag
    company_facebookurl company_linkedinurl company_twitterurl
    company_website company_fortune_rank
    company_government_type company_headquarters_branch company_home_business
    company_industry
    company_num_pcs_used
    company_num_employees
    company_firm_individual company_msa company_msa_name
    company_naics_code
    company_naics_description
    company_naics_code2 company_naics_description2
    company_sic_code2
    company_sic_code2_description
    company_sic...

  5. Additional file 11: of MGSEA – a multivariate Gene set enrichment analysis...

    • springernature.figshare.com
    xlsx
    Updated Jun 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Khong-Loon Tiong; Chen-Hsiang Yeang (2023). Additional file 11: of MGSEA – a multivariate Gene set enrichment analysis [Dataset]. http://doi.org/10.6084/m9.figshare.7861190.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 4, 2023
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Khong-Loon Tiong; Chen-Hsiang Yeang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Table S7. Subtype-specific CNV and their values in TCGA and external dataset. The table show the previously reported subtype-specific CNV and their values (mean and standard deviation) for (A) breast cancer and (B) GBM TCGA and external dataset. The values were CDFs (ranged from 0 to 1) for TCGA data, log2 of estimated copy numbers (centered at 0) for METABRIC, and estimated copy numbers (centered at 2) for REMBRANDT, respectively. (XLSX 11 kb)

  6. D

    Automated Indicator Enrichment Market Research Report 2033

    • dataintelo.com
    csv, pdf, pptx
    Updated Sep 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). Automated Indicator Enrichment Market Research Report 2033 [Dataset]. https://dataintelo.com/report/automated-indicator-enrichment-market
    Explore at:
    csv, pdf, pptxAvailable download formats
    Dataset updated
    Sep 30, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Automated Indicator Enrichment Market Outlook



    According to our latest research, the global Automated Indicator Enrichment market size reached USD 1.26 billion in 2024, reflecting robust adoption across sectors. The market is expected to grow at a CAGR of 16.7% during the forecast period, with the market size projected to reach USD 4.01 billion by 2033. This growth is primarily driven by the increasing sophistication of cyber threats and the urgent need for organizations to automate threat intelligence processes, enabling faster and more accurate response to security incidents. The convergence of AI, machine learning, and security automation technologies is further accelerating the adoption of automated indicator enrichment solutions globally.




    One of the key growth factors for the Automated Indicator Enrichment market is the escalating volume and complexity of cyber threats targeting organizations of all sizes. With threat actors employing advanced tactics, techniques, and procedures (TTPs), traditional manual threat analysis processes are proving inadequate. Automated indicator enrichment enables security teams to automatically contextualize, validate, and prioritize threat indicators, significantly reducing the mean time to detect (MTTD) and respond (MTTR) to incidents. The proliferation of endpoints, cloud workloads, and interconnected digital assets has necessitated a scalable approach to threat intelligence, further fueling demand for automated solutions that can process vast amounts of threat data in real time.




    Another significant driver is the increasing regulatory pressure on organizations to maintain robust cybersecurity postures and ensure compliance with international standards such as GDPR, HIPAA, and PCI DSS. Automated indicator enrichment solutions facilitate compliance management by providing auditable, consistent, and timely threat intelligence workflows. This not only helps organizations avoid costly penalties but also enhances their overall security posture. The market is also benefitting from the growing awareness among enterprises regarding the benefits of automation in reducing human error, improving operational efficiency, and enabling proactive security measures. As a result, both large enterprises and small and medium enterprises (SMEs) are investing in advanced automated indicator enrichment platforms to stay ahead of evolving cyber threats.




    The rapid advancements in artificial intelligence (AI) and machine learning (ML) technologies have also played a pivotal role in shaping the Automated Indicator Enrichment market. Modern solutions leverage AI and ML algorithms to enrich threat indicators with contextual data from multiple sources, including threat intelligence feeds, internal logs, and external databases. This automated enrichment process enhances the accuracy of threat detection and enables security analysts to focus on high-priority incidents. Additionally, the integration of automated indicator enrichment tools with Security Information and Event Management (SIEM) and Security Orchestration, Automation, and Response (SOAR) platforms is creating new opportunities for seamless, end-to-end security automation, further driving market growth.




    From a regional perspective, North America currently dominates the Automated Indicator Enrichment market, accounting for the largest share in 2024, followed closely by Europe and the Asia Pacific. The presence of major cybersecurity vendors, high adoption rates of advanced security solutions, and stringent regulatory frameworks are key factors contributing to North America's leadership. Meanwhile, Asia Pacific is expected to witness the fastest growth over the forecast period, driven by increasing digital transformation initiatives, rising cybercrime rates, and growing investments in cybersecurity infrastructure across emerging economies such as India, China, and Southeast Asia. Europe continues to show strong growth potential, particularly in sectors like BFSI, healthcare, and government, where data protection and compliance are top priorities.



    Component Analysis



    The Automated Indicator Enrichment market is segmented by component into software and services, each playing a vital role in the ecosystem. The software segment currently holds the largest market share, owing to the increasing deployment of advanced enrichment platforms that leverage AI, ML, and big data analytics to automate the enrichment of threat indicators. These platforms

  7. d

    Factori AI & ML Training Data | People Data | USA | Machine Learning Data

    • datarade.ai
    .json, .csv
    Updated Jul 23, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Factori (2022). Factori AI & ML Training Data | People Data | USA | Machine Learning Data [Dataset]. https://datarade.ai/data-products/factori-ai-ml-training-data-consumer-data-usa-machine-factori
    Explore at:
    .json, .csvAvailable download formats
    Dataset updated
    Jul 23, 2022
    Dataset authored and provided by
    Factori
    Area covered
    United States of America
    Description

    Our People data is gathered and aggregated via surveys, digital services, and public data sources. We use powerful profiling algorithms to collect and ingest only fresh and reliable data points.

    Our comprehensive data enrichment solution includes a variety of data sets that can help you address gaps in your customer data, gain a deeper understanding of your customers, and power superior client experiences.

    1. Geography - City, State, ZIP, County, CBSA, Census Tract, etc.
    2. Demographics - Gender, Age Group, Marital Status, Language etc.
    3. Financial - Income Range, Credit Rating Range, Credit Type, Net worth Range, etc
    4. Persona - Consumer type, Communication preferences, Family type, etc
    5. Interests - Content, Brands, Shopping, Hobbies, Lifestyle etc.
    6. Household - Number of Children, Number of Adults, IP Address, etc.
    7. Behaviours - Brand Affinity, App Usage, Web Browsing etc.
    8. Firmographics - Industry, Company, Occupation, Revenue, etc
    9. Retail Purchase - Store, Category, Brand, SKU, Quantity, Price etc.
    10. Auto - Car Make, Model, Type, Year, etc.
    11. Housing - Home type, Home value, Renter/Owner, Year Built etc.

    People Data Schema & Reach: Our data reach represents the total number of counts available within various categories and comprises attributes such as country location, MAU, DAU & Monthly Location Pings:

    Data Export Methodology: Since we collect data dynamically, we provide the most updated data and insights via a best-suited method on a suitable interval (daily/weekly/monthly).

    People data Use Cases:

    360-Degree Customer View: Get a comprehensive image of customers by the means of internal and external data aggregation. Data Enrichment: Leverage Online to offline consumer profiles to build holistic audience segments to improve campaign targeting using user data enrichment Fraud Detection: Use multiple digital (web and mobile) identities to verify real users and detect anomalies or fraudulent activity. Advertising & Marketing: Understand audience demographics, interests, lifestyle, hobbies, and behaviors to build targeted marketing campaigns.

    Here's the schema of People Data: person_id first_name last_name age gender linkedin_url twitter_url facebook_url city state address zip zip4 country delivery_point_bar_code carrier_route walk_seuqence_code fips_state_code fips_country_code country_name latitude longtiude address_type metropolitan_statistical_area core_based+statistical_area census_tract census_block_group census_block primary_address pre_address streer post_address address_suffix address_secondline address_abrev census_median_home_value home_market_value property_build+year property_with_ac property_with_pool property_with_water property_with_sewer general_home_value property_fuel_type year month household_id Census_median_household_income household_size marital_status length+of_residence number_of_kids pre_school_kids single_parents working_women_in_house_hold homeowner children adults generations net_worth education_level occupation education_history credit_lines credit_card_user newly_issued_credit_card_user credit_range_new
    credit_cards loan_to_value mortgage_loan2_amount mortgage_loan_type
    mortgage_loan2_type mortgage_lender_code
    mortgage_loan2_render_code
    mortgage_lender mortgage_loan2_lender
    mortgage_loan2_ratetype mortgage_rate
    mortgage_loan2_rate donor investor interest buyer hobby personal_email work_email devices phone employee_title employee_department employee_job_function skills recent_job_change company_id company_name company_description technologies_used office_address office_city office_country office_state office_zip5 office_zip4 office_carrier_route office_latitude office_longitude office_cbsa_code
    office_census_block_group
    office_census_tract office_county_code
    company_phone
    company_credit_score
    company_csa_code
    company_dpbc
    company_franchiseflag
    company_facebookurl company_linkedinurl company_twitterurl
    company_website company_fortune_rank
    company_government_type company_headquarters_branch company_home_business
    company_industry
    company_num_pcs_used
    company_num_employees
    company_firm_individual company_msa company_msa_name
    company_naics_code
    company_naics_description
    company_naics_code2 company_naics_description2
    company_sic_code2
    company_sic_code2_description
    company_sic_code4 company_sic_code4_description
    company_sic_code6
    company_sic_code6_description
    company_sic_code8
    company_sic_code8_description company_parent_company
    company_parent_company_location company_public_private company_subsidiary_company company_residential_business_code company_revenue_at_side_code company_revenue_range
    company_revenue company_sales_volume
    company_small_business company_stock_ticker company_year_founded company_minorityowned
    company_female_owned_or_operated company_franchise_code company_dma company_dma_name
    company_hq_address
    company_hq_city company_hq_duns company_hq_state
    company_hq_zip5 company_hq_zip4 company_se...

  8. f

    Supplementary data: Comparison of target enrichment strategies for ancient...

    • datasetcatalog.nlm.nih.gov
    • tandf.figshare.com
    Updated Nov 3, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Knauf, Sascha; Furtwängler, Anja; Schuenemann, Verena J.; Cole, Stewart T.; Calvignac-Spencer, Sébastien; Reiter, Ella; Singh, Pushpendra; Arora, Natasha; Böhme, Lisa; Vollstedt, Melanie; Krause-Kyora, Ben; Neukamm, Judith; Krause, Johannes; Herbig, Alexander (2020). Supplementary data: Comparison of target enrichment strategies for ancient pathogen DNA [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000454150
    Explore at:
    Dataset updated
    Nov 3, 2020
    Authors
    Knauf, Sascha; Furtwängler, Anja; Schuenemann, Verena J.; Cole, Stewart T.; Calvignac-Spencer, Sébastien; Reiter, Ella; Singh, Pushpendra; Arora, Natasha; Böhme, Lisa; Vollstedt, Melanie; Krause-Kyora, Ben; Neukamm, Judith; Krause, Johannes; Herbig, Alexander
    Description

    Supplementary Note 1 – Laboratory workflow Supplementary Note 2 - Bioinformatics and Statistical Analysis Supplementary Note 3 – Results of the Bioinformatics and Statistical Analysis Supplementary Figure 1: Comparison of (A) mean coverage, (B) standard deviation of the mean coverage, (C) enrichment factor, (D) and the percentage of the genome covered 5 fold, (E) distribution of the fragment length and (F) frequency of the aDNA damage for the ancient and modern strains of M. leprae. Three independent replicates were performed for each method. Labels of the ancient samples are in black and for the modern samples in red. Boxplots of the array are blue, of the DNA bait capture red and the RNA baits capture is green and grey for the first and second round, respectively Supplementary Figure 2: Comparison of (A) mean coverage, (B) standard deviation of the mean coverage, (C) enrichment factor, (D) and the percentage of the genome covered 5 fold, (E) distribution of the fragment length and (F) frequency of the aDNA damage for the ancient and modern strains of T. pallidum. Three independent replicates were performed for each method. Labels of the ancient samples are in black and for the modern samples in red. Boxplots of the array are blue, of the DNA bait capture red and the RNA baits capture is green and grey for the first and second round, respectively Supplementary Figure 3: Number of unique reads for the three replicate batches of the three tested methods. The number of unique reads in the second round of hybridization with the RNA baits does not strongly increase compared to the first round. Supplementary Table 1: List of all samples used in this study group according to organism and age together with the original publications. Supplementary Table 4: Comparison of the specific reads of the three tested protocols. Supplementary Table 6: Comparison of the variance within each method tested. Supplementary Table 7: Comparison of the costs per reaction.

  9. Data Catalog Market Analysis, Size, and Forecast 2025-2029: North America...

    • technavio.com
    pdf
    Updated Aug 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Technavio (2025). Data Catalog Market Analysis, Size, and Forecast 2025-2029: North America (US and Canada), Europe (France, Germany, Italy, Russia, and UK), APAC (China, India, and Japan), and Rest of World (ROW) [Dataset]. https://www.technavio.com/report/data-catalog-market-industry-analysis
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Aug 15, 2025
    Dataset provided by
    TechNavio
    Authors
    Technavio
    License

    https://www.technavio.com/content/privacy-noticehttps://www.technavio.com/content/privacy-notice

    Time period covered
    2025 - 2029
    Area covered
    United States
    Description

    Snapshot img

    Data Catalog Market Size 2025-2029

    The data catalog market size is valued to increase USD 5.03 billion, at a CAGR of 29.5% from 2024 to 2029. Rising demand for self-service analytics will drive the data catalog market.

    Major Market Trends & Insights

    North America dominated the market and accounted for a 39% growth during the forecast period.
    By Component - Solutions segment was valued at USD 822.80 billion in 2023
    By Deployment - Cloud segment accounted for the largest market revenue share in 2023
    

    Market Size & Forecast

    Market Opportunities: USD 554.30 million
    Market Future Opportunities: USD 5031.50 million
    CAGR : 29.5%
    North America: Largest market in 2023
    

    Market Summary

    The market is a dynamic and evolving landscape, driven by the increasing demand for self-service analytics and the rise of data mesh architecture. Core technologies, such as metadata management and data discovery, play a crucial role in enabling organizations to effectively manage and utilize their data assets. Applications, including data governance and data integration, are also seeing significant growth as businesses seek to optimize their data management processes.
    However, maintaining catalog accuracy over time poses a challenge, with concerns surrounding data lineage, data quality, and data security. According to recent estimates, the market is expected to account for over 30% of the overall data management market share by 2025, underscoring its growing importance in the digital transformation era.
    

    What will be the Size of the Data Catalog Market during the forecast period?

    Get Key Insights on Market Forecast (PDF) Request Free Sample

    How is the Data Catalog Market Segmented and what are the key trends of market segmentation?

    The data catalog industry research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD million' for the period 2025-2029, as well as historical data from 2019-2023 for the following segments.

    Component
    
      Solutions
      Services
    
    
    Deployment
    
      Cloud
      On-premises
    
    
    Type
    
      Technical metadata
      Business metadata
      Operational metadata
    
    
    Geography
    
      North America
    
        US
        Canada
    
    
      Europe
    
        France
        Germany
        Italy
        Russia
        UK
    
    
      APAC
    
        China
        India
        Japan
    
    
      Rest of World (ROW)
    

    By Component Insights

    The solutions segment is estimated to witness significant growth during the forecast period.

    Data catalog solutions have gained significant traction in today's data-driven business landscape, addressing complexities in data discovery, governance, collaboration, and data lifecycle management. These solutions enable users to search and discover relevant datasets for analytical or reporting purposes, thereby reducing the time spent locating data, promoting data reuse, and ensuring the usage of appropriate datasets for specific tasks. Centralized metadata storage is a key feature of data catalog solutions, offering detailed information about datasets, including source, schema, data quality, lineage, and other essential attributes. This metadata-centric approach enhances understanding of data assets, supports data governance initiatives, and provides users with the necessary context for effective data utilization.

    Data catalog solutions also facilitate semantic enrichment, data versioning, data security protocols, data access control, and data model design. Semantic enrichment adds meaning and context to data, making it easier to understand and use. Data versioning ensures that different versions of datasets are managed effectively, while data access control restricts access to sensitive data. Data model design helps create an accurate representation of data structures and relationships. Moreover, data catalog solutions offer data discovery tools, data lineage tracking, data governance policies, schema management, data lake management, ETL process optimization, and data quality monitoring. Data discovery tools help users locate relevant data quickly and efficiently.

    Data lineage tracking enables users to trace the origin and movement of data throughout its lifecycle. Data governance policies ensure compliance with regulatory requirements and organizational standards. Schema management maintains the structure and consistency of data, while data lake management simplifies the management of large volumes of data. ETL process optimization improves the efficiency of data integration, and data quality monitoring ensures that data is accurate and reliable. Businesses across various sectors, including healthcare, finance, retail, and manufacturing, are increasingly adopting data catalog solutions to streamline their data management and analytics processes. According to recent studies, the adoption of data catalog solutions has grown by approximately 25%, with an estimated 30% of organizations planning to implement t

  10. Data from: Data cleaning and enrichment through data integration: networking...

    • data.niaid.nih.gov
    • search.dataone.org
    • +1more
    zip
    Updated Jan 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Irene Finocchi; Alessio Martino; Blerina Sinaimeri; Fariba Ranjbar (2025). Data cleaning and enrichment through data integration: networking the Italian academia [Dataset]. http://doi.org/10.5061/dryad.wpzgmsbwj
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 29, 2025
    Dataset provided by
    Libera Università Internazionale degli Studi Sociali Guido Carli
    Authors
    Irene Finocchi; Alessio Martino; Blerina Sinaimeri; Fariba Ranjbar
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    We describe a bibliometric network characterizing co-authorship collaborations in the entire Italian academic community. The network, consisting of 38,220 nodes and 507,050 edges, is built upon two distinct data sources: faculty information provided by the Italian Ministry of University and Research and publications available in Semantic Scholar. Both nodes and edges are associated with a large variety of semantic data, including gender, bibliometric indexes, authors' and publications' research fields, and temporal information. While linking data between the two original sources posed many challenges, the network has been carefully validated to assess its reliability and to understand its graph-theoretic characteristics. By resembling several features of social networks, our dataset can be profitably leveraged in experimental studies in the wide social network analytics domain as well as in more specific bibliometric contexts.

    Methods The proposed network is built starting from two distinct data sources:

    the entire dataset dump from Semantic Scholar (with particular emphasis on the authors and papers datasets) the entire list of Italian faculty members as maintained by Cineca (under appointment by the Italian Ministry of University and Research).

    By means of a custom name-identity recognition algorithm (details are available in the accompanying paper published in Scientific Data), the names of the authors in the Semantic Scholar dataset have been mapped against the names contained in the Cineca dataset and authors with no match (e.g., because of not being part of an Italian university) have been discarded. The remaining authors will compose the nodes of the network, which have been enriched with node-related (i.e., author-related) attributes. In order to build the network edges, we leveraged the papers dataset from Semantic Scholar: specifically, any two authors are said to be connected if there is at least one paper co-authored by said authors. Then, the edges have been enriched with edge-related (i.e., collaboration-related) attributes.

  11. Recurrent functional misinterpretation of RNA-seq data caused by...

    • plos.figshare.com
    pdf
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shir Mandelboum; Zohar Manber; Orna Elroy-Stein; Ran Elkon (2023). Recurrent functional misinterpretation of RNA-seq data caused by sample-specific gene length bias [Dataset]. http://doi.org/10.1371/journal.pbio.3000481
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Shir Mandelboum; Zohar Manber; Orna Elroy-Stein; Ran Elkon
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data normalization is a critical step in RNA sequencing (RNA-seq) analysis, aiming to remove systematic effects from the data to ensure that technical biases have minimal impact on the results. Analyzing numerous RNA-seq datasets, we detected a prevalent sample-specific length effect that leads to a strong association between gene length and fold-change estimates between samples. This stochastic sample-specific effect is not corrected by common normalization methods, including reads per kilobase of transcript length per million reads (RPKM), Trimmed Mean of M values (TMM), relative log expression (RLE), and quantile and upper-quartile normalization. Importantly, we demonstrate that this bias causes recurrent false positive calls by gene-set enrichment analysis (GSEA) methods, thereby leading to frequent functional misinterpretation of the data. Gene sets characterized by markedly short genes (e.g., ribosomal protein genes) or long genes (e.g., extracellular matrix genes) are particularly prone to such false calls. This sample-specific length bias is effectively removed by the conditional quantile normalization (cqn) and EDASeq methods, which allow the integration of gene length as a sample-specific covariate. Consequently, using these normalization methods led to substantial reduction in GSEA false results while retaining true ones. In addition, we found that application of gene-set tests that take into account gene–gene correlations attenuates false positive rates caused by the length bias, but statistical power is reduced as well. Our results advocate the inspection and correction of sample-specific length biases as default steps in RNA-seq analysis pipelines and reiterate the need to account for intergene correlations when performing gene-set enrichment tests to lessen false interpretation of transcriptomic data.

  12. f

    Data Sheet 1_Data enrichment for semantic segmentation of point clouds for...

    • frontiersin.figshare.com
    pdf
    Updated Jun 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David Crampen; Joerg Blankenbach (2025). Data Sheet 1_Data enrichment for semantic segmentation of point clouds for the generation of geometric-semantic road models.pdf [Dataset]. http://doi.org/10.3389/fbuil.2025.1607375.s001
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jun 11, 2025
    Dataset provided by
    Frontiers
    Authors
    David Crampen; Joerg Blankenbach
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Digitalizing highway infrastructure is gaining interest in Germany and other countries due to the need for greater efficiency and sustainability. The maintenance of the built infrastructure accounts for nearly 30% of greenhouse gas emissions in Germany. To address this, Digital Twins are emerging as tools to optimize road systems. A Digital Twin of a built asset relies on a geometric-semantic as-is model of the area of interest, where an essential step for automated model generation is the semantic segmentation of reality capture data. While most approaches handle data without considering real-world context, our approach leverages existing geospatial data to enrich the data foundation through an adaptive feature extraction workflow. This workflow is adaptable to various model architectures, from deep learning methods like PointNet++ and PointNeXt to traditional machine learning models such as Random Forest. Our four-step workflow significantly boosts performance, improving overall accuracy by 20% and unweighted mean Intersection over Union (mIoU) by up to 43.47%. The target application is the semantic segmentation of point clouds in road environments. Additionally, the proposed modular workflow can be easily customized to fit diverse data sources and enhance semantic segmentation performance in a model-agnostic way.

  13. f

    Data from: Laboratory mouse housing conditions can be improved using common...

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Apr 16, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    de Angelis, Martin Hrabé; Ollert, Markus; Fuchs, Helmut; Rathkolb, Birgit; Racz, Ildiko; Wurst, Wolfgang; Scheideler, Angelika; Bekeredjian, Raffi; Becker, Lore; Garrett, Lillian; Wolf, Eckhard; Hans, Wolfgang; Klingenspor, Martin; Graw, Jochen; Hölter, Sabine M.; Rozman, Jan; Janik, Dirk; Aguilar-Pimentel, Juan A.; Neff, Frauke; Östereicher, Manuela; André, Viola; Klopstock, Thomas; Schmidt-Weber, Carsten; Gau, Christine; Moreth, Kristin; Amarie, Oana V.; Brielmeier, Markus; Gailus-Durner, Valérie (2018). Laboratory mouse housing conditions can be improved using common environmental enrichment without compromising data [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000628132
    Explore at:
    Dataset updated
    Apr 16, 2018
    Authors
    de Angelis, Martin Hrabé; Ollert, Markus; Fuchs, Helmut; Rathkolb, Birgit; Racz, Ildiko; Wurst, Wolfgang; Scheideler, Angelika; Bekeredjian, Raffi; Becker, Lore; Garrett, Lillian; Wolf, Eckhard; Hans, Wolfgang; Klingenspor, Martin; Graw, Jochen; Hölter, Sabine M.; Rozman, Jan; Janik, Dirk; Aguilar-Pimentel, Juan A.; Neff, Frauke; Östereicher, Manuela; André, Viola; Klopstock, Thomas; Schmidt-Weber, Carsten; Gau, Christine; Moreth, Kristin; Amarie, Oana V.; Brielmeier, Markus; Gailus-Durner, Valérie
    Description

    Animal welfare requires the adequate housing of animals to ensure health and well-being. The application of environmental enrichment is a way to improve the well-being of laboratory animals. However, it is important to know whether these enrichment items can be incorporated in experimental mouse husbandry without creating a divide between past and future experimental results. Previous small-scale studies have been inconsistent throughout the literature, and it is not yet completely understood whether and how enrichment might endanger comparability of results of scientific experiments. Here, we measured the effect on means and variability of 164 physiological parameters in 3 conditions: with nesting material with or without a shelter, comparing these 2 conditions to a “barren” regime without any enrichments. We studied a total of 360 mice from each of 2 mouse strains (C57BL/6NTac and DBA/2NCrl) and both sexes for each of the 3 conditions. Our study indicates that enrichment affects the mean values of some of the 164 parameters with no consistent effects on variability. However, the influence of enrichment appears negligible compared to the effects of other influencing factors. Therefore, nesting material and shelters may be used to improve animal welfare without impairment of experimental outcome or loss of comparability to previous data collected under barren housing conditions.

  14. Data from: Sodium enrichment of Mercury's subsurface through diffusion

    • zenodo.org
    bin
    Updated Mar 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sébastien Verkercke; Sébastien Verkercke (2024). Sodium enrichment of Mercury's subsurface through diffusion [Dataset]. http://doi.org/10.5281/zenodo.10890104
    Explore at:
    binAvailable download formats
    Dataset updated
    Mar 28, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Sébastien Verkercke; Sébastien Verkercke
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The data contains the temporal evolution of the sodium density in Mercury's subsurface. The four files correspond to the the sodium density evolution at the cold and hot longitudes, and for two different surface binding energy distributions.

    Description of the data and file structure

    Each file is structured as follow :

    Line 1 : Thickness of each layer of regolith starting from the surface

    Line 2 : Porosity of each layer starting from the surface

    Each following 5 lines correspond to the evolution over 3000 seconds with :

    Line 3 + (5 n) : Time

    Line 4 + (5 n) : Mean surface binding energy

    Line 5 + (5 n) : Density of adsorbates per layer starting from the surface

    Line 6 + (5 n) : Density of gaseous sodium per layer starting from the surface

    Line 7 + (5 n) : Density of sodium (adsorbate and gas) per layer starting from the surface

    The figures in the letter use only the density of adsorbates

    Sharing/Access information

    The data and the software can also be asked directly to the author of the letter.

    Code/Software

    The software is detailed in the letter.

  15. Molecular dataset on Denitrifying Anaerobic Methane Oxidation (DAMO)...

    • catalog.data.gov
    • gimi9.com
    Updated Jan 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2025). Molecular dataset on Denitrifying Anaerobic Methane Oxidation (DAMO) Enrichment [Dataset]. https://catalog.data.gov/dataset/molecular-dataset-on-denitrifying-anaerobic-methane-oxidation-damo-enrichment
    Explore at:
    Dataset updated
    Jan 20, 2025
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    Raw sequencing data from Denitrifying Anaerobic Methane Oxidation (DAMO) experiments and the relevant statistical data generated by various bioinformatics tools. This dataset is not publicly accessible because: All the experiments for this study were not performed in EPA but in co-authors' institution which has managed the project and prepared a manuscript for peer-reviewed journal submission. It can be accessed through the following means: The raw data will be made available by the authors on request (Dr. Yaohuan Gao, gaoyaohuan@xjtu.edu.cn). Format: Not available because the raw data was not generated in EPA. This dataset is associated with the following publication: Xia, L., Y. Wang, P. Yao, H. Ryu, Z. Dong, C. Tan, S. Deng, H. Liao, and Y. Gao. The effects of model insoluble copper compounds in anoxic sedimentary environment on denitrifying anaerobic methane oxidation (DAMO) activity. Microorganisms. MDPI, Basel, SWITZERLAND, 12(11): 2259, (2024).

  16. d

    Data from: Hierarchical Hybrid Enrichment: multi-tiered genomic data...

    • datadryad.org
    zip
    Updated Nov 19, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sarah Banker; Alan Lemmon; Alyssa Hassinger; Mysia Dye; Sean Holland; Michelle Kortyna; Oscar Ospina; Hannah Ralicki; Emily Lemmon (2019). Hierarchical Hybrid Enrichment: multi-tiered genomic data collection across evolutionary scales, with application to chorus frogs (Pseudacris) [Dataset]. http://doi.org/10.5061/dryad.0sf2hm8
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 19, 2019
    Dataset provided by
    Dryad
    Authors
    Sarah Banker; Alan Lemmon; Alyssa Hassinger; Mysia Dye; Sean Holland; Michelle Kortyna; Oscar Ospina; Hannah Ralicki; Emily Lemmon
    Time period covered
    May 17, 2019
    Description

    Determining the optimal targets of genomic sub-sampling for phylogenomics, phylogeography, and population genomics remains a challenge for evolutionary biologists. Of the available methods for sub-sampling the genome, hybrid enrichment (sequence capture) has become one of the primary means of data collection for systematics, due to the flexibility and cost efficiency of this approach. Despite the utility of this method, information is lacking as to what genomic targets are most appropriate for addressing questions at different evolutionary scales. In this study, first we compare the benefits of target loci developed for deep- and shallow-scales by comparing these loci at each of three taxonomic levels: within a genus (phylogenetics), within a species (phylogeography) and within a hybrid zone (population genomics). Specifically, we target evolutionary conserved loci that are appropriate for deep phylogenetic scales and more rapidly evolving loci that are informative for phylogeographic a...

  17. n

    Data from: Affordable de novo generation of fish mitogenomes using...

    • data.niaid.nih.gov
    • search.dataone.org
    • +2more
    zip
    Updated Jun 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ana Ramon-Laca (2024). Affordable de novo generation of fish mitogenomes using amplification-free enrichment of mitochondrial DNA and deep sequencing of long fragments [Dataset]. http://doi.org/10.5061/dryad.jm63xsjdj
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 28, 2024
    Dataset provided by
    NOAA National Marine Fisheries Service
    Authors
    Ana Ramon-Laca
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Biomonitoring surveys from environmental DNA make use of metabarcoding tools to describe the community composition. These studies match their sequencing results against public genomic databases to identify the species. However, mitochondrial genomic reference data are yet incomplete, only a few genes may be available, or the suitability of existing sequence data is suboptimal for species-level resolution. Here we present a dedicated and cost-effective workflow with no DNA amplification for generating complete fish mitogenomes for the purpose of strengthening fish mitochondrial databases. Two different long-fragment sequencing approaches using Oxford Nanopore sequencing coupled with mitochondrial DNA enrichment were used. One where the enrichment is achieved by preferential isolation of mitochondria followed by DNA extraction and nuclear DNA depletion (‘mitoenrichment’). A second enrichment approach takes advantage of the CRISPR-Cas9 targeted scission on previously dephosphorylated DNA (‘targeted mitosequencing’). The sequencing results varied between tissue, species, and integrity of the DNA. The mitoenrichment method yielded 0.17-12.33 % of sequences on target and a mean coverage ranging from 74.9 to 805-fold. The targeted mitosequencing experiment from native genomic DNA yielded 1.83-55 % of sequences on target and a 38 to 2123-fold mean coverage. This produced complete the mitogenome of species with homopolymeric regions, tandem repeats, and gene rearrangements. We demonstrate that deep sequencing of long fragments of native fish DNA is possible and can be achieved with low computational resources in a cost-effective manner, opening the discovery of mitogenomes of non-model or understudied fish taxa to a broad range of laboratories worldwide.

  18. n

    Data from: Enriching the ant tree of life: enhanced UCE bait set for...

    • data.niaid.nih.gov
    • datadryad.org
    • +1more
    zip
    Updated Feb 14, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Michael G. Branstetter; John T. Longino; Philip S. Ward; Brant C. Faircloth (2017). Enriching the ant tree of life: enhanced UCE bait set for genome-scale phylogenetics of ants and other Hymenoptera [Dataset]. http://doi.org/10.5061/dryad.89n87
    Explore at:
    zipAvailable download formats
    Dataset updated
    Feb 14, 2017
    Dataset provided by
    Louisiana State University of Alexandria
    University of California, Davis
    University of Utah
    Authors
    Michael G. Branstetter; John T. Longino; Philip S. Ward; Brant C. Faircloth
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Area covered
    Global
    Description
    1. Targeted enrichment of conserved genomic regions (e.g., ultraconserved elements or UCEs) has emerged as a promising tool for inferring evolutionary history in many organismal groups. Because the UCE approach is still relatively new, much remains to be learned about how best to identify UCE loci and design baits to enrich them.

    2. We test an updated UCE identification and bait design workflow for the insect order Hymenoptera, with a particular focus on ants. The new strategy augments a previous bait design for Hymenoptera by (a) changing the parameters by which conserved genomic regions are identified and retained, and (b) increasing the number of genomes used for locus identification and bait design. We perform in vitro validation of the approach in ants by synthesizing an ant-specific bait set that targets UCE loci and a set of “legacy” phylogenetic markers. Using this bait set, we generate new data for 84 taxa (16/17 ant subfamilies) and extract loci from an additional 17 genome-enabled taxa. We then use these data to examine UCE capture success and phylogenetic performance across ants. We also test the workability of extracting legacy markers from enriched samples and combining the data with published data sets.

    3. The updated bait design (hym-v2) contained a total of 2,590-targeted UCE loci for Hymenoptera, significantly increasing the number of loci relative to the original bait set (hym-v1; 1,510 loci). Across 38 genome-enabled Hymenoptera and 84 enriched samples, experiments demonstrated a high and unbiased capture success rate, with the mean locus enrichment rate being 2,214 loci per sample. Phylogenomic analyses of ants produced a robust tree that included strong support for previously uncertain relationships. Complementing the UCE results, we successfully enriched legacy markers, combined the data with published Sanger data sets, and generated a comprehensive ant phylogeny containing 1,060 terminals.

    4. Overall, the new UCE bait design strategy resulted in an enhanced bait set for genome-scale phylogenetics in ants and likely all of Hymenoptera. Our in vitro tests demonstrate the utility of the updated design workflow, providing evidence that this approach could be applied to any organismal group with available genomic information.

  19. d

    Plant Functional trait data from N & P experiments

    • search.dataone.org
    • knb.ecoinformatics.org
    Updated Jan 6, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Justin Nowakowski; Bryan Dewsbury; Danielle Ogurcak (2015). Plant Functional trait data from N & P experiments [Dataset]. http://doi.org/10.5063/AA/Dews.5.1
    Explore at:
    Dataset updated
    Jan 6, 2015
    Dataset provided by
    Knowledge Network for Biocomplexity
    Authors
    Justin Nowakowski; Bryan Dewsbury; Danielle Ogurcak
    Time period covered
    Jan 1, 1993 - Jan 1, 2008
    Description

    The search terms 'phosphorus', 'nitrogen', 'nutrient-enrichment' and 'limitation' were used in BIOSIS. From this search a collection of peer-reviewed literature spanning 1992-2008 was compiled. Specifically, we were looking for nutrient enrichment experiment data for coastal and wetland sites. From the literature we extracted nutrient dosages, significant functional trait changes, the associated 'new' means and associatd standard errors/variances and p-values.

  20. SLEPR: A Sample-Level Enrichment-Based Pathway Ranking Method — Seeking...

    • plos.figshare.com
    tiff
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ming Yi; Robert M. Stephens (2023). SLEPR: A Sample-Level Enrichment-Based Pathway Ranking Method — Seeking Biological Themes through Pathway-Level Consistency [Dataset]. http://doi.org/10.1371/journal.pone.0003288
    Explore at:
    tiffAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Ming Yi; Robert M. Stephens
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of microarray and other high throughput data often involves identification of genes consistently up or down-regulated across samples as the first step in extraction of biological meaning. This gene-level paradigm can be limited as a result of valid sample fluctuations and biological complexities. In this report, we describe a novel method, SLEPR, which eliminates this limitation by relying on pathway-level consistencies. Our method first selects the sample-level differentiated genes from each individual sample, capturing genes missed by other analysis methods, ascertains the enrichment levels of associated pathways from each of those lists, and then ranks annotated pathways based on the consistency of enrichment levels of individual samples from both sample classes. As a proof of concept, we have used this method to analyze three public microarray datasets with a direct comparison with the GSEA method, one of the most popular pathway-level analysis methods in the field. We found that our method was able to reproduce the earlier observations with significant improvements in depth of coverage for validated or expected biological themes, but also produced additional insights that make biological sense. This new method extends existing analyses approaches and facilitates integration of different types of HTP data.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
he jian kang; Zhang Guohua (2024). Data for Development of the Enrichment Mentality Questionnaire [Dataset]. http://doi.org/10.57760/sciencedb.18124

Data for Development of the Enrichment Mentality Questionnaire

Explore at:
315 scholarly articles cite this dataset (View in Google Scholar)
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 24, 2024
Dataset provided by
Science Data Bank
Authors
he jian kang; Zhang Guohua
Description

Study 1 Text preparation (specific questionnaire questions can be found in the paper) Vocabulary 1: 845 words related to material wealth and spiritual wealth, as well as their relationship, in The Contemporary Chinese Dictionary (7th edition); Vocabulary 2: Further screening, deleting irrelevant words, merging synonyms, and organizing a total of 69 sets of vocabulary. Test (detailed information can be found in the paper, variable labels and meanings can be found in SPSS data) Data 1: In August 2021, questionnaires were distributed through online platforms with an IP address limited to Zhejiang Province. A total of 503 responses were received, and invalid responses such as short answer times and regular responses were deleted, resulting in 462 valid responses (91.85%). Data 2: In September 2021, questionnaires were distributed through online platforms with IP addresses limited to Zhejiang Province. A total of 208 responses were received, and invalid responses such as short response times and regular responses were deleted, resulting in 201 valid responses (96.63%). Study 2 Test (detailed information can be found in the paper, variable labels and meanings can be found in SPSS data) Data 3: From July to August 2023, questionnaires were distributed through online platforms with IP addresses limited to Zhejiang Province. A total of 1045 answer sheets were collected. Deleting invalid answers such as short answer times and regular responses resulted in 937valid responses (89.67%).

Search
Clear search
Close search
Google apps
Main menu