Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This data archive accompanies our work, in which we analyze a pseudo-relevance retrieval method that is based on the results of web search engines. By enriching topics with text data from web search engine result pages and linked contents, we train topic-specific and cost-efficient classifiers that can be used to search test collections for relevant documents. Building up on attempts that were initially made at TREC Common Core 2018 by Grossman and Cormack, we address the questions of system performance over time considering different search engines, queries and test collections. Our experimental results show how and to which extent the considered components affect the retrieval performance. Overall, the analyzed method is robust in terms of average retrieval performance and a promising way to use web content for the data enrichment of relevance feedback methods.
Our People data is gathered and aggregated via surveys, digital services, and public data sources. We use powerful profiling algorithms to collect and ingest only fresh and reliable data points.
Our comprehensive data enrichment solution includes a variety of data sets that can help you address gaps in your customer data, gain a deeper understanding of your customers, and power superior client experiences.
People Data Schema & Reach: Our data reach represents the total number of counts available within various categories and comprises attributes such as country location, MAU, DAU & Monthly Location Pings:
Data Export Methodology: Since we collect data dynamically, we provide the most updated data and insights via a best-suited method on a suitable interval (daily/weekly/monthly).
People Data Use Cases:
360-Degree Customer View: Get a comprehensive image of customers by the means of internal and external data aggregation.
Data Enrichment: Leverage Online to offline consumer profiles to build holistic audience segments to improve campaign targeting using user data enrichment
Fraud Detection: Use multiple digital (web and mobile) identities to verify real users and detect anomalies or fraudulent activity.
Advertising & Marketing: Understand audience demographics, interests, lifestyle, hobbies, and behaviors to build targeted marketing campaigns.
Using Factori People Data you can solve use cases like:
Acquisition Marketing Expand your reach to new users and customers using lookalike modeling with your first party audiences to extend to other potential consumers with similar traits and attributes.
Lookalike Modeling
Build lookalike audience segments using your first party audiences as a seed to extend your reach for running marketing campaigns to acquire new users or customers
And also, CRM Data Enrichment, Consumer Data Enrichment B2B Data Enrichment B2C Data Enrichment Customer Acquisition Audience Segmentation 360-Degree Customer View Consumer Profiling Consumer Behaviour Data
Here's the schema of People Data:
person_id
first_name
last_name
age
gender
linkedin_url
twitter_url
facebook_url
city
state
address
zip
zip4
country
delivery_point_bar_code
carrier_route
walk_seuqence_code
fips_state_code
fips_country_code
country_name
latitude
longtiude
address_type
metropolitan_statistical_area
core_based+statistical_area
census_tract
census_block_group
census_block
primary_address
pre_address
streer
post_address
address_suffix
address_secondline
address_abrev
census_median_home_value
home_market_value
property_build+year
property_with_ac
property_with_pool
property_with_water
property_with_sewer
general_home_value
property_fuel_type
year
month
household_id
Census_median_household_income
household_size
marital_status
length+of_residence
number_of_kids
pre_school_kids
single_parents
working_women_in_house_hold
homeowner
children
adults
generations
net_worth
education_level
occupation
education_history
credit_lines
credit_card_user
newly_issued_credit_card_user
credit_range_new
credit_cards
loan_to_value
mortgage_loan2_amount
mortgage_loan_type
mortgage_loan2_type
mortgage_lender_code
mortgage_loan2_render_code
mortgage_lender
mortgage_loan2_lender
mortgage_loan2_ratetype
mortgage_rate
mortgage_loan2_rate
donor
investor
interest
buyer
hobby
personal_email
work_email
devices
phone
employee_title
employee_department
employee_job_function
skills
recent_job_change
company_id
company_name
company_description
technologies_used
office_address
office_city
office_country
office_state
office_zip5
office_zip4
office_carrier_route
office_latitude
office_longitude
office_cbsa_code
office_census_block_group
office_census_tract
office_county_code
company_phone
company_credit_score
company_csa_code
company_dpbc
company_franchiseflag
company_facebookurl
company_linkedinurl
company_twitterurl
company_website
company_fortune_rank
company_government_type
company_headquarters_branch
company_home_business
company_industry
company_num_pcs_used
company_num_employees
company_firm_individual
company_msa
company_msa_name
company_naics_code
company_naics_description
company_naics_code2
company_naics_description2
company_sic_code2
company_sic_code2_description
company_sic...
Our consumer data is gathered and aggregated via surveys, digital services, and public data sources. We use powerful profiling algorithms to collect and ingest only fresh and reliable data points.
Our comprehensive data enrichment solution includes a variety of data sets that can help you address gaps in your customer data, gain a deeper understanding of your customers, and power superior client experiences. 1. Geography - City, State, ZIP, County, CBSA, Census Tract, etc. 2. Demographics - Gender, Age Group, Marital Status, Language etc. 3. Financial - Income Range, Credit Rating Range, Credit Type, Net worth Range, etc 4. Persona - Consumer type, Communication preferences, Family type, etc 5. Interests - Content, Brands, Shopping, Hobbies, Lifestyle etc. 6. Household - Number of Children, Number of Adults, IP Address, etc. 7. Behaviours - Brand Affinity, App Usage, Web Browsing etc. 8. Firmographics - Industry, Company, Occupation, Revenue, etc 9. Retail Purchase - Store, Category, Brand, SKU, Quantity, Price etc. 10. Auto - Car Make, Model, Type, Year, etc. 11. Housing - Home type, Home value, Renter/Owner, Year Built etc.
Consumer Graph Schema & Reach: Our data reach represents the total number of counts available within various categories and comprises attributes such as country location, MAU, DAU & Monthly Location Pings:
Data Export Methodology: Since we collect data dynamically, we provide the most updated data and insights via a best-suited method on a suitable interval (daily/weekly/monthly).
Consumer Graph Use Cases: 360-Degree Customer View: Get a comprehensive image of customers by the means of internal and external data aggregation. Data Enrichment: Leverage Online to offline consumer profiles to build holistic audience segments to improve campaign targeting using user data enrichment Fraud Detection: Use multiple digital (web and mobile) identities to verify real users and detect anomalies or fraudulent activity. Advertising & Marketing: Understand audience demographics, interests, lifestyle, hobbies, and behaviors to build targeted marketing campaigns.
Here's the schema of Consumer Data:
person_id
first_name
last_name
age
gender
linkedin_url
twitter_url
facebook_url
city
state
address
zip
zip4
country
delivery_point_bar_code
carrier_route
walk_seuqence_code
fips_state_code
fips_country_code
country_name
latitude
longtiude
address_type
metropolitan_statistical_area
core_based+statistical_area
census_tract
census_block_group
census_block
primary_address
pre_address
streer
post_address
address_suffix
address_secondline
address_abrev
census_median_home_value
home_market_value
property_build+year
property_with_ac
property_with_pool
property_with_water
property_with_sewer
general_home_value
property_fuel_type
year
month
household_id
Census_median_household_income
household_size
marital_status
length+of_residence
number_of_kids
pre_school_kids
single_parents
working_women_in_house_hold
homeowner
children
adults
generations
net_worth
education_level
occupation
education_history
credit_lines
credit_card_user
newly_issued_credit_card_user
credit_range_new
credit_cards
loan_to_value
mortgage_loan2_amount
mortgage_loan_type
mortgage_loan2_type
mortgage_lender_code
mortgage_loan2_render_code
mortgage_lender
mortgage_loan2_lender
mortgage_loan2_ratetype
mortgage_rate
mortgage_loan2_rate
donor
investor
interest
buyer
hobby
personal_email
work_email
devices
phone
employee_title
employee_department
employee_job_function
skills
recent_job_change
company_id
company_name
company_description
technologies_used
office_address
office_city
office_country
office_state
office_zip5
office_zip4
office_carrier_route
office_latitude
office_longitude
office_cbsa_code
office_census_block_group
office_census_tract
office_county_code
company_phone
company_credit_score
company_csa_code
company_dpbc
company_franchiseflag
company_facebookurl
company_linkedinurl
company_twitterurl
company_website
company_fortune_rank
company_government_type
company_headquarters_branch
company_home_business
company_industry
company_num_pcs_used
company_num_employees
company_firm_individual
company_msa
company_msa_name
company_naics_code
company_naics_description
company_naics_code2
company_naics_description2
company_sic_code2
company_sic_code2_description
company_sic_code4
company_sic_code4_description
company_sic_code6
company_sic_code6_description
company_sic_code8
company_sic_code8_description
company_parent_company
company_parent_company_location
company_public_private
company_subsidiary_company
company_residential_business_code
company_revenue_at_side_code
company_revenue_range
company_revenue
company_sales_volume
company_small_business
company_stock_ticker
company_year_founded
company_minorityowned
company_female_owned_or_operated
company_franchise_code
company_dma
company_dma_name
company_hq_address
company_hq_city
company_hq_duns
company_hq_state
company_hq_zip5
company_hq_zip4
co...
Access over 5 million verified company email addresses through our advanced API or intuitive web screener platform. Our comprehensive European B2B email database goes beyond basic contact information, providing phone numbers and website details that empower your outreach efforts. We refresh this data monthly to ensure maximum accuracy and reliability, giving you confidence in every connection you make.
The monthly data refresh cycle ensures you're always working with current information, reducing bounce rates and improving deliverability. This commitment to accuracy means better ROI on your marketing spend and more productive sales conversations. Our verification process eliminates outdated contacts, so you can focus your energy on prospects who are ready to engage.
With both API integration and web-based access options, you can seamlessly incorporate our data into your existing workflows or use our standalone platform to research and build targeted lists. This flexibility makes it easy for teams of any size to harness the power of verified contact data for their specific needs.
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Supplementary Note 1 – Laboratory workflow
Supplementary Note 2 - Bioinformatics and Statistical Analysis
Supplementary Note 3 – Results of the Bioinformatics and Statistical Analysis
Supplementary Figure 1: Comparison of (A) mean coverage, (B) standard deviation of the mean coverage, (C) enrichment factor, (D) and the percentage of the genome covered 5 fold, (E) distribution of the fragment length and (F) frequency of the aDNA damage for the ancient and modern strains of M. leprae. Three independent replicates were performed for each method. Labels of the ancient samples are in black and for the modern samples in red. Boxplots of the array are blue, of the DNA bait capture red and the RNA baits capture is green and grey for the first and second round, respectively
Supplementary Figure 2: Comparison of (A) mean coverage, (B) standard deviation of the mean coverage, (C) enrichment factor, (D) and the percentage of the genome covered 5 fold, (E) distribution of the fragment length and (F) frequency of the aDNA damage for the ancient and modern strains of T. pallidum. Three independent replicates were performed for each method. Labels of the ancient samples are in black and for the modern samples in red. Boxplots of the array are
blue, of the DNA bait capture red and the RNA baits capture is green and grey for the first and second round, respectively
Supplementary Figure 3: Number of unique reads for the three replicate batches of the three tested methods. The number of unique reads in the second round of hybridization with the RNA baits does not strongly increase compared to the first round.
Supplementary Table 1: List of all samples used in this study group according to organism and age together with the original publications. Supplementary Table 4: Comparison of the specific reads of the three tested protocols. Supplementary Table 6: Comparison of the variance within each method tested.
Supplementary Table 7: Comparison of the costs per reaction.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Digitalizing highway infrastructure is gaining interest in Germany and other countries due to the need for greater efficiency and sustainability. The maintenance of the built infrastructure accounts for nearly 30% of greenhouse gas emissions in Germany. To address this, Digital Twins are emerging as tools to optimize road systems. A Digital Twin of a built asset relies on a geometric-semantic as-is model of the area of interest, where an essential step for automated model generation is the semantic segmentation of reality capture data. While most approaches handle data without considering real-world context, our approach leverages existing geospatial data to enrich the data foundation through an adaptive feature extraction workflow. This workflow is adaptable to various model architectures, from deep learning methods like PointNet++ and PointNeXt to traditional machine learning models such as Random Forest. Our four-step workflow significantly boosts performance, improving overall accuracy by 20% and unweighted mean Intersection over Union (mIoU) by up to 43.47%. The target application is the semantic segmentation of point clouds in road environments. Additionally, the proposed modular workflow can be easily customized to fit diverse data sources and enhance semantic segmentation performance in a model-agnostic way.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Animal welfare requires the adequate housing of animals to ensure health and well-being. The application of environmental enrichment is a way to improve the well-being of laboratory animals. However, it is important to know whether these enrichment items can be incorporated in experimental mouse husbandry without creating a divide between past and future experimental results. Previous small-scale studies have been inconsistent throughout the literature, and it is not yet completely understood whether and how enrichment might endanger comparability of results of scientific experiments. Here, we measured the effect on means and variability of 164 physiological parameters in 3 conditions: with nesting material with or without a shelter, comparing these 2 conditions to a “barren” regime without any enrichments. We studied a total of 360 mice from each of 2 mouse strains (C57BL/6NTac and DBA/2NCrl) and both sexes for each of the 3 conditions. Our study indicates that enrichment affects the mean values of some of the 164 parameters with no consistent effects on variability. However, the influence of enrichment appears negligible compared to the effects of other influencing factors. Therefore, nesting material and shelters may be used to improve animal welfare without impairment of experimental outcome or loss of comparability to previous data collected under barren housing conditions.
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
The size of the Data as a Service market was valued at USD XXX Million in 2023 and is projected to reach USD XXX Million by 2032, with an expected CAGR of 20.00% during the forecast period.Data as a Service, in its most simplistic form, provides an on-demand cloud-based service model for data and analytics. The model will help business use the power of data by not requiring large upfront investments in data storage, processing, and analysis infrastructure. Therefore, data and insights as a service will make DaaS simple to manage, reduce operational costs, and accelerate time-to-value.DaaS suppliers deliver a collection of data services which may include data integration, data cleansing, data enrichment, and data analytics. These services ensure businesses are able to access, and thereby use, hundreds and thousands of data sources located internally or externally for valuable insight and informed decisions. Primarily, DaaS can help out those organizations lacking internal resources and expertise or in their means to gather, handle, and process significant data. Business results are therefore better outsourced with DaaS because they can, at a given time, tend to more core competencies related to the business. Recent developments include: September 2022: Asigra Inc., an ultra-secure backup and recovery pioneer, declared the general availability of Tigris Data Protection software with Content Disarm & Reconstruction (CDR). The addition of CDR makes Asigra the most security-forward backup and recovery software platform available, adding to its extensive suite of security features., June 2022: IMAT Solutions, a real-time healthcare data management and population health reporting solutions provider, announced the launch of a new Data-as-a-Service (DaaS) offering for health payers. The new DaaS solution meets the new Centers for Medicare & Medicaid Services (CMS) effort to transition all quality measures used in its reporting programs to digital quality measures (dQMs).. Key drivers for this market are: Growing Penetration of Data-based Decisions Among Enterprises, Transformation of Enterprises Leading to Real-time Analytics Demand. Potential restraints include: Concerns Regarding Privacy and Security. Notable trends are: BFSI Sector to Witness High Growth.
Our People data is gathered and aggregated via surveys, digital services, and public data sources. We use powerful profiling algorithms to collect and ingest only fresh and reliable data points.
Our comprehensive data enrichment solution includes a variety of data sets that can help you address gaps in your customer data, gain a deeper understanding of your customers, and power superior client experiences. 1. Geography - City, State, ZIP, County, CBSA, Census Tract, etc. 2. Demographics - Gender, Age Group, Marital Status, Language etc. 3. Financial - Income Range, Credit Rating Range, Credit Type, Net worth Range, etc 4. Persona - Consumer type, Communication preferences, Family type, etc 5. Interests - Content, Brands, Shopping, Hobbies, Lifestyle etc. 6. Household - Number of Children, Number of Adults, IP Address, etc. 7. Behaviours - Brand Affinity, App Usage, Web Browsing etc. 8. Firmographics - Industry, Company, Occupation, Revenue, etc 9. Retail Purchase - Store, Category, Brand, SKU, Quantity, Price etc. 10. Auto - Car Make, Model, Type, Year, etc. 11. Housing - Home type, Home value, Renter/Owner, Year Built etc.
People Data Schema & Reach: Our data reach represents the total number of counts available within various categories and comprises attributes such as country location, MAU, DAU & Monthly Location Pings:
Data Export Methodology: Since we collect data dynamically, we provide the most updated data and insights via a best-suited method on a suitable interval (daily/weekly/monthly).
People Data Use Cases: 360-Degree Customer View: Get a comprehensive image of customers by the means of internal and external data aggregation. Data Enrichment: Leverage Online to offline consumer profiles to build holistic audience segments to improve campaign targeting using user data enrichment Fraud Detection: Use multiple digital (web and mobile) identities to verify real users and detect anomalies or fraudulent activity. Advertising & Marketing: Understand audience demographics, interests, lifestyle, hobbies, and behaviors to build targeted marketing campaigns.
Here's the schema of People Data:
person_id
first_name
last_name
age
gender
linkedin_url
twitter_url
facebook_url
city
state
address
zip
zip4
country
delivery_point_bar_code
carrier_route
walk_seuqence_code
fips_state_code
fips_country_code
country_name
latitude
longtiude
address_type
metropolitan_statistical_area
core_based+statistical_area
census_tract
census_block_group
census_block
primary_address
pre_address
streer
post_address
address_suffix
address_secondline
address_abrev
census_median_home_value
home_market_value
property_build+year
property_with_ac
property_with_pool
property_with_water
property_with_sewer
general_home_value
property_fuel_type
year
month
household_id
Census_median_household_income
household_size
marital_status
length+of_residence
number_of_kids
pre_school_kids
single_parents
working_women_in_house_hold
homeowner
children
adults
generations
net_worth
education_level
occupation
education_history
credit_lines
credit_card_user
newly_issued_credit_card_user
credit_range_new
credit_cards
loan_to_value
mortgage_loan2_amount
mortgage_loan_type
mortgage_loan2_type
mortgage_lender_code
mortgage_loan2_render_code
mortgage_lender
mortgage_loan2_lender
mortgage_loan2_ratetype
mortgage_rate
mortgage_loan2_rate
donor
investor
interest
buyer
hobby
personal_email
work_email
devices
phone
employee_title
employee_department
employee_job_function
skills
recent_job_change
company_id
company_name
company_description
technologies_used
office_address
office_city
office_country
office_state
office_zip5
office_zip4
office_carrier_route
office_latitude
office_longitude
office_cbsa_code
office_census_block_group
office_census_tract
office_county_code
company_phone
company_credit_score
company_csa_code
company_dpbc
company_franchiseflag
company_facebookurl
company_linkedinurl
company_twitterurl
company_website
company_fortune_rank
company_government_type
company_headquarters_branch
company_home_business
company_industry
company_num_pcs_used
company_num_employees
company_firm_individual
company_msa
company_msa_name
company_naics_code
company_naics_description
company_naics_code2
company_naics_description2
company_sic_code2
company_sic_code2_description
company_sic_code4
company_sic_code4_description
company_sic_code6
company_sic_code6_description
company_sic_code8
company_sic_code8_description
company_parent_company
company_parent_company_location
company_public_private
company_subsidiary_company
company_residential_business_code
company_revenue_at_side_code
company_revenue_range
company_revenue
company_sales_volume
company_small_business
company_stock_ticker
company_year_founded
company_minorityowned
company_female_owned_or_operated
company_franchise_code
company_dma
company_dma_name
company_hq_address
company_hq_city
company_hq_duns
company_hq_state
company_hq_zip5
company_hq_zip4
company_sect...
No description is available. Visit https://dataone.org/datasets/010fdb736ce0bdc2a2d1aea5a5973971 for complete metadata about this dataset.
We describe a bibliometric network characterizing co-authorship collaborations in the entire Italian academic community. The network, consisting of 38,220 nodes and 507,050 edges, is built upon two distinct data sources: faculty information provided by the Italian Ministry of University and Research and publications available in Semantic Scholar. Both nodes and edges are associated with a large variety of semantic data, including gender, bibliometric indexes, authors' and publications' research fields, and temporal information. While linking data between the two original sources posed many challenges, the network has been carefully validated to assess its reliability and to understand its graph-theoretic characteristics. By resembling several features of social networks, our dataset can be profitably leveraged in experimental studies in the wide social network analytics domain as well as in more specific bibliometric contexts. , The proposed network is built starting from two distinct data sources:
the entire dataset dump from Semantic Scholar (with particular emphasis on the authors and papers datasets) the entire list of Italian faculty members as maintained by Cineca (under appointment by the Italian Ministry of University and Research).
By means of a custom name-identity recognition algorithm (details are available in the accompanying paper published in Scientific Data), the names of the authors in the Semantic Scholar dataset have been mapped against the names contained in the Cineca dataset and authors with no match (e.g., because of not being part of an Italian university) have been discarded. The remaining authors will compose the nodes of the network, which have been enriched with node-related (i.e., author-related) attributes. In order to build the network edges, we leveraged the papers dataset from Semantic Scholar: specifically, any two authors are said to be connected if there is at least one pap..., , # Data cleaning and enrichment through data integration: networking the Italian academia
https://doi.org/10.5061/dryad.wpzgmsbwj
Manuscript published in Scientific Data with DOI .
This repository contains two main data files:
edge_data_AGG.csv
, the full network in comma-separated edge list format (this file contains mainly temporal co-authorship information);Coauthorship_Network_AGG.graphml
, the full network in GraphML format. along with several supplementary data, listed below, useful only to build the network (i.e., for reproducibility only):
University-City-match.xlsx
, an Excel file that maps the name of a university against the city where its respective headquarter is located;Areas-SS-CINECA-match.xlsx
, an Excel file that maps the research areas in Cineca against the research areas in Semantic Scholar.The `Coauthorship_Networ...
https://www.technavio.com/content/privacy-noticehttps://www.technavio.com/content/privacy-notice
Data Catalog Market Size 2025-2029
The data catalog market size is valued to increase USD 5.03 billion, at a CAGR of 29.5% from 2024 to 2029. Rising demand for self-service analytics will drive the data catalog market.
Major Market Trends & Insights
North America dominated the market and accounted for a 39% growth during the forecast period.
By Component - Solutions segment was valued at USD 822.80 billion in 2023
By Deployment - Cloud segment accounted for the largest market revenue share in 2023
Market Size & Forecast
Market Opportunities: USD 554.30 million
Market Future Opportunities: USD 5031.50 million
CAGR : 29.5%
North America: Largest market in 2023
Market Summary
The market is a dynamic and evolving landscape, driven by the increasing demand for self-service analytics and the rise of data mesh architecture. Core technologies, such as metadata management and data discovery, play a crucial role in enabling organizations to effectively manage and utilize their data assets. Applications, including data governance and data integration, are also seeing significant growth as businesses seek to optimize their data management processes.
However, maintaining catalog accuracy over time poses a challenge, with concerns surrounding data lineage, data quality, and data security. According to recent estimates, the market is expected to account for over 30% of the overall data management market share by 2025, underscoring its growing importance in the digital transformation era.
What will be the Size of the Data Catalog Market during the forecast period?
Get Key Insights on Market Forecast (PDF) Request Free Sample
How is the Data Catalog Market Segmented and what are the key trends of market segmentation?
The data catalog industry research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD million' for the period 2025-2029, as well as historical data from 2019-2023 for the following segments.
Component
Solutions
Services
Deployment
Cloud
On-premises
Type
Technical metadata
Business metadata
Operational metadata
Geography
North America
US
Canada
Europe
France
Germany
Italy
Russia
UK
APAC
China
India
Japan
Rest of World (ROW)
By Component Insights
The solutions segment is estimated to witness significant growth during the forecast period.
Data catalog solutions have gained significant traction in today's data-driven business landscape, addressing complexities in data discovery, governance, collaboration, and data lifecycle management. These solutions enable users to search and discover relevant datasets for analytical or reporting purposes, thereby reducing the time spent locating data, promoting data reuse, and ensuring the usage of appropriate datasets for specific tasks. Centralized metadata storage is a key feature of data catalog solutions, offering detailed information about datasets, including source, schema, data quality, lineage, and other essential attributes. This metadata-centric approach enhances understanding of data assets, supports data governance initiatives, and provides users with the necessary context for effective data utilization.
Data catalog solutions also facilitate semantic enrichment, data versioning, data security protocols, data access control, and data model design. Semantic enrichment adds meaning and context to data, making it easier to understand and use. Data versioning ensures that different versions of datasets are managed effectively, while data access control restricts access to sensitive data. Data model design helps create an accurate representation of data structures and relationships. Moreover, data catalog solutions offer data discovery tools, data lineage tracking, data governance policies, schema management, data lake management, ETL process optimization, and data quality monitoring. Data discovery tools help users locate relevant data quickly and efficiently.
Data lineage tracking enables users to trace the origin and movement of data throughout its lifecycle. Data governance policies ensure compliance with regulatory requirements and organizational standards. Schema management maintains the structure and consistency of data, while data lake management simplifies the management of large volumes of data. ETL process optimization improves the efficiency of data integration, and data quality monitoring ensures that data is accurate and reliable. Businesses across various sectors, including healthcare, finance, retail, and manufacturing, are increasingly adopting data catalog solutions to streamline their data management and analytics processes. According to recent studies, the adoption of data catalog solutions has grown by approximately 25%, with an estimated 30% of organizations planning to implement t
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Ecosystems are connected by flows of nutrients and organisms. Changes to connectivity and nutrient enrichment may destabilise ecosystem dynamics far from the nutrient source. We used gradostats to examine the effects of trophic connectivity (movement of consumers and producers) versus nutrient-only connectivity on the dynamics of Daphnia pulex (consumers) and algae (resources) in two metaecosystem configurations (linear vs. dendritic). We found that Daphnia peak population size and instability (coefficient of variation; CV) increased as distance from the nutrient input increased, but these effects were lower in metaecosystems connected by all trophic levels compared to nutrient-only connected systems and/or in dendritic compared to linear systems. We examined the effects of trophic connectivity (i.e. both trophic levels move rather than one or the other) using a generic model to qualitatively assess whether the expectations align with the ecosystem dynamics we observed. Analysis of our model shows that increased Daphnia population sizes and fluctuations in consumer-resource dynamics are expected with nutrient connectivity, with this pattern being more pronounced in linear rather than dendritic systems. These results confirm that connectivity may propagate and even amplify instability over a metaecosystem to communities distant from the source disturbance, and suggest a direction for future experiments, that recreate conditions closer to those found in natural systems.
Methods Our gradostat flasks contained simple communities of the water flea Daphnia pulex consuming a mix of three algal species (Pseudokirchneriella subcapitata, Scenedesmus quadricauda, Ankistrodesmus falcatus). This experiment employed a 2x2x2 factorial design to test the importance of ecosystem trophic connectivity (a treatment considering movement of medium only vs. movement of media, phytoplankton and Daphnia between flasks) and metaecosystem configuration (linear or dendritic) on the stability of Daphnia populations and algal communities with two levels of enriched medium input (regular and phosphorus-enriched). Four replicates of this whole design were established, for a total of 32 metaecosystems, run in 9 blocks due to time and space constraints. Each metaecosystem consisted of four “nodes” of 500 mL Erlenmeyer flasks with a foam stopper to allow for gas exchange (128 flasks total), seeded initially with 100 mL algal mix (total average algal density of 2.22 x106 +/- 1.3x104 cells/mL) to which 50 adult Daphnia with eggs (which produce broods of about 15 individuals each week in good conditions (Schwartz 1984) were added before topping off the flask to 500 mL with FLAMES media (Celis-Salgado et al. 2008). Configuration was controlled by unidirectionally connecting flasks in either a linear configuration (in →1→2→3→4→out) or a dendritic configuration (in →1, in→2, 1→3, 2→3, 3→4→out). We chose this as the simplest possible design in which a linear network could be compared to a branched network, with four nodes being the smallest possible number of nodes to create a dendritic configuration, and the two nodes branching into a third, similar to headwater in a river. Flasks were then connected by Tygon tubing and from an inflow reservoir of FLAMES medium (10 μgP/L) or enriched P (70 μgP /L) medium which was pumped through the array of flasks using peristaltic pumps (Watson-Marlow 503S/RL and Rainin Dynamix RP-1). Pumps were set on automatic timers to run for one hour each day at a speed adjusted to move a specific volume of media over that hour. The dilution rate was 10% of the total volume per for all flasks in the linear configurations and the “hub” (3) and “terminal” (4) nodes of the dendritic configurations (50 mL), and 5% per day (25 mL) for the “upstream” nodes in the dendritic configurations (Figure 1). We also controlled functional connectivity, contrasting metaecosystem dynamics when only nutrients moved versus the case when nutrients, resources and consumers moved. To block the flow of organisms in the nutrient-only connectivity treatment, outflow tubing was placed inside an 80-µm nylon mesh held in place with the stopper. Due to colony formation of the phytoplankton and clogging of the mesh, this proved to be an effective retention mechanism also for the algal resources, thus we believe flow of algae was significantly reduced in these treatments compared the trophic connectivity treatments. Though it is possible a small portion of single cells were able to pass through, Scenedesmus is known to form four-cell colonies in the presence of consumers (which we also observed in our algal counts), which are too large to pass through the mesh. As D. pulex were unable to fit through the tubing or survive moving through the peristaltic pumps, in the trophic connectivity treatment, D. pulex were manually moved using a 2mL transfer pipette at a rate of 10% of the population per day (20% were moved after each sampling count as sampling was only done every two days) in all linear nodes and the hub and terminal dendritic nodes, and 5% per day (10% moved after sampling) in the upstream dendritic nodes, in the same downstream direction as media. This type of passive movement at the flow rate of the system would be typical of planktonic animals in rivers that cannot swim upstream. Inflow stock solutions were prepared using FLAMES media (10 μgP/L). Finally, we modified our inflow reservoirs to contain either additionally P-enriched (high P) or regular (low P) FLAMES media. To increase P in the additionally nutrient-enriched treatment without changing pH, 132 μg/L of H2KPO4 and 168 μg/L H2KPO4 were added to our increased P treatment inflow stock solution. For the less-phosphorus treatment, no additional phosphorus was added, but 218 μg/L KCl were added to control for the K added to the high-P medium. See Figure S1 for a photograph of the experimental setup. Experimental Sampling The gradostats were sampled every other day for 30 days. In each node, the concentration of each algal species was measured using a haemocytometer. To estimate Daphnia population size, a 2mL plastic transfer pipette was used to gently agitate, and then sample each node. The number of individuals and two age classes (adult or juvenile) in the pipette were determined and then replaced to the experimental flask. This process was repeated five times, and the average D. pulex count of the five samples was used to estimate Daphnia density/2mL (total number estimated per flask = sampled count average *250). A pilot experiment testing this method proved it had an average error of 17.41 %, equating to 2.5 Daphnia more or less than the expected count at known densities; there is no reason to believe this error was systematic in one direction or the other, or to be systematically biased among our treatments. On Day 30 of the experiment, 40mL samples were taken from each flask to be analysed for total phosphorus concentration (TP). Phosphorus samples were analysed using a standard protocol (Wetzel and Likens 2013) at the GRIL-Université du Québec à Montréal analytical laboratory. Statistical Analysis To quantify the instability of Daphnia populations in experimental gradostats, we determined the peak total Dapnhia population size (as estimated by our density samples) and the coefficient of variation (CV) of Daphnia population size over the course of the experiment. These variables were calculated for each node within each gradostat, as well as in aggregate summed across all nodes for additive Daphnia metapopulation peak and CV. Similarly, population CV and peak density were calculated for each species of alga but we analyse here values based on total algal community density (sum of all species present), as Pseudorkirchinella and Ankistrodesmus were undetectable in most flasks for most of the experiment. Scenedesmus was mostly observed in 4-cell colonies, which is common in the presence of consumers, but we counted the total number of cells, not colonies. All analyses of experimental gradostat data were conducted in R version 4 (Team 2020). Statistical tests of the hypothesis were two-sided and with a level of significance of α=0.05. To determine whether metaecosystem connectivity, configuration and nutrient enrichment, as well as node position (1 upstream to 4 terminal), influenced node Daphnia population instability downstream of the nutrient enrichment source, we analysed the effects of these factors on mean Daphnia population and algal community peak values, on mean Daphnia population and algal community CV log-transformed (natural logarithm) values, and on mean final TP concentrations values, using linear mixed-effects models with the four factors as fixed effects. The mixed model included a random effect for ‘system’ which allowed us to account for a possible clustering in the response variables since the four nodes were connected as metaecosystems. For each of these models, pairwise interactions between factors were tested and terms for non-significant interactions were removed from the final models we report. Assumptions on the model errors (randomness, normality, and homoscedasticity) and the presence of possible influential observations or outliers were assessed with diagnostic plots of the model residuals. Robust standard errors (Huang and Li 2022) were used to adjust for heteroscedasticity. We also measured Daphnia metapopulation and algal metacommunity instability at the scale of the entire metaecosystem. To determine whether metaecosystem connectivity, configuration and nutrient enrichment influenced Daphnia metapopulation and algal metacommunity instability, we analysed the effects of these factors on mean Daphnia metapopulation and algal metacommunity peak values, and on mean CV values, using linear mixed-effects models with the three factors as fixed effects, using the block in which a metaecosystem was run
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
This dataset is about: (Table S1) Enrichment factors of all element/Al ratios compared to average shale of IODP Exp302 holes. Please consult parent dataset @ https://doi.org/10.1594/PANGAEA.831371 for more information. DEPTH, sediment/rock [m] is given in mcd.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset was supported by the Key Research and Development Program of China (Grant No. 2016YFC0500902) and the National Nature Science Foundation of China (Grant No. 41601009). It includes two parts: one is the data of SOC contents of aeolian sediment and parent soils, and the data type is '.csv'; and the other is raster type, which includes SOC enrichment rate, average annual dust emission rate, and average annual SOC loss induced by dust emission.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Aspectual verbs (e.g. begin) and intensional verbs (e.g. want) can both take entity-denoting NPs as a complement (begin/want the book) and acquire an implicit meaning (e.g. reading). Linguistic theory posits that such enriched implicit meanings can be acquired either by semantic enrichment with aspectual verbs or by syntactic enrichment with intensional verbs. To investigate whether semantic and syntactic enrichment share enrichment operations, we conducted a structural priming study. Experiment 1 repeated the verb on prime and target trials and found evidence for enrichment priming for both verb types. Experiment 2 crossed the verb type and found no evidence for priming. These results suggest that enrichment operations are distinct for aspectual and intensional verbs. However, Experiment 3 repeated Experiment 1 without lexical boost and found no enrichment priming within the verb type. Thus, producing an enriched structure may not robustly activate enrichment structures, leaving open questions concerning shared mechanisms.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Many ecosystems are now co-invaded by alien plant and herbivore species. The evolutionary naivety of native plants to alien herbivores can make the plants more susceptible to detrimental effects of herbivory than co-occurring invasive plants, in accordance with the apparent competition hypothesis. Moreover, the invasional meltdown hypothesis predicts that in multiply invaded ecosystems, invasive species can facilitate each other's impacts on native communities. Although there is growing empirical support for these hypotheses, facilitative interactions between invasive plants and herbivores remain underexplored in aquatic ecosystems. Many freshwater ecosystems are co-invaded by aquatic macrophytes and mollusks and simultaneously experience nutrient enrichment. However, the interactive effects of these ecological processes on native macrophyte communities remain an underexplored area. To test these effects, we performed a freshwater mesocosm experiment in which we grew a synthetic native community of three macrophyte species under two levels of invasion by an alien macrophyte Myriophyllum aquaticum (invasion vs. no-invasion) and fully crossed with two levels of nutrient enrichment (enrichment vs. no-enrichment) and herbivory by an invasive snail Pomacea canaliculata (herbivory vs. no-herbivory). In line with the invasional meltdown and apparent competition hypotheses, we found that the proportional above-ground biomass yield of the invasive macrophyte, relative to that of the native macrophyte community, was significantly greater in the presence of the invasive herbivore. Evidence of a reciprocal facilitative effect of the invasive macrophyte on the invasive herbivore is provided by the results showing that the herbivore produced greater egg biomass in the presence than in the absence of M. aquaticum. However, nutrient enrichment reduced the mean proportional above-ground biomass yield of the invasive macrophyte. Our results suggested that herbivory by invader P. canaliculata may enhance invasiveness of M. aquaticum. However, nutrient enrichment of habitats that already harbor M. aquaticum may slow down invasive spread of the macrophyte. Broadly, our study underscores the significance of considering several factors and their interactions when assessing the impact of invasive species, especially considering that many habitats experience co-invasion by plants and herbivores and simultaneously undergo various other disturbances, including nutrient enrichment.
https://spdx.org/licenses/etalab-2.0.htmlhttps://spdx.org/licenses/etalab-2.0.html
A P2M2 (IGEPP's Metabolic Profiling and Metabolomic Platform) workflow was built based on GC-MS raw data containing multiple ions from each TMS-derivative of interest to: i) correct for natural abundance, ii) calculate carbon isotopologue distributions and mean 13C enrichment, iii) calculate some positional 13C-enrichments. This workflow is specifically dedicated to the analysis of multiple mass fragments from a same metabolite (TMS-derivative) in order to calculate positional 13C enrichments. The corrective method is publicly available and can be run directly from GC-MS source files using the following workflow : Conversion GCMS PostRun Analysis to IsoCor Isotopic studies > Isotope Correction for mass spectrometry labeling experiments Calculations of positional 13C-enrichments The sources are available on the GitHub repository of the P2M2 platform. The workflow is composed of three items: "Conversion GCMS postrun analysis to IsoCor" , "Isotope Correction for mass spectrometry labeling experiments" and “Calculations of positional 13C-enrichments”. The first item (gcms2isocor) allows to convert GCMS raw data into a table suitable for IsoCor. For this purpose,the input files must contain a column "Name" filled with each carbon isotopologue of each fragment considered and a column "Area" filled with the area of the integrated peak. The name of each mass fragment analysed in the method file must be written exactly as depicted in the GCMS_example_file . This will ensure that the combination between the chemical formula of the metabolite backbone and the chemical formula of the TMS backbone is correct and clearly identified by IsoCor. Additionally, it will prevent the production of similar “metabolite” with IsoCor arising from different m/z. The identification of chemical structures for each fragment reported in the dataset has been manually curated with respect to known fragmentation patterns and 13C positional studies (see the reference publication). As an example, the name "ProlineC2C5m142_TMS_m0" and "ProlineC2C5m216_2TMSMINUSCH3_m0" are for the GC-MS fragments containing both the C2-C3-C4-C5 carbon skeleton of proline_2TMS derivative but which differed with respect to the m/z monitored (142 or 216) and its subsequently derivative backbone (TMS or 2TMSMINUSCH3). If this first step of the workflow is bypassed, the users are solely responsible for the scientific credibility of the results generated. The second item allows to correct for the abundance of naturally occurring isotopes using IsoCor software. Please run IsoCor with the three files “Isotopes.dat”, “Metabolites.dat” and “Derivatives.dat” from the dataset. The third item allows to calculate all positional 13C-enrichments depending on the fragments contained in the original data. It is possible to edit calculation rules if some fragments are prefered, otherwise all combinations of calculations will be performed. The results can be directly inspected visually and downloaded as a .txt format.
Biomarker Image Cytometry. The cell-level frequency of NK2 Homeobox 1 (NKX2-1), Keratin 7 (KRT7), and Thyroglobulin (TG) protein staining were quantitatively evaluated by high-content imaging across huThyrEC cell line variants (1-4) in two medium formulations (huThyrEC and h7H) for verification of thyroid follicular epithelial cell enrichment. Data are the % positive expression frequency (mean ± SD) of two replicates.
This dataset is associated with the following publication: Hopperstad, K., T. Truschel, T. Wahlicht, W. Stewart, A. Eicher, T. May, and C. Deisenroth. Characterization of Novel Human Immortalized Thyroid Follicular Epithelial Cell Lines. Applied In Vitro Toxicology. Mary Ann Liebert, Inc., Larchmont, NY, USA, 7(2): 39-49, (2021).
Sampling enrichment toward a target state, an analogue of the improvement of sampling efficiency (SE), is critical in both the refinement of protein structures and the generation of near-native structure ensembles for the exploration of structure-function relationships. We developed a hybrid molecular dynamics (MD)-Monte Carlo (MC) approach to enrich the sampling toward the target structures. In this approach, the higher SE is achieved by perturbing the conventional MD simulations with a MC structure-acceptance judgment, which is based on the coincidence degree of small angle x-ray scattering (SAXS) intensity profiles between the simulation structures and the target structure. We found that the hybrid simulations could significantly improve SE by making the top-ranked models much closer to the target structures both in the secondary and tertiary structures. Specifically, for the 20 mono-residue peptides, when the initial structures had the root-mean-squared deviation (RMSD) from the target structure smaller than 7 Å, the hybrid MD-MC simulations afforded, on average, 0.83 Å and 1.73 Å in RMSD closer to the target than the parallel MD simulations at 310K and 370K, respectively. Meanwhile, the average SE values are also increased by 13.2% and 15.7%. The enrichment of sampling becomes more significant when the target states are gradually detectable in the MD-MC simulations in comparison with the parallel MD simulations, and provide >200% improvement in SE. We also performed a test of the hybrid MD-MC approach in the real protein system, the results showed that the SE for 3 out of 5 real proteins are improved. Overall, this work presents an efficient way of utilizing solution SAXS to improve protein structure prediction and refinement, as well as the generation of near native structures for function annotation.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This data archive accompanies our work, in which we analyze a pseudo-relevance retrieval method that is based on the results of web search engines. By enriching topics with text data from web search engine result pages and linked contents, we train topic-specific and cost-efficient classifiers that can be used to search test collections for relevant documents. Building up on attempts that were initially made at TREC Common Core 2018 by Grossman and Cormack, we address the questions of system performance over time considering different search engines, queries and test collections. Our experimental results show how and to which extent the considered components affect the retrieval performance. Overall, the analyzed method is robust in terms of average retrieval performance and a promising way to use web content for the data enrichment of relevance feedback methods.