Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset presents data collect during research for my Ph.D. dissertation in Industrial and Systems Engineering at the University of Rhode Island in 2017-2018.
Research Purpose
Cost and schedule overruns have become increasingly common in large defense programs that attempt to build systems with improved performance and lifecycle characteristics, often using novel, untested, and complex product architectures. Based on the well documented relationship between product architecture and the structure of the product development organization, research examined the effectiveness of different organizational networks at designing complex engineered systems, comparing the performance or real-world organizations to ideal ones.
Method and Research Questions
Phase 1 examined information exchange models and implemented the model of information exchange proposed by Dodds, Watts and Sabel to confirm the model can be successfully implemented using agent-based models (ABM). Phase 2 examined artifact models and extended the information exchange model to include the processing artifacts. Phase 3 examined smart team models and phase 4 applied information exchange and artifact models to a real-world organization. Research questions:
1) How do random, multi-scale, military staff and matrix organizational networks perform in the information exchange and artifact task environments and how does increasing the degree of complexity affect performance?
2) How do military staff and matrix organizational networks (real organizations) perform compared to one another and to random and multi-scale networks (ideal organizations)? How does increasing degree of complexity affect performance and which structure is preferred for organizations that design complex engineered systems?
3) How can organizational networks be modified to improve performance?
Data Interpretation
Excel spreadsheets summarize and analyze data collected from MATLAB and NetLogo ABM experiments for each phase. In general, raw data was collected in a 'data' worksheet, and then additional worksheets and graphs were created to analyze data. Dataset includes link to associated dissertation, which provides further detail.
Notable Findings
1) All organizational networks perform well in the information exchange environment and in the artifact environment when complexity is low to moderate.
2) Military staff networks consistently out-perform matrix networks.
3) At high complexity, all networks are susceptible to congestion failure.
4) Military staff organizational networks exhibit performance comparable to multi-scale networks over a range of situations.
https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
The global big data technology market size was valued at approximately $162 billion in 2023 and is projected to reach around $471 billion by 2032, growing at a Compound Annual Growth Rate (CAGR) of 12.6% during the forecast period. The growth of this market is primarily driven by the increasing demand for data analytics and insights to enhance business operations, coupled with advancements in AI and machine learning technologies.
One of the principal growth factors of the big data technology market is the rapid digital transformation across various industries. Businesses are increasingly recognizing the value of data-driven decision-making processes, leading to the widespread adoption of big data analytics. Additionally, the proliferation of smart devices and the Internet of Things (IoT) has led to an exponential increase in data generation, necessitating robust big data solutions to analyze and extract meaningful insights. Organizations are leveraging big data to streamline operations, improve customer engagement, and gain a competitive edge.
Another significant growth driver is the advent of advanced technologies like artificial intelligence (AI) and machine learning (ML). These technologies are being integrated into big data platforms to enhance predictive analytics and real-time decision-making capabilities. AI and ML algorithms excel at identifying patterns within large datasets, which can be invaluable for predictive maintenance in manufacturing, fraud detection in banking, and personalized marketing in retail. The combination of big data with AI and ML is enabling organizations to unlock new revenue streams, optimize resource utilization, and improve operational efficiency.
Moreover, regulatory requirements and data privacy concerns are pushing organizations to adopt big data technologies. Governments worldwide are implementing stringent data protection regulations, like the General Data Protection Regulation (GDPR) in Europe and the California Consumer Privacy Act (CCPA) in the United States. These regulations necessitate robust data management and analytics solutions to ensure compliance and avoid hefty fines. As a result, organizations are investing heavily in big data platforms that offer secure and compliant data handling capabilities.
As organizations continue to navigate the complexities of data management, the role of Big Data Professional Services becomes increasingly critical. These services offer specialized expertise in implementing and managing big data solutions, ensuring that businesses can effectively harness the power of their data. Professional services encompass a range of offerings, including consulting, system integration, and managed services, tailored to meet the unique needs of each organization. By leveraging the knowledge and experience of big data professionals, companies can optimize their data strategies, streamline operations, and achieve their business objectives more efficiently. The demand for these services is driven by the growing complexity of big data ecosystems and the need for seamless integration with existing IT infrastructure.
Regionally, North America holds a dominant position in the big data technology market, primarily due to the early adoption of advanced technologies and the presence of key market players. The Asia Pacific region is expected to witness the highest growth rate during the forecast period, driven by increasing digitalization, the rapid growth of industries such as e-commerce and telecommunications, and supportive government initiatives aimed at fostering technological innovation.
The big data technology market is segmented into software, hardware, and services. The software segment encompasses data management software, analytics software, and data visualization tools, among others. This segment is expected to witness substantial growth due to the increasing demand for data analytics solutions that can handle vast amounts of data. Advanced analytics software, in particular, is gaining traction as organizations seek to gain deeper insights and make data-driven decisions. Companies are increasingly adopting sophisticated data visualization tools to present complex data in an easily understandable format, thereby enhancing decision-making processes.
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
This file contains the raw data for all Figures in the manuscript entitled "Adiponectin-expressing Treg facilitate T lymphocyte development in thymic nurse cell complexes" to be published by the journal Communications Biology (COMMSBIO-20-1888B). Data has been organized in different worksheets according to the order of Figures.
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
The semantic knowledge graph market is experiencing robust growth, driven by the increasing need for organizations to derive actionable insights from complex, unstructured data. The market, estimated at $5 billion in 2025, is projected to exhibit a Compound Annual Growth Rate (CAGR) of 25% from 2025 to 2033, reaching approximately $25 billion by 2033. This expansion is fueled by several key factors. Firstly, the proliferation of big data necessitates efficient data management and knowledge extraction tools; semantic knowledge graphs excel in this arena by organizing information into easily understandable and interlinked structures. Secondly, advancements in artificial intelligence (AI) and machine learning (ML) are enhancing the capabilities of semantic knowledge graphs, improving their ability to process and analyze ever-increasing volumes of data. Thirdly, the growing adoption of cloud-based solutions is simplifying deployment and accessibility, further driving market growth. Key players like Microsoft, Google, and Yandex are heavily investing in this technology, creating a competitive yet innovative landscape. However, challenges remain, including the complexity of implementing these systems, high initial investment costs, and the need for skilled professionals to manage and interpret the resulting knowledge graphs. Despite these restraints, the long-term prospects for the semantic knowledge graph market are incredibly positive. The increasing demand for improved data governance, enhanced business intelligence, and personalized customer experiences will continue to fuel adoption across various sectors, including finance, healthcare, and manufacturing. The market segmentation is expected to evolve, with increasing specialization in specific industry verticals and the development of more sophisticated analytics tools built on top of semantic knowledge graph technologies. The focus will likely shift towards the integration of semantic knowledge graphs with other emerging technologies such as blockchain and the Internet of Things (IoT) to unlock even greater value from data. This convergence will lead to the emergence of smarter and more autonomous systems capable of decision-making based on comprehensive, contextualized knowledge. Regions like North America and Europe are anticipated to maintain significant market shares, though Asia-Pacific is projected to witness substantial growth driven by increasing digitalization and technological advancements.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Spreadsheets targeted at the analysis of GHS safety fingerprints.AbstractOver a 20-year period, the UN developed the Globally Harmonized System (GHS) to address international variation in chemical safety information standards. By 2014, the GHS became widely accepted internationally and has become the cornerstone of OSHA’s Hazard Communication Standard. Despite this progress, today we observe that there are inconsistent results when different sources apply the GHS to specific chemicals, in terms of the GHS pictograms, hazard statements, precautionary statements, and signal words assigned to those chemicals. In order to assess the magnitude of this problem, this research uses an extension of the “chemical fingerprints” used in 2D chemical structure similarity analysis to GHS classifications. By generating a chemical safety fingerprint, the consistency of the GHS information for specific chemicals can be assessed. The problem is the sources for GHS information can differ. For example, the SDS for sodium hydroxide pellets found on Fisher Scientific’s website displays two pictograms, while the GHS information for sodium hydroxide pellets on Sigma Aldrich’s website has only one pictogram. A chemical information tool, which identifies such discrepancies within a specific chemical inventory, can assist in maintaining the quality of the safety information needed to support safe work in the laboratory. The tools for this analysis will be scaled to the size of a moderate large research lab or small chemistry department as a whole (between 1000 and 3000 chemical entities) so that labelling expectations within these universes can be established as consistently as possible.Most chemists are familiar with programs such as excel and google sheets which are spreadsheet programs that are used by many chemists daily. Though a monadal programming approach with these tools, the analysis of GHS information can be made possible for non-programmers. This monadal approach employs single spreadsheet functions to analyze the data collected rather than long programs, which can be difficult to debug and maintain. Another advantage of this approach is that the single monadal functions can be mixed and matched to meet new goals as information needs about the chemical inventory evolve over time. These monadal functions will be used to converts GHS information into binary strings of data called “bitstrings”. This approach is also used when comparing chemical structures. The binary approach make data analysis more manageable, as GHS information comes in a variety of formats such as pictures or alphanumeric strings which are difficult to compare on their face. Bitstrings generated using the GHS information can be compared using an operator such as the tanimoto coefficent to yield values from 0 for strings that have no similarity to 1 for strings that are the same. Once a particular set of information is analyzed the hope is the same techniques could be extended to more information. For example, if GHS hazard statements are analyzed through a spreadsheet approach the same techniques with minor modifications could be used to tackle more GHS information such as pictograms.Intellectual Merit. This research indicates that the use of the cheminformatic technique of structural fingerprints can be used to create safety fingerprints. Structural fingerprints are binary bit strings that are obtained from the non-numeric entity of 2D structure. This structural fingerprint allows comparison of 2D structure through the use of the tanimoto coefficient. The use of this structural fingerprint can be extended to safety fingerprints, which can be created by converting a non-numeric entity such as GHS information into a binary bit string and comparing data through the use of the tanimoto coefficient.Broader Impact. Extension of this research can be applied to many aspects of GHS information. This research focused on comparing GHS hazard statements, but could be further applied to other bits of GHS information such as pictograms and GHS precautionary statements. Another facet of this research is allowing the chemist who uses the data to be able to compare large dataset using spreadsheet programs such as excel and not need a large programming background. Development of this technique will also benefit the Chemical Health and Safety community and Chemical Information communities by better defining the quality of GHS information available and providing a scalable and transferable tool to manipulate this information to meet a variety of other organizational needs.
The Ontario government, generates and maintains thousands of datasets. Since 2012, we have shared data with Ontarians via a data catalogue. Open data is data that is shared with the public. Click here to learn more about open data and why Ontario releases it. Ontario’s Open Data Directive states that all data must be open, unless there is good reason for it to remain confidential. Ontario’s Chief Digital and Data Officer also has the authority to make certain datasets available publicly. Datasets listed in the catalogue that are not open will have one of the following labels: If you want to use data you find in the catalogue, that data must have a licence – a set of rules that describes how you can use it. A licence: Most of the data available in the catalogue is released under Ontario’s Open Government Licence. However, each dataset may be shared with the public under other kinds of licences or no licence at all. If a dataset doesn’t have a licence, you don’t have the right to use the data. If you have questions about how you can use a specific dataset, please contact us. The Ontario Data Catalogue endeavors to publish open data in a machine readable format. For machine readable datasets, you can simply retrieve the file you need using the file URL. The Ontario Data Catalogue is built on CKAN, which means the catalogue has the following features you can use when building applications. APIs (Application programming interfaces) let software applications communicate directly with each other. If you are using the catalogue in a software application, you might want to extract data from the catalogue through the catalogue API. Note: All Datastore API requests to the Ontario Data Catalogue must be made server-side. The catalogue's collection of dataset metadata (and dataset files) is searchable through the CKAN API. The Ontario Data Catalogue has more than just CKAN's documented search fields. You can also search these custom fields. You can also use the CKAN API to retrieve metadata about a particular dataset and check for updated files. Read the complete documentation for CKAN's API. Some of the open data in the Ontario Data Catalogue is available through the Datastore API. You can also search and access the machine-readable open data that is available in the catalogue. How to use the API feature: Read the complete documentation for CKAN's Datastore API. The Ontario Data Catalogue contains a record for each dataset that the Government of Ontario possesses. Some of these datasets will be available to you as open data. Others will not be available to you. This is because the Government of Ontario is unable to share data that would break the law or put someone's safety at risk. You can search for a dataset with a word that might describe a dataset or topic. Use words like “taxes” or “hospital locations” to discover what datasets the catalogue contains. You can search for a dataset from 3 spots on the catalogue: the homepage, the dataset search page, or the menu bar available across the catalogue. On the dataset search page, you can also filter your search results. You can select filters on the left hand side of the page to limit your search for datasets with your favourite file format, datasets that are updated weekly, datasets released by a particular organization, or datasets that are released under a specific licence. Go to the dataset search page to see the filters that are available to make your search easier. You can also do a quick search by selecting one of the catalogue’s categories on the homepage. These categories can help you see the types of data we have on key topic areas. When you find the dataset you are looking for, click on it to go to the dataset record. Each dataset record will tell you whether the data is available, and, if so, tell you about the data available. An open dataset might contain several data files. These files might represent different periods of time, different sub-sets of the dataset, different regions, language translations, or other breakdowns. You can select a file and either download it or preview it. Make sure to read the licence agreement to make sure you have permission to use it the way you want. Read more about previewing data. A non-open dataset may be not available for many reasons. Read more about non-open data. Read more about restricted data. Data that is non-open may still be subject to freedom of information requests. The catalogue has tools that enable all users to visualize the data in the catalogue without leaving the catalogue – no additional software needed. Have a look at our walk-through of how to make a chart in the catalogue. Get automatic notifications when datasets are updated. You can choose to get notifications for individual datasets, an organization’s datasets or the full catalogue. You don’t have to provide and personal information – just subscribe to our feeds using any feed reader you like using the corresponding notification web addresses. Copy those addresses and paste them into your reader. Your feed reader will let you know when the catalogue has been updated. The catalogue provides open data in several file formats (e.g., spreadsheets, geospatial data, etc). Learn about each format and how you can access and use the data each file contains. A file that has a list of items and values separated by commas without formatting (e.g. colours, italics, etc.) or extra visual features. This format provides just the data that you would display in a table. XLSX (Excel) files may be converted to CSV so they can be opened in a text editor. How to access the data: Open with any spreadsheet software application (e.g., Open Office Calc, Microsoft Excel) or text editor. Note: This format is considered machine-readable, it can be easily processed and used by a computer. Files that have visual formatting (e.g. bolded headers and colour-coded rows) can be hard for machines to understand, these elements make a file more human-readable and less machine-readable. A file that provides information without formatted text or extra visual features that may not follow a pattern of separated values like a CSV. How to access the data: Open with any word processor or text editor available on your device (e.g., Microsoft Word, Notepad). A spreadsheet file that may also include charts, graphs, and formatting. How to access the data: Open with a spreadsheet software application that supports this format (e.g., Open Office Calc, Microsoft Excel). Data can be converted to a CSV for a non-proprietary format of the same data without formatted text or extra visual features. A shapefile provides geographic information that can be used to create a map or perform geospatial analysis based on location, points/lines and other data about the shape and features of the area. It includes required files (.shp, .shx, .dbt) and might include corresponding files (e.g., .prj). How to access the data: Open with a geographic information system (GIS) software program (e.g., QGIS). A package of files and folders. The package can contain any number of different file types. How to access the data: Open with an unzipping software application (e.g., WinZIP, 7Zip). Note: If a ZIP file contains .shp, .shx, and .dbt file types, it is an ArcGIS ZIP: a package of shapefiles which provide information to create maps or perform geospatial analysis that can be opened with ArcGIS (a geographic information system software program). A file that provides information related to a geographic area (e.g., phone number, address, average rainfall, number of owl sightings in 2011 etc.) and its geospatial location (i.e., points/lines). How to access the data: Open using a GIS software application to create a map or do geospatial analysis. It can also be opened with a text editor to view raw information. Note: This format is machine-readable, and it can be easily processed and used by a computer. Human-readable data (including visual formatting) is easy for users to read and understand. A text-based format for sharing data in a machine-readable way that can store data with more unconventional structures such as complex lists. How to access the data: Open with any text editor (e.g., Notepad) or access through a browser. Note: This format is machine-readable, and it can be easily processed and used by a computer. Human-readable data (including visual formatting) is easy for users to read and understand. A text-based format to store and organize data in a machine-readable way that can store data with more unconventional structures (not just data organized in tables). How to access the data: Open with any text editor (e.g., Notepad). Note: This format is machine-readable, and it can be easily processed and used by a computer. Human-readable data (including visual formatting) is easy for users to read and understand. A file that provides information related to an area (e.g., phone number, address, average rainfall, number of owl sightings in 2011 etc.) and its geospatial location (i.e., points/lines). How to access the data: Open with a geospatial software application that supports the KML format (e.g., Google Earth). Note: This format is machine-readable, and it can be easily processed and used by a computer. Human-readable data (including visual formatting) is easy for users to read and understand. This format contains files with data from tables used for statistical analysis and data visualization of Statistics Canada census data. How to access the data: Open with the Beyond 20/20 application. A database which links and combines data from different files or applications (including HTML, XML, Excel, etc.). The database file can be converted to a CSV/TXT to make the data machine-readable, but human-readable formatting will be lost. How to access the data: Open with Microsoft Office Access (a database management system used to develop application software). A file that keeps the original layout and
Increasing evidence emphasizes that the effects of human impacts on ecosystems must be investigated using designs that incorporate the responses across levels of biological organization as well as the effects of multiple stressors. Here we implemented a mesocosm experiment to investigate how the individual and interactive effects of CO2 enrichment and eutrophication scale-up from changes in primary producers at the individual (biochemistry) or population level (production, reproduction, and/or abundance) to higher levels of community (macroalgae abundance, herbivory, and global metabolism), and ecosystem organization (detritus release and carbon sink capacity). The responses of Zostera noltii seagrass meadows growing in low- and high-nutrient field conditions were compared. In both meadows, the expected CO2 benefits on Z. noltii leaf production were suppressed by epiphyte overgrowth, with no direct CO2 effect on plant biochemistry or population-level traits. Multi-level meadow response to nutrients was faster and stronger than to CO2. Nutrient enrichment promoted the nutritional quality of Z. noltii (high N, low CVN and phenolics), the growth of epiphytic pennate diatoms and purple bacteria, and shoot mortality. In the low-nutrient meadow, individual effects of CO2 and nutrients separately resulted in reduced carbon storage in the sediment, probably due to enhanced microbial degradation of more labile organic matter. These changes, however, had no effect on herbivory or on community metabolism. Interestingly, individual effects of CO2 or nutrient addition on epiphytes, shoot mortality, and carbon storage were attenuated when nutrients and CO2 acted simultaneously. This suggests CO2-induced benefits on eutrophic meadows. In the high-nutrient meadow, a striking shoot decline caused by amphipod overgrazing masked the response to CO2 and nutrient additions. Our results reveal that under future scenarios of CO2, the responses of seagrass ecosystems will be complex and context-dependent, being mediated by epiphyte overgrowth rather than by direct effects on plant biochemistry. Overall, we found that the responses of seagrass meadows to individual and interactive effects of CO2 and nutrient enrichment varied depending on interactions among species and connections between organization levels.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The data used for each figure is included on a separate tab, organized by figure number. (XLSX)
The 2003 Agriculture Sample Census was designed to meet the data needs of a wide range of users down to district level including policy makers at local, regional and national levels, rural development agencies, funding institutions, researchers, NGOs, farmer organisations, etc. As a result the dataset is both more numerous in its sample and detailed in its scope compared to previous censuses and surveys. To date this is the most detailed Agricultural Census carried out in Africa.
The census was carried out in order to: · Identify structural changes if any, in the size of farm household holdings, crop and livestock production, farm input and implement use. It also seeks to determine if there are any improvements in rural infrastructure and in the level of agriculture household living conditions; · Provide benchmark data on productivity, production and agricultural practices in relation to policies and interventions promoted by the Ministry of Agriculture and Food Security and other stake holders. · Establish baseline data for the measurement of the impact of high level objectives of the Agriculture Sector Development Programme (ASDP), National Strategy for Growth and Reduction of Poverty (NSGRP) and other rural development programs and projects. · Obtain benchmark data that will be used to address specific issues such as: food security, rural poverty, gender, agro-processing, marketing, service delivery, etc.
Tanzania Mainland and Zanzibar
Large scale, small scale and community farms.
Census/enumeration data [cen]
The Mainland sample consisted of 3,221 villages. These villages were drawn from the National Master Sample (NMS) developed by the National Bureau of Statistics (NBS) to serve as a national framework for the conduct of household based surveys in the country. The National Master Sample was developed from the 2002 Population and Housing Census. The total Mainland sample was 48,315 agricultural households. In Zanzibar a total of 317 enumeration areas (EAs) were selected and 4,755 agriculture households were covered. Nationwide, all regions and districts were sampled with the exception of three urban districts (two from Mainland and one from Zanzibar).
In both Mainland and Zanzibar, a stratified two stage sample was used. The number of villages/EAs selected for the first stage was based on a probability proportional to the number of villages in each district. In the second stage, 15 households were selected from a list of farming households in each selected Village/EA, using systematic random sampling, with the village chairpersons assisting to locate the selected households.
Face-to-face [f2f]
The census covered agriculture in detail as well as many other aspects of rural development and was conducted using three different questionnaires: • Small scale questionnaire • Community level questionnaire • Large scale farm questionnaire
The small scale farm questionnaire was the main census instrument and it includes questions related to crop and livestock production and practices; population demographics; access to services, resources and infrastructure; and issues on poverty, gender and subsistence versus profit making production unit.
The community level questionnaire was designed to collect village level data such as access and use of common resources, community tree plantation and seasonal farm gate prices.
The large scale farm questionnaire was administered to large farms either privately or corporately managed.
Questionnaire Design The questionnaires were designed following user meetings to ensure that the questions asked were in line with users data needs. Several features were incorporated into the design of the questionnaires to increase the accuracy of the data: • Where feasible all variables were extensively coded to reduce post enumeration coding error. • The definitions for each section were printed on the opposite page so that the enumerator could easily refer to the instructions whilst interviewing the farmer. • The responses to all questions were placed in boxes printed on the questionnaire, with one box per character. This feature made it possible to use scanning and Intelligent Character Recognition (ICR) technologies for data entry. • Skip patterns were used to reduce unnecessary and incorrect coding of sections which do not apply to the respondent. • Each section was clearly numbered, which facilitated the use of skip patterns and provided a reference for data type coding for the programming of CSPro, SPSS and the dissemination applications.
Data processing consisted of the following processes: · Data entry · Data structure formatting · Batch validation · Tabulation
Data Entry Scanning and ICR data capture technology for the small holder questionnaire were used on the Mainland. This not only increased the speed of data entry, it also increased the accuracy due to the reduction of keystroke errors. Interactive validation routines were incorporated into the ICR software to track errors during the verification process. The scanning operation was so successful that it is highly recommended for adoption in future censuses/surveys. In Zanzibar all data was entered manually using CSPro.
Prior to scanning, all questionnaires underwent a manual cleaning exercise. This involved checking that the questionnaire had a full set of pages, correct identification and good handwriting. A score was given to each questionnaire based on the legibility and the completeness of enumeration. This score will be used to assess the quality of enumeration and supervision in order to select the best field staff for future censuses/surveys.
CSPro was used for data entry of all Large Scale Farm and community based questionnaires due to the relatively small number of questionnaires. It was also used to enter data from the 2,880 small holder questionnaires that were rejected by the ICR extraction application.
Data Structure Formatting A program was developed in visual basic to automatically alter the structure of the output from the scanning/extraction process in order to harmonise it with the manually entered data. The program automatically checked and changed the number of digits for each variable, the record type code, the number of questionnaires in the village, the consistency of the Village ID Code and saved the data of one village in a file named after the village code.
Batch Validation A batch validation program was developed in order to identify inconsistencies within a questionnaire. This is in addition to the interactive validation during the ICR extraction process. The procedures varied from simple range checking within each variable to the more complex checking between variables. It took six months to screen, edit and validate the data from the smallholder questionnaires. After the long process of data cleaning, tabulations were prepared based on a pre-designed tabulation plan.
Tabulations Statistical Package for Social Sciences (SPSS) was used to produce the Census tabulations and Microsoft Excel was used to organize the tables and compute additional indicators. Excel was also used to produce charts while ArcView and Freehand were used for the maps.
Analysis and Report Preparation The analysis in this report focuses on regional comparisons, time series and national production estimates. Microsoft Excel was used to produce charts; ArcView and Freehand were used for maps, whereas Microsoft Word was used to compile the report.
Data Quality A great deal of emphasis was placed on data quality throughout the whole exercise from planning, questionnaire design, training, supervision, data entry, validation and cleaning/editing. As a result of this, it is believed that the census is highly accurate and representative of what was experienced at field level during the Census year. With very few exceptions, the variables in the questionnaire are within the norms for Tanzania and they follow expected time series trends when compared to historical data. Standard Errors and Coefficients of Variation for the main variables are presented in the Technical Report (Volume I).
The Sampling Error found on page (21) up to page (22) in the Technical Report for Agriculture Sample Census Survey 2002-2003
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
41 chromatin and chromatin regulating factors (CRFs) identified as significant regulators of AC invasion. For each RNAi clone listed, the corresponding genetic sequence name, public name, and human homolog is listed. AC invasion scoring data is provided for each clone at the P6.p 4-cell stage. Genes were determined to be significant AC invasion regulators if RNAi targeting resulted in ≥ 20% loss of invasion at the P6.p 4-cell stage (n≥30 animals). Genes in bold are components of the SWI/SNF complex. Asterisks denote genes previously published to regulate C. elegans AC invasion. N.A. denotes genes for which no human ortholog exists. List is organized alphabetically based on genetic sequence name. (XLSX)
Digital clinical decision support algorithms (CDSAs) that guide healthcare workers during consultations can enhance adherence to guidelines and the resulting quality of care. However, this improvement depends on the accuracy of inputs (symptoms and signs) entered by healthcare workers into the digital tool, which relies mainly on their clinical skills, that are often limited, especially in resource-constrained primary care settings. This study aimed to identify and characterize potential clinical skill gaps based on CDSA data patterns and clinical observations. We retrospectively analyzed data from 20,085 pediatric consultations conducted using an IMCI-based CDSA in 16 primary health centers in Rwanda. We focused on clinical signs with numerical values: temperature, mid-upper arm circumference (MUAC), weight, height, z-scores (MUAC for age, weight for age, and weight for height), heart rate, respiratory rate and blood oxygen saturation. Statistical summary measures (frequency of skipped measurements, frequent plausible and implausible values) and their variation in individual health centers compared to the overall average were used to identify 10 health centers with irregular data patterns signaling potential clinical skill gaps. We subsequently observed 188 consultations in these health centers and interviewed healthcare workers to understand potential error causes. Observations indicated basic measurements not being assessed correctly in most children; weight (70%), MUAC (69%), temperature (67%), height (54%). These measures were predominantly conducted by minimally trained non-clinical staff in the registration area. More complex measures, done mostly by healthcare workers in the consultation room, were often skipped: respiratory rate (43%), heart rate (37%), blood oxygen saturation (33%). This was linked to underestimating the importance of these signs in child management, especially in the context of high patient loads typical at primary care level. Addressing clinical skill gaps through in-person training, eLearning and regular personalized mentoring tailored to specific health center needs is imperative to improve quality of care and enhance the benefits of CDSAs.
16 primary healthcare centers (HCs) of Rusizi and Nyamasheke districts in Rwanda.
First dataset was collected directly by the ePOCT+ CDSA during 20,085 pediatric consultations across 16 primary health centers in Rwanda. It includes anonymized patient, healthfacility and consultation data with key clinical measurements (temperature, mid-upper arm circumference (MUAC), weight, height, MUAC for age z-score, weight for age z-score, weight for height z-score, heart rate, respiratory rate and blood oxygen saturation (SpO2).) Second dataset results from structured observations of 188 routine pediatric consultations at a subset of 10 health facilities. Clinicians used a standardized evaluation form to record clinical measurements, mirroring variables in the first dataset. This dataset is used to deepen the analysis from the primary dataset by understanding the reason for the patterns appearing from the quantitative analysis of the first dataset.
Children aged 1 day to 14 years with an acute condition, in the 16 HCs where the intervention was deployed.
Clinical data [cli]
First dataset: ePOCT+ stores all the information (date of consultation, anthropometric measures, vitals, presence/absence of specific symptoms and signs prompted by the algorithm, diagnoses, medicines, managements, etc.) entered by the HW in the tablet during consultations. We retrospectively analyzed data from 20,085 outpatient consultations conducted between November 2021 and October 2022 with children aged 1 day to 14 years with an acute condition, in the 16 HCs where the intervention was deployed. Data cleaning, management, and analyses were conducted using R software (version 4.2.1). Second dataset: Based on the results of the retrospective analysis, we observed 188 routine consultations in a subset of 10 of 16 HCs (approximately 19 observations per HC), from 20 December 2022 and to 09 March 2023. The selection of HCs was guided by the retrospective analysis, ensuring that the 10 HCs chosen were those showing the most critical results. The observing study clinician obtained oral consent from the HWs and was instructed not to interfere with the consultation to avoid introducing any additional bias to the observer effect. To ensure a standardized and consistent evaluation, a digital evaluation form (Google sheets) was used. These observations were conducted over 3 days per HC, with efforts made to separate them by a few days in order to have more chance to observe several different HWs and minimize potential bias. At the end of each day of observation in a HC (and not after each consultation to avoid any influence on subsequent consultations), the observing study clinician conducted an interview with the HW to understand why the assessment of some signs was skipped.Data were exported to Microsoft Excel (Version 16.77.1) for further simple descriptive analysis.
Second dataset: Most of the time, there was only one HW attending to children in the HC on a given day. On the rare occasions when two HW were present, each was observed by one of the two study clinicians.
Other [oth]
The second dataset for this study was derived from structured observations of 188 routine pediatric consultations conducted across a subset of 10 health facilities. Clinicians utilized a standardized evaluation form that included variables aligning with those in the first dataset. This secondary dataset was designed to provide deeper insights into patterns observed in the primary dataset through the quantitative analysis.
The data collection focused on various clinical measurements and observations, categorized as follows:
General Information:
• Date of the consultation.
• Health facility (coded for anonymity).
• Clinical measurements taken at the reception and during the consultation.
• Presence of a conducting line. Additional remarks related to the consultation.
Clinical Measurements: For each of the following, the dataset records whether the measurement was assessed or skipped, the quality of assessment (sufficient/insufficient), reasons for skipping or insufficient assessments, and any extra remarks:
• Temperature (T°).
• MUAC (Mid-Upper Arm Circumference).
• Weight. Height.
• Respiratory Rate (RR).
• Blood Oxygen Saturation (Sat).
• Heart Rate (HR).
Additional Observations: Remarks on other signs and symptoms assessed during the consultation. The structured nature of this dataset ensures consistency in evaluating the reasons behind clinical decisions and the quality of care provided in routine pediatric consultations.
Data editing was conducted as follows: First data set: • Data Extraction: The dataset was extracted from the larger ePOCT+ storage system, which records all consultation-related information entered by healthcare workers (HWs) in tablets during consultations. This includes details such as the date of consultation, anthropometric measures, vital signs, the presence or absence of specific symptoms and signs prompted by the algorithm, diagnoses, medicines, and managements.
• Data Cleaning:
The extracted data were systematically cleaned to focus solely on the variables of interest for this analysis. Irrelevant variables and incomplete records were excluded to ensure a streamlined and accurate dataset.
• Anonymization:
To protect patient and health facilities confidentiality, the data were anonymized prior to analysis. All personal identifiers were removed, and only aggregated or coded information was retained.
• Analysis Preparation:
After cleaning and anonymization, the dataset was reviewed for consistency and coherence. Specific patterns of data were analyzed for the selected variables of interest, ensuring alignment with the study objectives.
• Software Used: Data cleaning, management, and analyses were conducted using R software (version 4.2.1). All processes, including extraction, cleaning, and anonymization, were documented to maintain transparency and reproducibility.
**Second dataset:**
• Data Collection: Data were collected directly from respondents through a Google Forms questionnaire. The structured format ensured standardized responses across all participants, facilitating subsequent data processing and analysis.
• Data Export:
Upon completion of data collection, the dataset was exported from Google Forms to Microsoft Excel (Version 16.77.1). This provided a structured and organized format for further data handling.
• Anonymization:
All personally identifiable information was removed during the data processing phase to protect participant confidentiality. Anonymization measures included replacing personal identifiers with unique codes and omitting any information that could reveal the identity of respondents.
• Data Cleaning and Descriptive Analysis:
The dataset was reviewed in Microsoft Excel to ensure consistency and completeness. Responses were screened for missing or inconsistent data, and necessary corrections were made where appropriate. Simple descriptive analyses were conducted within Excel to summarize key variables and identify initial patterns in the data.
Attribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
License information was derived automatically
Geo-referenced point database on dams in the Middle East.
Supplemental Information:
This dataset is described extensively on the website https://www.fao.org/aquastat/en/databases/dams. On this website, the dataset is also published in Excel to facilitate the publication of information on dams without geographical co-ordinates. It is accompanied by an explanatory document that provides specific information about the references used, and brief notes on the more complicated dams. The shapefile consists of the following information: a) GIS generated codes (FID); b) coordinates in decimal degrees (DDLONG, DDLAT); c) 'coordinates' broken down into eight codes (LATDIR with an N or an S for North or South, LATDEG, LATMIN and LATSEC for degrees, minutes and seconds latitude and LONGDIR with an W or E for West or East and LONGDEG, LONGMIN and LONGSEC for degrees minutes and seconds longitude); d) items described in details on the website, such as river basin and administrative unit; e)completion date; f) height; g)surface area; h) main purpose.
Contact points:
Metadata contact: AQUASTAT FAO-UN Land and Water Division
Online resources:
Download - Database of dams in the Middle East (Shapefile)
Attribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
License information was derived automatically
Geo-referenced point database on dams in Africa.
Supplemental Information:
This dataset is described extensively on the website https://www.fao.org/aquastat/en/databases/dams. On this website, the dataset is also published in Excel to facilitate the publication of information on dams without geographical co-ordinates. It is accompanied by an explanatory document that provides specific information about the references used, and brief notes on the more complicated dams. The shapefile consists of the following information: a) GIS generated codes (FID); b) coordinates in decimal degrees (DDLONG, DDLAT); c) 'coordinates' broken down into eight codes (LATDIR with an N or an S for North or South, LATDEG, LATMIN and LATSEC for degrees, minutes and seconds latitude and LONGDIR with an W or E for West or East and LONGDEG, LONGMIN and LONGSEC for degrees minutes and seconds longitude); d) items described in details on the website, such as river basin and administrative unit; e) completion date; f) height; g)surface area; h) main purpose.
This dataset served also as a basis for the Global reservoirs and dams (GRanD) database, which resulted in the article: Lehner, B., Reidy Liermann, C., Revenga, C., Vörösmarty, C., Fekete, B., Crouzet, P., Döll, P., Endejan, M., Frenken, K., Magome, J., Nilsson, C., Robertson, J., Rödel, R., Sindorf, N., Wisser, D. 2011. High resolution mapping of the world’s reservoirs and dams for sustainable river flow management. Published in the Journal Frontiers in Ecology and the Environment.
For a wider distribution and to support other projects at FAO this map is also distributed in a DVD as part of a publication entitled: Jenness, J., Dooley, J., Aguilar-Manjarrez, J., Riva, C. African Water Resource Database. GIS-based tools for inland aquatic resource management. 2. Technical manual and workbook. CIFA Technical Paper. No. 33, Part 2. Rome, FAO. 2007. 308 p.
Contact points:
Metadata contact: AQUASTAT FAO-UN Land and Water Division
Online resources:
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset presents data collect during research for my Ph.D. dissertation in Industrial and Systems Engineering at the University of Rhode Island in 2017-2018.
Research Purpose
Cost and schedule overruns have become increasingly common in large defense programs that attempt to build systems with improved performance and lifecycle characteristics, often using novel, untested, and complex product architectures. Based on the well documented relationship between product architecture and the structure of the product development organization, research examined the effectiveness of different organizational networks at designing complex engineered systems, comparing the performance or real-world organizations to ideal ones.
Method and Research Questions
Phase 1 examined information exchange models and implemented the model of information exchange proposed by Dodds, Watts and Sabel to confirm the model can be successfully implemented using agent-based models (ABM). Phase 2 examined artifact models and extended the information exchange model to include the processing artifacts. Phase 3 examined smart team models and phase 4 applied information exchange and artifact models to a real-world organization. Research questions:
1) How do random, multi-scale, military staff and matrix organizational networks perform in the information exchange and artifact task environments and how does increasing the degree of complexity affect performance?
2) How do military staff and matrix organizational networks (real organizations) perform compared to one another and to random and multi-scale networks (ideal organizations)? How does increasing degree of complexity affect performance and which structure is preferred for organizations that design complex engineered systems?
3) How can organizational networks be modified to improve performance?
Data Interpretation
Excel spreadsheets summarize and analyze data collected from MATLAB and NetLogo ABM experiments for each phase. In general, raw data was collected in a 'data' worksheet, and then additional worksheets and graphs were created to analyze data. Dataset includes link to associated dissertation, which provides further detail.
Notable Findings
1) All organizational networks perform well in the information exchange environment and in the artifact environment when complexity is low to moderate.
2) Military staff networks consistently out-perform matrix networks.
3) At high complexity, all networks are susceptible to congestion failure.
4) Military staff organizational networks exhibit performance comparable to multi-scale networks over a range of situations.