Facebook
TwitterWorking Excel spreadsheet compilation of recently published GMarc normalized datasets mapped onto granular segments of canonical Luke and related statistical findings. There are now over 56400 word tokens mapped.
Facebook
TwitterA Data. Full data set, organized by transfected plate number. Shown is the average of duplicate wells in each transfection, normalized as F/R/Ba (see S10B File). Samples used in the figures are highlighted in bold, red font. B Example. Illustration of the data processing. Raw firefly counts (F counts) are normalized to the renilla control (R counts) for each well to give “F/R”. The two Ba710 control samples in each plate are averaged to give “Ba”. Each F/R value is then normalized to the Ba value to give “F/R/Ba”. The duplicate F/R/Ba values are averaged to give the activity of each sample for that transfection. This number is used in further analysis as an “n” of 1. (XLSX)
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Finding a good data source is the first step toward creating a database. Cardiovascular illnesses (CVDs) are the major cause of death worldwide. CVDs include coronary heart disease, cerebrovascular disease, rheumatic heart disease, and other heart and blood vessel problems. According to the World Health Organization, 17.9 million people die each year. Heart attacks and strokes account for more than four out of every five CVD deaths, with one-third of these deaths occurring before the age of 70 A comprehensive database for factors that contribute to a heart attack has been constructed , The main purpose here is to collect characteristics of Heart Attack or factors that contribute to it. As a result, a form is created to accomplish this. Microsoft Excel was used to create this form. Figure 1 depicts the form which It has nine fields, where eight fields for input fields and one field for output field. Age, gender, heart rate, systolic BP, diastolic BP, blood sugar, CK-MB, and Test-Troponin are representing the input fields, while the output field pertains to the presence of heart attack, which is divided into two categories (negative and positive).negative refers to the absence of a heart attack, while positive refers to the presence of a heart attack.Table 1 show the detailed information and max and min of values attributes for 1319 cases in the whole database.To confirm the validity of this data, we looked at the patient files in the hospital archive and compared them with the data stored in the laboratories system. On the other hand, we interviewed the patients and specialized doctors. Table 2 is a sample for 1320 cases, which shows 44 cases and the factors that lead to a heart attack in the whole database,After collecting this data, we checked the data if it has null values (invalid values) or if there was an error during data collection. The value is null if it is unknown. Null values necessitate special treatment. This value is used to indicate that the target isn’t a valid data element. When trying to retrieve data that isn't present, you can come across the keyword null in Processing. If you try to do arithmetic operations on a numeric column with one or more null values, the outcome will be null. An example of a null values processing is shown in Figure 2.The data used in this investigation were scaled between 0 and 1 to guarantee that all inputs and outputs received equal attention and to eliminate their dimensionality. Prior to the use of AI models, data normalization has two major advantages. The first is to avoid overshadowing qualities in smaller numeric ranges by employing attributes in larger numeric ranges. The second goal is to avoid any numerical problems throughout the process.After completion of the normalization process, we split the data set into two parts - training and test sets. In the test, we have utilized1060 for train 259 for testing Using the input and output variables, modeling was implemented.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In this study, blood proteome characterization in face transplantation using longitudinal serum samples from six face transplant patients was carried out with SOMAscan platform. Overall, 24 serum samples from 13 no-rejection, 5 nonsevere rejection and 6 severe rejection episodes were analyzed.Files attached:HMS-16-007.20160218.adat - raw SomaScan dataset presented in adat format.HMS-16-007_SQS_20160218.pdf - technical validation report on the dataset.HMS-16-007.HybNorm.20160218.adat - SomaScan dataset after hybridization control normalization presented in adat format.HMS-16-007.HybNorm.MedNorm.20160218.adat - SomaScan dataset after hybridization control normalization and median signal normalization presented in adat format.HMS-16-007.HybNorm.MedNorm.Cal.20160218.adat - SomaScan dataset after hybridization control normalization, median signal normalization, and calibration presented in adat format.HMS-16-007.HybNorm.MedNorm.Cal.20160218.xls - SomaScan dataset after hybridization control normalization, median signal normalization, and calibration presented in Microsoft Excel Spreadsheet format.Patients_metadata.txt – metadata file containing patients’ demographic and clinical information presented in tab-delimited text format. Metadata is linked to records in the SomaScan dataset via ‘SampleType’ column.SciData_R_script.R – this script is given as an example of a downstream statistical analysis of the HMS-16-007.HybNorm.MedNorm.Cal.20160218.adat dataset.SciData_R_script_SessionInfo - Session information for SciData_R_script.R script.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Additional file 1: 994 Prodigal ortholog sets with inconsistent start sites. The Excel file provides information about the 994 ortholog sets with inconsistent start sites, including the genes within each set and the gene start site revisions required to achieve consistency within each set. (XLS 1 MB)
Facebook
TwitterDataset Title: Data and Code for: "Universal Adaptive Normalization Scale (AMIS): Integration of Heterogeneous Metrics into a Unified System" Description: This dataset contains source data and processing results for validating the Adaptive Multi-Interval Scale (AMIS) normalization method. Includes educational performance data (student grades), economic statistics (World Bank GDP), and Python implementation of the AMIS algorithm with graphical interface. Contents: - Source data: educational grades and GDP statistics - AMIS normalization results (3, 5, 9, 17-point models) - Comparative analysis with linear normalization - Ready-to-use Python code for data processing Applications: - Educational data normalization and analysis - Economic indicators comparison - Development of unified metric systems - Methodology research in data scaling Technical info: Python code with pandas, numpy, scipy, matplotlib dependencies. Data in Excel format.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset contains results from Nanostring Digital Spatial Profiling (DSP, trade name is now GeoMx) experiments using colonic punch biopsy FFPE thin sections from IBD and IBS patients. The multiplex probe panel includes barcode-linked antibodies against 26 immune-oncology relevant proteins and 4 reference/normalization proteins.
The IF labeling strategy included Pan-cytokeratin, Tryptase, and DAPI staining for epithelia, mast cells, and sub-mucosal tissues, respectively. 21 FFPE sections were used, representing 19 individuals. 14 pediatric samples included 8 IBD, 5 IBS, and 1 recurring abdominal pain diagnoses. 7 adult samples were studied - 2 normal tissue biopsies from a single healthy control, 3 X-linked Severe Combined Immuno Deficiency (XSCID) samples from 2 individuals, 1 graft-versus-host disease, and 1 eosinophilic gastroenteritis sample. 8 representative ROIs per slide were selected, with a 9th ROI selected representing a lymphoid aggregate where present. Each of the ROIs contained the three masks (PanCK/epithelia, Tryptase/Mast cell, Dapi/submucosa), and therefore generated 24 individual 30-plex protein expression profiles per slide, with a 25th lymphoid ROI per sample (when present).
The data include: 1) Matrix of metadata with sample identifiers and clinical diagnoses (Excel file). 2) A PowerPoint for each sample showing an image of the full slide, images of each selected ROI and QC expression data. 3) An Excel file for each sample containing raw and normalized protein counts. Three normalization methods are reported: a) Normalization by nuclei count, b) Normalization by tissue area, c) Normalization by housekeeping proteins (Histone H3, Ribosomal protein S6).
Analysis derived from these data have been published in two conference proceedings (see references below)
Facebook
TwitterWe provide data on an Excel file, with absolute differences in beta values between replicate samples for each probe provided in different tabs for raw data and different normalization methods.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We provide the data used for this research in both Excel (one file with one matrix per sheet, 'Allmatrices.xlsx'), and CSV (one file per matrix).
Patent applications (Patent_applications.csv) Patent applications from residents and no residents per million inhabitants. Data obtained from the World Development Indicators database (World Bank 2020). Normalization by the number of inhabitants was made by the authors.
High-tech exports (High-tech_exports.csv) The proportion of exports of high-level technology manufactures from total exports by technology intensity, obtained from the Trade Structure by Partner, Product or Service-Category database (Lall, 2000; UNCTAD, 2019)
Expenditure on education (Expenditure_on_education.csv) Per capita government expenditure on education, total (2010 US$). The data was obtained from the government expenditure on education (total % of GDP), GDP (constant 2010 US$), and population indicators of the World Development Indicators database (World Bank 2020). Normalization by the number of inhabitants was made by the authors.
Scientific publications (Scientific_publications.csv) Scientific and technical journal articles per million inhabitants. The data were obtained from the scientific and technical journal articles and population indicators of the World Development Indicators database (World Bank 2020). Normalization by the number of inhabitants was made by the authors.
Expenditure on R&D (Expenditure_on_R&D.csv) Expenditure on research and development. Data obtained from the research and development expenditure (% of GDP), GDP (constant 2010 US$), and population indicators of the World Development Indicators database (World Bank 2020). Normalization by the number of inhabitants was made by the authors.
Two centuries of GDP (GDP_two_centuries.csv) GDP per capita that accounts for inflation. Data obtained from the Maddison Project Database, version 2018 (Inklaar et al. 2018), and available from the Open Numbers community (open-numbers.github.io).
Inklaar, R., de Jong, H., Bolt, J., & van Zanden, J. (2018). Rebasing “Maddison”: new income comparisons and the shape of long-run economic development (GD-174; GGDC Research Memorandum). https://www.rug.nl/research/portal/files/53088705/gd174.pdf
Lall, S. (2000). The Technological Structure and Performance of Developing Country Manufactured Exports, 1985‐98. Oxford Development Studies, 28(3), 337–369. https://doi.org/10.1080/713688318
Unctad. 2019. “Trade Structure by Partner, Product or Service-Category.” 2019. https://unctadstat.unctad.org/EN/.
World Bank. (2020). World Development Indicators. https://databank.worldbank.org/source/world-development-indicators
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Additional file 2 Data set (excel file). The excel data file data_set_of_extracted_data_Buchka_et_al.xlsx contains the data from our bibliographical survey.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Full dataset for the siderite + R. palustris experiment. Includes:
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Netflix Dashboard using Power BI MySQL and Excel
This project visualizes Netflix’s catalog of Movies and TV Shows to uncover trends by release year, genre, country, and rating.
Dataset Used: Netflix Movies and TV Shows Dataset https://www.kaggle.com/datasets/shivamb/netflix-shows
Steps Followed:
-Cleaned and transformed data in Excel (Text to Columns for cast, director, listed_in, country).
-Split dataset into normalized Excel sheets (titles, cast, directors, genres, countries, descriptions).
-Imported into MySQL Workbench and replaced blanks with NULL.
-Used UNION queries to flatten country and genre fields.
-Connected Power BI to MySQL for live data visualization.
Dashboard Pages:
1.**Overview Page**: KPIs, ratings, genres, global availability.
2.**Single Title Overview:** Cast, director, description, map view by country.
Tools:
Excel | MySQL | Power BI | SQL | DAX
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
ERS annually calculates "normalized prices," which smooth out the effects of shortrun seasonal or cyclical variation, for key agricultural inputs and outputs. They are used to evaluate the benefits of projects affecting agriculture.This record was taken from the USDA Enterprise Data Inventory that feeds into the https://data.gov catalog. Data for this record includes the following resources: Web page with links to Excel files For complete information, please visit https://data.gov.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In order to compare strength testing results of ceramic specimens obtained through different testing methods, the knowledge of the effective surface or effective volume is essential.
In this repository, data to determine the maximum tensile stress, the effective surface and effective volume for the "Notched Roller Test", described in [https://doi.org/10.1016/j.jeurceramsoc.2014.02.009], is given. The relevant geometrical and material parameters to determine the effective surface or effective volume are:
-Roller diameter D -Roller length H -Roller chamfering radius rf -Notch length l -Notch width w -Notch root radius rn -Poisson's ratio v -Weibull modulus m
The data is available within:
1 <= H/D <= 3 0 <= rf/D <= 0.05 0.74 <= l/D <= 0.9 0.05 <= w/D <= 0.2 0 <= rn/w <= 0.5 0.1 <= v <= 0.4 1 <= m <=50
Based on the data for stress interpolation, the maximum tensile stress can be determined from an interpolation of "finter" and the relevant geometrical properties (see equation 1 in the paper cited above). The normalized effective surface or effective volume can be determined through interpolation of the Seff and Veff data of this repository in the same way. The normalization volume Vnorm and normalization surface Snorm are given through the volume (= Pi*H*(D/2)^2) and surface (= Pi*H*D + 2*Pi*(D/2)^2) of the roller, respectively. To aid evaluation, interpolation files in Python, Excel and Mathematica are also provided in this repository.
Additional information:
-Data-files (.csv,.tsv,.xlsx)
The structure of the data in each file for stress evaluation is as follows:
H/D || rf/D || l/D || w/D || rn/w || v || finter
All files provided follow this convention, and the permutation follows v -> rn/w -> w/D -> l/D -> rf/D -> H/D
The structure of the data in each file for the evaluation of Veff and Seff is as follows:
H/D || rf/D || l/D || w/D || rn/w || v || m || Veff/Vnorm || Seff/Snorm
All files provided follow this convention, and the permutation follows m -> v -> rn/w -> w/D -> l/D -> rf/D -> H/D
-Interpolation files (.xlsx,.py,.nb)
The Interpolation implemented in the Excel-file is linear, while the others are cubic. The results from Python- and Mathematica-files vary slightly.
Excel-file:
Entering the specimen geometry and material parameters will automatically adjust the values for the maximum tensile stress and all effective quantities.
Python-file:
The .csv-files have to be in the same directory as the script. Running the script opens prompts in the command line to enter the specimen geometry and material parameters. Results for the maximum tensile stress and all effective quantities are given.
Mathematica-file:
The .csv-files have to be in the same directory as the script. The rows marked in red represent the input-lines for the specimen geometry and material parameters. Afterwards, results for the maximum tensile stress and all effective quantities are given in lines highlighted in green.
Facebook
TwitterThis dataset contains Normalized Difference Vegetation Index (NDVI), Leaf Area Index (LAI) and Phytomass data collected at the Ivotuk field site during the growing season of 1999. The worksheets within this Excel file contain Mean NDVI and LAI data, raw NDVI and LAI data, seasonal mean phytomass, peak phytomass data and raw phytomass data separated by sampling period.
Facebook
TwitterThis dataset contains changes in Normalized Difference Vegetation Index (NDVI) data from International Tundra Experiment (ITEX) 1999. This dataset is in Excel Format. For more information, please see the readme file.
Facebook
TwitterBusiness licenses issued by the Department of Business Affairs and Consumer Protection in the City of Chicago from 2006 to the present. This dataset contains a large number of records/rows of data and may not be viewed in full in Microsoft Excel. Therefore, when downloading the file, select CSV from the Export menu. Open the file in an ASCII text editor, such as Notepad or Wordpad, to view and search.
Data fields requiring description are detailed below.
APPLICATION TYPE: ‘ISSUE’ is the record associated with the initial license application. ‘RENEW’ is a subsequent renewal record. All renewal records are created with a term start date and term expiration date. ‘C_LOC’ is a change of location record. It means the business moved. ‘C_CAPA’ is a change of capacity record. Only a few license types may file this type of application. ‘C_EXPA’ only applies to businesses that have liquor licenses. It means the business location expanded.
LICENSE STATUS: ‘AAI’ means the license was issued. ‘AAC’ means the license was cancelled during its term. ‘REV’ means the license was revoked. 'REA' means the license revocation has been appealed.
LICENSE STATUS CHANGE DATE: This date corresponds to the date a license was cancelled (AAC), revoked (REV) or appealed (REA).
Business License Owner information may be accessed at: https://data.cityofchicago.org/dataset/Business-Owners/ezma-pppn. To identify the owner of a business, you will need the account number or legal name, which may be obtained from this Business Licenses dataset.
Data Owner: Business Affairs and Consumer Protection. Time Period: January 1, 2006 to present. Frequency: Data is updated daily.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Amazon Financial Dataset: R&D, Marketing, Campaigns, and Profit
This dataset provides fictional yet insightful financial data of Amazon's business activities across all 50 states of the USA. It is specifically designed to help students, researchers, and practitioners perform various data analysis tasks such as log normalization, Gaussian distribution visualization, and financial performance comparisons.
Each row represents a state and contains the following columns:
- R&D Amount (in $): The investment made in research and development.
- Marketing Amount (in $): The expenditure on marketing activities.
- Campaign Amount (in $): The costs associated with promotional campaigns.
- State: The state in which the data is recorded.
- Profit (in $): The net profit generated from the state.
Additional features include log-normalized and Z-score transformations for advanced analysis.
This dataset is ideal for practicing:
1. Log Transformation: Normalize skewed data for better modeling and analysis.
2. Statistical Analysis: Explore relationships between financial investments and profit.
3. Visualization: Create compelling graphs such as Gaussian distributions and standard normal distributions.
4. Machine Learning Projects: Build regression models to predict profits based on R&D and marketing spend.
This dataset is synthetically generated and is not based on actual Amazon financial records. It is created solely for educational and practice purposes.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset from the VIRTA Publication Information Service consists of the metadata of 241,575 publications of Finnish universities (publication years 2016–2021) merged from yearly datasets downloaded from https://wiki.eduuni.fi/display/cscvirtajtp/Vuositasoiset+Excel-tiedostot.
The dataset contains following information:
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This spreadsheet implements the FA normalization technique for analyzing a set of male Drosophila cuticular hydrocarbons. It is intended for GC-FID output. Sample data is included. New data can be copied into the file to apply the normalization. (0.07 MB DOC)
Facebook
TwitterWorking Excel spreadsheet compilation of recently published GMarc normalized datasets mapped onto granular segments of canonical Luke and related statistical findings. There are now over 56400 word tokens mapped.