100+ datasets found
  1. Escape Excel: A tool for preventing gene symbol and accession conversion...

    • plos.figshare.com
    xlsx
    Updated Jun 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eric A. Welsh; Paul A. Stewart; Brent M. Kuenzi; James A. Eschrich (2023). Escape Excel: A tool for preventing gene symbol and accession conversion errors [Dataset]. http://doi.org/10.1371/journal.pone.0185207
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 5, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Eric A. Welsh; Paul A. Stewart; Brent M. Kuenzi; James A. Eschrich
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    BackgroundMicrosoft Excel automatically converts certain gene symbols, database accessions, and other alphanumeric text into dates, scientific notation, and other numerical representations. These conversions lead to subsequent, irreversible, corruption of the imported text. A recent survey of popular genomic literature estimates that one-fifth of all papers with supplementary gene lists suffer from this issue.ResultsHere, we present an open-source tool, Escape Excel, which prevents these erroneous conversions by generating an escaped text file that can be safely imported into Excel. Escape Excel is implemented in a variety of formats (http://www.github.com/pstew/escape_excel), including a command line based Perl script, a Windows-only Excel Add-In, an OS X drag-and-drop application, a simple web-server, and as a Galaxy web environment interface. Test server implementations are accessible as a Galaxy interface (http://apostl.moffitt.org) and simple non-Galaxy web server (http://apostl.moffitt.org:8000/).ConclusionsEscape Excel detects and escapes a wide variety of problematic text strings so that they are not erroneously converted into other representations upon importation into Excel. Examples of problematic strings include date-like strings, time-like strings, leading zeroes in front of numbers, and long numeric and alphanumeric identifiers that should not be automatically converted into scientific notation. It is hoped that greater awareness of these potential data corruption issues, together with diligent escaping of text files prior to importation into Excel, will help to reduce the amount of Excel-corrupted data in scientific analyses and publications.

  2. f

    Raw data (see Excel spreadsheet).

    • datasetcatalog.nlm.nih.gov
    Updated Apr 29, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kocha, Katrinka M.; Ahuja, Suchit; Labit, Elodie; Rosin, Nicole; Li, Qing; Huang, Peng; Long, Quan; Narang, Ankita; Biernaskie, Jeff; Sinha, Sarthak; Adjekukor, Cynthia; Childs, Sarah J. (2024). Raw data (see Excel spreadsheet). [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001416008
    Explore at:
    Dataset updated
    Apr 29, 2024
    Authors
    Kocha, Katrinka M.; Ahuja, Suchit; Labit, Elodie; Rosin, Nicole; Li, Qing; Huang, Peng; Long, Quan; Narang, Ankita; Biernaskie, Jeff; Sinha, Sarthak; Adjekukor, Cynthia; Childs, Sarah J.
    Description

    Brain pericytes are one of the critical cell types that regulate endothelial barrier function and activity, thus ensuring adequate blood flow to the brain. The genetic pathways guiding undifferentiated cells into mature pericytes are not well understood. We show here that pericyte precursor populations from both neural crest and head mesoderm of zebrafish express the transcription factor nkx3.1 develop into brain pericytes. We identify the gene signature of these precursors and show that an nkx3.1-, foxf2a-, and cxcl12b-expressing pericyte precursor population is present around the basilar artery prior to artery formation and pericyte recruitment. The precursors later spread throughout the brain and differentiate to express canonical pericyte markers. Cxcl12b-Cxcr4 signaling is required for pericyte attachment and differentiation. Further, both nkx3.1 and cxcl12b are necessary and sufficient in regulating pericyte number as loss inhibits and gain increases pericyte number. Through genetic experiments, we have defined a precursor population for brain pericytes and identified genes critical for their differentiation.

  3. B

    Yield to the Data: Some Perspective on Crop Productivity and Pesticides -...

    • borealisdata.ca
    • search.dataone.org
    Updated Dec 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nicole Washuck; Mark Hanson; Ryan Prosser (2024). Yield to the Data: Some Perspective on Crop Productivity and Pesticides - Excel user form [Dataset]. http://doi.org/10.5683/SP3/RDQWIK
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 3, 2024
    Dataset provided by
    Borealis
    Authors
    Nicole Washuck; Mark Hanson; Ryan Prosser
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jun 2021 - Dec 2021
    Area covered
    North America
    Dataset funded by
    Natural Sciences and Engineering Research Council of Canada
    Description

    The hectares of habitat protected and the number of adults and children fed in one year were calculated for each of the six crop types for Canada and United States. The calculations were based on the 50th centile of the cumulative frequency distributions of change in crop yield due to pesticide treatment for each crop type. An editable interactive table was created using Microsoft Excel that would allow individuals to determine how pesticide treatment in their selected jurisdiction (province in Canada or state in the United States) and crop translates into habitat saved, calories produced, and mouths fed. This table allows the user to choose the country (Canada or United States), whether to include the organic agriculture correction factor, their state or province of interest, crop, and whether a young child, adolescent child, adult women, or adult man is being fed. The table will then calculate the hectares of habitat saved, added number of calories produced (kcal), the number of individual fed in one day, and the number of individual fed in one year. Due to the variability in yield results between crops and studies, the Excel user form allows individuals to set whichever yield increase they anticipate observing or use the 50th centile of yield increase from the cumulative frequency distribution for each crop.

  4. Coffee Shop Sales Analysis

    • kaggle.com
    Updated Apr 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Monis Amir (2024). Coffee Shop Sales Analysis [Dataset]. https://www.kaggle.com/datasets/monisamir/coffee-shop-sales-analysis
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 25, 2024
    Dataset provided by
    Kaggle
    Authors
    Monis Amir
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Analyzing Coffee Shop Sales: Excel Insights šŸ“ˆ

    In my first Data Analytics Project, I Discover the secrets of a fictional coffee shop's success with my data-driven analysis. By Analyzing a 5-sheet Excel dataset, I've uncovered valuable sales trends, customer preferences, and insights that can guide future business decisions. šŸ“Šā˜•

    DATA CLEANING 🧹

    • REMOVED DUPLICATES OR IRRELEVANT ENTRIES: Thoroughly eliminated duplicate records and irrelevant data to refine the dataset for analysis.

    • FIXED STRUCTURAL ERRORS: Rectified any inconsistencies or structural issues within the data to ensure uniformity and accuracy.

    • CHECKED FOR DATA CONSISTENCY: Verified the integrity and coherence of the dataset by identifying and resolving any inconsistencies or discrepancies.

    DATA MANIPULATION šŸ› ļø

    • UTILIZED LOOKUPS: Used Excel's lookup functions for efficient data retrieval and analysis.

    • IMPLEMENTED INDEX MATCH: Leveraged the Index Match function to perform advanced data searches and matches.

    • APPLIED SUMIFS FUNCTIONS: Utilized SumIFs to calculate totals based on specified criteria.

    • CALCULATED PROFITS: Used relevant formulas and techniques to determine profit margins and insights from the data.

    PIVOTING THE DATA š„œ

    • CREATED PIVOT TABLES: Utilized Excel's PivotTable feature to pivot the data for in-depth analysis.

    • FILTERED DATA: Utilized pivot tables to filter and analyze specific subsets of data, enabling focused insights. Specially used in ā€œPEAK HOURSā€ and ā€œTOP 3 PRODUCTSā€ charts.

    VISUALIZATION šŸ“Š

    • KEY INSIGHTS: Unveiled the grand total sales revenue while also analyzing the average bill per person, offering comprehensive insights into the coffee shop's performance and customer spending habits.

    • SALES TREND ANALYSIS: Used Line chart to compute total sales across various time intervals, revealing valuable insights into evolving sales trends.

    • PEAK HOUR ANALYSIS: Leveraged Clustered Column chart to identify peak sales hours, shedding light on optimal operating times and potential staffing needs.

    • TOP 3 PRODUCTS IDENTIFICATION: Utilized Clustered Bar chart to determine the top three coffee types, facilitating strategic decisions regarding inventory management and marketing focus.

    *I also used a Timeline to visualize chronological data trends and identify key patterns over specific times.

    While it's a significant milestone for me, I recognize that there's always room for growth and improvement. Your feedback and insights are invaluable to me as I continue to refine my skills and tackle future projects. I'm eager to hear your thoughts and suggestions on how I can make my next endeavor even more impactful and insightful.

    THANKS TO: WsCube Tech Mo Chen Alex Freberg

    TOOLS USED: Microsoft Excel

    DataAnalytics #DataAnalyst #ExcelProject #DataVisualization #BusinessIntelligence #SalesAnalysis #DataAnalysis #DataDrivenDecisions

  5. N

    Excel, AL Age Group Population Dataset: A Complete Breakdown of Excel Age...

    • neilsberg.com
    csv, json
    Updated Jul 24, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2024). Excel, AL Age Group Population Dataset: A Complete Breakdown of Excel Age Demographics from 0 to 85 Years and Over, Distributed Across 18 Age Groups // 2024 Edition [Dataset]. https://www.neilsberg.com/research/datasets/aa8c95e0-4983-11ef-ae5d-3860777c1fe6/
    Explore at:
    csv, jsonAvailable download formats
    Dataset updated
    Jul 24, 2024
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Excel, Alabama
    Variables measured
    Population Under 5 Years, Population over 85 years, Population Between 5 and 9 years, Population Between 10 and 14 years, Population Between 15 and 19 years, Population Between 20 and 24 years, Population Between 25 and 29 years, Population Between 30 and 34 years, Population Between 35 and 39 years, Population Between 40 and 44 years, and 9 more
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the age groups. For age groups we divided it into roughly a 5 year bucket for ages between 0 and 85. For over 85, we aggregated data into a single group for all ages. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the Excel population distribution across 18 age groups. It lists the population in each age group along with the percentage population relative of the total population for Excel. The dataset can be utilized to understand the population distribution of Excel by age. For example, using this dataset, we can identify the largest age group in Excel.

    Key observations

    The largest age group in Excel, AL was for the group of age 45 to 49 years years with a population of 74 (15.64%), according to the ACS 2018-2022 5-Year Estimates. At the same time, the smallest age group in Excel, AL was the 85 years and over years with a population of 2 (0.42%). Source: U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates

    Age groups:

    • Under 5 years
    • 5 to 9 years
    • 10 to 14 years
    • 15 to 19 years
    • 20 to 24 years
    • 25 to 29 years
    • 30 to 34 years
    • 35 to 39 years
    • 40 to 44 years
    • 45 to 49 years
    • 50 to 54 years
    • 55 to 59 years
    • 60 to 64 years
    • 65 to 69 years
    • 70 to 74 years
    • 75 to 79 years
    • 80 to 84 years
    • 85 years and over

    Variables / Data Columns

    • Age Group: This column displays the age group in consideration
    • Population: The population for the specific age group in the Excel is shown in this column.
    • % of Total Population: This column displays the population of each age group as a proportion of Excel total population. Please note that the sum of all percentages may not equal one due to rounding of values.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Excel Population by Age. You can refer the same here

  6. Random Test Data

    • figshare.com
    bin
    Updated Jan 19, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stefan Proell (2016). Random Test Data [Dataset]. http://doi.org/10.6084/m9.figshare.1096255.v2
    Explore at:
    binAvailable download formats
    Dataset updated
    Jan 19, 2016
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Stefan Proell
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is a test

  7. Sales and workload in retail industry

    • kaggle.com
    zip
    Updated Dec 11, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dennis Gluesenkamp (2019). Sales and workload in retail industry [Dataset]. https://www.kaggle.com/dgluesen/sales-and-workload-data-from-retail-industry
    Explore at:
    zip(454426 bytes)Available download formats
    Dataset updated
    Dec 11, 2019
    Authors
    Dennis Gluesenkamp
    Description

    Context

    Raw data of real analytical use cases in a number of industries and companies is frequently provided in an Excel-based form. These files usually cannot be processed directly in machine learning models, but must first be cleaned and preprocessed. In this procedure, many different types of pitfalls may occur. This makes data preprocessing an essential time factor in the daily work of a data scientist.

    Here, an Excel spreadsheet will be presented which in this form is closely oriented to a real case but contains only simulated figures for reasons of data and business results protection. The form and structure of the file correspond to a real case and could be encountered by a data scientist in a company in this way. Such a file can be the result of a download from a financial controlling system, e.g. SAP.

    Content

    The data includes information about sold goods resp. product units, the associated turnover and hours worked. This information is grouped by month, store and department of the retailer. Moreover, information about the sales area in a specific department as well as about the opening hours of the store is provided.

    Possible objectives

    The following goals of data cleansing might be addressed:

    • Import the Excel-file
    • Inspect the dataset
    • Check data types and do meaningful modifications
    • Handle missings/data gaps
    • Find and solve data inconsistencies
    • Rename columns for improved usage
    • Join tables to a single one

    Furthermore, the data can be investigated with regard to correlations between different features and/or a regression model.

    License

    GNU General Public License v3.0 - https://www.gnu.org/licenses/gpl-3.0.en.html

  8. G

    Utah FORGE: Milford Triaxial Test Data and Summary from EGI labs

    • gdr.openei.org
    • data.openei.org
    • +2more
    data
    Updated Mar 1, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Joe Moore; Joe Moore (2016). Utah FORGE: Milford Triaxial Test Data and Summary from EGI labs [Dataset]. http://doi.org/10.15121/1406605
    Explore at:
    dataAvailable download formats
    Dataset updated
    Mar 1, 2016
    Dataset provided by
    Energy and Geoscience Institute at the University of Utah
    USDOE Office of Energy Efficiency and Renewable Energy (EERE), Renewable Power Office. Geothermal Technologies Program (EE-4G)
    Geothermal Data Repository
    Authors
    Joe Moore; Joe Moore
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Six samples were evaluated in unconfined and triaxial compression, their data are included in separate excel spreadsheets, and summarized in the word document. Three samples were plugged along the axis of the core (presumed to be nominally vertical) and three samples were plugged perpendicular to the axis of the core. A designation of "V"indicates vertical or the long axis of the plugged sample is aligned with the axis of the core. Similarly, "H" indicates a sample that is nominally horizontal and cut orthogonal to the axis of the core. Stress-strain curves were made before and after the testing, and are included in the word doc. The confining pressure for this test was 2800 psi. A series of tests are being carried out on to define a failure envelope, to provide representative hydraulic fracture design parameters and for future geomechanical assessments. The samples are from well 52-21, which reaches a maximum depth of 3581 ft +/- 2 ft into a gneiss complex.

  9. test data csv

    • kaggle.com
    zip
    Updated Dec 26, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    PB&J (2021). test data csv [Dataset]. https://www.kaggle.com/datasets/pbatch21/test-data-csv
    Explore at:
    zip(842 bytes)Available download formats
    Dataset updated
    Dec 26, 2021
    Authors
    PB&J
    Description

    Dataset

    This dataset was created by PB&J

    Contents

  10. H

    Relaxed NaĆÆve Bayes Data

    • dataverse.harvard.edu
    Updated Aug 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Relaxed NaĆÆve Bayes Team (2023). Relaxed NaĆÆve Bayes Data [Dataset]. http://doi.org/10.7910/DVN/7KNKLL
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 7, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Relaxed NaĆÆve Bayes Team
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    NaiveBayes_R.xlsx: This Excel file includes information as to how probabilities of observed features are calculated given recidivism (P(x_ij│R)) in the training data. Each cell is embedded with an Excel function to render appropriate figures. P(Xi|R): This tab contains probabilities of feature attributes among recidivated offenders. NIJ_Recoded: This tab contains re-coded NIJ recidivism challenge data following our coding schema described in Table 1. Recidivated_Train: This tab contains re-coded features of recidivated offenders. Tabs from [Gender] through [Condition_Other]: Each tab contains probabilities of feature attributes given recidivism. We use these conditional probabilities to replace the raw values of each feature in P(Xi|R) tab. NaiveBayes_NR.xlsx: This Excel file includes information as to how probabilities of observed features are calculated given non-recidivism (P(x_ij│N)) in the training data. Each cell is embedded with an Excel function to render appropriate figures. P(Xi|N): This tab contains probabilities of feature attributes among non-recidivated offenders. NIJ_Recoded: This tab contains re-coded NIJ recidivism challenge data following our coding schema described in Table 1. NonRecidivated_Train: This tab contains re-coded features of non-recidivated offenders. Tabs from [Gender] through [Condition_Other]: Each tab contains probabilities of feature attributes given non-recidivism. We use these conditional probabilities to replace the raw values of each feature in P(Xi|N) tab. Training_LnTransformed.xlsx: Figures in each cell are log-transformed ratios of probabilities in NaiveBayes_R.xlsx (P(Xi|R)) to the probabilities in NaiveBayes_NR.xlsx (P(Xi|N)). TestData.xlsx: This Excel file includes the following tabs based on the test data: P(Xi|R), P(Xi|N), NIJ_Recoded, and Test_LnTransformed (log-transformed P(Xi|R)/ P(Xi|N)). Training_LnTransformed.dta: We transform Training_LnTransformed.xlsx to Stata data set. We use Stat/Transfer 13 software package to transfer the file format. StataLog.smcl: This file includes the results of the logistic regression analysis. Both estimated intercept and coefficient estimates in this Stata log correspond to the raw weights and standardized weights in Figure 1. Brier Score_Re-Check.xlsx: This Excel file recalculates Brier scores of Relaxed NaĆÆve Bayes Classifier in Table 3, showing evidence that results displayed in Table 3 are correct. *****Full List***** NaiveBayes_R.xlsx NaiveBayes_NR.xlsx Training_LnTransformed.xlsx TestData.xlsx Training_LnTransformed.dta StataLog.smcl Brier Score_Re-Check.xlsx Data for Weka (Training Set): Bayes_2022_NoID Data for Weka (Test Set): BayesTest_2022_NoID Weka output for machine learning models (Conventional naĆÆve Bayes, AdaBoost, Multilayer Perceptron, Logistic Regression, and Random Forest)

  11. a

    Thingvellir Site Excel Data [Magnusson]

    • arcticdata.io
    • search.dataone.org
    Updated Sep 28, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Borgthor Magnusson (2020). Thingvellir Site Excel Data [Magnusson] [Dataset]. https://arcticdata.io/catalog/view/urn%3Auuid%3A52acd80e-4ec4-40b1-bb87-50b69637f512
    Explore at:
    Dataset updated
    Sep 28, 2020
    Dataset provided by
    Arctic Data Center
    Authors
    Borgthor Magnusson
    Time period covered
    Jun 28, 1995 - Jun 22, 2000
    Area covered
    Description

    The ITEX experiment at Thingvellir was set up in 1995 when control and OTC plots 1-10 were set up. Sampling of plots was then repeated in 1996, 1998 and 2000. The sampling was limited to recording of species. This dataset is in excel format. For more information, please see the readme file.

  12. u

    US Toolik Site 2, Cover Community Excel Data [Bret-Harte]

    • data.ucar.edu
    • search.dataone.org
    • +1more
    excel
    Updated Oct 7, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marion Syndonia Bret-Harte (2025). US Toolik Site 2, Cover Community Excel Data [Bret-Harte] [Dataset]. http://doi.org/10.5065/D6F18WX6
    Explore at:
    excelAvailable download formats
    Dataset updated
    Oct 7, 2025
    Authors
    Marion Syndonia Bret-Harte
    Time period covered
    Jan 1, 1995 - Dec 31, 1995
    Area covered
    Description

    This dataset contains cover community data from the US TOOL2 site, Alaska in 1995. This dataset is in excel format. For more information, please see the readme file.

  13. d

    Navigating Stats Can Data & Scrubbing Data Clean with Excel Workshop

    • search.dataone.org
    • borealisdata.ca
    Updated Jul 31, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Costanzo, Lucia; Jadon, Vivek (2024). Navigating Stats Can Data & Scrubbing Data Clean with Excel Workshop [Dataset]. http://doi.org/10.5683/SP3/FF6AI9
    Explore at:
    Dataset updated
    Jul 31, 2024
    Dataset provided by
    Borealis
    Authors
    Costanzo, Lucia; Jadon, Vivek
    Description

    Ahoy, data enthusiasts! Join us for a hands-on workshop where you will hoist your sails and navigate through the Statistics Canada website, uncovering hidden treasures in the form of data tables. With the wind at your back, you’ll master the art of downloading these invaluable Stats Can datasets while braving the occasional squall of data cleaning challenges using Excel with your trusty captains Vivek and Lucia at the helm.

  14. u

    US Toolik Site 2, Species Excel Data [Bret-Harte]

    • data.ucar.edu
    • arcticdata.io
    • +2more
    excel
    Updated Oct 7, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marion Syndonia Bret-Harte (2025). US Toolik Site 2, Species Excel Data [Bret-Harte] [Dataset]. http://doi.org/10.5065/D6V40SDZ
    Explore at:
    excelAvailable download formats
    Dataset updated
    Oct 7, 2025
    Authors
    Marion Syndonia Bret-Harte
    Time period covered
    Jan 1, 1995 - Dec 31, 1995
    Area covered
    Description

    This dataset contains the species data from the US TOOL2 site, Alaska in 1995. This dataset is in excel format. For more information, please see the readme file.

  15. Data from: Pilot Testing of SHRP 2 Reliability Data and Analytical Products:...

    • catalog.data.gov
    • data.bts.gov
    • +2more
    Updated Dec 7, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Federal Highway Administration (2023). Pilot Testing of SHRP 2 Reliability Data and Analytical Products: Minnesota [supporting datasets] [Dataset]. https://catalog.data.gov/dataset/pilot-testing-of-shrp-2-reliability-data-and-analytical-products-minnesota-supporting-data
    Explore at:
    Dataset updated
    Dec 7, 2023
    Dataset provided by
    Federal Highway Administrationhttps://highways.dot.gov/
    Description

    The objective of this project was to develop system designs for programs to monitor travel time reliability and to prepare a guidebook that practitioners and others can use to design, build, operate, and maintain such systems. Generally, such travel time reliability monitoring systems will be built on top of existing traffic monitoring systems. The focus of this project was on travel time reliability. The data from the monitoring systems developed in this project – from both public and private sources –included, wherever cost-effective, information on the seven sources of non-recurring congestion. This data was used to construct performance measures or to perform various types of analyses useful for operations management as well as performance measurement, planning, and programming. The datasets in the attached ZIP file support SHRP 2 reliability project L38B, "Pilot testing of SHRP 2 reliability data and analytical products: Minnesota." This report can be accessed via the following URL: https://rosap.ntl.bts.gov/view/dot/3608 This ZIP file package, which is 22.1 MB in size, contains 6 Microsoft Excel spreadsheet files (XLSX). This file package also contains 3 Comma Separated Value files (CSV). The XLSX and CSV files can be opened using Microsoft Excel 2010 and 2016. The CSV files can be opened using most available text editing programs.

  16. d

    NDVI data (Excel) [Oberbauer]

    • search.dataone.org
    • data.ucar.edu
    Updated Oct 22, 2016
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arctic Data Center (2016). NDVI data (Excel) [Oberbauer] [Dataset]. https://search.dataone.org/view/urn%3Auuid%3Aa4247f32-efd9-484f-9947-e885b5bb28b0
    Explore at:
    Dataset updated
    Oct 22, 2016
    Dataset provided by
    Arctic Data Center
    Time period covered
    Jul 8, 1999 - Aug 15, 1999
    Area covered
    Description

    This dataset contains Normalized Difference Vegetation Index (NDVI) images of the 1999 growing season of the Toolik Lake Field station to document differences in on study site in control and treatment plots. For more information, please see the readme file. NOTE: This dataset contains the data in EXCEL format.

  17. f

    Excel spreadsheet containing detailed data matrices supporting all figures...

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Jun 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zhuang, Kai; Wang, Shuzhong; Li, Huifang; Can, Dan; Zhang, Jie; Chen, Shaokun; Lin, Raozhou; Chen, Erqu; Li, Jing; Zhou, Jiechao; Liang, Chensi (2025). Excel spreadsheet containing detailed data matrices supporting all figures in the study. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0002093886
    Explore at:
    Dataset updated
    Jun 30, 2025
    Authors
    Zhuang, Kai; Wang, Shuzhong; Li, Huifang; Can, Dan; Zhang, Jie; Chen, Shaokun; Lin, Raozhou; Chen, Erqu; Li, Jing; Zhou, Jiechao; Liang, Chensi
    Description

    Complete annotations for the tabular data are presented below. Tab Fig 1: (A) The heatmap data of G protein family members in the hippocampal tissue of 6-month-old Wildtype (n = 6) and 5xFAD (n = 6) mice; (B) The heatmap data of G protein family members in the cortical tissue of 6-month-old Wildtype (n = 6) and 5xFAD (n = 6) mice; (C) The data in the overlapping part of the Venn diagram (132 elements); (D) The data information for creating volcano plot; (E) The data information for creating heatmap of GPCR-related DEGs; (F) Expression of Gnb5 in the large sample dataset GSE44772; Control, n = 303; AD, n = 387; (H) Statistical analysis of Gnb5 protein levels from panel G; Wildtype, n = 4; 5xFAD, n = 4; (J) Statistical analysis of Gnb5 protein levels from panel I; Wildtype, n = 4; 5xFAD, n = 4; (L) Quantitative analysis of Gnb5 fluorescence intensity in 5xFAD and Wildtype groups; Wildtype, n = 4; 5xFAD, n = 4. Tab Fig 2: (D) qPCR data of Gnb5 knockout in hippocampal tissue; Gnb5F/F, n = 6; Gnb5-CCKO, n = 6; (E–I, L–N) Animal behavioral tests in mice, Gnb5F/F, n = 22; Gnb5-CCKO, n = 16; (E) Total distance traveled in the open field experiment; (F) Training curve in the Morris water maze (MWM); (F-day6) Data from the sixth day of MWM training; (G) Percentage of time spent by the mouse in the target quadrant in the MWM; (H) Statistical analysis of the number of times the mouse traverses the target quadrant in the MWM; (I) Latency to first reach the target quadrant in the MWM; (L) Baseline freezing percentage of mice in an identical testing context; (M) Percentage of freezing time of mice during the Context phase; (N) Percentage of freezing time of mice during the Cue phase. Tab Fig 3: (D–F, H) MWM tests in mice; Wildtype+AAV-GFP, n = 20; Wildtype+AAV-Gnb5-GFP, n = 23; 5xFAD + AAV-GFP, n = 23; 5xFAD + AAV-Gnb5-GFP, n = 26; (D) Training curve in the MWM; (D-day6) Data from the sixth day of MWM training; (E) Percentage of time spent in the target quadrant in the MWM; (F) Statistical analysis of the number of entries in the target quadrant in the MWM; (H) Movement speed of mice in the MWM; (I–K) The contextual fear conditioning test in mice; 5xFAD + AAV-GFP, n = 23; 5xFAD + AAV-Gnb5-GFP, n = 26; (I) Baseline freezing percentage of mice in an identical testing context; (J) Percentage of freezing time of mice during the Context phase; (K) Percentage of freezing time of mice during the Cue phase; (L) Total distance traveled in the open field test; (M) Percentage of time spent in the center area during the open field test. Tab Fig 4: (B, C) Quantification of Aβ plaques in the hippocampus sections from Wildtype and 5xFAD mice injected with either AAV-Gnb5 or AAV-GFP; Wildtype+AAV-GFP, n = 4; Wildtype+AAV-Gnb5-GFP, n = 4; 5xFAD + AAV-GFP, n = 4; 5xFAD + AAV-Gnb5-GFP, n = 4; (B) Quantification of Aβ plaques number; (C) Quantification of Aβ plaques size; (F, G) Quantification of Aβ pylaques from indicted mice lines; WT&Gnb5F/F&CamKIIa-CreERT+Vehicle, n = 4; 5xFAD&Gnb5F/F&CamKIIa-CreERT+Vehicle, n = 4; 5xFAD&Gnb5F/F&CamKIIa-CreERT+Tamoxifen, n = 4; (F) Quantification of Aβ plaque size; (G) Quantification of Aβ plaque number. Tab Fig 5: (B) Overexpression of Gnb5-AAV in 5xFAD mice affects the expression of proteins related to APP cleavage (BACE1, β-CTF, Nicastrin and APP); Statistical analysis of protein levels; n = 4, respectively; (D) Tamoxifen-induced Gnb5 knockdown in 5xFAD mice affects APP-cleaving proteins; Statistical analysis of protein levels; n = 4, respectively; (F) Gnb5-CCKO mice show altered expression of APP-cleaving proteins; Statistical analysis of protein levels; n = 6, respectively. Tab Fig 7: (C, D) Quantification of Aβ plaques in the overexpressed full-length Gnb5, truncated fragments, and mutant truncated fragment AAV in 5xFAD mice; n = 4, respectively; (C) Quantification of Aβ plaques size; (D) Quantification of Aβ plaques number; (F) Effect of overexpressing full-length Gnb5, truncated fragments, and mutant truncated fragment viruses on the expression of proteins related to APP cleavage process in 5xFAD; Statistical analysis of protein levels; n = 3, respectively. (XLSX)

  18. Government Equalities Office spend data July 2012 (Excel format)

    • gov.uk
    Updated Oct 4, 2012
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Government Equalities Office (2012). Government Equalities Office spend data July 2012 (Excel format) [Dataset]. https://www.gov.uk/government/publications/government-equalities-office-spend-data-july-2012-excel-format
    Explore at:
    Dataset updated
    Oct 4, 2012
    Dataset provided by
    GOV.UKhttp://gov.uk/
    Authors
    Government Equalities Office
    Description

    Government Equalities Office spend data July 2012 (Excel format)

    Date: Thu Oct 04 14:17:24 BST 2012

    Full Document

  19. Multimode Passive BMS Data Analysis

    • figshare.com
    application/x-rar
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andrej Brandis (2023). Multimode Passive BMS Data Analysis [Dataset]. http://doi.org/10.6084/m9.figshare.20059844.v1
    Explore at:
    application/x-rarAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Andrej Brandis
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data analysis of the project Multimode Capable Passive BMS

  20. m

    Raw data outputs 1-18

    • bridges.monash.edu
    • researchdata.edu.au
    xlsx
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abbas Salavaty Hosein Abadi; Sara Alaei; Mirana Ramialison; Peter Currie (2023). Raw data outputs 1-18 [Dataset]. http://doi.org/10.26180/21259491.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Monash University
    Authors
    Abbas Salavaty Hosein Abadi; Sara Alaei; Mirana Ramialison; Peter Currie
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Raw data outputs 1-18 Raw data output 1. Differentially expressed genes in AML CSCs compared with GTCs as well as in TCGA AML cancer samples compared with normal ones. This data was generated based on the results of AML microarray and TCGA data analysis. Raw data output 2. Commonly and uniquely differentially expressed genes in AML CSC/GTC microarray and TCGA bulk RNA-seq datasets. This data was generated based on the results of AML microarray and TCGA data analysis. Raw data output 3. Common differentially expressed genes between training and test set samples the microarray dataset. This data was generated based on the results of AML microarray data analysis. Raw data output 4. Detailed information on the samples of the breast cancer microarray dataset (GSE52327) used in this study. Raw data output 5. Differentially expressed genes in breast CSCs compared with GTCs as well as in TCGA BRCA cancer samples compared with normal ones. Raw data output 6. Commonly and uniquely differentially expressed genes in breast cancer CSC/GTC microarray and TCGA BRCA bulk RNA-seq datasets. This data was generated based on the results of breast cancer microarray and TCGA BRCA data analysis. CSC, and GTC are abbreviations of cancer stem cell, and general tumor cell, respectively. Raw data output 7. Differential and common co-expression and protein-protein interaction of genes between CSC and GTC samples. This data was generated based on the results of AML microarray and STRING database-based protein-protein interaction data analysis. CSC, and GTC are abbreviations of cancer stem cell, and general tumor cell, respectively. Raw data output 8. Differentially expressed genes between AML dormant and active CSCs. This data was generated based on the results of AML scRNA-seq data analysis. Raw data output 9. Uniquely expressed genes in dormant or active AML CSCs. This data was generated based on the results of AML scRNA-seq data analysis. Raw data output 10. Intersections between the targeting transcription factors of AML key CSC genes and differentially expressed genes between AML CSCs vs GTCs and between dormant and active AML CSCs or the uniquely expressed genes in either class of CSCs. Raw data output 11. Targeting desirableness score of AML key CSC genes and their targeting transcription factors. These scores were generated based on an in-house scoring function described in the Methods section. Raw data output 12. CSC-specific targeting desirableness score of AML key CSC genes and their targeting transcription factors. These scores were generated based on an in-house scoring function described in the Methods section. Raw data output 13. The protein-protein interactions between AML key CSC genes with themselves and their targeting transcription factors. This data was generated based on the results of AML microarray and STRING database-based protein-protein interaction data analysis. Raw data output 14. The previously confirmed associations of genes having the highest targeting desirableness and CSC-specific targeting desirableness scores with AML or other cancers’ (stem) cells as well as hematopoietic stem cells. These data were generated based on a PubMed database-based literature mining. Raw data output 15. Drug score of available drugs and bioactive small molecules targeting AML key CSC genes and/or their targeting transcription factors. These scores were generated based on an in-house scoring function described in the Methods section. Raw data output 16. CSC-specific drug score of available drugs and bioactive small molecules targeting AML key CSC genes and/or their targeting transcription factors. These scores were generated based on an in-house scoring function described in the Methods section. Raw data output 17. Candidate drugs for experimental validation. These drugs were selected based on their respective (CSC-specific) drug scores. CSC is the abbreviation of cancer stem cell. Raw data output 18. Detailed information on the samples of the AML microarray dataset GSE30375 used in this study.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Eric A. Welsh; Paul A. Stewart; Brent M. Kuenzi; James A. Eschrich (2023). Escape Excel: A tool for preventing gene symbol and accession conversion errors [Dataset]. http://doi.org/10.1371/journal.pone.0185207
Organization logo

Escape Excel: A tool for preventing gene symbol and accession conversion errors

Explore at:
4 scholarly articles cite this dataset (View in Google Scholar)
xlsxAvailable download formats
Dataset updated
Jun 5, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Eric A. Welsh; Paul A. Stewart; Brent M. Kuenzi; James A. Eschrich
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

BackgroundMicrosoft Excel automatically converts certain gene symbols, database accessions, and other alphanumeric text into dates, scientific notation, and other numerical representations. These conversions lead to subsequent, irreversible, corruption of the imported text. A recent survey of popular genomic literature estimates that one-fifth of all papers with supplementary gene lists suffer from this issue.ResultsHere, we present an open-source tool, Escape Excel, which prevents these erroneous conversions by generating an escaped text file that can be safely imported into Excel. Escape Excel is implemented in a variety of formats (http://www.github.com/pstew/escape_excel), including a command line based Perl script, a Windows-only Excel Add-In, an OS X drag-and-drop application, a simple web-server, and as a Galaxy web environment interface. Test server implementations are accessible as a Galaxy interface (http://apostl.moffitt.org) and simple non-Galaxy web server (http://apostl.moffitt.org:8000/).ConclusionsEscape Excel detects and escapes a wide variety of problematic text strings so that they are not erroneously converted into other representations upon importation into Excel. Examples of problematic strings include date-like strings, time-like strings, leading zeroes in front of numbers, and long numeric and alphanumeric identifiers that should not be automatically converted into scientific notation. It is hoped that greater awareness of these potential data corruption issues, together with diligent escaping of text files prior to importation into Excel, will help to reduce the amount of Excel-corrupted data in scientific analyses and publications.

Search
Clear search
Close search
Google apps
Main menu