Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Categorical scatterplots with R for biologists: a step-by-step guide
Benjamin Petre1, Aurore Coince2, Sophien Kamoun1
1 The Sainsbury Laboratory, Norwich, UK; 2 Earlham Institute, Norwich, UK
Weissgerber and colleagues (2015) recently stated that ‘as scientists, we urgently need to change our practices for presenting continuous data in small sample size studies’. They called for more scatterplot and boxplot representations in scientific papers, which ‘allow readers to critically evaluate continuous data’ (Weissgerber et al., 2015). In the Kamoun Lab at The Sainsbury Laboratory, we recently implemented a protocol to generate categorical scatterplots (Petre et al., 2016; Dagdas et al., 2016). Here we describe the three steps of this protocol: 1) formatting of the data set in a .csv file, 2) execution of the R script to generate the graph, and 3) export of the graph as a .pdf file.
Protocol
• Step 1: format the data set as a .csv file. Store the data in a three-column excel file as shown in Powerpoint slide. The first column ‘Replicate’ indicates the biological replicates. In the example, the month and year during which the replicate was performed is indicated. The second column ‘Condition’ indicates the conditions of the experiment (in the example, a wild type and two mutants called A and B). The third column ‘Value’ contains continuous values. Save the Excel file as a .csv file (File -> Save as -> in ‘File Format’, select .csv). This .csv file is the input file to import in R.
• Step 2: execute the R script (see Notes 1 and 2). Copy the script shown in Powerpoint slide and paste it in the R console. Execute the script. In the dialog box, select the input .csv file from step 1. The categorical scatterplot will appear in a separate window. Dots represent the values for each sample; colors indicate replicates. Boxplots are superimposed; black dots indicate outliers.
• Step 3: save the graph as a .pdf file. Shape the window at your convenience and save the graph as a .pdf file (File -> Save as). See Powerpoint slide for an example.
Notes
• Note 1: install the ggplot2 package. The R script requires the package ‘ggplot2’ to be installed. To install it, Packages & Data -> Package Installer -> enter ‘ggplot2’ in the Package Search space and click on ‘Get List’. Select ‘ggplot2’ in the Package column and click on ‘Install Selected’. Install all dependencies as well.
• Note 2: use a log scale for the y-axis. To use a log scale for the y-axis of the graph, use the command line below in place of command line #7 in the script.
replicates
graph + geom_boxplot(outlier.colour='black', colour='black') + geom_jitter(aes(col=Replicate)) + scale_y_log10() + theme_bw()
References
Dagdas YF, Belhaj K, Maqbool A, Chaparro-Garcia A, Pandey P, Petre B, et al. (2016) An effector of the Irish potato famine pathogen antagonizes a host autophagy cargo receptor. eLife 5:e10856.
Petre B, Saunders DGO, Sklenar J, Lorrain C, Krasileva KV, Win J, et al. (2016) Heterologous Expression Screens in Nicotiana benthamiana Identify a Candidate Effector of the Wheat Yellow Rust Pathogen that Associates with Processing Bodies. PLoS ONE 11(2):e0149035
Weissgerber TL, Milic NM, Winham SJ, Garovic VD (2015) Beyond Bar and Line Graphs: Time for a New Data Presentation Paradigm. PLoS Biol 13(4):e1002128
Facebook
TwitterThe Florida Flood Hub for Applied Research and Innovation and the U.S. Geological Survey have developed projected future change factors for precipitation depth-duration-frequency (DDF) curves at 242 National Oceanic and Atmospheric Administration (NOAA) Atlas 14 stations in Florida. The change factors were computed as the ratio of projected future to historical extreme-precipitation depths fitted to extreme-precipitation data from downscaled climate datasets using a constrained maximum likelihood (CML) approach as described in https://doi.org/10.3133/sir20225093. The change factors correspond to the periods 2020-59 (centered in the year 2040) and 2050-89 (centered in the year 2070) as compared to the 1966-2005 historical period. An R script (create_boxplot.R) is provided which generates boxplots of change factors for a NOAA Atlas 14 station, or for all NOAA Atlas 14 stations in a Florida HUC-8 basin or county for durations of interest (1, 3, and 7 days, or combinations thereof) and return periods of interest (5, 10, 25, 50, 100, 200, and 500 years, or combinations thereof). The user also has the option of requesting that the script save the raw change factor data used to generate the boxplots, as well as the processed quantile and outlier data shown in the figure. The script allows the user to modify the percentiles used in generating the boxplots. A Microsoft Word file documenting code usage and available options is also provided within this data release (Documentation_R_script_create_boxplot.docx). As described in the documentation, the R script relies on some of the Microsoft Excel spreadsheets published as part of this data release. The script uses basins defined in the "Florida Hydrologic Unit Code (HUC) Basins (areas)" from the Florida Department of Environmental Protection (FDEP; https://geodata.dep.state.fl.us/datasets/FDEP::florida-hydrologic-unit-code-huc-basins-areas/explore) and their names are listed in the file basins_list.txt provided with the script. County names are listed in the file counties_list.txt provided with the script. NOAA Atlas 14 stations located in each Florida HUC-8 basin or county are defined in the Microsoft Excel spreadsheet Datasets_station_information.xlsx which is part of this data release. Instructions are provided in code documentation (see highlighted text on page 7 of Documentation_R_script_create_boxplot.docx) so that users can modify the script to generate boxplots for basins different from the FDEP "lorida Hydrologic Unit Code (HUC) Basins (areas).
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This repository contains the replication package for the article "Hazardous Times: Animal Spirits and U.S. Recession Probabilities." It includes all necessary R code, raw data, and processed data in the start-stop (counting process) format required to reproduce the empirical results, tables, and figures from the study. Project Description: The study assembles monthly U.S. macroeconomic time series from the Federal Reserve Economic Data (FRED) and related sources—covering labor market conditions, consumer sentiment, term spreads, and credit spreads—and implements a novel "high water mark" methodology to measure the lead times with which these indicators signal NBER-dated recessions. Contents: Code: R scripts for data cleaning, multiple imputation, survival analysis, and figure/table generation. A top-level master script (run_all.R) executes the entire analytical pipeline end-to-end. Data: Raw/: Original data pulls from primary sources. Analysis_Ready/: Cleaned series, constructed cycle-specific extremes (high water marks), lead time variables, and the final start-stop dataset for survival analysis. The final curated Excel workbooks used as direct inputs for the replication code. (Note: These Excel sheets must be saved as separate .xlsx files in the designated directory before running the R code.) Documentation: This README file and detailed comments within the code. Key Details: Software Requirements: The replication code is written in R. A list of required R packages (with versions) is provided in the reference list of the article. Missing Data: Addressed via Multiple Imputation by Chained Equations (MICE). License: The original raw data from FRED is subject to its own terms of use, which require citation. The R code is released under the MIT License. All processed data, constructed variables, and analysis-ready datasets created by the author are dedicated to the public domain under the CC0 1.0 Universal Public Domain Dedication. Instructions: Download the entire repository. Install the required R packages. Save Excel sheets from the workbook “Hazardous_Times_Data.xlsx” as separate .xlsx files in the designated directory before running the R code in step 4. Run the master script run_all.R to fully replicate the study's analysis from the provided Analysis_Ready data. This script will regenerate all tables and figures. Users should consult the main publication for full context, theoretical motivation, and series-specific citations.
Facebook
TwitterThis repository provides access to five pre-computed reconstruction files as well as the static polygons and rotation files used to generate them. This set of palaeogeographic reconstruction files provide palaeocoordinates for three global grids at H3 resolutions 2, 3, and 4, which have an average cell spacing of ~316 km, ~119 km, and ~45 km, respectively. Grids were reconstructed at a temporal resolution of one million years throughout the entire Phanerozoic (540–0 Ma). The reconstruction files are stored as comma-separated-value (CSV) files which can be easily read by almost any spreadsheet program (e.g. Microsoft Excel and Google Sheets) or programming language (e.g. Python, Julia, and R). In addition, R Data Serialization (RDS) files—a common format for saving R objects—are also provided as lighter (and compressed) alternatives to the CSV files. The structure of the reconstruction files follows a wide-form data frame structure to ease indexing. Each file consists of three initial index columns relating to the H3 cell index (i.e. the 'H3 address'), present-day longitude of the cell centroid, and the present-day latitude of the cell centroid. The subsequent columns provide the reconstructed longitudinal and latitudinal coordinate pairs for their respective age of reconstruction in ascending order, indicated by a numerical suffix. Each row contains a unique spatial point on the Earth's continental surface reconstructed through time. NA values within the reconstruction files indicate points which are not defined in deeper time (i.e. either the static polygon does not exist at that time, or it is outside the temporal coverage as defined by the rotation file).
The following five Global Plate Models are provided (abbreviation, temporal coverage, reference) within the GPMs folder:
WR13, 0–550 Ma, (Wright et al., 2013)
MA16, 0–410 Ma, (Matthews et al., 2016)
TC16, 0–540 Ma, (Torsvik and Cocks, 2016)
SC16, 0–1100 Ma, (Scotese, 2016)
ME21, 0–1000 Ma, (Merdith et al., 2021)
In addition, the H3 grids for resolutions 2, 3, and 4 are provided within the grids folder. Finally, we also provide two scripts (python and R) within the code folder which can be used to generate reconstructed coordinates for user data from the reconstruction files.
For access to the code used to generate these files:
https://github.com/LewisAJones/PhanGrids
For more information, please refer to the article describing the data:
Jones, L.A. and Domeier, M.M. 2024. A Phanerozoic gridded dataset for palaeogeographic reconstructions. (2024).
For any additional queries, contact:
Lewis A. Jones (lewisa.jones@outlook.com) or Mathew M. Domeier (mathewd@uio.no)
If you use these files, please cite:
Jones, L.A. and Domeier, M.M. 2024. A Phanerozoic gridded dataset for palaeogeographic reconstructions. DOI: 10.5281/zenodo.10069221
References
Matthews, K. J., Maloney, K. T., Zahirovic, S., Williams, S. E., Seton, M., & Müller, R. D. (2016). Global plate boundary evolution and kinematics since the late Paleozoic. Global and Planetary Change, 146, 226–250. https://doi.org/10.1016/j.gloplacha.2016.10.002.
Merdith, A. S., Williams, S. E., Collins, A. S., Tetley, M. G., Mulder, J. A., Blades, M. L., Young, A., Armistead, S. E., Cannon, J., Zahirovic, S., & Müller, R. D. (2021). Extending full-plate tectonic models into deep time: Linking the Neoproterozoic and the Phanerozoic. Earth-Science Reviews, 214, 103477. https://doi.org/10.1016/j.earscirev.2020.103477.
Scotese, C. R. (2016). Tutorial: PALEOMAP paleoAtlas for GPlates and the paleoData plotter program: PALEOMAP Project, Technical Report.
Torsvik, T. H., & Cocks, L. R. M. (2017). Earth history and palaeogeography. Cambridge University Press. https://doi.org/10.1017/9781316225523.
Wright, N., Zahirovic, S., Müller, R. D., & Seton, M. (2013). Towards community-driven paleogeographic reconstructions: Integrating open-access paleogeographic and paleobiology data with plate tectonics. Biogeosciences, 10, 1529–1541. https://doi.org/10.5194/bg-10-1529-2013.
Facebook
TwitterArchery is my favorite sport and I enjoy this as it keeps calming and keep me focused. I have played archery for over 4 years, 2 years in the UK in my local club Epping Archers and 2.5 years in Hong Kong with Target X.
Recently I decided to take the opportunity to enter the open competition at 18 meters with 6 ringed 80cm target face. To quality further to other competition you will need to pass the minimum score requirement. Example, beginners’ competition 18 meters. I would have to score minimum 600 points to qualify in the 30 meters tournaments.
The type of bow I use is a recurve bow it is most recognisable in the Olympics. The target face is 80cm and there is 6 ring scoring from 10, 9, 8, 7, 6, 5 with 10 being the highest point per arrow. The lowest is point is 5 and arrows that hits the outside is zero points known as ‘M’ a miss. In the competition there are total of 72 arrows, it is split into 2 rounds of 36 arrows. Each round has 6 ends a one end consists of 6 arrows. The maximum number of point per arrow is 10 points and a total of 72 arrows totals to 720 points. The minimum points to qualify for 18 meter shoot is 600 points.
I am using this opportunity to complete an analysis to calculate averages, maximum and minimum points scored. This will be followed by using R programming, Tableau and Excel spreadsheet to produce graphs, charts and dashboard.
The data collected are my personal best, it is recorded on a hand written sheet and it is transferred on to a spreadsheet and saved as a CSV file format. The time period is from 13th Dec 2020 to 28th Sep 2021, there are total of 13 games played during this time. Also with the Covid-19 social distancing, the games played are not consistent, i.e 1 game every Sunday week.
The dataset has index column which is my primary key, the days and date. The games represent all the 72 arrows, The rounds have been recorded as 1 and 2 The ends have been 1 to 12
The shots_1, shot_2 and so on… are the actual scoring, ‘M’ is recorded as a number ‘0’. Then total columns represents each end actual point scored. The hits represents the number of arrows that hit the 6 ring target face. Any arrows that does not land on the target face is not counted as it represents a miss. The tens columns represents the number of times I have hit 10.
Thank you to Target X for maintaining social distance in order to keep the club running.
Facebook
TwitterThis dataset reflects reported incidents of crime (with the exception of murders where data exists for each victim) that occurred in the City of Chicago from 2001 to present, minus the most recent seven days. Data is extracted from the Chicago Police Department's CLEAR (Citizen Law Enforcement Analysis and Reporting) system. In order to protect the privacy of crime victims, addresses are shown at the block level only and specific locations are not identified. Should you have questions about this dataset, you may contact the Research & Development Division of the Chicago Police Department at 312.745.6071 or RandD@chicagopolice.org. Disclaimer: These crimes may be based upon preliminary information supplied to the Police Department by the reporting parties that have not been verified. The preliminary crime classifications may be changed at a later date based upon additional investigation and there is always the possibility of mechanical or human error. Therefore, the Chicago Police Department does not guarantee (either expressed or implied) the accuracy, completeness, timeliness, or correct sequencing of the information and the information should not be used for comparison purposes over time. The Chicago Police Department will not be responsible for any error or omission, or for the use of, or the results obtained from the use of this information. All data visualizations on maps should be considered approximate and attempts to derive specific addresses are strictly prohibited. The Chicago Police Department is not responsible for the content of any off-site pages that are referenced by or that reference this web page other than an official City of Chicago or Chicago Police Department web page. The user specifically acknowledges that the Chicago Police Department is not responsible for any defamatory, offensive, misleading, or illegal conduct of other users, links, or third parties and that the risk of injury from the foregoing rests entirely with the user. The unauthorized use of the words "Chicago Police Department," "Chicago Police," or any colorable imitation of these words or the unauthorized use of the Chicago Police Department logo is unlawful. This web page does not, in any way, authorize such use. Data is updated daily Tuesday through Sunday. The dataset contains more than 65,000 records/rows of data and cannot be viewed in full in Microsoft Excel. Therefore, when downloading the file, select CSV from the Export menu. Open the file in an ASCII text editor, such as Wordpad, to view and search. To access a list of Chicago Police Department - Illinois Uniform Crime Reporting (IUCR) codes, go to http://data.cityofchicago.org/Public-Safety/Chicago-Police-Department-Illinois-Uniform-Crime-R/c7ck-438e
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This is the supplementary material accompanying the manuscript "Daily life in the Open Biologist’s second job, as a Data Curator", published in Wellcome Open Research.
It contains:
- Python_scripts.zip: Python scripts used for data cleaning and organization:
-add_headers.py: adds specified headers automatically to a list of csv files, creating new output files containing a "_with_headers" suffix.
-count_NaN_values.py: counts the total number of rows containing null values in a csv file and prints the location of null values in the (row, column) format.
-remove_rowsNaN_file.py: removes rows containing null values in a single csv file and saves the modified file with a "_dropNaN" suffix.
-remove_rowsNaN_list.py: removes rows containing null values in list of csv files and saves the modified files with a "_dropNaN" suffix.
- README_template.txt: a template for a README file to be used to describe and accompany a dataset.
- template_for_source_data_information.xlsx: a spreadsheet to help manuscript authors to keep track of data used for each figure (e.g., information about data location and links to dataset description).
- Supplementary_Figure_1.tif: Example of a dataset shared by us on Zenodo. The elements that make the dataset FAIR are indicated by the respective letters. Findability (F) is achieved by the dataset unique and persistent identifier (DOI), as well as by the related identifiers for the publication and dataset on GitHub. Additionally, the dataset is described with rich metadata, (e.g., keywords). Accessibility (A) is achieved by the ease of visualization and downloading using a standardised communications protocol (https). Also, the metadata are publicly accessible and licensed under the public domain. Interoperability (I) is achieved by the open formats used (CSV; R), and metadata are harvestable using the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH), a low-barrier mechanism for repository interoperability. Reusability (R) is achieved by the complete description of the data with metadata in README files and links to the related publication (which contains more detailed information, as well as links to protocols on protocols.io). The dataset has a clear and accessible data usage license (CC-BY 4.0).
Facebook
TwitterThis dataset reflects reported incidents of crime (with the exception of murders where data exists for each victim) that occurred in the City of Chicago from 2001 to present, minus the most recent seven days. Data is extracted from the Chicago Police Department's CLEAR (Citizen Law Enforcement Analysis and Reporting) system. In order to protect the privacy of crime victims, addresses are shown at the block level only and specific locations are not identified. Should you have questions about this dataset, you may contact the Research & Development Division of the Chicago Police Department at 312.745.6071 or RandD@chicagopolice.org. Disclaimer: These crimes may be based upon preliminary information supplied to the Police Department by the reporting parties that have not been verified. The preliminary crime classifications may be changed at a later date based upon additional investigation and there is always the possibility of mechanical or human error. Therefore, the Chicago Police Department does not guarantee (either expressed or implied) the accuracy, completeness, timeliness, or correct sequencing of the information and the information should not be used for comparison purposes over time. The Chicago Police Department will not be responsible for any error or omission, or for the use of, or the results obtained from the use of this information. All data visualizations on maps should be considered approximate and attempts to derive specific addresses are strictly prohibited. The Chicago Police Department is not responsible for the content of any off-site pages that are referenced by or that reference this web page other than an official City of Chicago or Chicago Police Department web page. The user specifically acknowledges that the Chicago Police Department is not responsible for any defamatory, offensive, misleading, or illegal conduct of other users, links, or third parties and that the risk of injury from the foregoing rests entirely with the user. The unauthorized use of the words "Chicago Police Department," "Chicago Police," or any colorable imitation of these words or the unauthorized use of the Chicago Police Department logo is unlawful. This web page does not, in any way, authorize such use. Data is updated daily Tuesday through Sunday. The dataset contains more than 65,000 records/rows of data and cannot be viewed in full in Microsoft Excel. Therefore, when downloading the file, select CSV from the Export menu. Open the file in an ASCII text editor, such as Wordpad, to view and search. To access a list of Chicago Police Department - Illinois Uniform Crime Reporting (IUCR) codes, go to http://data.cityofchicago.org/Public-Safety/Chicago-Police-Department-Illinois-Uniform-Crime-R/c7ck-438e
Facebook
TwitterAttribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
The 2014 15 Budget is officially available at budget.gov.au as the authoritative source of Budget Papers (BPs) and Portfolio Budget Statement (PBS) documents. This dataset is a collection of data sources from the 2014 15 Budget, including:\r \r * Selected Tables from the Budget Papers – Available after Budget Papers published on budget.gov.au (~7.30pm Budget night).\r * Machine Readable Tables from Budget Papers – Available after Budget Papers published on budget.gov.au (~7.30pm Budget night).\r * The Portfolio Budget Statement Excel spreadsheets – Available after PBSs tabled in the Senate (~8.30pm Budget night).\r * A Machine Readable CSV of all PBS Excel Spreadsheet Line Items – Available after PBSs tabled in the Senate and translated (~8.30pm Budget night). First version published with 10 portfolios ~8:50pm Budget night. Work due to be competed by midday 14 May 2014.\r \r Data from the 2014-15 Budget are provided to assist those who wish to analyse, visualise and programmatically access the 2014-15 Budget. It is the first time this has been done as per our announcement blog post. We intend to move further down the digital by default route to make the 2015-16 Budget more accessible and reusable in data form. We welcome your feedback and comments below. Data users should refer to footnotes and memoranda in the original files as these are not usually captured in machine readable CSVs.\r \r This dataset was prepared by the Department of Finance and the Department of the Treasury. \r \r Information about the PBS Excel files and CSV\r --------------------------------------------------------\r The PBS Excel files published should include the following financial tables with headings and footnotes. Only the line item data (table 2.2) is available in CSV at this stage as we thought this would be the most useful PBS data to extract. Much of the other data is also available in the Budget Papers 1 and 4 in aggregate form:\r \r * Table 1.1: Agency Resource Statement;\r * Table 1.2: Agency 2014-15 Budget Measures;\r * Table 2.1: Budgeted Expenses for Outcome X;\r * Table 2.2: Programme Expenses and Programme Components.\r * Table 3.1.1: Movement of Administered Funds Between Years;\r * Table 3.1.2: Estimates of Special Account Flows and Balances;\r * Table 3.1.3: Australian Government Indigenous Expenditure (AGIE);\r * Tables 3.2.1 to 3.2.6: Departmental Budgeted Financial Statements; and\r * Tables 3.2.7 to 3.2.11: Administered Budgeted Financial Statements.\r \r Please note, total expenses reported in the csv file ‘2014-15 PBS line items dataset’ was prepared from individual agency programme expense tables. Totalling these figures does not produce the total expense figure in ‘Table1: estimates of general government expenses’ (Statement 6, Budget Paper 1). Differences relate to:\r \r 1. Intra agency charging for services which are eliminated for the reporting of general government financial statements; \r \r 2. Agency expenses that involve revaluation of assets and liabilities are reported as other economic flows in general government financial statements; and\r \r 3. Additional agencies’ expenses are included in general government sector expenses (e.g. Australian Strategic Policy Institute Limited and other entities) noting that only agencies that are directly government funded are required to prepare a PBS.\r \r At this stage, the following Portfolios have contributed their PBS Excel files and are included in the line item CSV: 1.1 Agriculture Portfolio; 1.2 Attorney-General’s Portfolio; 1.3 Communications Portfolio; 1.4A Defence Portfolio; 1.4B Defence Portfolio (Department of Veterans’ Affairs); 1.5 Education Portfolio; 1.6 Employment Portfolio; 1.7 Environment Portfolio; 1.8 Finance Portfolio; 1.9 Foreign Affairs and Trade Portfolio; 1.10 Health Portfolio; 1.11 Immigration and Border Protection Portfolio; 1.12 Industry Portfolio; 1.13 Infrastructure and Regional Development Portfolio; 1.14 Prime Minister and Cabinet Portfolio; 1.15A Social Services Portfolio; 1.15B Social Services Portfolio (Department of Human Services); 1.16 Treasury Portfolio; 1.17A Department of the House of Representatives; 1.17B Department of the Senate; 1.17C Department of Parliamentary Services; and 1.17D Department of the Parliamentary Budget Office.\r \r The original PBS Excel files and published documents include sub-totals and totals by agency and appropriation type which are not included in the line item CSV as these can be calculated programmatically. Where modifications are identified they will be updated as required. If a corrigendum to an agencies PBS is issued after budget night, tables will be updated as necessary. \r \r Below is the CSV structure of the line item CSV. The data transformation is expected to be complete by midday 14 May, so we have put up the incomplete CSV which will be updated as additional PBSs are transformed into data form. Please keep refreshing for now.\r \r Portfolio, Department/Agency, Outcome, Program, Expense type, Appropriation type, Description, 2012-13, 2013-14, 2014-15, 2015-16, 2016-17, Source document, Source table, URL\r \r Budget Paper Tables\r --------------------------\r We have made a number of data tables from Budget Papers 1 and 4 available in their original format as Excel or XML files. We have transformed a number of these into machine readable format (as prioritised by several users of budget data) which will be published here as they are ready. Below is the list of the tables published and whether we’ve translated them into CSV form this year: \r \r * Budget Paper 1 Overview Appendix C Major Intiatives (XLSX, CSV)\r * Budget Paper 1 Overview Appendix D Major Savings (XLSX, CSV)\r * Budget Paper 1 Statement 3 Table 5 Reconciliation of underlying cash balance estimates (XLSX, no CSV due to complexity)\r * Budget Paper 1 Statement 5 Table 1 Australian Government general government receipts (XLSX, CSV)\r * Budget Paper 1 Statement 5 Table 8 Australian Government general government (cash) receipts (XLSX, CSV)\r * Budget Paper 1 Statement 5 (online tables) Australian Government (accrual) revenue (XLSX, CSV)\r * Budget Paper 1 Statement 5 Table 11 Reconciliation of 2015-15 general government (accrual) revenue (XLSX, CSV)\r * Budget Paper 1 Statement 6 Appendix A Table A1 Estimates of expenses by function and sub-function (XLSX, CSV)\r * Budget Paper 1 Statement 10 Table 1 Australian Government general government sector receipts payments net Future Fund earnings and underlying cash balance (XLSX, CSV)\r * Budget Paper 1 Statement 10 Table 4 Australian Government general government sector taxation receipts non taxation receipts and total receipts (XLSX, CSV)\r * Budget Paper 1 Statement 10 Table 5 Australian Government general government sector net debt and net interest payments (XLSX, CSV)\r * Budget Paper 4 Table 1.1 Agency Resourcing (xls, CSV coming)\r * Budget Paper 4 Table 1.2 Special Appropriations (xlsx, CSV coming)\r * Budget Paper 4 Table 1.3 Special Accounts (xlsx, CSV)\r * Budget Paper 4 Table 2.2 Average Staffing Table (xlsx, CSV coming)\r * Budget Paper 4 Table 3.1 Departmental Expenses (xlsx, CSV coming)\r * Budget Paper 4 Table 3.2 Net Capital Investment (xlsx, CSV coming)
Facebook
TwitterInfaunal marine invertebrates were collected from inside and outside of patches of white bacterial mats from several sites in the Windmill Islands, Antarctica, around Casey station during the 2006-07 summer. Samples were collected from McGrady Cove inner and outer, the tide gauge near the Casey wharf, Stevenson's Cove and Brown Bay inner. Sediment cores of 10cm depth and 5cm diameter were collected by divers using a PVC corer from inside (4 cores) and outside (4 cores) each bacterial patch. The size of each patch varied from site to site. Cores were sieved at 500 microns and the extracted fauna preserved in 4 percent neutral buffered formalin. All fauna were counted and identified to species where possible or assigned to morphospecies based on previous infaunal sampling around Casey.
An excel spreadsheet is available for download at the URL given below. The spreadsheet does not represent the complete dataset, and is only the bacterial mat infauna data.
Regarding the infauna dataset:
This work was completed as part of ASAC 2201 (ASAC_2201).
Facebook
TwitterThis dataset reflects reported incidents of crime (with the exception of murders where data exists for each victim) that occurred in the City of Chicago from 2001 to present, minus the most recent seven days. Data is extracted from the Chicago Police Department's CLEAR (Citizen Law Enforcement Analysis and Reporting) system. In order to protect the privacy of crime victims, addresses are shown at the block level only and specific locations are not identified. Should you have questions about this dataset, you may contact the Research & Development Division of the Chicago Police Department at 312.745.6071 or RDAnalysis@chicagopolice.org. Disclaimer: These crimes may be based upon preliminary information supplied to the Police Department by the reporting parties that have not been verified. The preliminary crime classifications may be changed at a later date based upon additional investigation and there is always the possibility of mechanical or human error. Therefore, the Chicago Police Department does not guarantee (either expressed or implied) the accuracy, completeness, timeliness, or correct sequencing of the information and the information should not be used for comparison purposes over time. The Chicago Police Department will not be responsible for any error or omission, or for the use of, or the results obtained from the use of this information. All data visualizations on maps should be considered approximate and attempts to derive specific addresses are strictly prohibited. The Chicago Police Department is not responsible for the content of any off-site pages that are referenced by or that reference this web page other than an official City of Chicago or Chicago Police Department web page. The user specifically acknowledges that the Chicago Police Department is not responsible for any defamatory, offensive, misleading, or illegal conduct of other users, links, or third parties and that the risk of injury from the foregoing rests entirely with the user. The unauthorized use of the words "Chicago Police Department," "Chicago Police," or any colorable imitation of these words or the unauthorized use of the Chicago Police Department logo is unlawful. This web page does not, in any way, authorize such use. Data are updated daily. The dataset contains more than 65,000 records/rows of data and cannot be viewed in full in Microsoft Excel. Therefore, when downloading the file, select CSV from the Export menu. Open the file in an ASCII text editor, such as Wordpad, to view and search. To access a list of Chicago Police Department - Illinois Uniform Crime Reporting (IUCR) codes, go to http://data.cityofchicago.org/Public-Safety/Chicago-Police-Department-Illinois-Uniform-Crime-R/c7ck-438e
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Categorical scatterplots with R for biologists: a step-by-step guide
Benjamin Petre1, Aurore Coince2, Sophien Kamoun1
1 The Sainsbury Laboratory, Norwich, UK; 2 Earlham Institute, Norwich, UK
Weissgerber and colleagues (2015) recently stated that ‘as scientists, we urgently need to change our practices for presenting continuous data in small sample size studies’. They called for more scatterplot and boxplot representations in scientific papers, which ‘allow readers to critically evaluate continuous data’ (Weissgerber et al., 2015). In the Kamoun Lab at The Sainsbury Laboratory, we recently implemented a protocol to generate categorical scatterplots (Petre et al., 2016; Dagdas et al., 2016). Here we describe the three steps of this protocol: 1) formatting of the data set in a .csv file, 2) execution of the R script to generate the graph, and 3) export of the graph as a .pdf file.
Protocol
• Step 1: format the data set as a .csv file. Store the data in a three-column excel file as shown in Powerpoint slide. The first column ‘Replicate’ indicates the biological replicates. In the example, the month and year during which the replicate was performed is indicated. The second column ‘Condition’ indicates the conditions of the experiment (in the example, a wild type and two mutants called A and B). The third column ‘Value’ contains continuous values. Save the Excel file as a .csv file (File -> Save as -> in ‘File Format’, select .csv). This .csv file is the input file to import in R.
• Step 2: execute the R script (see Notes 1 and 2). Copy the script shown in Powerpoint slide and paste it in the R console. Execute the script. In the dialog box, select the input .csv file from step 1. The categorical scatterplot will appear in a separate window. Dots represent the values for each sample; colors indicate replicates. Boxplots are superimposed; black dots indicate outliers.
• Step 3: save the graph as a .pdf file. Shape the window at your convenience and save the graph as a .pdf file (File -> Save as). See Powerpoint slide for an example.
Notes
• Note 1: install the ggplot2 package. The R script requires the package ‘ggplot2’ to be installed. To install it, Packages & Data -> Package Installer -> enter ‘ggplot2’ in the Package Search space and click on ‘Get List’. Select ‘ggplot2’ in the Package column and click on ‘Install Selected’. Install all dependencies as well.
• Note 2: use a log scale for the y-axis. To use a log scale for the y-axis of the graph, use the command line below in place of command line #7 in the script.
replicates
graph + geom_boxplot(outlier.colour='black', colour='black') + geom_jitter(aes(col=Replicate)) + scale_y_log10() + theme_bw()
References
Dagdas YF, Belhaj K, Maqbool A, Chaparro-Garcia A, Pandey P, Petre B, et al. (2016) An effector of the Irish potato famine pathogen antagonizes a host autophagy cargo receptor. eLife 5:e10856.
Petre B, Saunders DGO, Sklenar J, Lorrain C, Krasileva KV, Win J, et al. (2016) Heterologous Expression Screens in Nicotiana benthamiana Identify a Candidate Effector of the Wheat Yellow Rust Pathogen that Associates with Processing Bodies. PLoS ONE 11(2):e0149035
Weissgerber TL, Milic NM, Winham SJ, Garovic VD (2015) Beyond Bar and Line Graphs: Time for a New Data Presentation Paradigm. PLoS Biol 13(4):e1002128