11 datasets found

Petre_Slide_CategoricalScatterplotFigShare.pptx
figshare.com
pptx
Updated Sep 19, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Benj Petre; Aurore Coince; Sophien Kamoun (2016). Petre_Slide_CategoricalScatterplotFigShare.pptx [Dataset]. http://doi.org/10.6084/m9.figshare.3840102.v1
Explore at:
pptxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.3840102.v1
Dataset updated
Sep 19, 2016
Dataset provided by
Figsharehttp://figshare.com/
Authors
Benj Petre; Aurore Coince; Sophien Kamoun
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Categorical scatterplots with R for biologists: a step-by-step guide

Benjamin Petre1, Aurore Coince2, Sophien Kamoun1

1 The Sainsbury Laboratory, Norwich, UK; 2 Earlham Institute, Norwich, UK

Weissgerber and colleagues (2015) recently stated that ‘as scientists, we urgently need to change our practices for presenting continuous data in small sample size studies’. They called for more scatterplot and boxplot representations in scientific papers, which ‘allow readers to critically evaluate continuous data’ (Weissgerber et al., 2015). In the Kamoun Lab at The Sainsbury Laboratory, we recently implemented a protocol to generate categorical scatterplots (Petre et al., 2016; Dagdas et al., 2016). Here we describe the three steps of this protocol: 1) formatting of the data set in a .csv file, 2) execution of the R script to generate the graph, and 3) export of the graph as a .pdf file.

Protocol

• Step 1: format the data set as a .csv file. Store the data in a three-column excel file as shown in Powerpoint slide. The first column ‘Replicate’ indicates the biological replicates. In the example, the month and year during which the replicate was performed is indicated. The second column ‘Condition’ indicates the conditions of the experiment (in the example, a wild type and two mutants called A and B). The third column ‘Value’ contains continuous values. Save the Excel file as a .csv file (File -> Save as -> in ‘File Format’, select .csv). This .csv file is the input file to import in R.

• Step 2: execute the R script (see Notes 1 and 2). Copy the script shown in Powerpoint slide and paste it in the R console. Execute the script. In the dialog box, select the input .csv file from step 1. The categorical scatterplot will appear in a separate window. Dots represent the values for each sample; colors indicate replicates. Boxplots are superimposed; black dots indicate outliers.

• Step 3: save the graph as a .pdf file. Shape the window at your convenience and save the graph as a .pdf file (File -> Save as). See Powerpoint slide for an example.

Notes

• Note 1: install the ggplot2 package. The R script requires the package ‘ggplot2’ to be installed. To install it, Packages & Data -> Package Installer -> enter ‘ggplot2’ in the Package Search space and click on ‘Get List’. Select ‘ggplot2’ in the Package column and click on ‘Install Selected’. Install all dependencies as well.

• Note 2: use a log scale for the y-axis. To use a log scale for the y-axis of the graph, use the command line below in place of command line #7 in the script.

7 Display the graph in a separate window. Dot colors indicate

replicates

graph + geom_boxplot(outlier.colour='black', colour='black') + geom_jitter(aes(col=Replicate)) + scale_y_log10() + theme_bw()

References

Dagdas YF, Belhaj K, Maqbool A, Chaparro-Garcia A, Pandey P, Petre B, et al. (2016) An effector of the Irish potato famine pathogen antagonizes a host autophagy cargo receptor. eLife 5:e10856.

Petre B, Saunders DGO, Sklenar J, Lorrain C, Krasileva KV, Win J, et al. (2016) Heterologous Expression Screens in Nicotiana benthamiana Identify a Candidate Effector of the Wheat Yellow Rust Pathogen that Associates with Processing Bodies. PLoS ONE 11(2):e0149035

Weissgerber TL, Milic NM, Winham SJ, Garovic VD (2015) Beyond Bar and Line Graphs: Time for a New Data Presentation Paradigm. PLoS Biol 13(4):e1002128

https://cran.r-project.org/

http://ggplot2.org/
d
R script to create boxplots of change factors by NOAA Atlas 14 station, or...
catalog.data.gov
data.usgs.gov
+1more
Updated Nov 20, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2025). R script to create boxplots of change factors by NOAA Atlas 14 station, or for all stations in a Florida HUC-8 basin or county (create_boxplot.R) [Dataset]. https://catalog.data.gov/dataset/r-script-to-create-boxplots-of-change-factors-by-noaa-atlas-14-station-or-for-all-stations
Explore at:
Dataset updated
Nov 20, 2025
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Description
The Florida Flood Hub for Applied Research and Innovation and the U.S. Geological Survey have developed projected future change factors for precipitation depth-duration-frequency (DDF) curves at 242 National Oceanic and Atmospheric Administration (NOAA) Atlas 14 stations in Florida. The change factors were computed as the ratio of projected future to historical extreme-precipitation depths fitted to extreme-precipitation data from downscaled climate datasets using a constrained maximum likelihood (CML) approach as described in https://doi.org/10.3133/sir20225093. The change factors correspond to the periods 2020-59 (centered in the year 2040) and 2050-89 (centered in the year 2070) as compared to the 1966-2005 historical period. An R script (create_boxplot.R) is provided which generates boxplots of change factors for a NOAA Atlas 14 station, or for all NOAA Atlas 14 stations in a Florida HUC-8 basin or county for durations of interest (1, 3, and 7 days, or combinations thereof) and return periods of interest (5, 10, 25, 50, 100, 200, and 500 years, or combinations thereof). The user also has the option of requesting that the script save the raw change factor data used to generate the boxplots, as well as the processed quantile and outlier data shown in the figure. The script allows the user to modify the percentiles used in generating the boxplots. A Microsoft Word file documenting code usage and available options is also provided within this data release (Documentation_R_script_create_boxplot.docx). As described in the documentation, the R script relies on some of the Microsoft Excel spreadsheets published as part of this data release. The script uses basins defined in the "Florida Hydrologic Unit Code (HUC) Basins (areas)" from the Florida Department of Environmental Protection (FDEP; https://geodata.dep.state.fl.us/datasets/FDEP::florida-hydrologic-unit-code-huc-basins-areas/explore) and their names are listed in the file basins_list.txt provided with the script. County names are listed in the file counties_list.txt provided with the script. NOAA Atlas 14 stations located in each Florida HUC-8 basin or county are defined in the Microsoft Excel spreadsheet Datasets_station_information.xlsx which is part of this data release. Instructions are provided in code documentation (see highlighted text on page 7 of Documentation_R_script_create_boxplot.docx) so that users can modify the script to generate boxplots for basins different from the FDEP "lorida Hydrologic Unit Code (HUC) Basins (areas).
H
Hazardous Times
dataverse.harvard.edu
Updated Aug 27, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
George Robert Lefter (2025). Hazardous Times [Dataset]. http://doi.org/10.7910/DVN/MP6L5B
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/MP6L5B
Dataset updated
Aug 27, 2025
Dataset provided by
Harvard Dataverse
Authors
George Robert Lefter
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This repository contains the replication package for the article "Hazardous Times: Animal Spirits and U.S. Recession Probabilities." It includes all necessary R code, raw data, and processed data in the start-stop (counting process) format required to reproduce the empirical results, tables, and figures from the study. Project Description: The study assembles monthly U.S. macroeconomic time series from the Federal Reserve Economic Data (FRED) and related sources—covering labor market conditions, consumer sentiment, term spreads, and credit spreads—and implements a novel "high water mark" methodology to measure the lead times with which these indicators signal NBER-dated recessions. Contents: Code: R scripts for data cleaning, multiple imputation, survival analysis, and figure/table generation. A top-level master script (run_all.R) executes the entire analytical pipeline end-to-end. Data: Raw/: Original data pulls from primary sources. Analysis_Ready/: Cleaned series, constructed cycle-specific extremes (high water marks), lead time variables, and the final start-stop dataset for survival analysis. The final curated Excel workbooks used as direct inputs for the replication code. (Note: These Excel sheets must be saved as separate .xlsx files in the designated directory before running the R code.) Documentation: This README file and detailed comments within the code. Key Details: Software Requirements: The replication code is written in R. A list of required R packages (with versions) is provided in the reference list of the article. Missing Data: Addressed via Multiple Imputation by Chained Equations (MICE). License: The original raw data from FRED is subject to its own terms of use, which require citation. The R code is released under the MIT License. All processed data, constructed variables, and analysis-ready datasets created by the author are dedicated to the public domain under the CC0 1.0 Universal Public Domain Dedication. Instructions: Download the entire repository. Install the required R packages. Save Excel sheets from the workbook “Hazardous_Times_Data.xlsx” as separate .xlsx files in the designated directory before running the R code in step 4. Run the master script run_all.R to fully replicate the study's analysis from the provided Analysis_Ready data. This script will regenerate all tables and figures. Users should consult the main publication for full context, theoretical motivation, and series-specific citations.
u
Data from: A Phanerozoic gridded dataset for palaeogeographic...
portalcientifico.uvigo.gal
data.niaid.nih.gov
+1more
Updated 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jones, Lewis A.; Domeier, Mathew; Jones, Lewis A.; Domeier, Mathew (2024). A Phanerozoic gridded dataset for palaeogeographic reconstructions [Dataset]. https://portalcientifico.uvigo.gal/documentos/668fc42bb9e7c03b01bd5735
Explore at:
Dataset updated
2024
Authors
Jones, Lewis A.; Domeier, Mathew; Jones, Lewis A.; Domeier, Mathew
Description
This repository provides access to five pre-computed reconstruction files as well as the static polygons and rotation files used to generate them. This set of palaeogeographic reconstruction files provide palaeocoordinates for three global grids at H3 resolutions 2, 3, and 4, which have an average cell spacing of ~316 km, ~119 km, and ~45 km, respectively. Grids were reconstructed at a temporal resolution of one million years throughout the entire Phanerozoic (540–0 Ma). The reconstruction files are stored as comma-separated-value (CSV) files which can be easily read by almost any spreadsheet program (e.g. Microsoft Excel and Google Sheets) or programming language (e.g. Python, Julia, and R). In addition, R Data Serialization (RDS) files—a common format for saving R objects—are also provided as lighter (and compressed) alternatives to the CSV files. The structure of the reconstruction files follows a wide-form data frame structure to ease indexing. Each file consists of three initial index columns relating to the H3 cell index (i.e. the 'H3 address'), present-day longitude of the cell centroid, and the present-day latitude of the cell centroid. The subsequent columns provide the reconstructed longitudinal and latitudinal coordinate pairs for their respective age of reconstruction in ascending order, indicated by a numerical suffix. Each row contains a unique spatial point on the Earth's continental surface reconstructed through time. NA values within the reconstruction files indicate points which are not defined in deeper time (i.e. either the static polygon does not exist at that time, or it is outside the temporal coverage as defined by the rotation file).

The following five Global Plate Models are provided (abbreviation, temporal coverage, reference) within the GPMs folder:

WR13, 0–550 Ma, (Wright et al., 2013)

MA16, 0–410 Ma, (Matthews et al., 2016)

TC16, 0–540 Ma, (Torsvik and Cocks, 2016)

SC16, 0–1100 Ma, (Scotese, 2016)

ME21, 0–1000 Ma, (Merdith et al., 2021)

In addition, the H3 grids for resolutions 2, 3, and 4 are provided within the grids folder. Finally, we also provide two scripts (python and R) within the code folder which can be used to generate reconstructed coordinates for user data from the reconstruction files.

For access to the code used to generate these files:

https://github.com/LewisAJones/PhanGrids

For more information, please refer to the article describing the data:

Jones, L.A. and Domeier, M.M. 2024. A Phanerozoic gridded dataset for palaeogeographic reconstructions. (2024).

For any additional queries, contact:

Lewis A. Jones (lewisa.jones@outlook.com) or Mathew M. Domeier (mathewd@uio.no)

If you use these files, please cite:

Jones, L.A. and Domeier, M.M. 2024. A Phanerozoic gridded dataset for palaeogeographic reconstructions. DOI: 10.5281/zenodo.10069221

References

Matthews, K. J., Maloney, K. T., Zahirovic, S., Williams, S. E., Seton, M., & Müller, R. D. (2016). Global plate boundary evolution and kinematics since the late Paleozoic. Global and Planetary Change, 146, 226–250. https://doi.org/10.1016/j.gloplacha.2016.10.002.

Merdith, A. S., Williams, S. E., Collins, A. S., Tetley, M. G., Mulder, J. A., Blades, M. L., Young, A., Armistead, S. E., Cannon, J., Zahirovic, S., & Müller, R. D. (2021). Extending full-plate tectonic models into deep time: Linking the Neoproterozoic and the Phanerozoic. Earth-Science Reviews, 214, 103477. https://doi.org/10.1016/j.earscirev.2020.103477.

Scotese, C. R. (2016). Tutorial: PALEOMAP paleoAtlas for GPlates and the paleoData plotter program: PALEOMAP Project, Technical Report.

Torsvik, T. H., & Cocks, L. R. M. (2017). Earth history and palaeogeography. Cambridge University Press. https://doi.org/10.1017/9781316225523.

Wright, N., Zahirovic, S., Müller, R. D., & Seton, M. (2013). Towards community-driven paleogeographic reconstructions: Integrating open-access paleogeographic and paleobiology data with plate tectonics. Biogeosciences, 10, 1529–1541. https://doi.org/10.5194/bg-10-1529-2013.
Gordon's Archery Scores
kaggle.com
zip
Updated Oct 11, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gordon Man (2021). Gordon's Archery Scores [Dataset]. https://www.kaggle.com/datasets/gordonman/gordons-archery-scores
Explore at:
zip(1732 bytes)Available download formats
Dataset updated
Oct 11, 2021
Authors
Gordon Man
Description
Context

Archery is my favorite sport and I enjoy this as it keeps calming and keep me focused. I have played archery for over 4 years, 2 years in the UK in my local club Epping Archers and 2.5 years in Hong Kong with Target X.

Recently I decided to take the opportunity to enter the open competition at 18 meters with 6 ringed 80cm target face. To quality further to other competition you will need to pass the minimum score requirement. Example, beginners’ competition 18 meters. I would have to score minimum 600 points to qualify in the 30 meters tournaments.

The type of bow I use is a recurve bow it is most recognisable in the Olympics. The target face is 80cm and there is 6 ring scoring from 10, 9, 8, 7, 6, 5 with 10 being the highest point per arrow. The lowest is point is 5 and arrows that hits the outside is zero points known as ‘M’ a miss. In the competition there are total of 72 arrows, it is split into 2 rounds of 36 arrows. Each round has 6 ends a one end consists of 6 arrows. The maximum number of point per arrow is 10 points and a total of 72 arrows totals to 720 points. The minimum points to qualify for 18 meter shoot is 600 points.

I am using this opportunity to complete an analysis to calculate averages, maximum and minimum points scored. This will be followed by using R programming, Tableau and Excel spreadsheet to produce graphs, charts and dashboard.

Content

The data collected are my personal best, it is recorded on a hand written sheet and it is transferred on to a spreadsheet and saved as a CSV file format. The time period is from 13th Dec 2020 to 28th Sep 2021, there are total of 13 games played during this time. Also with the Covid-19 social distancing, the games played are not consistent, i.e 1 game every Sunday week.

The dataset has index column which is my primary key, the days and date. The games represent all the 72 arrows, The rounds have been recorded as 1 and 2 The ends have been 1 to 12

The shots_1, shot_2 and so on… are the actual scoring, ‘M’ is recorded as a number ‘0’. Then total columns represents each end actual point scored. The hits represents the number of arrows that hit the 6 ring target face. Any arrows that does not land on the target face is not counted as it represents a miss. The tens columns represents the number of times I have hit 10.

Acknowledgement

Thank you to Target X for maintaining social distance in order to keep the club running.
C
Beat 0314
data.cityofchicago.org
Updated Dec 2, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chicago Police Department (2025). Beat 0314 [Dataset]. https://data.cityofchicago.org/Public-Safety/Beat-0314/2quy-k6gn
Explore at:
kmz, application/geo+json, xml, kml, xlsx, csvAvailable download formats
Dataset updated
Dec 2, 2025
Authors
Chicago Police Department
Description
This dataset reflects reported incidents of crime (with the exception of murders where data exists for each victim) that occurred in the City of Chicago from 2001 to present, minus the most recent seven days. Data is extracted from the Chicago Police Department's CLEAR (Citizen Law Enforcement Analysis and Reporting) system. In order to protect the privacy of crime victims, addresses are shown at the block level only and specific locations are not identified. Should you have questions about this dataset, you may contact the Research & Development Division of the Chicago Police Department at 312.745.6071 or RandD@chicagopolice.org. Disclaimer: These crimes may be based upon preliminary information supplied to the Police Department by the reporting parties that have not been verified. The preliminary crime classifications may be changed at a later date based upon additional investigation and there is always the possibility of mechanical or human error. Therefore, the Chicago Police Department does not guarantee (either expressed or implied) the accuracy, completeness, timeliness, or correct sequencing of the information and the information should not be used for comparison purposes over time. The Chicago Police Department will not be responsible for any error or omission, or for the use of, or the results obtained from the use of this information. All data visualizations on maps should be considered approximate and attempts to derive specific addresses are strictly prohibited. The Chicago Police Department is not responsible for the content of any off-site pages that are referenced by or that reference this web page other than an official City of Chicago or Chicago Police Department web page. The user specifically acknowledges that the Chicago Police Department is not responsible for any defamatory, offensive, misleading, or illegal conduct of other users, links, or third parties and that the risk of injury from the foregoing rests entirely with the user. The unauthorized use of the words "Chicago Police Department," "Chicago Police," or any colorable imitation of these words or the unauthorized use of the Chicago Police Department logo is unlawful. This web page does not, in any way, authorize such use. Data is updated daily Tuesday through Sunday. The dataset contains more than 65,000 records/rows of data and cannot be viewed in full in Microsoft Excel. Therefore, when downloading the file, select CSV from the Export menu. Open the file in an ASCII text editor, such as Wordpad, to view and search. To access a list of Chicago Police Department - Illinois Uniform Crime Reporting (IUCR) codes, go to http://data.cityofchicago.org/Public-Safety/Chicago-Police-Department-Illinois-Uniform-Crime-R/c7ck-438e
Data curation materials in "Daily life in the Open Biologist's second job,...
zenodo.org
bin, tiff, txt
Updated Sep 18, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Livia C T Scorza; Livia C T Scorza; Tomasz Zieliński; Tomasz Zieliński; Andrew J Millar; Andrew J Millar (2024). Data curation materials in "Daily life in the Open Biologist's second job, as a Data Curator" [Dataset]. http://doi.org/10.5281/zenodo.13321937
Explore at:
tiff, txt, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.13321937
Dataset updated
Sep 18, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Livia C T Scorza; Livia C T Scorza; Tomasz Zieliński; Tomasz Zieliński; Andrew J Millar; Andrew J Millar
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This is the supplementary material accompanying the manuscript "Daily life in the Open Biologist’s second job, as a Data Curator", published in Wellcome Open Research.

It contains:

- Python_scripts.zip: Python scripts used for data cleaning and organization:

-add_headers.py: adds specified headers automatically to a list of csv files, creating new output files containing a "_with_headers" suffix.

-count_NaN_values.py: counts the total number of rows containing null values in a csv file and prints the location of null values in the (row, column) format.

-remove_rowsNaN_file.py: removes rows containing null values in a single csv file and saves the modified file with a "_dropNaN" suffix.

-remove_rowsNaN_list.py: removes rows containing null values in list of csv files and saves the modified files with a "_dropNaN" suffix.

- README_template.txt: a template for a README file to be used to describe and accompany a dataset.

- template_for_source_data_information.xlsx: a spreadsheet to help manuscript authors to keep track of data used for each figure (e.g., information about data location and links to dataset description).

- Supplementary_Figure_1.tif: Example of a dataset shared by us on Zenodo. The elements that make the dataset FAIR are indicated by the respective letters. Findability (F) is achieved by the dataset unique and persistent identifier (DOI), as well as by the related identifiers for the publication and dataset on GitHub. Additionally, the dataset is described with rich metadata, (e.g., keywords). Accessibility (A) is achieved by the ease of visualization and downloading using a standardised communications protocol (https). Also, the metadata are publicly accessible and licensed under the public domain. Interoperability (I) is achieved by the open formats used (CSV; R), and metadata are harvestable using the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH), a low-barrier mechanism for repository interoperability. Reusability (R) is achieved by the complete description of the data with metadata in README files and links to the related publication (which contains more detailed information, as well as links to protocols on protocols.io). The dataset has a clear and accessible data usage license (CC-BY 4.0).
C
Data from: Our Block
data.cityofchicago.org
Updated Nov 29, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chicago Police Department (2025). Our Block [Dataset]. https://data.cityofchicago.org/Public-Safety/Our-Block/285v-myf3
Explore at:
xml, csv, kmz, kml, application/geo+json, xlsxAvailable download formats
Dataset updated
Nov 29, 2025
Authors
Chicago Police Department
Description
This dataset reflects reported incidents of crime (with the exception of murders where data exists for each victim) that occurred in the City of Chicago from 2001 to present, minus the most recent seven days. Data is extracted from the Chicago Police Department's CLEAR (Citizen Law Enforcement Analysis and Reporting) system. In order to protect the privacy of crime victims, addresses are shown at the block level only and specific locations are not identified. Should you have questions about this dataset, you may contact the Research & Development Division of the Chicago Police Department at 312.745.6071 or RandD@chicagopolice.org. Disclaimer: These crimes may be based upon preliminary information supplied to the Police Department by the reporting parties that have not been verified. The preliminary crime classifications may be changed at a later date based upon additional investigation and there is always the possibility of mechanical or human error. Therefore, the Chicago Police Department does not guarantee (either expressed or implied) the accuracy, completeness, timeliness, or correct sequencing of the information and the information should not be used for comparison purposes over time. The Chicago Police Department will not be responsible for any error or omission, or for the use of, or the results obtained from the use of this information. All data visualizations on maps should be considered approximate and attempts to derive specific addresses are strictly prohibited. The Chicago Police Department is not responsible for the content of any off-site pages that are referenced by or that reference this web page other than an official City of Chicago or Chicago Police Department web page. The user specifically acknowledges that the Chicago Police Department is not responsible for any defamatory, offensive, misleading, or illegal conduct of other users, links, or third parties and that the risk of injury from the foregoing rests entirely with the user. The unauthorized use of the words "Chicago Police Department," "Chicago Police," or any colorable imitation of these words or the unauthorized use of the Chicago Police Department logo is unlawful. This web page does not, in any way, authorize such use. Data is updated daily Tuesday through Sunday. The dataset contains more than 65,000 records/rows of data and cannot be viewed in full in Microsoft Excel. Therefore, when downloading the file, select CSV from the Export menu. Open the file in an ASCII text editor, such as Wordpad, to view and search. To access a list of Chicago Police Department - Illinois Uniform Crime Reporting (IUCR) codes, go to http://data.cityofchicago.org/Public-Safety/Chicago-Police-Department-Illinois-Uniform-Crime-R/c7ck-438e
Budget 2014 -15 and Portfolio Budget Statements (PBS) - Tables and Data
researchdata.edu.au
data.gov.au
+1more
Updated May 12, 2014
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department of Finance (2014). Budget 2014 -15 and Portfolio Budget Statements (PBS) - Tables and Data [Dataset]. https://researchdata.edu.au/budget-2014-15-tables-data/2982931
Explore at:
Dataset updated
May 12, 2014
Dataset provided by
Data.govhttps://data.gov/
Authors
Department of Finance
License
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
Description
The 2014 15 Budget is officially available at budget.gov.au as the authoritative source of Budget Papers (BPs) and Portfolio Budget Statement (PBS) documents. This dataset is a collection of data sources from the 2014 15 Budget, including:\r \r * Selected Tables from the Budget Papers – Available after Budget Papers published on budget.gov.au (~7.30pm Budget night).\r * Machine Readable Tables from Budget Papers – Available after Budget Papers published on budget.gov.au (~7.30pm Budget night).\r * The Portfolio Budget Statement Excel spreadsheets – Available after PBSs tabled in the Senate (~8.30pm Budget night).\r * A Machine Readable CSV of all PBS Excel Spreadsheet Line Items – Available after PBSs tabled in the Senate and translated (~8.30pm Budget night). First version published with 10 portfolios ~8:50pm Budget night. Work due to be competed by midday 14 May 2014.\r \r Data from the 2014-15 Budget are provided to assist those who wish to analyse, visualise and programmatically access the 2014-15 Budget. It is the first time this has been done as per our announcement blog post. We intend to move further down the digital by default route to make the 2015-16 Budget more accessible and reusable in data form. We welcome your feedback and comments below. Data users should refer to footnotes and memoranda in the original files as these are not usually captured in machine readable CSVs.\r \r This dataset was prepared by the Department of Finance and the Department of the Treasury. \r \r Information about the PBS Excel files and CSV\r --------------------------------------------------------\r The PBS Excel files published should include the following financial tables with headings and footnotes. Only the line item data (table 2.2) is available in CSV at this stage as we thought this would be the most useful PBS data to extract. Much of the other data is also available in the Budget Papers 1 and 4 in aggregate form:\r \r * Table 1.1: Agency Resource Statement;\r * Table 1.2: Agency 2014-15 Budget Measures;\r * Table 2.1: Budgeted Expenses for Outcome X;\r * Table 2.2: Programme Expenses and Programme Components.\r * Table 3.1.1: Movement of Administered Funds Between Years;\r * Table 3.1.2: Estimates of Special Account Flows and Balances;\r * Table 3.1.3: Australian Government Indigenous Expenditure (AGIE);\r * Tables 3.2.1 to 3.2.6: Departmental Budgeted Financial Statements; and\r * Tables 3.2.7 to 3.2.11: Administered Budgeted Financial Statements.\r \r Please note, total expenses reported in the csv file ‘2014-15 PBS line items dataset’ was prepared from individual agency programme expense tables. Totalling these figures does not produce the total expense figure in ‘Table1: estimates of general government expenses’ (Statement 6, Budget Paper 1). Differences relate to:\r \r 1. Intra agency charging for services which are eliminated for the reporting of general government financial statements; \r \r 2. Agency expenses that involve revaluation of assets and liabilities are reported as other economic flows in general government financial statements; and\r \r 3. Additional agencies’ expenses are included in general government sector expenses (e.g. Australian Strategic Policy Institute Limited and other entities) noting that only agencies that are directly government funded are required to prepare a PBS.\r \r At this stage, the following Portfolios have contributed their PBS Excel files and are included in the line item CSV: 1.1 Agriculture Portfolio; 1.2 Attorney-General’s Portfolio; 1.3 Communications Portfolio; 1.4A Defence Portfolio; 1.4B Defence Portfolio (Department of Veterans’ Affairs); 1.5 Education Portfolio; 1.6 Employment Portfolio; 1.7 Environment Portfolio; 1.8 Finance Portfolio; 1.9 Foreign Affairs and Trade Portfolio; 1.10 Health Portfolio; 1.11 Immigration and Border Protection Portfolio; 1.12 Industry Portfolio; 1.13 Infrastructure and Regional Development Portfolio; 1.14 Prime Minister and Cabinet Portfolio; 1.15A Social Services Portfolio; 1.15B Social Services Portfolio (Department of Human Services); 1.16 Treasury Portfolio; 1.17A Department of the House of Representatives; 1.17B Department of the Senate; 1.17C Department of Parliamentary Services; and 1.17D Department of the Parliamentary Budget Office.\r \r The original PBS Excel files and published documents include sub-totals and totals by agency and appropriation type which are not included in the line item CSV as these can be calculated programmatically. Where modifications are identified they will be updated as required. If a corrigendum to an agencies PBS is issued after budget night, tables will be updated as necessary. \r \r Below is the CSV structure of the line item CSV. The data transformation is expected to be complete by midday 14 May, so we have put up the incomplete CSV which will be updated as additional PBSs are transformed into data form. Please keep refreshing for now.\r \r Portfolio, Department/Agency, Outcome, Program, Expense type, Appropriation type, Description, 2012-13, 2013-14, 2014-15, 2015-16, 2016-17, Source document, Source table, URL\r \r Budget Paper Tables\r --------------------------\r We have made a number of data tables from Budget Papers 1 and 4 available in their original format as Excel or XML files. We have transformed a number of these into machine readable format (as prioritised by several users of budget data) which will be published here as they are ready. Below is the list of the tables published and whether we’ve translated them into CSV form this year: \r \r * Budget Paper 1 Overview Appendix C Major Intiatives (XLSX, CSV)\r * Budget Paper 1 Overview Appendix D Major Savings (XLSX, CSV)\r * Budget Paper 1 Statement 3 Table 5 Reconciliation of underlying cash balance estimates (XLSX, no CSV due to complexity)\r * Budget Paper 1 Statement 5 Table 1 Australian Government general government receipts (XLSX, CSV)\r * Budget Paper 1 Statement 5 Table 8 Australian Government general government (cash) receipts (XLSX, CSV)\r * Budget Paper 1 Statement 5 (online tables) Australian Government (accrual) revenue (XLSX, CSV)\r * Budget Paper 1 Statement 5 Table 11 Reconciliation of 2015-15 general government (accrual) revenue (XLSX, CSV)\r * Budget Paper 1 Statement 6 Appendix A Table A1 Estimates of expenses by function and sub-function (XLSX, CSV)\r * Budget Paper 1 Statement 10 Table 1 Australian Government general government sector receipts payments net Future Fund earnings and underlying cash balance (XLSX, CSV)\r * Budget Paper 1 Statement 10 Table 4 Australian Government general government sector taxation receipts non taxation receipts and total receipts (XLSX, CSV)\r * Budget Paper 1 Statement 10 Table 5 Australian Government general government sector net debt and net interest payments (XLSX, CSV)\r * Budget Paper 4 Table 1.1 Agency Resourcing (xls, CSV coming)\r * Budget Paper 4 Table 1.2 Special Appropriations (xlsx, CSV coming)\r * Budget Paper 4 Table 1.3 Special Accounts (xlsx, CSV)\r * Budget Paper 4 Table 2.2 Average Staffing Table (xlsx, CSV coming)\r * Budget Paper 4 Table 3.1 Departmental Expenses (xlsx, CSV coming)\r * Budget Paper 4 Table 3.2 Net Capital Investment (xlsx, CSV coming)
n
Infaunal marine invertebrate fauna inside and outside of bacterial mats,...
access.earthdata.nasa.gov
researchdata.edu.au
+3more
Updated Mar 15, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2019). Infaunal marine invertebrate fauna inside and outside of bacterial mats, Casey 2006-07 [Dataset]. http://doi.org/10.26179/5c8b147c9568b
Explore at:
Unique identifier
https://doi.org/10.26179/5c8b147c9568b
Dataset updated
Mar 15, 2019
Time period covered
Nov 10, 2006 - Dec 7, 2006
Area covered
Description
Infaunal marine invertebrates were collected from inside and outside of patches of white bacterial mats from several sites in the Windmill Islands, Antarctica, around Casey station during the 2006-07 summer. Samples were collected from McGrady Cove inner and outer, the tide gauge near the Casey wharf, Stevenson's Cove and Brown Bay inner. Sediment cores of 10cm depth and 5cm diameter were collected by divers using a PVC corer from inside (4 cores) and outside (4 cores) each bacterial patch. The size of each patch varied from site to site. Cores were sieved at 500 microns and the extracted fauna preserved in 4 percent neutral buffered formalin. All fauna were counted and identified to species where possible or assigned to morphospecies based on previous infaunal sampling around Casey.

An excel spreadsheet is available for download at the URL given below. The spreadsheet does not represent the complete dataset, and is only the bacterial mat infauna data.

Regarding the infauna dataset:

in - in the mat or patch of bacteria and out is in the "normal" sediment surrounding the patch without evidence of any bacterial mat presence.

Patch numbers were allocated to ensure there was no confusion between patches in the same area.

Fauna names are our identification codes for each species. Some we have confirmed identifications for, some not. Species names, where we have them and as we get them, are listed against these codes in the Casey marine soft-sediment fauna identification guide.

This work was completed as part of ASAC 2201 (ASAC_2201).
C
Woodlawn Motor Vehicle Theft
data.cityofchicago.org
Updated Dec 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chicago Police Department (2025). Woodlawn Motor Vehicle Theft [Dataset]. https://data.cityofchicago.org/Public-Safety/Woodlawn-Motor-Vehicle-Theft/ijaw-z4w5
Explore at:
xlsx, kmz, csv, application/geo+json, xml, kmlAvailable download formats
Dataset updated
Dec 2, 2025
Authors
Chicago Police Department
Description
This dataset reflects reported incidents of crime (with the exception of murders where data exists for each victim) that occurred in the City of Chicago from 2001 to present, minus the most recent seven days. Data is extracted from the Chicago Police Department's CLEAR (Citizen Law Enforcement Analysis and Reporting) system. In order to protect the privacy of crime victims, addresses are shown at the block level only and specific locations are not identified. Should you have questions about this dataset, you may contact the Research & Development Division of the Chicago Police Department at 312.745.6071 or RDAnalysis@chicagopolice.org. Disclaimer: These crimes may be based upon preliminary information supplied to the Police Department by the reporting parties that have not been verified. The preliminary crime classifications may be changed at a later date based upon additional investigation and there is always the possibility of mechanical or human error. Therefore, the Chicago Police Department does not guarantee (either expressed or implied) the accuracy, completeness, timeliness, or correct sequencing of the information and the information should not be used for comparison purposes over time. The Chicago Police Department will not be responsible for any error or omission, or for the use of, or the results obtained from the use of this information. All data visualizations on maps should be considered approximate and attempts to derive specific addresses are strictly prohibited. The Chicago Police Department is not responsible for the content of any off-site pages that are referenced by or that reference this web page other than an official City of Chicago or Chicago Police Department web page. The user specifically acknowledges that the Chicago Police Department is not responsible for any defamatory, offensive, misleading, or illegal conduct of other users, links, or third parties and that the risk of injury from the foregoing rests entirely with the user. The unauthorized use of the words "Chicago Police Department," "Chicago Police," or any colorable imitation of these words or the unauthorized use of the Chicago Police Department logo is unlawful. This web page does not, in any way, authorize such use. Data are updated daily. The dataset contains more than 65,000 records/rows of data and cannot be viewed in full in Microsoft Excel. Therefore, when downloading the file, select CSV from the Export menu. Open the file in an ASCII text editor, such as Wordpad, to view and search. To access a list of Chicago Police Department - Illinois Uniform Crime Reporting (IUCR) codes, go to http://data.cityofchicago.org/Public-Safety/Chicago-Police-Department-Illinois-Uniform-Crime-R/c7ck-438e
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Benj Petre; Aurore Coince; Sophien Kamoun (2016). Petre_Slide_CategoricalScatterplotFigShare.pptx [Dataset]. http://doi.org/10.6084/m9.figshare.3840102.v1

Petre_Slide_CategoricalScatterplotFigShare.pptx

Explore at:

pptxAvailable download formats

Unique identifier

https://doi.org/10.6084/m9.figshare.3840102.v1

Dataset updated

Sep 19, 2016

Dataset provided by

Figsharehttp://figshare.com/

Authors

Benj Petre; Aurore Coince; Sophien Kamoun

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Categorical scatterplots with R for biologists: a step-by-step guide

Benjamin Petre1, Aurore Coince2, Sophien Kamoun1

1 The Sainsbury Laboratory, Norwich, UK; 2 Earlham Institute, Norwich, UK

Weissgerber and colleagues (2015) recently stated that ‘as scientists, we urgently need to change our practices for presenting continuous data in small sample size studies’. They called for more scatterplot and boxplot representations in scientific papers, which ‘allow readers to critically evaluate continuous data’ (Weissgerber et al., 2015). In the Kamoun Lab at The Sainsbury Laboratory, we recently implemented a protocol to generate categorical scatterplots (Petre et al., 2016; Dagdas et al., 2016). Here we describe the three steps of this protocol: 1) formatting of the data set in a .csv file, 2) execution of the R script to generate the graph, and 3) export of the graph as a .pdf file.

Protocol

• Step 1: format the data set as a .csv file. Store the data in a three-column excel file as shown in Powerpoint slide. The first column ‘Replicate’ indicates the biological replicates. In the example, the month and year during which the replicate was performed is indicated. The second column ‘Condition’ indicates the conditions of the experiment (in the example, a wild type and two mutants called A and B). The third column ‘Value’ contains continuous values. Save the Excel file as a .csv file (File -> Save as -> in ‘File Format’, select .csv). This .csv file is the input file to import in R.

• Step 2: execute the R script (see Notes 1 and 2). Copy the script shown in Powerpoint slide and paste it in the R console. Execute the script. In the dialog box, select the input .csv file from step 1. The categorical scatterplot will appear in a separate window. Dots represent the values for each sample; colors indicate replicates. Boxplots are superimposed; black dots indicate outliers.

• Step 3: save the graph as a .pdf file. Shape the window at your convenience and save the graph as a .pdf file (File -> Save as). See Powerpoint slide for an example.

Notes

• Note 1: install the ggplot2 package. The R script requires the package ‘ggplot2’ to be installed. To install it, Packages & Data -> Package Installer -> enter ‘ggplot2’ in the Package Search space and click on ‘Get List’. Select ‘ggplot2’ in the Package column and click on ‘Install Selected’. Install all dependencies as well.

• Note 2: use a log scale for the y-axis. To use a log scale for the y-axis of the graph, use the command line below in place of command line #7 in the script.

7 Display the graph in a separate window. Dot colors indicate

replicates

graph + geom_boxplot(outlier.colour='black', colour='black') + geom_jitter(aes(col=Replicate)) + scale_y_log10() + theme_bw()

References

Dagdas YF, Belhaj K, Maqbool A, Chaparro-Garcia A, Pandey P, Petre B, et al. (2016) An effector of the Irish potato famine pathogen antagonizes a host autophagy cargo receptor. eLife 5:e10856.

Petre B, Saunders DGO, Sklenar J, Lorrain C, Krasileva KV, Win J, et al. (2016) Heterologous Expression Screens in Nicotiana benthamiana Identify a Candidate Effector of the Wheat Yellow Rust Pathogen that Associates with Processing Bodies. PLoS ONE 11(2):e0149035

Weissgerber TL, Milic NM, Winham SJ, Garovic VD (2015) Beyond Bar and Line Graphs: Time for a New Data Presentation Paradigm. PLoS Biol 13(4):e1002128

https://cran.r-project.org/

http://ggplot2.org/

Clear search

Close search

Google apps

Main menu

Petre_Slide_CategoricalScatterplotFigShare.pptx

7 Display the graph in a separate window. Dot colors indicate

R script to create boxplots of change factors by NOAA Atlas 14 station, or...

Hazardous Times

Data from: A Phanerozoic gridded dataset for palaeogeographic...

Gordon's Archery Scores

Context

Content

Acknowledgement

Beat 0314

Data curation materials in "Daily life in the Open Biologist's second job,...

Data from: Our Block

Budget 2014 -15 and Portfolio Budget Statements (PBS) - Tables and Data

Infaunal marine invertebrate fauna inside and outside of bacterial mats,...

Woodlawn Motor Vehicle Theft

Petre_Slide_CategoricalScatterplotFigShare.pptx

7 Display the graph in a separate window. Dot colors indicate