This dataset contains the geographic data used to create maps for the San Diego County Regional Equity Indicators Report led by the Office of Equity and Racial Justice (OERJ). The full report can be found here: https://data.sandiegocounty.gov/stories/s/7its-kgpt
Demographic data from the report can be found here: https://data.sandiegocounty.gov/dataset/Equity-Report-Data-Demographics/q9ix-kfws
Filter by the Indicator column to select data for a particular indicator map.
Export notes: Dataset may not automatically open correctly in Excel due to geospatial data. To export the data for geospatial analysis, select Shapefile or GEOJSON as the file type. To view the data in Excel, export as a CSV but do not open the file. Then, open a blank Excel workbook, go to the Data tab, select âFrom Text/CSV,â and follow the prompts to import the CSV file into Excel. Alternatively, use the exploration options in "View Data" to hide the geographic column prior to exporting the data.
USER NOTES: 4/7/2025 - The maps and data have been removed for the Health Professional Shortage Areas indicator due to inconsistencies with the data source leading to some missing health professional shortage areas. We are working to fix this issue, including exploring possible alternative data sources.
5/21/2025 - The following changes were made to the 2023 report data (Equity Report Year = 2023). Self-Sufficiency Wage - a typo in the indicator name was fixed (changed sufficienct to sufficient) and the percent for one PUMA corrected from 56.9 to 59.9 (PUMA = San Diego County (Northwest)--Oceanside City & Camp Pendleton). Notes were made consistent for all rows where geography = ZCTA. A note was added to all rows where geography = PUMA. Voter registration - label "92054, 92051" was renamed to be in numerical order and is now "92051, 92054". Removed data from the percentile column because the categories are not true percentiles. Employment - Data was corrected to show the percent of the labor force that are employed (ages 16 and older). Previously, the data was the percent of the population 16 years and older that are in the labor force. 3- and 4-Year-Olds Enrolled in School - percents are now rounded to one decimal place. Poverty - the last two categories/percentiles changed because the 80th percentile cutoff was corrected by 0.01 and one ZCTA was reassigned to a different percentile as a result. Low Birthweight - the 33th percentile label was corrected to be written as the 33rd percentile. Life Expectancy - Corrected the category and percentile assignment for SRA CENTRAL SAN DIEGO. Parks and Community Spaces - corrected the category assignment for six SRAs.
5/21/2025 - Data was uploaded for Equity Report Year 2025. The following changes were made relative to the 2023 report year. Adverse Childhood Experiences - added geographic data for 2025 report. No calculation of bins nor corresponding percentiles due to small number of geographic areas. Low Birthweight - no calculation of bins nor corresponding percentiles due to small number of geographic areas.
Prepared by: Office of Evaluation, Performance, and Analytics and the Office of Equity and Racial Justice, County of San Diego, in collaboration with the San Diego Regional Policy & Innovation Center (https://www.sdrpic.org).
The data sets provide an overview of selected data on waterworks registered with the Norwegian Food Safety Authority. The information has been reported by the waterworks through application processing or other reporting to the Norwegian Food Safety Authority. Drinking water regulations require, among other things, annual reporting. The Norwegian Food Safety Authority has created a separate form service for such reporting. The data sets include public or private waterworks that supply 50 people or more. In addition, all municipal owned businesses with their own water supply are included regardless of size. The data sets also contain decommissioned facilities. This is done for those who wish to view historical data, i.e. data for previous years or earlier. There are data sets for the following supervisory objects: 1. Water supply system. It also includes analysis of drinking water. 2. Transport system 3. Treatment facility 4. Entry point. It also includes analysis of the water source. Below you will find datasets for: 4. In addition, there is a file (information.txt) that provides an overview of when the extracts were produced and how many lines there are in the individual files. The withdrawals are done weekly. Furthermore, for the data sets water supply system, transport system and intake point it is possible to see historical data on what is included in the annual reporting. To make use of that information, the file must be linked to the âmoderâ file. to get names and other static information. These files have the _reporting ending in the file name. Description of the data fields (i.e. metadata) in the individual data sets appears in separate files. These are available in pdf format. If you double-click the csv file and it opens directly in excel, then you will not get the Ìøü. To see the character set correctly in Excel, you must: & start Excel and a new spreadsheet & select data and then from text, press Import & select separator data and file origin 65001: Unicode (UTF-8) and tick of My Data have headings and press Next & remove tab as separator and select semicolon as separator, press next & otherwise, complete the data sets can be imported into a separate database and compiled as desired. There are link keys in the files that make it possible to link the files together. The waterworks are responsible for the quality of the datasets. â Purpose: Make information on the supply of drinking water available to the public. The data sets provide an overview of selected data on waterworks registered with the Norwegian Food Safety Authority. The information has been reported by the waterworks through application processing or other reporting to the Norwegian Food Safety Authority. Drinking water regulations require, among other things, annual reporting. The Norwegian Food Safety Authority has created a separate form service for such reporting. The data sets include public or private waterworks that supply 50 people or more. In addition, all municipal owned businesses with their own water supply are included regardless of size. The data sets also contain decommissioned facilities. This is done for those who wish to view historical data, i.e. data for previous years or earlier. There are data sets for the following supervisory objects: 1. Water supply system. It also includes analysis of drinking water. 2. Transport system 3. Treatment facility 4. Entry point. It also includes analysis of the water source. Below you will find datasets for: 4. In addition, there is a file (information.txt) that provides an overview of when the extracts were produced and how many lines there are in the individual files. The withdrawals are done weekly. Furthermore, for the data sets water supply system, transport system and intake point it is possible to see historical data on what is included in the annual reporting. To make use of that information, the file must be linked to the âmoderâ file. to get names and other static information. These files have the _reporting ending in the file name. Description of the data fields (i.e. metadata) in the individual data sets appears in separate files. These are available in pdf format. If you double-click the csv file and it opens directly in excel, then you will not get the Ìøü. To see the character set correctly in Excel, you must: & start Excel and a new spreadsheet & select data and then from text, press Import & select separator data and file origin 65001: Unicode (UTF-8) and tick of My Data have headings and press Next & remove tab as separator and select semicolon as separator, press next & otherwise, complete the data sets can be imported into a separate database and compiled as desired. There are link keys in the files that make it possible to link the files together. The waterworks are responsible for the quality of the datasets.
â
Purpose: Make information on the supply of drinking water available to the public.
This page provides data for the 3rd Grade Reading Level Proficiency performance measure.
The dataset includes the student performance results on the English/Language Arts section of the AzMERIT from the Fall 2017 and Spring 2018. Data is representive of students in third grade in public elementary schools in Tempe. This includes schools from both Tempe Elementary and Kyrene districts. Results are by school and provide the total number of students tested, total percentage passing and percentage of students scoring at each of the four levels of proficiency.
The performance measure dashboard is available at 3.07 3rd Grade Reading Level Proficiency.
Additional Information
Source: Arizona Department of Education
Contact: Ann Lynn DiDomenico
Contact E-Mail: Ann_DiDomenico@tempe.gov
Data Source Type: Excel/ CSV
Preparation Method: Filters on original dataset: within
"Schools" Tab School District [select Tempe School District and
Kyrene School District]; School Name [deselect Kyrene SD not in Tempe city
limits]; Content Area [select English Language Arts]; Test Level [select Grade
3]; Subgroup/Ethnicity [select All Students] Remove irrelevant fields; Add
Fiscal Year
Publish Frequency: Annually as data becomes available
Publish Method: Manual
Data Dictionary
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This article describes a free, open-source collection of templates for the popular Excel (2013, and later versions) spreadsheet program. These templates are spreadsheet files that allow easy and intuitive learning and the implementation of practical examples concerning descriptive statistics, random variables, confidence intervals, and hypothesis testing. Although they are designed to be used with Excel, they can also be employed with other free spreadsheet programs (changing some particular formulas). Moreover, we exploit some possibilities of the ActiveX controls of the Excel Developer Menu to perform interactive Gaussian density charts. Finally, it is important to note that they can be often embedded in a web page, so it is not necessary to employ Excel software for their use. These templates have been designed as a useful tool to teach basic statistics and to carry out data analysis even when the students are not familiar with Excel. Additionally, they can be used as a complement to other analytical software packages. They aim to assist students in learning statistics, within an intuitive working environment. Supplementary materials with the Excel templates are available online.
https://data.norge.no/nlod/en/2.0/https://data.norge.no/nlod/en/2.0/
The data sets provide an overview of selected data on waterworks registered with the Norwegian Food Safety Authority. The information has been reported by the waterworks through application processing or other reporting to the Norwegian Food Safety Authority. Drinking water regulations require, among other things, annual reporting. The Norwegian Food Safety Authority has created a separate form service for such reporting. The data sets include public or private waterworks that supply 50 people or more. In addition, all municipal owned businesses with their own water supply are included regardless of size. The data sets also contain decommissioned facilities. This is done for those who wish to view historical data, i.e. data for previous years or earlier. There are data sets for the following supervisory objects: 1. Water supply system. It also includes analysis of drinking water. 2. Transport system 3. Treatment facility 4. Entry point. It also includes analysis of the water source. Below you will find datasets for: 1. Water supply system_reporting In addition, there is a file (information.txt) that provides an overview of when the extracts were produced and how many lines there are in the individual files. The withdrawals are done weekly. Furthermore, for the data sets water supply system, transport system and intake point it is possible to see historical data on what is included in the annual reporting. To make use of that information, the file must be linked to the âmoderâ file. to get names and other static information. These files have the _reporting ending in the file name. Description of the data fields (i.e. metadata) in the individual data sets appears in separate files. These are available in pdf format. If you double-click the csv file and it opens directly in excel, then you will not get the Ìøü. To see the character set correctly in Excel, you must: & start Excel and a new spreadsheet & select data and then from text, press Import & select separator data and file origin 65001: Unicode (UTF-8) and tick of My Data have headings and press Next & remove tab as separator and select semicolon as separator, press next & otherwise, complete the data sets can be imported into a separate database and compiled as desired. There are link keys in the files that make it possible to link the files together. The waterworks are responsible for the quality of the datasets.
â
Purpose: Make data for drinking water supply available to the public.
This dataset contains all current and active business licenses issued by the Department of Business Affairs and Consumer Protection. This dataset contains a large number of records /rows of data and may not be viewed in full in Microsoft Excel. Therefore, when downloading the file, select CSV from the Export menu. Open the file in an ASCII text editor, such as Notepad or Wordpad, to view and search.
Data fields requiring description are detailed below.
APPLICATION TYPE: 'ISSUE' is the record associated with the initial license application. 'RENEW' is a subsequent renewal record. All renewal records are created with a term start date and term expiration date. 'C_LOC' is a change of location record. It means the business moved. 'C_CAPA' is a change of capacity record. Only a few license types my file this type of application. 'C_EXPA' only applies to businesses that have liquor licenses. It means the business location expanded.
LICENSE STATUS: 'AAI' means the license was issued.
Business license owners may be accessed at: http://data.cityofchicago.org/Community-Economic-Development/Business-Owners/ezma-pppn To identify the owner of a business, you will need the account number or legal name.
Data Owner: Business Affairs and Consumer Protection
Time Period: Current
Frequency: Data is updated daily
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository contains the datasets and data sources, analysis code, and workflow associated with the manuscript "Comparing the Effects of Euclidean Distance Matching and Dynamic Time Warping in the Clustering of COVID-19 Evolution". The following resources are provided:
Data Files:
time_series_data.csv
: A curated time series dataset with dates as rows and NUTS 2 regions as columns. Each column is labeled using a 4-letter abbreviation format "CC.RR", where "CC" represents the country code and "RR" represents the region code. This same abbreviation is also included in the accompanying GeoJSON file.geometry_data.geojson
: A GeoJSON file representing the spatial boundaries of the NUTS 2 regions, with the same 4-letter abbreviations used in the CSV file. EPSG:4326.COVID19_data_sources.xlsx
: This Excel file contains important metadata regarding the sources of COVID-19 data used in this study. It includes:
Code:
analysis.py
: A Python script used to process and analyze the data. This code can be run using Python 3.x. The libraries required to run this script are listed in the first lines of the code. The code is organized in different numbered sections (1), (2), ... and sub-sections (1a), (1b) ... Make sure to run the script one (sub-)section at a time, so that everything stays overviewable and you don't get all the output at once.Workflow:
workflow.png
: A detailed workflow according to the Knowledge Discovery in Databases (KDD) process, outlining the steps involved in processing and analyzing the data, including the methods used. This workflow provides a comprehensive guide to reproducing the analysis presented in the paper.Origin of glass and its relationships with phlogopite in mantle xenoliths from central Sardinia (Italy)
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Any Queries or requirements, Please feel free to share them with me!!
Context:
The Indian Premier League (IPL) has carved out a special place for itself in the hearts of cricket lovers from the very first season itself in 2008. It is a professional Twenty20 cricket league in India, organized by the Board of Control for Cricket in India (BCCI). Founded in 2007, the league features ten state or city-based franchise teams. The IPL is the most popular and richest cricket league in the world and is held between March and May.
The current defending champions are the Kolkata Knight Riders, who won the 2024 season after defeating Sunrisers Hyderabad in the final. IPL 2025 is the 18th edition of the tournament. This edition of the prestigious tournament commenced on March 22, 2025 and the Final is expected to be played on May 25, 2025.
The 74 matches of the season will be played across 13 venues and will include 12 double-headers. While the afternoon games will begin at 03.30 PM IST, the evening games will begin from 07.30 PM IST.
Content:
Primary file -> matches.csv: Contains detailed information for each match played.
Secondary files->
deliveries: Ball by ball data
orange_cap: Top batting performances
purple_cap: Top bowling performances
Acknowledgements:
The data source is Google and the ESPN official website.
Inspiration:
You can use this data to analyze each team performances, create visualizations to explore tournament results and also predict outcomes of future ipl matches (for eg: fantasy prediction).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of â2014-2015 Diversity Report - K-8 & Grades 9-12 District, Schools, Special Programs, Diversity Effortsâ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://catalog.data.gov/dataset/e7dc14b8-c671-4c2f-b501-44f13ec6f1d5 on 26 January 2022.
--- Dataset description provided by original source is as follows ---
Enrollment counts are based on the October 31st Audited Register for 2014.
Data on students with disabilities, English language learners and students poverty status are as of February 2nd 2015. Due to missing demographic information in rare cases, demographic categories do not always add up to citywide totals. In order to view all data there is an excel file attached which you would select to open.
--- Original source retains full ownership of the source dataset ---
Load, wind and solar, prices in hourly resolution. This data package contains different kinds of timeseries data relevant for power system modelling, namely electricity prices, electricity consumption (load) as well as wind and solar power generation and capacities. The data is aggregated either by country, control area or bidding zone. Geographical coverage includes the EU and some neighbouring countries. All variables are provided in hourly resolution. Where original data is available in higher resolution (half-hourly or quarter-hourly), it is provided in separate files. This package version only contains data provided by TSOs and power exchanges via ENTSO-E Transparency, covering the period 2015-mid 2020. See previous versions for historical data from a broader range of sources. All data processing is conducted in Python/pandas and has been documented in the Jupyter notebooks linked below.
The Bureau of the Census has released Census 2000 Summary File 1 (SF1) 100-Percent data. The file includes the following population items: sex, age, race, Hispanic or Latino origin, household relationship, and household and family characteristics. Housing items include occupancy status and tenure (whether the unit is owner or renter occupied). SF1 does not include information on incomes, poverty status, overcrowded housing or age of housing. These topics will be covered in Summary File 3. Data are available for states, counties, county subdivisions, places, census tracts, block groups, and, where applicable, American Indian and Alaskan Native Areas and Hawaiian Home Lands. The SF1 data are available on the Bureau's web site and may be retrieved from American FactFinder as tables, lists, or maps. Users may also download a set of compressed ASCII files for each state via the Bureau's FTP server. There are over 8000 data items available for each geographic area. The full listing of these data items is available here as a downloadable compressed data base file named TABLES.ZIP. The uncompressed is in FoxPro data base file (dbf) format and may be imported to ACCESS, EXCEL, and other software formats. While all of this information is useful, the Office of Community Planning and Development has downloaded selected information for all states and areas and is making this information available on the CPD web pages. The tables and data items selected are those items used in the CDBG and HOME allocation formulas plus topics most pertinent to the Comprehensive Housing Affordability Strategy (CHAS), the Consolidated Plan, and similar overall economic and community development plans. The information is contained in five compressed (zipped) dbf tables for each state. When uncompressed the tables are ready for use with FoxPro and they can be imported into ACCESS, EXCEL, and other spreadsheet, GIS and database software. The data are at the block group summary level. The first two characters of the file name are the state abbreviation. The next two letters are BG for block group. Each record is labeled with the code and name of the city and county in which it is located so that the data can be summarized to higher-level geography. The last part of the file name describes the contents . The GEO file contains standard Census Bureau geographic identifiers for each block group, such as the metropolitan area code and congressional district code. The only data included in this table is total population and total housing units. POP1 and POP2 contain selected population variables and selected housing items are in the HU file. The MA05 table data is only for use by State CDBG grantees for the reporting of the racial composition of beneficiaries of Area Benefit activities. The complete package for a state consists of the dictionary file named TABLES, and the five data files for the state. The logical record number (LOGRECNO) links the records across tables.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
By Department of Energy [source]
The Building Energy Data Book (2011) is an invaluable resource for gaining insight into the current state of energy consumption in the buildings sector. This dataset provides comprehensive data on residential, commercial and industrial building energy consumption, construction techniques, building technologies and characteristics. With this resource, you can get an in-depth understanding of how energy is used in various types of buildings - from single family homes to large office complexes - as well as its impact on the environment. The BTO within the U.S Department of Energy's Office of Energy Efficiency and Renewable Energy developed this dataset to provide a wealth of knowledge for researchers, policy makers, engineers and even everyday observers who are interested in learning more about our built environment and its energy usage patterns
For more datasets, click here.
- đ¨ Your notebook can be here! đ¨!
This dataset provides comprehensive information regarding energy consumption in the buildings sector of the United States. It contains a number of key variables which can be used to analyze and explore the relations between energy consumption and building characteristics, technologies, and construction. The data is provided in both CSV format as well as tabular format which can make it helpful for those who prefer to use programs like Excel or other statistical modeling software.
In order to get started with this dataset we've developed a guide outlining how to effectively use it for your research or project needs.
Understand what's included: Before you start analyzing the data, you should read through the provided documentation so that you fully understand what is included in the datasets. You'll want to be aware of any potential limitations or requirements associated with each type of data point so that your results are valid and reliable when drawing conclusions from them.
Clean up any outliers: You may need to take some time upfront investigating suspicious outliers within your dataset before using it in any further analyses â otherwise, they can skew results down the road if not dealt with first-hand! Furthermore, they could also make complex statistical modeling more difficult as well since they artificially inflate values depending on their magnitude within each example data point (i.e., one outlier could affect an entire modelâs prior distributions). Missing values should also be accounted for too since these may not always appear obvious at first glance when reviewing a table or graphical representation - but accurate statistics must still be obtained either way no matter how messy things seem!
Exploratory data analysis: After cleaning up your dataset you'll want to do some basic exploring by visualizing different types of summaries like boxplots, histograms and scatter plots etc.. This will give you an initial case into what trends might exist within certain demographic/geographic/etc.. regions & variables which can then help inform future predictive models when needed! Additionally this step will highlight any clear discontinuous changes over time due over-generalization (if applicable), making sure predictors themselves donât become part noise instead contributing meaningful signals towards overall effect predictions accuracy etcâŚ
Analyze key metrics & observations: Once exploratory analyses have been carried out on rawsamples post-processing steps are next such as analyzing metrics such ascorrelations amongst explanatory functions; performing significance testing regression models; imputing missing/outlier values and much more depending upon specific project needs at hand⌠Additionally â interpretation efforts based
- Creating an energy efficiency rating system for buildings - Using the dataset, an organization can develop a metric to rate the energy efficiency of commercial and residential buildings in a standardized way.
- Developing targeted campaigns to raise awareness about energy conservation - Analyzing data from this dataset can help organizations identify areas of high energy consumption and create targeted campaigns and incentives to encourage people to conserve energy in those areas.
- Estimating costs associated with upgrading building technologies - By evaluating various trends in building technologies and their associated costs, decision-makers can determine the most cost-effective option when it comes time to upgrade their structures' energy efficiency...
This dataset contains information about India's Sales of Motor Vehicles for2007-2019.Data from Ministry of Road Transport and Highways.
The Emissions & Generation Resource Integrated Database (eGRID) is a comprehensive source of data on the environmental characteristics of almost all electric power generated in the United States. These environmental characteristics include air emissions for nitrogen oxides, sulfur dioxide, carbon dioxide, methane, and nitrous oxide; emissions rates; net generation; resource mix; and many other attributes.
eGRID2010 contains the complete release of year 2007 data, as well as years 2005 and 2004 data. Excel spreadsheets, full documentation, summary data, eGRID subregion and NERC region representational maps, and GHG emission factors are included in this data set. The Archived data in eGRID2002 contain years 1996 through 2000 data.
For year 2007 data, the first Microsoft Excel workbook, Plant, contains boiler, generator, and plant spreadsheets. The second Microsoft Excel workbook, Aggregation, contains aggregated data by state, electric generating company, parent company, power control area, eGRID subregion, NERC region, and U.S. total levels. The third Microsoft Excel workbook, ImportExport, contains state import-export data, as well as U.S. generation and consumption data for years 2007, 2005, and 2004. For eGRID data for years 2005 and 2004, a user friendly web application, eGRIDweb, is available to select, view, print, and export specified data.
Statistics on grants for courses in reading, writing, accounting, Norwegian and digital and oral skills through Kompetanseplusâs work. Enterprises in Competanseplusâs work per county Projects in Kompetanseplusâs work broken down by skills Project applications and projects granted by skills, years and call for proposals Businesses who have applied for funding, which have applied for funding, distributed by industry, year and call for proposals Businesses who have applied for funding, distributed by the companyâs county, year and call for proposals Demanded and granted amount, divided by year and call for announcement Unique participants, by gender Participants, divided by age, type of course, year and announcements and initiatives Participants, divided by educational background, course type, year and announcements and initiatives Participants and participations Participants and womenâs share, divided by course type, year and announcements and initiatives Participants and the proportion of minority language, divided by course type, year and callouts and initiatives Explaining Deployment information can be found by clicking on & in the menu bar above each table. In the menu bar, you can also choose whether to retrieve the source, view the statistics as charts, export the data to Excel or pdf, and copy the link to the specific view of a table or chart. See where you're doing on our help page. If you have any questions about the tables, please contact Kristine Bettum at kristine.bettum@kompetansenorge.no .
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This data and code archive provides all the files that are necessary to replicate the empirical analyses that are presented in the paper "Climate impacts and adaptation in US dairy systems 1981-2018" authored by Maria Gisbert-Queral, Arne Henningsen, Bo Markussen, Meredith T. Niles, Ermias Kebreab, Angela J. Rigden, and Nathaniel D. Mueller and published in 'Nature Food' (2021, DOI: 10.1038/s43016-021-00372-z). The empirical analyses are entirely conducted with the "R" statistical software using the add-on packages "car", "data.table", "dplyr", "ggplot2", "grid", "gridExtra", "lmtest", "lubridate", "magrittr", "nlme", "OneR", "plyr", "pracma", "quadprog", "readxl", "sandwich", "tidyr", "usfertilizer", and "usmap". The R code was written by Maria Gisbert-Queral and Arne Henningsen with assistance from Bo Markussen. Some parts of the data preparation and the analyses require substantial amounts of memory (RAM) and computational power (CPU). Running the entire analysis (all R scripts consecutively) on a laptop computer with 32 GB physical memory (RAM), 16 GB swap memory, an 8-core Intel Xeon CPU E3-1505M @ 3.00 GHz, and a GNU/Linux/Ubuntu operating system takes around 11 hours. Running some parts in parallel can speed up the computations but bears the risk that the computations terminate when two or more memory-demanding computations are executed at the same time.
This data and code archive contains the following files and folders:
* README
Description: text file with this description
* flowchart.pdf
Description: a PDF file with a flow chart that illustrates how R scripts transform the raw data files to files that contain generated data sets and intermediate results and, finally, to the tables and figures that are presented in the paper.
* runAll.sh
Description: a (bash) shell script that runs all R scripts in this data and code archive sequentially and in a suitable order (on computers with a "bash" shell such as most computers with MacOS, GNU/Linux, or Unix operating systems)
* Folder "DataRaw"
Description: folder for raw data files
This folder contains the following files:
- DataRaw/COWS.xlsx
Description: MS-Excel file with the number of cows per county
Source: USDA NASS Quickstats
Observations: All available counties and years from 2002 to 2012
- DataRaw/milk_state.xlsx
Description: MS-Excel file with average monthly milk yields per cow
Source: USDA NASS Quickstats
Observations: All available states from 1981 to 2018
- DataRaw/TMAX.csv
Description: CSV file with daily maximum temperatures
Source: PRISM Climate Group (spatially averaged)
Observations: All counties from 1981 to 2018
- DataRaw/VPD.csv
Description: CSV file with daily maximum vapor pressure deficits
Source: PRISM Climate Group (spatially averaged)
Observations: All counties from 1981 to 2018
- DataRaw/countynamesandID.csv
Description: CSV file with county names, state FIPS codes, and county FIPS codes
Source: US Census Bureau
Observations: All counties
- DataRaw/statecentroids.csv
Descriptions: CSV file with latitudes and longitudes of state centroids
Source: Generated by Nathan Mueller from Matlab state shapefiles using the Matlab "centroid" function
Observations: All states
* Folder "DataGenerated"
Description: folder for data sets that are generated by the R scripts in this data and code archive. In order to reproduce our entire analysis 'from scratch', the files in this folder should be deleted. We provide these generated data files so that parts of the analysis can be replicated (e.g., on computers with insufficient memory to run all parts of the analysis).
* Folder "Results"
Description: folder for intermediate results that are generated by the R scripts in this data and code archive. In order to reproduce our entire analysis 'from scratch', the files in this folder should be deleted. We provide these intermediate results so that parts of the analysis can be replicated (e.g., on computers with insufficient memory to run all parts of the analysis).
* Folder "Figures"
Description: folder for the figures that are generated by the R scripts in this data and code archive and that are presented in our paper. In order to reproduce our entire analysis 'from scratch', the files in this folder should be deleted. We provide these figures so that people who replicate our analysis can more easily compare the figures that they get with the figures that are presented in our paper. Additionally, this folder contains CSV files with the data that are required to reproduce the figures.
* Folder "Tables"
Description: folder for the tables that are generated by the R scripts in this data and code archive and that are presented in our paper. In order to reproduce our entire analysis 'from scratch', the files in this folder should be deleted. We provide these tables so that people who replicate our analysis can more easily compare the tables that they get with the tables that are presented in our paper.
* Folder "logFiles"
Description: the shell script runAll.sh writes the output of each R script that it runs into this folder. We provide these log files so that people who replicate our analysis can more easily compare the R output that they get with the R output that we got.
* PrepareCowsData.R
Description: R script that imports the raw data set COWS.xlsx and prepares it for the further analyses
* PrepareWeatherData.R
Description: R script that imports the raw data sets TMAX.csv, VPD.csv, and countynamesandID.csv, merges these three data sets, and prepares the data for the further analyses
* PrepareMilkData.R
Description: R script that imports the raw data set milk_state.xlsx and prepares it for the further analyses
* CalcFrequenciesTHI_Temp.R
Description: R script that calculates the frequencies of days with the different THI bins and the different temperature bins in each month for each state
* CalcAvgTHI.R
Description: R script that calculates the average THI in each state
* PreparePanelTHI.R
Description: R script that creates a state-month panel/longitudinal data set with exposure to the different THI bins
* PreparePanelTemp.R
Description: R script that creates a state-month panel/longitudinal data set with exposure to the different temperature bins
* PreparePanelFinal.R
Description: R script that creates the state-month panel/longitudinal data set with all variables (e.g., THI bins, temperature bins, milk yield) that are used in our statistical analyses
* EstimateTrendsTHI.R
Description: R script that estimates the trends of the frequencies of the different THI bins within our sampling period for each state in our data set
* EstimateModels.R
Description: R script that estimates all model specifications that are used for generating results that are presented in the paper or for comparing or testing different model specifications
* CalcCoefStateYear.R
Description: R script that calculates the effects of each THI bin on the milk yield for all combinations of states and years based on our 'final' model specification
* SearchWeightMonths.R
Description: R script that estimates our 'final' model specification with different values of the weight of the temporal component relative to the weight of the spatial component in the temporally and spatially correlated error term
* TestModelSpec.R
Description: R script that applies Wald tests and Likelihood-Ratio tests to compare different model specifications and creates Table S10
* CreateFigure1a.R
Description: R script that creates subfigure a of Figure 1
* CreateFigure1b.R
Description: R script that creates subfigure b of Figure 1
* CreateFigure2a.R
Description: R script that creates subfigure a of Figure 2
* CreateFigure2b.R
Description: R script that creates subfigure b of Figure 2
* CreateFigure2c.R
Description: R script that creates subfigure c of Figure 2
* CreateFigure3.R
Description: R script that creates the subfigures of Figure 3
* CreateFigure4.R
Description: R script that creates the subfigures of Figure 4
* CreateFigure5_TableS6.R
Description: R script that creates the subfigures of Figure 5 and Table S6
* CreateFigureS1.R
Description: R script that creates Figure S1
* CreateFigureS2.R
Description: R script that creates Figure S2
* CreateTableS2_S3_S7.R
Description: R script that creates Tables S2, S3, and S7
* CreateTableS4_S5.R
Description: R script that creates Tables S4 and S5
* CreateTableS8.R
Description: R script that creates Table S8
* CreateTableS9.R
Description: R script that creates Table S9
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
đłđ´ ë ¸ëĽ´ě¨ě´ English Statistics on grants for courses in reading, writing, accounting, Norwegian and digital and oral skills through Kompetanseplusâs work. Enterprises in Competanseplusâs work per county Projects in Kompetanseplusâs work broken down by skills Project applications and projects granted by skills, years and call for proposals Businesses who have applied for funding, which have applied for funding, distributed by industry, year and call for proposals Businesses who have applied for funding, distributed by the companyâs county, year and call for proposals Demanded and granted amount, divided by year and call for announcement Unique participants, by gender Participants, divided by age, type of course, year and announcements and initiatives Participants, divided by educational background, course type, year and announcements and initiatives Participants and participations Participants and womenâs share, divided by course type, year and announcements and initiatives Participants and the proportion of minority language, divided by course type, year and callouts and initiatives Explaining Deployment information can be found by clicking on & in the menu bar above each table. In the menu bar, you can also choose whether to retrieve the source, view the statistics as charts, export the data to Excel or pdf, and copy the link to the specific view of a table or chart. See where you're doing on our help page. If you have any questions about the tables, please contact Kristine Bettum at kristine.bettum@kompetansenorge.no .
Not seeing a result you expected?
Learn how you can add new datasets to our index.
This dataset contains the geographic data used to create maps for the San Diego County Regional Equity Indicators Report led by the Office of Equity and Racial Justice (OERJ). The full report can be found here: https://data.sandiegocounty.gov/stories/s/7its-kgpt
Demographic data from the report can be found here: https://data.sandiegocounty.gov/dataset/Equity-Report-Data-Demographics/q9ix-kfws
Filter by the Indicator column to select data for a particular indicator map.
Export notes: Dataset may not automatically open correctly in Excel due to geospatial data. To export the data for geospatial analysis, select Shapefile or GEOJSON as the file type. To view the data in Excel, export as a CSV but do not open the file. Then, open a blank Excel workbook, go to the Data tab, select âFrom Text/CSV,â and follow the prompts to import the CSV file into Excel. Alternatively, use the exploration options in "View Data" to hide the geographic column prior to exporting the data.
USER NOTES: 4/7/2025 - The maps and data have been removed for the Health Professional Shortage Areas indicator due to inconsistencies with the data source leading to some missing health professional shortage areas. We are working to fix this issue, including exploring possible alternative data sources.
5/21/2025 - The following changes were made to the 2023 report data (Equity Report Year = 2023). Self-Sufficiency Wage - a typo in the indicator name was fixed (changed sufficienct to sufficient) and the percent for one PUMA corrected from 56.9 to 59.9 (PUMA = San Diego County (Northwest)--Oceanside City & Camp Pendleton). Notes were made consistent for all rows where geography = ZCTA. A note was added to all rows where geography = PUMA. Voter registration - label "92054, 92051" was renamed to be in numerical order and is now "92051, 92054". Removed data from the percentile column because the categories are not true percentiles. Employment - Data was corrected to show the percent of the labor force that are employed (ages 16 and older). Previously, the data was the percent of the population 16 years and older that are in the labor force. 3- and 4-Year-Olds Enrolled in School - percents are now rounded to one decimal place. Poverty - the last two categories/percentiles changed because the 80th percentile cutoff was corrected by 0.01 and one ZCTA was reassigned to a different percentile as a result. Low Birthweight - the 33th percentile label was corrected to be written as the 33rd percentile. Life Expectancy - Corrected the category and percentile assignment for SRA CENTRAL SAN DIEGO. Parks and Community Spaces - corrected the category assignment for six SRAs.
5/21/2025 - Data was uploaded for Equity Report Year 2025. The following changes were made relative to the 2023 report year. Adverse Childhood Experiences - added geographic data for 2025 report. No calculation of bins nor corresponding percentiles due to small number of geographic areas. Low Birthweight - no calculation of bins nor corresponding percentiles due to small number of geographic areas.
Prepared by: Office of Evaluation, Performance, and Analytics and the Office of Equity and Racial Justice, County of San Diego, in collaboration with the San Diego Regional Policy & Innovation Center (https://www.sdrpic.org).