https://github.com/nytimes/covid-19-data/blob/master/LICENSEhttps://github.com/nytimes/covid-19-data/blob/master/LICENSE
The New York Times is releasing a series of data files with cumulative counts of coronavirus cases in the United States, at the state and county level, over time. We are compiling this time series data from state and local governments and health departments in an attempt to provide a complete record of the ongoing outbreak.
Since the first reported coronavirus case in Washington State on Jan. 21, 2020, The Times has tracked cases of coronavirus in real time as they were identified after testing. Because of the widespread shortage of testing, however, the data is necessarily limited in the picture it presents of the outbreak.
We have used this data to power our maps and reporting tracking the outbreak, and it is now being made available to the public in response to requests from researchers, scientists and government officials who would like access to the data to better understand the outbreak.
The data begins with the first reported coronavirus case in Washington State on Jan. 21, 2020. We will publish regular updates to the data in this repository.
Notice of data discontinuation: Since the start of the pandemic, AP has reported case and death counts from data provided by Johns Hopkins University. Johns Hopkins University has announced that they will stop their daily data collection efforts after March 10. As Johns Hopkins stops providing data, the AP will also stop collecting daily numbers for COVID cases and deaths. The HHS and CDC now collect and visualize key metrics for the pandemic. AP advises using those resources when reporting on the pandemic going forward.
April 9, 2020
April 20, 2020
April 29, 2020
September 1st, 2020
February 12, 2021
new_deaths
column.February 16, 2021
The AP is using data collected by the Johns Hopkins University Center for Systems Science and Engineering as our source for outbreak caseloads and death counts for the United States and globally.
The Hopkins data is available at the county level in the United States. The AP has paired this data with population figures and county rural/urban designations, and has calculated caseload and death rates per 100,000 people. Be aware that caseloads may reflect the availability of tests -- and the ability to turn around test results quickly -- rather than actual disease spread or true infection rates.
This data is from the Hopkins dashboard that is updated regularly throughout the day. Like all organizations dealing with data, Hopkins is constantly refining and cleaning up their feed, so there may be brief moments where data does not appear correctly. At this link, you’ll find the Hopkins daily data reports, and a clean version of their feed.
The AP is updating this dataset hourly at 45 minutes past the hour.
To learn more about AP's data journalism capabilities for publishers, corporations and financial institutions, go here or email kromano@ap.org.
Use AP's queries to filter the data or to join to other datasets we've made available to help cover the coronavirus pandemic
Filter cases by state here
Rank states by their status as current hotspots. Calculates the 7-day rolling average of new cases per capita in each state: https://data.world/associatedpress/johns-hopkins-coronavirus-case-tracker/workspace/query?queryid=481e82a4-1b2f-41c2-9ea1-d91aa4b3b1ac
Find recent hotspots within your state by running a query to calculate the 7-day rolling average of new cases by capita in each county: https://data.world/associatedpress/johns-hopkins-coronavirus-case-tracker/workspace/query?queryid=b566f1db-3231-40fe-8099-311909b7b687&showTemplatePreview=true
Join county-level case data to an earlier dataset released by AP on local hospital capacity here. To find out more about the hospital capacity dataset, see the full details.
Pull the 100 counties with the highest per-capita confirmed cases here
Rank all the counties by the highest per-capita rate of new cases in the past 7 days here. Be aware that because this ranks per-capita caseloads, very small counties may rise to the very top, so take into account raw caseload figures as well.
The AP has designed an interactive map to track COVID-19 cases reported by Johns Hopkins.
@(https://datawrapper.dwcdn.net/nRyaf/15/)
<iframe title="USA counties (2018) choropleth map Mapping COVID-19 cases by county" aria-describedby="" id="datawrapper-chart-nRyaf" src="https://datawrapper.dwcdn.net/nRyaf/10/" scrolling="no" frameborder="0" style="width: 0; min-width: 100% !important;" height="400"></iframe><script type="text/javascript">(function() {'use strict';window.addEventListener('message', function(event) {if (typeof event.data['datawrapper-height'] !== 'undefined') {for (var chartId in event.data['datawrapper-height']) {var iframe = document.getElementById('datawrapper-chart-' + chartId) || document.querySelector("iframe[src*='" + chartId + "']");if (!iframe) {continue;}iframe.style.height = event.data['datawrapper-height'][chartId] + 'px';}}});})();</script>
Johns Hopkins timeseries data - Johns Hopkins pulls data regularly to update their dashboard. Once a day, around 8pm EDT, Johns Hopkins adds the counts for all areas they cover to the timeseries file. These counts are snapshots of the latest cumulative counts provided by the source on that day. This can lead to inconsistencies if a source updates their historical data for accuracy, either increasing or decreasing the latest cumulative count. - Johns Hopkins periodically edits their historical timeseries data for accuracy. They provide a file documenting all errors in their timeseries files that they have identified and fixed here
This data should be credited to Johns Hopkins University COVID-19 tracking project
The dataset summarizes counts and rates of cumulative COVID-19 cases by cities in Santa Clara County. Source: California Reportable Disease Information Exchange
This dataset is updated every Thursday.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset name: asppl_dataset_v2.csv
Version: 2.0
Dataset period: 06/07/2018 - 01/14/2022
Dataset Characteristics: Multivalued
Number of Instances: 8118
Number of Attributes: 9
Missing Values: Yes
Area(s): Health and education
Sources:
Virtual Learning Environment of the Brazilian Health System (AVASUS) (Brasil, 2022a);
Brazilian Occupational Classification (CBO) (Brasil, 2022b);
National Registry of Health Establishments (CNES) (Brasil, 2022c);
Brazilian Institute of Geography and Statistics (IBGE) (Brasil, 2022e).
Description: The data contained in the asppl_dataset_v2.csv dataset (see Table 1) originates from participants of the technology-based educational course “Health Care for People Deprived of Freedom.” The course is available on the AVASUS (Brasil, 2022a). This dataset provides elementary data for analyzing the course’s impact and reach and the profile of its participants. In addition, it brings an update of the data presented in work by Valentim et al. (2021).
Table 1: Description of AVASUS dataset features.
Attributes |
Description |
datatype |
Value |
gender |
Gender of the course participant. |
Categorical. |
Feminino / Masculino / Não Informado. (In English, Female, Male or Uninformed) |
course_progress |
Percentage of completion of the course. |
Numerical. |
Range from 0 to 100. |
course_evaluation |
A score given to the course by the participant. |
Numerical. |
0, 1, 2, 3, 4, 5 or NaN. |
evaluation_commentary |
Comment made by the participant about the course. |
Categorical. |
Free text or NaN. |
region |
Brazilian region in which the participant resides. |
Categorical. |
Brazilian region according to IBGE: Norte, Nordeste, Centro-Oeste, Sudeste or Sul (In English North, Northeast, Midwest, Southeast or South). |
CNES |
The CNES code refers to the health establishment where the participant works. |
Numerical. |
CNES Code or NaN. |
health_care_level |
Identification of the health care network level for which the course participant works. |
Categorical. |
“ATENCAO PRIMARIA”, “MEDIA COMPLEXIDADE”, “ALTA COMPLEXIDADE”, and their possible combinations. |
year_enrollment |
Year in which the course participant registered. |
Numerical. |
Year (YYYY). |
CBO |
Participant occupation. |
Categorical. |
Text coded according to the Brazilian Classification of Occupations or “Indivíduo sem afiliação formal.” (In English “Individual without formal affiliation.”) |
Dataset name: prison_syphilis_and_population_brazil.csv
Dataset period: 2017 - 2020
Dataset Characteristics: Multivalued
Number of Instances: 6
Number of Attributes: 13
Missing Values: No
Source:
National Penitentiary Department (DEPEN) (Brasil, 2022d);
Description: The data contained in the prison_syphilis_and_population_brazil.csv dataset (see Table 2) originate from the National Penitentiary Department Information System (SISDEPEN) (Brasil, 2022d). This dataset provides data on the population and prevalence of syphilis in the Brazilian prison system. In addition, it brings a rate that represents the normalized data for purposes of comparison between the populations of each region and Brazil.
Table 2: Description of DEPEN dataset Features.
Attributes |
Description |
datatype |
Value |
Region |
Brazilian region in which the participant resides. In addition, the sum of the regions, which refers to Brazil. |
Categorical. |
Brazil and Brazilian region according to IBGE: North, Northeast, Midwest, Southeast or South. |
syphilis_2017 |
Number of syphilis cases in the prison system in 2017. |
Numerical. |
Number of syphilis cases. |
syphilis_rate_2017 |
Normalized rate of syphilis cases in 2017. |
Numerical. |
Syphilis case rate. |
syphilis_2018 |
Number of syphilis cases in the prison system in 2018. |
Numerical. |
Number of syphilis cases. |
syphilis_rate_2018 |
Normalized rate of syphilis cases in 2018. |
Numerical. |
Syphilis case rate. |
syphilis_2019 |
Number of syphilis cases in the prison system in 2019. |
Numerical. |
Number of syphilis cases. |
syphilis_rate_2019 |
Normalized rate of syphilis cases in 2019. |
Numerical. |
Syphilis case rate. |
syphilis_2020 |
Number of syphilis cases in the prison system in 2020. |
Numerical. |
Number of syphilis cases. |
syphilis_rate_2020 |
Normalized rate of syphilis cases in 2020. |
Numerical. |
Syphilis case rate. |
pop_2017 |
Prison population in 2017. |
Numerical. |
Population number. |
pop_2018 |
Prison population in 2018. |
Numerical. |
Population number. |
pop_2019 |
Prison population in 2019. |
Numerical. |
Population number. |
pop_2020 |
Prison population in 2020. |
Numerical. |
Population number. |
Dataset name: students_cumulative_sum.csv
Dataset period: 2018 - 2020
Dataset Characteristics: Multivalued
Number of Instances: 6
Number of Attributes: 7
Missing Values: No
Source:
Virtual Learning Environment of the Brazilian Health System (AVASUS) (Brasil, 2022a);
Brazilian Institute of Geography and Statistics (IBGE) (Brasil, 2022e).
Description: The data contained in the students_cumulative_sum.csv dataset (see Table 3) originate mainly from AVASUS (Brasil, 2022a). This dataset provides data on the number of students by region and year. In addition, it brings a rate that represents the normalized data for purposes of comparison between the populations of each region and Brazil. We used population data estimated by the IBGE (Brasil, 2022e) to calculate the rate.
Table 3: Description of Students dataset Features.
This dataset includes all the data and R code needed to reproduce the analyses in a forthcoming manuscript:Copes, W. E., Q. D. Read, and B. J. Smith. Environmental influences on drying rate of spray applied disinfestants from horticultural production services. PhytoFrontiers, DOI pending.Study description: Instructions for disinfestants typically specify a dose and a contact time to kill plant pathogens on production surfaces. A problem occurs when disinfestants are applied to large production areas where the evaporation rate is affected by weather conditions. The common contact time recommendation of 10 min may not be achieved under hot, sunny conditions that promote fast drying. This study is an investigation into how the evaporation rates of six commercial disinfestants vary when applied to six types of substrate materials under cool to hot and cloudy to sunny weather conditions. Initially, disinfestants with low surface tension spread out to provide 100% coverage and disinfestants with high surface tension beaded up to provide about 60% coverage when applied to hard smooth surfaces. Disinfestants applied to porous materials were quickly absorbed into the body of the material, such as wood and concrete. Even though disinfestants evaporated faster under hot sunny conditions than under cool cloudy conditions, coverage was reduced considerably in the first 2.5 min under most weather conditions and reduced to less than or equal to 50% coverage by 5 min. Dataset contents: This dataset includes R code to import the data and fit Bayesian statistical models using the model fitting software CmdStan, interfaced with R using the packages brms and cmdstanr. The models (one for 2022 and one for 2023) compare how quickly different spray-applied disinfestants dry, depending on what chemical was sprayed, what surface material it was sprayed onto, and what the weather conditions were at the time. Next, the statistical models are used to generate predictions and compare mean drying rates between the disinfestants, surface materials, and weather conditions. Finally, tables and figures are created. These files are included:Drying2022.csv: drying rate data for the 2022 experimental runWeather2022.csv: weather data for the 2022 experimental runDrying2023.csv: drying rate data for the 2023 experimental runWeather2023.csv: weather data for the 2023 experimental rundisinfestant_drying_analysis.Rmd: RMarkdown notebook with all data processing, analysis, and table creation codedisinfestant_drying_analysis.html: rendered output of notebookMS_figures.R: additional R code to create figures formatted for journal requirementsfit2022_discretetime_weather_solar.rds: fitted brms model object for 2022. This will allow users to reproduce the model prediction results without having to refit the model, which was originally fit on a high-performance computing clusterfit2023_discretetime_weather_solar.rds: fitted brms model object for 2023data_dictionary.xlsx: descriptions of each column in the CSV data files
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Rhodium price data, historical values, forecasts, and news provided by Money Metals Exchange. Rhodium prices and trends updated regularly to provide accurate market insights.
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
Concept: Average cost of credit operations that make up the portfolio of loans, financing and leasing operations of financial institutions belonging to the National Financial System. It includes the totality of outstanding operations classified as current assets, regardless of the date of the credit lending. Source: Central Bank of Brazil � Statistics Department 27713-average-cost-of-outstanding-loans---earmarked---households---real-estate-financing---market-r 27713-average-cost-of-outstanding-loans---earmarked---households---real-estate-financing---market-r
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is part of the Monash, UEA & UCR time series regression repository. http://timeseriesregression.org/
The goal of this dataset is to estimate heart rate using PPG and ECG data. This dataset contains 7949 time series obtained from the Physionet's BIDMC PPG and Respiration dataset, which was extracted from the much larger MIMIC II waveform database.
Please refer to https://physionet.org/content/bidmc/1.0.0/ for more details
Relevant papers Pimentel, M.A.F. et al. Towards a Robust Estimation of Respiratory Rate from Pulse Oximeters. IEEE Transactions on Biomedical Engineering, 64(8), pp.1914-1923, 2016. DOI: 10.1109/TBME.2016.2613124.
Citation request Pimentel, M.A.F. et al. Towards a Robust Estimation of Respiratory Rate from Pulse Oximeters. IEEE Transactions on Biomedical Engineering, 64(8), pp.1914-1923, 2016. DOI: 10.1109/TBME.2016.2613124. Goldberger, A., Amaral, L., Glass, L., Hausdorff, J., Ivanov, P. C., Mark, R., ... & Stanley, H. E. (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation [Online]. 101 (23), pp. e215–e220.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Background: The current propagation models of COVID-19 are poorly consistent with existing epidemiological data and with evidence that the SARS-CoV-2 genome is mutating, for potential aggressive evolution of the disease.Objectives: We looked for fundamental variables that were missing from current analyses. Among them were regional climate heterogeneity, viral evolution processes versus founder effects, and large-scale virus containment measures.Methods: We challenged regional versus genetic evolution models of COVID-19 at a whole-population level, over 168,089 laboratory-confirmed SARS-CoV-2 infection cases in Italy, Spain, and Scandinavia at early time-points of the pandemic. Diffusion data in Germany, France, and the United Kingdom provided a validation dataset of 210,239 additional cases.Results: Mean doubling time of COVID-19 cases was 6.63 days in Northern versus 5.38 days in Southern Italy. Spain extended this trend of faster diffusion in Southern Europe, with a doubling time of 4.2 days. Slower doubling times were observed in Sweden (9.4 days), Finland (10.8 days), and Norway (12.95 days). COVID-19 doubling time in Germany (7.0 days), France (7.5 days), and the United Kingdom (7.2 days) supported the North/South gradient model. Clusters of SARS-CoV-2 mutations upon sequential diffusion were not found to clearly correlate with regional distribution dynamics.Conclusion: Acquisition of mutations upon SARS-CoV-2 spreading failed to explain regional diffusion heterogeneity at early pandemic times. Our findings indicate that COVID-19 transmission rates are rather associated with a sharp North/South climate gradient, with faster spreading in Southern regions. Thus, warmer climate conditions may not limit SARS-CoV-2 infectivity. Very cold regions may be better spared by recurrent courses of SARS-CoV-2 infection.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
UK Gas decreased 26.27 GBp/Thm or 20.95% since the beginning of 2025, according to trading on a contract for difference (CFD) that tracks the benchmark market for this commodity. UK Natural Gas - values, historical data, forecasts and news - updated on March of 2025.
This data package includes the underlying data and files to replicate the calculations, charts, and tables presented in Estimates of Fundamental Equilibrium Exchange Rates, May 2017, PIIE Policy Brief 17-19. If you use the data, please cite as: Cline, William R. (2017). Estimates of Fundamental Equilibrium Exchange Rates, May 2017. PIIE Policy Brief 17-19. Peterson Institute for International Economics.
This information covers fires, false alarms and other incidents attended by fire crews, and the statistics include the numbers of incidents, fires, fatalities and casualties as well as information on response times to fires. The Home Office also collect information on the workforce, fire prevention work, health and safety and firefighter pensions. All data tables on fire statistics are below.
The Home Office has responsibility for fire services in England. The vast majority of data tables produced by the Home Office are for England but some (0101, 0103, 0201, 0501, 1401) tables are for Great Britain split by nation. In the past the Department for Communities and Local Government (who previously had responsibility for fire services in England) produced data tables for Great Britain and at times the UK. Similar information for devolved administrations are available at https://www.firescotland.gov.uk/about/statistics/" class="govuk-link">Scotland: Fire and Rescue Statistics, https://statswales.gov.wales/Catalogue/Community-Safety-and-Social-Inclusion/Community-Safety" class="govuk-link">Wales: Community safety and http://www.nifrs.org/" class="govuk-link">Northern Ireland: Fire and Rescue Statistics.
If you use assistive technology (for example, a screen reader) and need a version of any of these documents in a more accessible format, please email alternativeformats@homeoffice.gov.uk. Please tell us what format you need. It will help us if you say what assistive technology you use.
Fire statistics guidance
Fire statistics incident level datasets
https://assets.publishing.service.gov.uk/media/6787aa6c2cca34bdaf58a257/fire-statistics-data-tables-fire0101-230125.xlsx">FIRE0101: Incidents attended by fire and rescue services by nation and population (MS Excel Spreadsheet, 94 KB) Previous FIRE0101 tables
https://assets.publishing.service.gov.uk/media/6787ace93f1182a1e258a25c/fire-statistics-data-tables-fire0102-230125.xlsx">FIRE0102: Incidents attended by fire and rescue services in England, by incident type and fire and rescue authority (MS Excel Spreadsheet, 1.51 MB) Previous FIRE0102 tables
https://assets.publishing.service.gov.uk/media/6787b036868b2b1923b64648/fire-statistics-data-tables-fire0103-230125.xlsx">FIRE0103: Fires attended by fire and rescue services by nation and population (MS Excel Spreadsheet, 123 KB) Previous FIRE0103 tables
https://assets.publishing.service.gov.uk/media/6787b3ac868b2b1923b6464d/fire-statistics-data-tables-fire0104-230125.xlsx">FIRE0104: Fire false alarms by reason for false alarm, England (MS Excel Spreadsheet, 295 KB) Previous FIRE0104 tables
https://assets.publishing.service.gov.uk/media/6787b4323f1182a1e258a26a/fire-statistics-data-tables-fire0201-230125.xlsx">FIRE0201: Dwelling fires attended by fire and rescue services by motive, population and nation (MS Excel Spreadsheet, 111 KB) <a href="https://www.gov.uk/government/statistical-data-sets/fire0201-previous-data-t
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The USDRUB decreased 0.6990 or 0.83% to 83.9215 on Wednesday March 26 from 84.6205 in the previous trading session. Russian Ruble - values, historical data, forecasts and news - updated on March of 2025.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Historical dataset of the daily level of the federal funds rate back to 1954. The fed funds rate is the interest rate at which depository institutions (banks and credit unions) lend reserve balances to other depository institutions overnight, on an uncollateralized basis. The Federal Open Market Committee (FOMC) meets eight times a year to determine the federal funds target rate.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Inflation Rate in Russia increased to 10.10 percent in February from 9.90 percent in January of 2025. This dataset provides - Russia Inflation Rate - actual values, historical data, forecast, chart, statistics, economic calendar and news.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Iron Ore decreased 1.36 USD/MT or 1.31% since the beginning of 2025, according to trading on a contract for difference (CFD) that tracks the benchmark market for this commodity. Iron Ore - values, historical data, forecasts and news - updated on March of 2025.
This repository is the second updated version of the attribute-linked residential property price dataset in UK Data Service ReShare 854240 (https://reshare.ukdataservice.ac.uk/854240/). As with the first updated version (ReShare 855033 https://reshare.ukdataservice.ac.uk/855033/) in 2021, this updated dataset contains individual property transactions and associated variables from both Land Registry Price Paid Dataset (LR PPD) and the Ministry for Housing, Communities and Local Government (MHCLG) Domestic Energy Performance Certificate (EPC) data. This is a linked result by address matching between LR-PPD data (1/1/1995-27/6/2022) and Domestic EPCs data (the twelfth version: ending with 30/6/2022). It is the whole of the 2022 update house price per square metre dataset published in the Greater London Authority (GLA) London Datastore (https://data.london.gov.uk/dataset/house-price-per-square-metre-in-england-and-wales).
The linked dataset in this repository is the uncorrected version, recording almost 20 million transactions with 106 variables in England and Wales between 1/1/1995 and 27/6/2022. We have offered technical validation and data cleaning code in UKDA ReShare 854240 to help users to evaluate the representation and to clean up the data. There is no unique way to clean this raw linked dataset, so we suggest users develop their own clean-up process based on their research requirements. In addition, this repository covers the original LR PPD and Domestic EPCs for the linked data (house price per square metre dataset). Similar to the first updated version, a field header has been added in LR PPD. Six variables (individual lodgement identifier, address, address 1, address 2, address 3, postcode) in Domestic EPCs are removed. A newly created unique identifier (id) is added in Domestic EPCs, this id is newly created for Version 12 Domestic EPCs. It is not the same id as in the Domestic EPCs from UK Data Service ReShare 854240 and ReShare 855033. Since November 2021 DLUCH has published Domestic EPCs with the Unique Property Reference Number (UPRN) hence the dataset in this repository contains the UPRN information from the Domestic EPCs.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The “Tesla Stock Price Data (Last One Year)” dataset is a comprehensive collection of historical stock market information, focusing on Tesla Inc. (TSLA) for the past year. This dataset serves as a valuable resource for financial analysts, investors, researchers, and data enthusiasts who are interested in studying the trends, patterns, and performance of Tesla’s stock in the financial markets.It consists of 9 columns referring to date, high and low prices, open and closing value, volume, cumulative open and of course changing of price.At a first glance in order to better understand the data we should plot the time series of each attribute.The cumulative Open Interest(OI) is the total open contracts that are being held in a particular Future or Call or Put contracts on the Exchange. We can see that the biggest drop of the stock happened in January of 2023 and after 5 to 6 months it regained its stock value round the summer of the same year with opening and closing price around 300.As a next step we are going to plot some more plots in order ro better understand the relation between our target column(change price) with every other attribute. In order to interpret the results:
Linear Regression:
Mean Absolute Error (MAE): 6.28 This model, on average, predicts the “Price Change” within approximately 6.28 units of the true value. Mean Squared Error (MSE): 52.97 MSE measures the average of squared differences, and this value suggests some variability in prediction errors. Root Mean Squared Error (RMSE): 7.28 RMSE is the square root of MSE and is in the same units as the target variable. An RMSE of 7.28 indicates the typical prediction error. R-squared (R2): 0.0868 R-squared represents the proportion of the variance in the target variable explained by the model. An R2 of 0.0868 suggests that the model explains only a small portion of the variance, indicating limited predictive power. Decision Tree Regression:
Mean Absolute Error (MAE): 9.21 This model, on average, predicts the “Price Change” within approximately 9.21 units of the true value, which is higher than the Linear Regression model. Mean Squared Error (MSE): 150.69 The MSE is relatively high, indicating larger prediction errors and more variability. Root Mean Squared Error (RMSE): 12.28 RMSE of 12.28 is notably higher, suggesting that this model has larger prediction errors. R-squared (R2): -1.598 The negative R-squared value indicates that the model performs worse than a horizontal line as a predictor, indicating a poor fit. Random Forest Regression:
Mean Absolute Error (MAE): 6.99 This model, on average, predicts the “Price Change” within approximately 6.99 units of the true value, similar to Linear Regression. Mean Squared Error (MSE): 62.79 MSE is lower than the Decision Tree model but higher than Linear Regression, suggesting intermediate prediction accuracy Root Mean Squared Error (RMSE): 7.92 RMSE is also intermediate, indicating moderate prediction errors. R-squared (R2): -0.0824 The negative R-squared suggests that the Random Forest model does not perform well and has limited predictive power.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
General overview The following datasets are described by this metadata record, and are available for download from the provided URL.
####
Physical parameters raw log files
Raw log files 1) DATE= 2) Time= UTC+11 3) PROG=Automated program to control sensors and collect data 4) BAT=Amount of battery remaining 5) STEP=check aquation manual 6) SPIES=check aquation manual 7) PAR=Photoactive radiation 8) Levels=check aquation manual 9) Pumps= program for pumps 10) WQM=check aquation manual
####
Respiration/PAM chamber raw excel spreadsheets
Abbreviations in headers of datasets Note: Two data sets are provided in different formats. Raw and cleaned (adj). These are the same data with the PAR column moved over to PAR.all for analysis. All headers are the same. The cleaned (adj) dataframe will work with the R syntax below, alternative add code to do cleaning in R.
Date: ISO 1986 - Check Time:UTC+11 unless otherwise stated DATETIME: UTC+11 unless otherwise stated ID (of instrument in respiration chambers) ID43=Pulse amplitude fluoresence measurement of control ID44=Pulse amplitude fluoresence measurement of acidified chamber ID=1 Dissolved oxygen ID=2 Dissolved oxygen ID3= PAR ID4= PAR PAR=Photo active radiation umols F0=minimal florescence from PAM Fm=Maximum fluorescence from PAM Yield=(F0 – Fm)/Fm rChl=an estimate of chlorophyll (Note this is uncalibrated and is an estimate only) Temp=Temperature degrees C PAR=Photo active radiation PAR2= Photo active radiation2 DO=Dissolved oxygen %Sat= Saturation of dissolved oxygen Notes=This is the program of the underwater submersible logger with the following abreviations: Notes-1) PAM= Notes-2) PAM=Gain level set (see aquation manual for more detail) Notes-3) Acclimatisation= Program of slowly introducing treatment water into chamber Notes-4) Shutter start up 2 sensors+sample…= Shutter PAMs automatic set up procedure (see aquation manual) Notes-5) Yield step 2=PAM yield measurement and calculation of control Notes-6) Yield step 5= PAM yield measurement and calculation of acidified Notes-7) Abatus respiration DO and PAR step 1= Program to measure dissolved oxygen and PAR (see aquation manual). Steps 1-4 are different stages of this program including pump cycles, DO and PAR measurements.
8) Rapid light curve data Pre LC: A yield measurement prior to the following measurement After 10.0 sec at 0.5% to 8%: Level of each of the 8 steps of the rapid light curve Odessey PAR (only in some deployments): An extra measure of PAR (umols) using an Odessey data logger Dataflow PAR: An extra measure of PAR (umols) using a Dataflow sensor. PAM PAR: This is copied from the PAR or PAR2 column PAR all: This is the complete PAR file and should be used Deployment: Identifying which deployment the data came from
####
Respiration chamber biomass data
The data is chlorophyll a biomass from cores from the respiration chambers. The headers are: Depth (mm) Treat (Acidified or control) Chl a (pigment and indicator of biomass) Core (5 cores were collected from each chamber, three were analysed for chl a), these are psudoreplicates/subsamples from the chambers and should not be treated as replicates.
####
Associated R script file for pump cycles of respirations chambers
Associated respiration chamber data to determine the times when respiration chamber pumps delivered treatment water to chambers. Determined from Aquation log files (see associated files). Use the chamber cut times to determine net production rates. Note: Users need to avoid the times when the respiration chambers are delivering water as this will give incorrect results. The headers that get used in the attached/associated R file are start regression and end regression. The remaining headers are not used unless called for in the associated R script. The last columns of these datasets (intercept, ElapsedTimeMincoef) are determined from the linear regressions described below.
To determine the rate of change of net production, coefficients of the regression of oxygen consumption in discrete 180 minute data blocks were determined. R squared values for fitted regressions of these coefficients were consistently high (greater than 0.9). We make two assumptions with calculation of net production rates: the first is that heterotrophic community members do not change their metabolism under OA; and the second is that the heterotrophic communities are similar between treatments.
####
Combined dataset pH, temperature, oxygen, salinity, velocity for experiment
This data is rapid light curve data generat...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The benchmark interest rate In the Euro Area was last recorded at 2.65 percent. This dataset provides - Euro Area Interest Rate - actual values, historical data, forecast, chart, statistics, economic calendar and news.
https://github.com/nytimes/covid-19-data/blob/master/LICENSEhttps://github.com/nytimes/covid-19-data/blob/master/LICENSE
The New York Times is releasing a series of data files with cumulative counts of coronavirus cases in the United States, at the state and county level, over time. We are compiling this time series data from state and local governments and health departments in an attempt to provide a complete record of the ongoing outbreak.
Since the first reported coronavirus case in Washington State on Jan. 21, 2020, The Times has tracked cases of coronavirus in real time as they were identified after testing. Because of the widespread shortage of testing, however, the data is necessarily limited in the picture it presents of the outbreak.
We have used this data to power our maps and reporting tracking the outbreak, and it is now being made available to the public in response to requests from researchers, scientists and government officials who would like access to the data to better understand the outbreak.
The data begins with the first reported coronavirus case in Washington State on Jan. 21, 2020. We will publish regular updates to the data in this repository.