https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This is a dataset of the most highly populated city (if applicable) in a form easy to join with the COVID19 Global Forecasting (Week 1) dataset. You can see how to use it in this kernel
There are four columns. The first two correspond to the columns from the original COVID19 Global Forecasting (Week 1) dataset. The other two is the highest population density, at city level, for the given country/state. Note that some countries are very small and in those cases the population density reflects the entire country. Since the original dataset has a few cruise ships as well, I've added them there.
Thanks a lot to Kaggle for this competition that gave me the opportunity to look closely at some data and understand this problem better.
Summary: I believe that the square root of the population density should relate to the logistic growth factor of the SIR model. I think the SEIR model isn't applicable due to any intervention being too late for a fast-spreading virus like this, especially in places with dense populations.
After playing with the data provided in COVID19 Global Forecasting (Week 1) (and everything else online or media) a bit, one thing becomes clear. They have nothing to do with epidemiology. They reflect sociopolitical characteristics of a country/state and, more specifically, the reactivity and attitude towards testing.
The testing method used (PCR tests) means that what we measure could potentially be a proxy for the number of people infected during the last 3 weeks, i.e the growth (with lag). It's not how many people have been infected and recovered. Antibody or serology tests would measure that, and by using them, we could go back to normality faster... but those will arrive too late. Way earlier, China will have experimentally shown that it's safe to go back to normal as soon as your number of newly infected per day is close to zero.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F197482%2F429e0fdd7f1ce86eba882857ac7a735e%2Fcovid-summary.png?generation=1585072438685236&alt=media" alt="">
My view, as a person living in NYC, about this virus, is that by the time governments react to media pressure, to lockdown or even test, it's too late. In dense areas, everyone susceptible has already amble opportunities to be infected. Especially for a virus with 5-14 days lag between infections and symptoms, a period during which hosts spread it all over on subway, the conditions are hopeless. Active populations have already been exposed, mostly asymptomatic and recovered. Sensitive/older populations are more self-isolated/careful in affluent societies (maybe this isn't the case in North Italy). As the virus finishes exploring the active population, it starts penetrating the more isolated ones. At this point in time, the first fatalities happen. Then testing starts. Then the media and the lockdown. Lockdown seems overly effective because it coincides with the tail of the disease spread. It helps slow down the virus exploring the long-tail of sensitive population, and we should all contribute by doing it, but it doesn't cause the end of the disease. If it did, then as soon as people were back in the streets (see China), there would be repeated outbreaks.
Smart politicians will test a lot because it will make their condition look worse. It helps them demand more resources. At the same time, they will have a low rate of fatalities due to large denominator. They can take credit for managing well a disproportionally major crisis - in contrast to people who didn't test.
We were lucky this time. We, Westerners, have woken up to the potential of a pandemic. I'm sure we will give further resources for prevention. Additionally, we will be more open-minded, helping politicians to have more direct responses. We will also require them to be more responsible in their messages and reactions.
https://www.usa.gov/government-workshttps://www.usa.gov/government-works
Note: Reporting of new COVID-19 Case Surveillance data will be discontinued July 1, 2024, to align with the process of removing SARS-CoV-2 infections (COVID-19 cases) from the list of nationally notifiable diseases. Although these data will continue to be publicly available, the dataset will no longer be updated.
Authorizations to collect certain public health data expired at the end of the U.S. public health emergency declaration on May 11, 2023. The following jurisdictions discontinued COVID-19 case notifications to CDC: Iowa (11/8/21), Kansas (5/12/23), Kentucky (1/1/24), Louisiana (10/31/23), New Hampshire (5/23/23), and Oklahoma (5/2/23). Please note that these jurisdictions will not routinely send new case data after the dates indicated. As of 7/13/23, case notifications from Oregon will only include pediatric cases resulting in death.
This case surveillance public use dataset has 12 elements for all COVID-19 cases shared with CDC and includes demographics, any exposure history, disease severity indicators and outcomes, presence of any underlying medical conditions and risk behaviors, and no geographic data.
The COVID-19 case surveillance database includes individual-level data reported to U.S. states and autonomous reporting entities, including New York City and the District of Columbia (D.C.), as well as U.S. territories and affiliates. On April 5, 2020, COVID-19 was added to the Nationally Notifiable Condition List and classified as “immediately notifiable, urgent (within 24 hours)” by a Council of State and Territorial Epidemiologists (CSTE) Interim Position Statement (Interim-20-ID-01). CSTE updated the position statement on August 5, 2020, to clarify the interpretation of antigen detection tests and serologic test results within the case classification (Interim-20-ID-02). The statement also recommended that all states and territories enact laws to make COVID-19 reportable in their jurisdiction, and that jurisdictions conducting surveillance should submit case notifications to CDC. COVID-19 case surveillance data are collected by jurisdictions and reported voluntarily to CDC.
For more information:
NNDSS Supports the COVID-19 Response | CDC.
The deidentified data in the “COVID-19 Case Surveillance Public Use Data” include demographic characteristics, any exposure history, disease severity indicators and outcomes, clinical data, laboratory diagnostic test results, and presence of any underlying medical conditions and risk behaviors. All data elements can be found on the COVID-19 case report form located at www.cdc.gov/coronavirus/2019-ncov/downloads/pui-form.pdf.
COVID-19 case reports have been routinely submitted using nationally standardized case reporting forms. On April 5, 2020, CSTE released an Interim Position Statement with national surveillance case definitions for COVID-19 included. Current versions of these case definitions are available here: https://ndc.services.cdc.gov/case-definitions/coronavirus-disease-2019-2021/.
All cases reported on or after were requested to be shared by public health departments to CDC using the standardized case definitions for laboratory-confirmed or probable cases. On May 5, 2020, the standardized case reporting form was revised. Case reporting using this new form is ongoing among U.S. states and territories.
To learn more about the limitations in using case surveillance data, visit FAQ: COVID-19 Data and Surveillance.
CDC’s Case Surveillance Section routinely performs data quality assurance procedures (i.e., ongoing corrections and logic checks to address data errors). To date, the following data cleaning steps have been implemented:
To prevent release of data that could be used to identify people, data cells are suppressed for low frequency (<5) records and indirect identifiers (e.g., date of first positive specimen). Suppression includes rare combinations of demographic characteristics (sex, age group, race/ethnicity). Suppressed values are re-coded to the NA answer option; records with data suppression are never removed.
For questions, please contact Ask SRRG (eocevent394@cdc.gov).
COVID-19 data are available to the public as summary or aggregate count files, including total counts of cases and deaths by state and by county. These
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the Credit River township household income by gender. The dataset can be utilized to understand the gender-based income distribution of Credit River township income.
The dataset will have the following datasets when applicable
Please note: The 2020 1-Year ACS estimates data was not reported by the Census Bureau due to the impact on survey collection and analysis caused by COVID-19. Consequently, median household income data for 2020 is unavailable for large cities (population 65,000 and above).
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
Explore our comprehensive data analysis and visual representations for a deeper understanding of Credit River township income distribution by gender. You can refer the same here
The G.19 Statistical Release, Consumer Credit, reports outstanding credit extended to individuals for household, family, and other personal expenditures, excluding loans secured by real estate. Total consumer credit comprises two major types: revolving and nonrevolving. Revolving credit plans may be unsecured or secured by collateral and allow a consumer to borrow up to a prearranged limit and repay the debt in one or more installments. Credit card loans comprise most of revolving consumer credit measured in the G.19, but other types, such as prearranged overdraft plans, are also included. Nonrevolving credit is closed-end credit extended to consumers that is repaid on a prearranged repayment schedule and may be secured or unsecured. To borrow additional funds, the consumer must enter into an additional contract with the lender. Consumer motor vehicle and education loans comprise the majority of nonrevolving credit, but other loan types, such as boat loans, recreational vehicle loans, and personal loans, are also included. This statistical release is designated by OMB as a Principal Federal Economic Indicator (PFEI).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Costa Rica CR: Internet Users: Individuals: % of Population data was reported at 82.598 % in 2022. This records a decrease from the previous number of 82.749 % for 2021. Costa Rica CR: Internet Users: Individuals: % of Population data is updated yearly, averaging 25.100 % from Dec 1990 (Median) to 2022, with 33 observations. The data reached an all-time high of 82.749 % in 2021 and a record low of 0.000 % in 1991. Costa Rica CR: Internet Users: Individuals: % of Population data remains active status in CEIC and is reported by World Bank. The data is categorized under Global Database’s Costa Rica – Table CR.World Bank.WDI: Telecommunication. Internet users are individuals who have used the Internet (from any location) in the last 3 months. The Internet can be used via a computer, mobile phone, personal digital assistant, games machine, digital TV etc.;International Telecommunication Union (ITU) World Telecommunication/ICT Indicators Database;Weighted average;Please cite the International Telecommunication Union for third-party use of these data.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains 1,000 financial records with five key features and one target variable, Loan Default Risk. It is designed for credit risk analysis, helping to predict whether a customer is likely to default on a loan based on financial attributes.
Income: The individual's annual income. Credit Score: A credit rating score ranging from 300 to 850, where higher values indicate better creditworthiness. Spending Score: A normalized score between 0 and 100, representing the individual's spending habits. Transaction Count: The number of transactions made by the individual in a given period. Savings Ratio: The ratio of savings to income, ranging from 0 to 1. Loan Default Risk (Target): 0: Low risk (likely to repay the loan). 1: High risk (likely to default on the loan).
Feel free to use this dataset for research, projects, or educational purposes. If you use it in a publication, kindly provide attribution.
This dataset was synthetically generated. The features were adjusted to resemble real-world financial data, but they do not represent actual individuals or real financial records.
List of the data tables as part of the Immigration System Statistics Home Office release. Summary and detailed data tables covering the immigration system, including out-of-country and in-country visas, asylum, detention, and returns.
If you have any feedback, please email MigrationStatsEnquiries@homeoffice.gov.uk.
The Microsoft Excel .xlsx files may not be suitable for users of assistive technology.
If you use assistive technology (such as a screen reader) and need a version of these documents in a more accessible format, please email MigrationStatsEnquiries@homeoffice.gov.uk
Please tell us what format you need. It will help us if you say what assistive technology you use.
Immigration system statistics, year ending March 2025
Immigration system statistics quarterly release
Immigration system statistics user guide
Publishing detailed data tables in migration statistics
Policy and legislative changes affecting migration to the UK: timeline
Immigration statistics data archives
https://assets.publishing.service.gov.uk/media/68258d71aa3556876875ec80/passenger-arrivals-summary-mar-2025-tables.xlsx">Passenger arrivals summary tables, year ending March 2025 (MS Excel Spreadsheet, 66.5 KB)
‘Passengers refused entry at the border summary tables’ and ‘Passengers refused entry at the border detailed datasets’ have been discontinued. The latest published versions of these tables are from February 2025 and are available in the ‘Passenger refusals – release discontinued’ section. A similar data series, ‘Refused entry at port and subsequently departed’, is available within the Returns detailed and summary tables.
https://assets.publishing.service.gov.uk/media/681e406753add7d476d8187f/electronic-travel-authorisation-datasets-mar-2025.xlsx">Electronic travel authorisation detailed datasets, year ending March 2025 (MS Excel Spreadsheet, 56.7 KB)
ETA_D01: Applications for electronic travel authorisations, by nationality
ETA_D02: Outcomes of applications for electronic travel authorisations, by nationality
https://assets.publishing.service.gov.uk/media/68247953b296b83ad5262ed7/visas-summary-mar-2025-tables.xlsx">Entry clearance visas summary tables, year ending March 2025 (MS Excel Spreadsheet, 113 KB)
https://assets.publishing.service.gov.uk/media/682c4241010c5c28d1c7e820/entry-clearance-visa-outcomes-datasets-mar-2025.xlsx">Entry clearance visa applications and outcomes detailed datasets, year ending March 2025 (MS Excel Spreadsheet, 29.1 MB)
Vis_D01: Entry clearance visa applications, by nationality and visa type
Vis_D02: Outcomes of entry clearance visa applications, by nationality, visa type, and outcome
Additional dat
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
People who have been granted permanent resident status in Canada. Please note that in these datasets, the figures have been suppressed or rounded to prevent the identification of individuals when the datasets are compiled and compared with other publicly available statistics. Values between 0 and 5 are shown as “--“ and all other values are rounded to the nearest multiple of 5. This may result to the sum of the figures not equating to the totals indicated.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset presents the detailed breakdown of the count of individuals within distinct income brackets, categorizing them by gender (men and women) and employment type - full-time (FT) and part-time (PT), offering valuable insights into the diverse income landscapes within Credit River township. The dataset can be utilized to gain insights into gender-based income distribution within the Credit River township population, aiding in data analysis and decision-making..
Key observations
https://i.neilsberg.com/ch/credit-river-township-mn-income-distribution-by-gender-and-employment-type.jpeg" alt="Credit River Township, Minnesota gender and employment-based income distribution analysis (Ages 15+)">
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.
Income brackets:
Variables / Data Columns
Employment type classifications include:
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Credit River township median household income by gender. You can refer the same here
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Recommended citation
Gütschow, J.; Busch, D.; Pflüger, M. (2024): The PRIMAP-hist national historical emissions time series v2.6.1 (1750-2023). zenodo. doi:10.5281/zenodo.15016289.
Gütschow, J.; Jeffery, L.; Gieseke, R.; Gebel, R.; Stevens, D.; Krapp, M.; Rocha, M. (2016): The PRIMAP-hist national historical emissions time series, Earth Syst. Sci. Data, 8, 571-603, doi:10.5194/essd-8-571-2016
Content
Use of the dataset and full description
Abstract
Support
Sources
Files included in the dataset
Notes
Data format description (columns)
References
Changelog
Abstract
The PRIMAP-hist dataset combines several published datasets to create a comprehensive set of greenhouse gas emission pathways for every country and Kyoto gas, covering the years 1750 to 2023, and almost all UNFCCC (United Nations Framework Convention on Climate Change) member states as well as most non-UNFCCC territories. The data resolves the main IPCC (Intergovernmental Panel on Climate Change) 2006 categories. For CO2, CH4, and N2O subsector data for Energy, Industrial Processes and Product Use (IPPU), and Agriculture are available. The "country reported data priority" (CR) scenario of the PRIMAP-hist datset prioritizes data that individual countries report to the UNFCCC.
For developed countries, AnnexI in terms of the UNFCCC, this is the data submitted anually in the "National Inventory Submissions". Until 2023 data was submitted in the "Common Reporting Format" (CRF). Since 2024 the new "Common Reporting Tables" (CRT) are used. For developing countries, non-AnnexI in terms of the UNFCCC, we use the "Biannial Transparency Reports" (BTR) which mostly come with data also using the "Common Reporting Tables". We also use older data available through the UNFCCC DI portal (di.unfccc.int) and additional country submissions from "Biannial Update Reports" (BUR), "National Communications" (NC), and "National Inventory Reports" (NIR) read from pdf and where available xls(x) or csv files. For a list of these submissions please see below. For South Korea the 2023 official GHG inventory has not yet been submitted to the UNFCCC but is included in PRIMAP-hist. PRIMAP-hist also includes official data for Taiwan which is not recognized as a party to the UNFCCC. We have mostly replaced the official data that has not been submitted to the UNFCCC used in v2.6 as countries have now submitted their data in CRT format, but had to make some exceptions as the CRT data was not usable for all countries.
Gaps in the country reported data are filled using third party data such as CDIAC, EI (fossil CO2), Andrew cement emissions data (cement), FAOSTAT (agriculture), and EDGAR 2024 (all sectors for CO2, CH4, N2O, HFCs, PFCs, SF6, NF3, except energy CO2). Lower priority data are harmonized to higher priority data in the gap-filling process.
For the third party priority time series gaps in the third party data are filled from country reported data sources.
Data for earlier years which are not available in the above mentioned sources are sourced from EDGAR-HYDE, CEDS, and RCP (N2O only) historical emissions.
The v2.4 release of PRIMAP-hist reduced the time-lag from 2 to 1 years for the October release. Thus the present version 2.6.1 includes data for 2023. For energy CO2 growth rates from the EI Statistical Review of World Energy are used to extend the country reported (CR) or CDIAC (TP) data to 2023. For CO2 from cement production Andrew cement data are used. For other gases and sectors we use EDGAR 2024 data. In a few cases we have to rely on numerical methods to estimate emissions for 2023.
Version 2.6.1 of the PRIMAP-hist dataset does not include emissions from Land Use, Land-Use Change, and Forestry (LULUCF) in the main file. LULUCF data are included in the file with increased number of significant digits and have to be used with care as they are constructed from different sources using different methodologies and are not harmonized.
The PRIMAP-hist v2.6.1 dataset is an updated version of
Gütschow, J.; Pflüger, M.; Busch, D. (2024): The PRIMAP-hist national historical emissions time series v2.6 (1750-2023). zenodo. doi:10.5281/zenodo.13752654.
The Changelog indicates the most important changes. You can also check the issue tracker on github.com/JGuetschow/PRIMAP-hist for additional information on issues found after the release of the dataset. Detailed per country information is available from the detailed changelog which is available on the primap.org website and on zenodo.
Use of the dataset and full description
Before using the dataset, please read this document and the article describing the methodology, especially the section on uncertainties and the section on limitations of the method and use of the dataset.
Gütschow, J.; Jeffery, L.; Gieseke, R.; Gebel, R.; Stevens, D.; Krapp, M.; Rocha, M. (2016): The PRIMAP-hist national historical emissions time series, Earth Syst. Sci. Data, 8, 571-603, doi:10.5194/essd-8-571-2016
Please notify us (johannes.guetschow@climate-resource.com) if you use the dataset so that we can keep track of how it is used and take that into consideration when updating and improving the dataset.
When using this dataset or one of its updates, please cite the DOI of the precise version of the dataset used and also the data description article which this dataset is supplement to (see above). Please consider also citing the relevant original sources when using the PRIMAP-hist dataset. See the full citations in the References section further below.
Since version 2.3 we use the data formats developed for the PRIMAP2 climate policy analysis suite: PRIMAP2 on GitHub. The data are published both in the interchange format which consists of a csv file with the data and a yaml file with additional metadata and the native NetCDF based format. For a detailed description of the data format we refer to the PRIMAP2 documentation.
We have also included files with more than three significant digits. These files are mainly aimed at people doing policy analysis using the country reported data scenario (HISTCR). Using the high precision data they can avoid questions on discrepancies with the reported data. The uncertainties of emissions data do not justify the additional significant digits and they might give a false sense of accuracy, so please use this version of the dataset with extra care.
Support
If you encounter possible errors or other things that should be noted, please check our issue tracker at github.com/JGuetschow/PRIMAP-hist and report your findings there. Please use the tag "v2.6.1" in any issue you create regarding this dataset.
If you need support in using the dataset or have any other questions regarding the dataset, please contact johannes.guetschow@climate-resource.com.
Climate Resource makes this data available CC BY 4.0 licence. Free support is limited to simple questions and non-commercial users. We also provide additional data, and data support services to clients wanting more frequent updates, additional metadata or to integrate these datasets into their workflows. Get in touch at contact@climate-resource.com if you are interested.
Sources
Global CO2 emissions from cement production v250226 data, paper: Andrew(2025), Andrew (2019)
EI Statistical Review of World Energy website: Energy Institute (2024)
CDIAC data: Hefner and Marland (2023), data: Hefner (2024), paper: Gilfillan and Marland (2021)
CEDS: data: Hoesly et al. (2020), paper: Hoesly et al. (2018)
EDGAR 2024: data/website: European Commission, European Commision, JRC (2024), report: European Commission. Joint Research Centre & IEA. (2024)
EDGAR-HYDE 1.4 data: Van Aardenne et al. (2001), Olivier and Berdowski (2001)
FAOSTAT database data: Food and Agriculture Organization of the United Nations (2024)
RCP historical data data, paper: Meinshausen et al. (2011)
UNFCCC National Communications and National Inventory Reports for developing countries available from the UNFCCC DI portal website, data: UNFCCC (2024e), Pflüger and Gütschow (2024), github
UNFCCC Bnnial Update Reports, National Communications, and National Inventory Reports for developing countries website-BURs, website-NCs, data: UNFCCC (2024d), UNFCCC (2024b).
Notes:
Not all BUR and NC submissions are included as reading the data is time consuming and not all submission contain sufficient data to be used in PRIMAP-hist.
Not all submissions included in PRIMAP-hist are available in the github repository as we do not (yet) have code that we can publish for all submissions.
No submissions have been added for PRIMAP-hist v2.6.1
UNFCCC First Biannial Transparency Reports website, [data] UNFCCC (2025)
Notes:
For a list of added submissions see section "Data source updates (v2.6.1)" in the changelog in the pdf data description.
UNFCCC Common Reporting Format (CRF) website, paper, data (24-01-08): UNFCCC (2024c) (processed as described in Jeffery et al. (2018))
Official country repositories (non-UNFCCC)
Belarus: Greenhouse gas statistics (1990-2022) website: National Statistical Committee of theRepublic of Belarus (2024)
EU, Iceland, Norway, Switzerland: National emissions reported to the UNFCCC and to the EU Greenhouse Gas Monitoring Mechanism, April 2024 website: European Environment Agency(2024)
South Korea: 2023 Inventory website, data: Republic of Korea (2023)
Taiwan / Republic of China: 2023 Inventory website, data: Republic of China - EnvironmentalProtection Administration (2023)
For the pre-1990 LULUCF time-series we use the following additional data sources:
Houghton land use CO2 website: Houghton (2008)
HYDE land cover data website: Klein Goldewijk et al. (2010), Klein Goldewijk et al. (2011)
SAGE Global Potential Vegetation Dataset website: Ramankutty and Foley (1999)
FAO Country Boundaries website: Food and Agriculture Organization of the United Nations(2015)
Files included in the dataset
For each dataset we have three files:
To cancel your Allegiant Airlines flight, simply call ☎️+1(888) 714-9534, which is the official line for customer support. This number, ☎️+1(888) 714-9534, connects you directly with agents who specialize in flight changes, cancellations, and refunds. Whether your plans changed unexpectedly or you just don’t need the flight anymore, ☎️+1(888) 714-9534 is available to guide you through the cancellation process. It's advisable to call ☎️+1(888) 714-9534 as soon as your travel plans change.
The cancellation process over the phone is simple, especially when handled via ☎️+1(888) 714-9534. Make sure you have your six-digit booking number, personal identification, and payment details ready before calling ☎️+1(888) 714-9534. The representative will walk you through what cancellation options are available, whether that includes a refund, credit voucher, or rebooking. Allegiant’s cancellation policy is strict, but ☎️+1(888) 714-9534 helps clarify all terms.
If you purchased Trip Flex with your ticket, then calling ☎️+1(888) 714-9534 could help you avoid paying a cancellation fee. Trip Flex provides extra leniency in modifying or canceling travel. Confirm with the agent at ☎️+1(888) 714-9534 whether your ticket is eligible. If Trip Flex wasn’t purchased, the cancellation may still be allowed but will likely involve fees—ask ☎️+1(888) 714-9534 about the specific charges.
Many travelers prefer canceling by phone because it ensures they receive confirmation instantly. The agent at ☎️+1(888) 714-9534 will verify your details and send a confirmation email once the cancellation is processed. Don’t rely solely on apps or online portals—calling ☎️+1(888) 714-9534 provides human help and fast results.
Another reason to cancel through ☎️+1(888) 714-9534 is to inquire about credits or rebooking options. Sometimes Allegiant offers future travel credits even if your ticket is non-refundable. The expert on the line at ☎️+1(888) 714-9534 can explain what financial recovery you might be entitled to, based on the fare rules for your booking. If you want to rebook instead of cancel outright, ☎️+1(888) 714-9534 can help with that too.
Timing is crucial. Allegiant requires cancellations to be made at least 7 days before departure for most refund or credit eligibility. Calling ☎️+1(888) 714-9534 promptly gives you the best chance of minimizing losses. If your flight is within 24 hours, calling ☎️+1(888) 714-9534 is even more urgent, since last-minute policies are stricter.
If your flight was disrupted by a weather event, airline cancellation, or other issue not caused by you, be sure to explain this clearly when calling ☎️+1(888) 714-9534. In such cases, the agent may offer free cancellation or rebooking options. The key is to use ☎️+1(888) 714-9534 to initiate the process while documenting everything.
In group travel scenarios, each passenger’s cancellation may need to be handled individually. The support team at ☎️+1(888) 714-9534 can help you navigate that complexity. Let them know how many people are canceling and what refund or credit options you'd like to pursue. ☎️+1(888) 714-9534 remains the easiest and most reliable way to get all travelers on the same page.
Keep in mind that Allegiant usually does not issue cash refunds unless required by law or under specific conditions. Most likely, a travel voucher will be issued when canceling via ☎️+1(888) 714-9534. These vouchers can typically be used within 12 months. Be sure to clarify expiry dates and usage terms with ☎️+1(888) 714-9534 before ending the call.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
NLUCat is a dataset of NLU in Catalan. It consists of nearly 12,000 instructions annotated with the most relevant intents and spans. Each instruction is accompanied, in addition, by the instructions received by the annotator who wrote it.
The intents taken into account are the habitual ones of a virtual home assistant (activity calendar, IOT, list management, leisure, etc.), but specific ones have also been added to take into account social and healthcare needs for vulnerable people (information on administrative procedures, menu and medication reminders, etc.).
The spans have been annotated with a tag describing the type of information they contain. They are fine-grained, but can be easily grouped to use them in robust systems.
The examples are not only written in Catalan, but they also take into account the geographical and cultural reality of the speakers of this language (geographic points, cultural references, etc.)
This dataset can be used to train models for intent classification, spans identification and examples generation.
This is the complete version of the dataset. A version prepared to train and evaluate intent classifiers has been published in HuggingFace.
In this repository you'll find the following items:
This dataset can be used for any purpose, whether academic or commercial, under the terms of the CC BY 4.0. Give appropriate credit , provide a link to the license, and indicate if changes were made.
Intent classification, spans identification and examples generation.
The dataset is in Catalan (ca-ES).
Three JSON files, one for each split.
Example
An example looks as follows:
{
"example": "Demana una ambulància; la meva dona està de part.",
"annotation": {
"intent": "call_emergency",
"slots": [
{
"Tag": "service",
"Text": "ambulància",
"Start_char": 11,
"End_char": 21
},
{
"Tag": "situation",
"Text": "la meva dona està de part",
"Start_char": 23,
"End_char": 48
}
]
}
},
We created this dataset to contribute to the development of language models in Catalan, a low-resource language.
When creating this dataset, we took into account not only the language but the entire socio-cultural reality of the Catalan-speaking population. Special consideration was also given to the needs of the vulnerable population.
Initial Data Collection and Normalization
We commissioned a company to create fictitious examples for the creation of this dataset.
Who are the source language producers?
We commissioned the writing of the examples to the company m47 labs.
Annotation process
The elaboration of this dataset has been done in three steps, taking as a model the process followed by the NLU-Evaluation-Data dataset, as explained in the paper.
* First step: translation or elaboration of the instructions given to the annotators to write the examples.
* Second step: writing the examples. This step also includes the grammatical correction and normalization of the texts.
* Third step: recording the attempts and the slots of each example. In this step, some modifications were made to the annotation guides to adjust them to the real situations.
Who are the annotators?
The drafting of the examples and their annotation was entrusted to the company m47 labs through a public tender process.
No personal or sensitive information included.
The examples used for the preparation of this dataset are fictitious and, therefore, the information shown is not real.
We hope that this dataset will help the development of virtual assistants in Catalan, a language that is often not taken into account, and that it will especially help to improve the quality of life of people with special needs.
When writing the examples, the annotators were asked to take into account the socio-cultural reality (geographic points, artists and cultural references, etc.) of the Catalan-speaking population.
Likewise, they were asked to be careful to avoid examples that reinforce the stereotypes that exist in this society. For example: be careful with the gender or origin of personal names that are associated with certain activities.
[N/A]
Language Technologies Unit at the Barcelona Supercomputing Center (langtech@bsc.es)
This work has been promoted and financed by the Generalitat de Catalunya through the Aina project.
This dataset can be used for any purpose, whether academic or commercial, under the terms of the CC BY 4.0.
Give appropriate credit, provide a link to the license, and indicate if changes were made.
The drafting of the examples and their annotation was entrusted to the company m47 labs through a public tender process.
Data for a Kaggle competition
Banks play a crucial role in market economies. They decide who can get finance and on what terms and can make or break investment decisions. For markets and society to function, individuals and companies need access to credit.
Credit scoring algorithms, which make a guess at the probability of default, are the method banks use to determine whether or not a loan should be granted. This competition requires participants to improve on the state of the art in credit scoring, by predicting the probability that somebody will experience financial distress in the next two years.
The goal of this competition is to build a model that borrowers can use to help make the best financial decisions.
Historical data are provided on 250,000 borrowers and the prize pool is $5,000 ($3,000 for first, $1,500 for second and $500 for third).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Costa Rica CR: Proportion of People Living Below 50 Percent Of Median Income: % data was reported at 18.700 % in 2022. This records a decrease from the previous number of 19.600 % for 2021. Costa Rica CR: Proportion of People Living Below 50 Percent Of Median Income: % data is updated yearly, averaging 20.000 % from Dec 1981 (Median) to 2022, with 36 observations. The data reached an all-time high of 25.000 % in 1981 and a record low of 18.300 % in 2010. Costa Rica CR: Proportion of People Living Below 50 Percent Of Median Income: % data remains active status in CEIC and is reported by World Bank. The data is categorized under Global Database’s Costa Rica – Table CR.World Bank.WDI: Social: Poverty and Inequality. The percentage of people in the population who live in households whose per capita income or consumption is below half of the median income or consumption per capita. The median is measured at 2017 Purchasing Power Parity (PPP) using the Poverty and Inequality Platform (http://www.pip.worldbank.org). For some countries, medians are not reported due to grouped and/or confidential data. The reference year is the year in which the underlying household survey data was collected. In cases for which the data collection period bridged two calendar years, the first year in which data were collected is reported.;World Bank, Poverty and Inequality Platform. Data are based on primary household survey data obtained from government statistical agencies and World Bank country departments. Data for high-income economies are mostly from the Luxembourg Income Study database. For more information and methodology, please see http://pip.worldbank.org.;;The World Bank’s internationally comparable poverty monitoring database now draws on income or detailed consumption data from more than 2000 household surveys across 169 countries. See the Poverty and Inequality Platform (PIP) for details (www.pip.worldbank.org).
These family food datasets contain more detailed information than the ‘Family Food’ report and mainly provide statistics from 2001 onwards. The UK household purchases and the UK household expenditure spreadsheets include statistics from 1974 onwards. These spreadsheets are updated annually when a new edition of the ‘Family Food’ report is published.
The ‘purchases’ spreadsheets give the average quantity of food and drink purchased per person per week for each food and drink category. The ‘nutrient intake’ spreadsheets give the average nutrient intake (eg energy, carbohydrates, protein, fat, fibre, minerals and vitamins) from food and drink per person per day. The ‘expenditure’ spreadsheets give the average amount spent in pence per person per week on each type of food and drink. Several different breakdowns are provided in addition to the UK averages including figures by region, income, household composition and characteristics of the household reference person.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Popular Website Traffic Over Time ’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/yamqwe/popular-website-traffice on 13 February 2022.
--- Dataset description provided by original source is as follows ---
Background
Have you every been in a conversation and the question comes up, who uses Bing? This question comes up occasionally because people wonder if these sites have any views. For this research study, we are going to be exploring popular website traffic for many popular websites.
Methodology
The data collected originates from SimilarWeb.com.
Source
For the analysis and study, go to The Concept Center
This dataset was created by Chase Willden and contains around 0 samples along with 1/1/2017, Social Media, technical information and other features such as: - 12/1/2016 - 3/1/2017 - and more.
- Analyze 11/1/2016 in relation to 2/1/2017
- Study the influence of 4/1/2017 on 1/1/2017
- More datasets
If you use this dataset in your research, please credit Chase Willden
--- Original source retains full ownership of the source dataset ---
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Some studies have found that constituents do not evaluate legislators more favorably for claiming credit for delivering large grants than for claiming credit for delivering tiny ones. It remains unclear, however, whether the lack of sensitivity to the amount of money claimed reflects innumeracy or the difficulty that many people have understanding the size of a government expenditure in the abstract. We perform a survey experiment in which we give respondents information about both the absolute and relative size of projects. We find that subjects evaluate legislators significantly more favorably for claiming credit for relatively large projects. Our results suggest that subjects are responsive to the magnitudes in claims of accomplishment, but only when provided a benchmark. We also find evidence of an asymmetric effect; subjects are more inclined to punish legislators for delivering grants of below average size than to reward them for delivering grants of above average size
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
We propose Safe Human dataset consisting of 17 different objects referred to as SH17 dataset. We scrapped images from the Pexels website, which offers clear usage rights for all its images, showcasing a range of human activities across diverse industrial operations.
To extract relevant images, we used multiple queries such as manufacturing worker, industrial worker, human worker, labor, etc. The tags associated with Pexels images proved reasonably accurate. After removing duplicate samples, we obtained a dataset of 8,099 images. The dataset exhibits significant diversity, representing manufacturing environments globally, thus minimizing potential regional or racial biases. Samples of the dataset are shown below.
Key features
Collected from diverse industrial environments globally
High quality images (max resolution 8192x5462, min 1920x1002)
Average of 9.38 instances per image
Includes small objects like ears and earmuffs (39,764 annotations < 1% image area, 59,025 annotations < 5% area)
Classes
Person
Head
Face
Glasses
Face-mask-medical
Face-guard
Ear
Earmuffs
Hands
Gloves
Foot
Shoes
Safety-vest
Tools
Helmet
Medical-suit
Safety-suit
The data consists of three folders,
images contains all images
labels contains labels in YOLO format for all images
voc_labels contains labels in VOC format for all images
train_files.txt contains list of all images we used for training
val_files.txt contains list of all images we used for validation
Disclaimer and Responsible Use:
This dataset, scrapped through the Pexels website, is intended for educational, research, and analysis purposes only. You may be able to use the data for training of the Machine learning models only. Users are urged to use this data responsibly, ethically, and within the bounds of legal stipulations.
Users should adhere to Copyright Notice of Pexels when utilizing this dataset.
Legal Simplicity: All photos and videos on Pexels can be downloaded and used for free.
Allowed 👌
All photos and videos on Pexels are free to use.
Attribution is not required. Giving credit to the photographer or Pexels is not necessary but always appreciated.
You can modify the photos and videos from Pexels. Be creative and edit them as you like.
Not allowed 👎
Identifiable people may not appear in a bad light or in a way that is offensive.
Don't sell unaltered copies of a photo or video, e.g. as a poster, print or on a physical product without modifying it first.
Don't imply endorsement of your product by people or brands on the imagery.
Don't redistribute or sell the photos and videos on other stock photo or wallpaper platforms.
Don't use the photos or videos as part of your trade-mark, design-mark, trade-name, business name or service mark.
No Warranty Disclaimer:
The dataset is provided "as is," without warranty, and the creator disclaims any legal liability for its use by others.
Ethical Use:
Users are encouraged to consider the ethical implications of their analyses and the potential impact on broader community.
GitHub Page:
The Fish predator/prey database (stomach database) contains data covering 11 countries. A total of 217 399 stomachs are reported in the North Sea data. Eight predator species were analysed and 854 NODC prey codes have been reported AccConstrDescription=This license lets others distribute, remix, tweak, and build upon your work, even commercially, as long as they credit you for the original creation. This is the most accommodating of licenses offered. Recommended for maximum dissemination and use of licensed materials. AccConstrDisplay=This dataset is licensed under a Creative Commons Attribution 4.0 International License. AccConstrEN=Attribution (CC BY) AccessConstraint=Attribution (CC BY) AccessConstraints=ICES Data Policy: https://www.ices.dk/data/guidelines-and-policy/Pages/ICES-data-policy.aspx Acronym=STOMACH BrackishFlag=0 CDate=2009-10-19 cdm_data_type=Other CheckedFlag=0 Citation=ICES Fish stomach contents dataset (STOMACH). The International Council for the Exploration of the Sea, Copenhagen. 2010. Online source: http://ecosystemdata.ices.dk Comments=None ContactEmail=None Conventions=COARDS, CF-1.6, ACDD-1.3 CurrencyDate=None DasID=2144 DasOrigin=Data collection DasType=Data DasTypeID=1 DescrCompFlag=1 DescrTransFlag=0 Easternmost_Easting=12.5 EmbargoDate=None EngAbstract=The Fish predator/prey database (stomach database) contains data covering 11 countries. A total of 217 399 stomachs are reported in the North Sea data. Eight predator species were analysed and 854 NODC prey codes have been reported EngDescr=Year of the Stomach content data The first year of the stomach ran in 1981 and covered a handful of species in the North Sea. Follow up data collection was performed in 1985-1986, and a decade later in 1991 the 2nd year of the stomach ran. The data were collected as part of an ICES initiative and the results and analysis were presented in two cooperative reports (CRR 164 and CRR 219).
Geographical Coverage There are 2 datasets related to the year of the stomach, North Sea and Baltic respectively. The map to the right shows the distribution of the North Sea records. Currently, only the North Sea dataset is available online.
About this dataset While ICES has made every effort to ensure that the dataset presented here is the most complete set of information, it should be noted that there are a number of versions of this dataset held by institutions and individuals which may differ from this set. FreshFlag=0 geospatial_lat_max=63.25 geospatial_lat_min=51.25 geospatial_lat_units=degrees_north geospatial_lon_max=12.5 geospatial_lon_min=-5.5 geospatial_lon_units=degrees_east infoUrl=None InputNotes=None institution=ICES License=https://creativecommons.org/licenses/by/4.0/ Lineage=None MarineFlag=1 Northernmost_Northing=63.25 OrigAbstract=None OrigDescr=None OrigDescrLang=English OrigDescrLangNL=Engels OrigLangCode=en OrigTitle=None OrigTitleLang=None OrigTitleLangCode=None OrigTitleLangNL=None Progress=Completed PublicFlag=1 ReleaseDate=None ReleaseDate0=None RevisionDate=None SizeReference=1,149,608 Records sourceUrl=(local files) Southernmost_Northing=51.25 standard_name_vocabulary=CF Standard Name Table v70 StandardTitle=ICES Fish stomach contents dataset subsetVariables=ScientificName,BasisOfRecord,YearCollected,MonthCollected,DayCollected,aphia_id TerrestrialFlag=0 time_coverage_start=1980-01-01T01:00:00Z UDate=2022-06-16 VersionDate=None VersionDay=None VersionMonth=None VersionName=None VersionYear=None VlizCoreFlag=1 Westernmost_Easting=-5.5
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Authors: Baldwin M. Way (Principal Investigator), Christopher R. Browning, Dylan D. Wagner, Jodi L. Ford, Bethany Boettner, Ping Bai.
Study Overview These data were collected as part of a longitudinal study of adolescent health and well-being collected in Columbus, Ohio (Adolescent Health and Development in Context Study). The larger goals of the project (R01DA042080) were to understand how geospatial exposures predicted substance use. More specifically, Specific Aims 1a & 1b were to longitudinally and cross-sectionally determine how neural function and structure is reshaped by EtV in the community (1a) and substance use (1b). Specific Aim 2 was to use baseline as well as longitudinal neural changes to predict subsequent substance use and identify neural mediators. Specific Aim 3 was to identify risk and resilience factors that alter the effects of community EtV on the neural embedding of EtV as well as the neural prediction of substance use outcomes. The participants in this longitudinal neuroimaging study were recruited from the Adolescent Health and Development in Context study (Boettner, B., Browning, C. R., & Calder, C. A. (2019). Feasibility and validity of geographically explicit ecological momentary assessment with recall‐aided space‐time budgets. Journal of Research on Adolescence, 29(3), 627-645.).
Inquiries about this dataset should be directed to: way.37@osu.edu This dataset is licensed under the Creative Commons Zero (CC0) v1.0 License.
Study Area The study area is a contiguous space within the Interstate 270 loop outerbelt freeway, encompassing a majority of the city of Columbus as well as several suburban municipalities.
Sampling The sampling frame was based on a combination of a vendor-provided list of households in the study area with high probability of meeting eligibility criteria and directory data from public school districts represented in the study area. Households were mailed a letter or postcard describing the study, followed by interviewer calls to the household to solicit participation in the study. Once eligibility was confirmed with the household, one randomly selected youth aged 11-17 and one primary caregiver (English speaking) were recruited to participate in the study.
The racial/ethnic identity of the first wave of the AHDC study was 1,405 youth with 47% white, 38% Black, 5% Hispanic/Latino, 8% multiracial, and 2% Asian. The sample closely approximates the population in the study area with respect to household income of families with children and youth racial/ethnic composition, with the exception that the AHDC sample has a somewhat higher percent of youth who identified as Black compared with the 2009-2013 American Community Survey (ACS) estimates of the area.
For Wave 3 (the first neuroimaging wave), in addition to the participants recruited from the original sample, a refresher sample was recruited. These participants were recruited from within the families of the original sample (i.e. siblings) as well as using the same methods of recruiting the initial wave from low-income census tracts as well as tabling at schools in these tracts.
Study Design The study employs a prospective cohort design in which the data on youth and caregivers were collected at multiple time points. The Wave 1 field period began in spring 2014 and was completed in summer 2016. Wave 2 was conducted between January and December 2016. Wave 3 (the first imaging wave) was conducted between July of 2018 and March of 2020, concluding with the cessation of in person activities due to COVID-19 related restrictions on in-person activities. Wave 4 was run between July and October of 2020. Wave 5 was run between March and July of 2021. Wave 6 (the 2nd imaging wave) was run between May of 2022 and January of 2024.
Study Procedures Within each wave, participant data were collected over a weeklong period. An Entrance Survey with both a focal youth and his or her caregiver was followed by a seven-day smartphone-based Global Positioning System (GPS) tracking and Ecological Momentary Assessment (EMA) data collection period (EMA Week), and either a final Exit Survey at the end of the week (Waves 1 and 2) or a session at the Center for Cognitive and Behavioral Brain Imaging at the Ohio State University (Waves 3 and 6). Waves 4 and 5 were slightly different due to restrictions on in-person activity. Wave 4 consisted of a phone interview and online survey that was completed remotely with participants downloading an app on their phone for responding to EMAs and GPS tracking. Wave 5 only consisted of an online survey and the responding to EMAs with GPS tracking.
The Entrance Survey was collected at the initial in-home visit with adolescent participants and their caregivers. It included a wide range of measures across social, economic, psychological, health, and behavioral domains. Both adolescent and caregiver participants reported on geographic location of and experiences at routine activities (e.g. school, work, church, stores, relative’s house).
The real-time Ecological Momentary Assessment (EMA) surveys were collected via self-administered survey on project-provided smartphones. The study phones also passively collected GPS spatial coordinates during the seven-day EMA collection period. Youth respondents were prompted up to five times a day, and asked to report on their location, network partner presence, risk behaviors such as substance use, mood, surrounding social climate, and sleep patterns.
Waves 1 and 2: A second visit, the Exit Survey, gathered follow-up information about the EMA week. The youth completed an interactive Space-Time Budget with the interviewer to collect detailed activity data on five days – the three most recent weekdays and two weekend days. The processed GPS data results in summarized stationary and travel periods during those five days, along with activity types and network partner presence. Concurrently, caregivers completed a self-administered survey about perceptions of social climate and safety in their neighborhood and at other routine locations.
Waves 3 and 6: The second visit at the conclusion of the week of GPS tracking and EMA sampling was conducted at the Ohio State Center for Cognitive and Behavioral Brain Imaging. Participants completed an initial battery of questionnaires before scanning as well as had the option of providing a hair sample for cortisol or substance use measurement and blood sample for measurement of immune related markers. Participants also completed questionnaires after the scan.
Participants 309 youths participated in the initial home interviews in Wave 3. 290 of these youths came to the imaging center and 271 adolescents were successfully scanned. Of these 271, 158 were in Wave 1 of the AHDC study, while 113 were part of the refresher sample and were thus new to Wave 3.
For Wave 6, there were 144 individuals who came to the imaging center and 120 were successfully scanned. Of these, 110 were also scanned at wave 3, while 10 of these were individuals who were scanned for the first time.
MRI Tasks In the first wave of imaging data (2018 to 2020; Wave 3 of the AHDC parent study), the task sequence was the same for all youths. The time of each run is listed after each and then in parentheses is the number of subjects after quality control checks (e.g. motion). 1. MPRAGE: 6:58 min (n = 249) 2. T2: 3:36 min 3. Resting State Scan (eyes open, rest): 5 min 4. Emotional Faces Task (Surprise, Angry, Fear, Neutral): 4:30 min x 2 runs (n = 214) 5. Cue Reactivity Task (Food, Marijuana, Flavored E-Cigs, Alcohol, and Outdoor images): 5:40 min x 2 runs (n = 215) 6. DTI: 6:55 min 7. Resting State Scan (eyes open, rest): 5 min 8. Field Map: 1:33 min 9. Monetary Incentive Delay Task: 5:23 min x 2 runs (n = 207) 10. Working Memory Task: 4:51 min x 2 runs (n = 183) These latter two tasks used the same Eprime script as used in the ABCD study.
In the second imaging wave run between 2022 and 2024 (Wave 6 overall), there was a slight change to the task order for all participants in order to reduce the probability of youths falling asleep during the first resting state scan. The scan order for the second wave of imaging data was T1, T2, Emotional Faces Task, Cue Reactivity, Resting State 1, MID, Resting State 2, Field Map, Nback task.
Caution This dataset is for research purposes only. The data have been anonymized, and users must not perform analyses aimed at re-identifying individual subjects.
Acknowledgements. We are grateful to all of the youth and their caregivers who participated in the study. The Adolescent Health and Development in Context study (Waves 1 and 2) was funded by the National Institutes for Drug Abuse (R01DA032371; Browning, PI) as well as the Eunice Kennedy Shriver National Institute on Child Health and Human Development (Boettner, R03HD096182; Calder, R01HD088545; Hayford, the Ohio State University Institute for Population Research, 2P2CHD058484), and the William T. Grant Foundation). Participants for the imaging data (Waves 3 and 6) were recruited from this sample, which was generously supported by a grant from the National Institutes of Drug Abuse (R01DA042080; Way, PI). There were two waves of data collected during COVID (Waves 4 and 5) that were funded by a supplemental grant from the National Institutes of Drug Abuse (DA042080-03S1; Way, PI). Assay of head hair samples for cortisol during the imaging waves (Waves 3 and 6) was funded by a grant from the John Templeton Foundation (ID: 61803; Way, PI). Head hair cortisol and salivary cortisol collection and assays for Waves 1 and 2 were funded by R21DA034960 (Ford, PI).
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This is a dataset of the most highly populated city (if applicable) in a form easy to join with the COVID19 Global Forecasting (Week 1) dataset. You can see how to use it in this kernel
There are four columns. The first two correspond to the columns from the original COVID19 Global Forecasting (Week 1) dataset. The other two is the highest population density, at city level, for the given country/state. Note that some countries are very small and in those cases the population density reflects the entire country. Since the original dataset has a few cruise ships as well, I've added them there.
Thanks a lot to Kaggle for this competition that gave me the opportunity to look closely at some data and understand this problem better.
Summary: I believe that the square root of the population density should relate to the logistic growth factor of the SIR model. I think the SEIR model isn't applicable due to any intervention being too late for a fast-spreading virus like this, especially in places with dense populations.
After playing with the data provided in COVID19 Global Forecasting (Week 1) (and everything else online or media) a bit, one thing becomes clear. They have nothing to do with epidemiology. They reflect sociopolitical characteristics of a country/state and, more specifically, the reactivity and attitude towards testing.
The testing method used (PCR tests) means that what we measure could potentially be a proxy for the number of people infected during the last 3 weeks, i.e the growth (with lag). It's not how many people have been infected and recovered. Antibody or serology tests would measure that, and by using them, we could go back to normality faster... but those will arrive too late. Way earlier, China will have experimentally shown that it's safe to go back to normal as soon as your number of newly infected per day is close to zero.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F197482%2F429e0fdd7f1ce86eba882857ac7a735e%2Fcovid-summary.png?generation=1585072438685236&alt=media" alt="">
My view, as a person living in NYC, about this virus, is that by the time governments react to media pressure, to lockdown or even test, it's too late. In dense areas, everyone susceptible has already amble opportunities to be infected. Especially for a virus with 5-14 days lag between infections and symptoms, a period during which hosts spread it all over on subway, the conditions are hopeless. Active populations have already been exposed, mostly asymptomatic and recovered. Sensitive/older populations are more self-isolated/careful in affluent societies (maybe this isn't the case in North Italy). As the virus finishes exploring the active population, it starts penetrating the more isolated ones. At this point in time, the first fatalities happen. Then testing starts. Then the media and the lockdown. Lockdown seems overly effective because it coincides with the tail of the disease spread. It helps slow down the virus exploring the long-tail of sensitive population, and we should all contribute by doing it, but it doesn't cause the end of the disease. If it did, then as soon as people were back in the streets (see China), there would be repeated outbreaks.
Smart politicians will test a lot because it will make their condition look worse. It helps them demand more resources. At the same time, they will have a low rate of fatalities due to large denominator. They can take credit for managing well a disproportionally major crisis - in contrast to people who didn't test.
We were lucky this time. We, Westerners, have woken up to the potential of a pandemic. I'm sure we will give further resources for prevention. Additionally, we will be more open-minded, helping politicians to have more direct responses. We will also require them to be more responsible in their messages and reactions.