49 datasets found

Number of missing persons files in the U.S. 2022, by race
statista.com
Updated Jul 5, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2024). Number of missing persons files in the U.S. 2022, by race [Dataset]. https://www.statista.com/statistics/240396/number-of-missing-persons-files-in-the-us-by-race/
Explore at:
Dataset updated
Jul 5, 2024
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
2022
Area covered
United States
Description
In 2022, there were 313,017 cases filed by the NCIC where the race of the reported missing was White. In the same year, 18,928 people were missing whose race was unknown.

What is the NCIC?

The National Crime Information Center (NCIC) is a digital database that stores crime data for the United States, so criminal justice agencies can access it. As a part of the FBI, it helps criminal justice professionals find criminals, missing people, stolen property, and terrorists. The NCIC database is broken down into 21 files. Seven files belong to stolen property and items, and 14 belong to persons, including the National Sex Offender Register, Missing Person, and Identify Theft. It works alongside federal, tribal, state, and local agencies. The NCIC’s goal is to maintain a centralized information system between local branches and offices, so information is easily accessible nationwide.

Missing people in the United States

A person is considered missing when they have disappeared and their location is unknown. A person who is considered missing might have left voluntarily, but that is not always the case. The number of the NCIC unidentified person files in the United States has fluctuated since 1990, and in 2022, there were slightly more NCIC missing person files for males as compared to females. Fortunately, the number of NCIC missing person files has been mostly decreasing since 1998.
n
Coronavirus (Covid-19) Data in the United States
nytimes.com
openicpsr.org
+4more
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
New York Times, Coronavirus (Covid-19) Data in the United States [Dataset]. https://www.nytimes.com/interactive/2020/us/coronavirus-us-cases.html
Explore at:
Dataset provided by
New York Times
Description
The New York Times is releasing a series of data files with cumulative counts of coronavirus cases in the United States, at the state and county level, over time. We are compiling this time series data from state and local governments and health departments in an attempt to provide a complete record of the ongoing outbreak.
Since late January, The Times has tracked cases of coronavirus in real time as they were identified after testing. Because of the widespread shortage of testing, however, the data is necessarily limited in the picture it presents of the outbreak.
We have used this data to power our maps and reporting tracking the outbreak, and it is now being made available to the public in response to requests from researchers, scientists and government officials who would like access to the data to better understand the outbreak.
The data begins with the first reported coronavirus case in Washington State on Jan. 21, 2020. We will publish regular updates to the data in this repository.
N
Lost Creek, WV Population Dataset: Yearly Figures, Population Change, and...
neilsberg.com
csv, json
Updated Sep 18, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2023). Lost Creek, WV Population Dataset: Yearly Figures, Population Change, and Percent Change Analysis [Dataset]. https://www.neilsberg.com/research/datasets/6ed01482-3d85-11ee-9abe-0aa64bf2eeb2/
Explore at:
json, csvAvailable download formats
Dataset updated
Sep 18, 2023
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
West Virginia, Lost Creek
Variables measured
Annual Population Growth Rate, Population Between 2000 and 2022, Annual Population Growth Rate Percent
Measurement technique
The data presented in this dataset is derived from the 20 years data of U.S. Census Bureau Population Estimates Program (PEP) 2000 - 2022. To measure the variables, namely (a) population and (b) population change in ( absolute and as a percentage ), we initially analyzed and tabulated the data for each of the years between 2000 and 2022. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset tabulates the Lost Creek population over the last 20 plus years. It lists the population for each year, along with the year on year change in population, as well as the change in percentage terms for each year. The dataset can be utilized to understand the population change of Lost Creek across the last two decades. For example, using this dataset, we can identify if the population is declining or increasing. If there is a change, when the population peaked, or if it is still growing and has not reached its peak. We can also compare the trend with the overall trend of United States population over the same period of time.

Key observations

In 2022, the population of Lost Creek was 353, a 0.84% decrease year-by-year from 2021. Previously, in 2021, Lost Creek population was 356, a decline of 0.84% compared to a population of 359 in 2020. Over the last 20 plus years, between 2000 and 2022, population of Lost Creek decreased by 112. In this period, the peak population was 503 in the year 2006. The numbers suggest that the population has already reached its peak and is showing a trend of decline. Source: U.S. Census Bureau Population Estimates Program (PEP).

Content

When available, the data consists of estimates from the U.S. Census Bureau Population Estimates Program (PEP).

Data Coverage:

From 2000 to 2022

Variables / Data Columns

Year: This column displays the data year (Measured annually and for years 2000 to 2022)

Population: The population for the specific year for the Lost Creek is shown in this column.

Year on Year Change: This column displays the change in Lost Creek population for each year compared to the previous year.

Change in Percent: This column displays the year on year change as a percentage. Please note that the sum of all percentages may not equal one due to rounding of values.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for Lost Creek Population by Year. You can refer the same here
Z
Counts of Dysentery reported in UNITED STATES OF AMERICA: 1942-1948
data.niaid.nih.gov
Updated Jun 3, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Van Panhuis, Willem (2024). Counts of Dysentery reported in UNITED STATES OF AMERICA: 1942-1948 [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_11452249
Explore at:
Dataset updated
Jun 3, 2024
Dataset provided by
Van Panhuis, Willem
Cross, Anne
Burke, Donald
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
United States
Description
Project Tycho datasets contain case counts for reported disease conditions for countries around the world. The Project Tycho data curation team extracts these case counts from various reputable sources, typically from national or international health authorities, such as the US Centers for Disease Control or the World Health Organization. These original data sources include both open- and restricted-access sources. For restricted-access sources, the Project Tycho team has obtained permission for redistribution from data contributors. All datasets contain case count data that are identical to counts published in the original source and no counts have been modified in any way by the Project Tycho team. The Project Tycho team has pre-processed datasets by adding new variables, such as standard disease and location identifiers, that improve data interpretabilty. We also formatted the data into a standard data format. Each Project Tycho dataset contains case counts for a specific condition (e.g. measles) and for a specific country (e.g. The United States). Case counts are reported per time interval. In addition to case counts, datsets include information about these counts (attributes), such as the location, age group, subpopulation, diagnostic certainty, place of aquisition, and the source from which we extracted case counts. One dataset can include many series of case count time intervals, such as "US measles cases as reported by CDC", or "US measles cases reported by WHO", or "US measles cases that originated abroad", etc. Depending on the intended use of a dataset, we recommend a few data processing steps before analysis:

Analyze missing data: Project Tycho datasets do not inlcude time intervals for which no case count was reported (for many datasets, time series of case counts are incomplete, due to incompleteness of source documents) and users will need to add time intervals for which no count value is available. Project Tycho datasets do include time intervals for which a case count value of zero was reported. Separate cumulative from non-cumulative time interval series. Case count time series in Project Tycho datasets can be "cumulative" or "fixed-intervals". Cumulative case count time series consist of overlapping case count intervals starting on the same date, but ending on different dates. For example, each interval in a cumulative count time series can start on January 1st, but end on January 7th, 14th, 21st, etc. It is common practice among public health agencies to report cases for cumulative time intervals. Case count series with fixed time intervals consist of mutually exxclusive time intervals that all start and end on different dates and all have identical length (day, week, month, year). Given the different nature of these two types of case count data, we indicated this with an attribute for each count value, named "PartOfCumulativeCountSeries".
N
Lost Nation, IA Age Group Population Dataset: A complete breakdown of Lost...
neilsberg.com
csv, json
Updated Sep 16, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2023). Lost Nation, IA Age Group Population Dataset: A complete breakdown of Lost Nation age demographics from 0 to 85 years, distributed across 18 age groups [Dataset]. https://www.neilsberg.com/research/datasets/70a91071-3d85-11ee-9abe-0aa64bf2eeb2/
Explore at:
csv, jsonAvailable download formats
Dataset updated
Sep 16, 2023
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Iowa, Lost Nation
Variables measured
Population Under 5 Years, Population over 85 years, Population Between 5 and 9 years, Population Between 10 and 14 years, Population Between 15 and 19 years, Population Between 20 and 24 years, Population Between 25 and 29 years, Population Between 30 and 34 years, Population Between 35 and 39 years, Population Between 40 and 44 years, and 9 more
Measurement technique
The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the age groups. For age groups we divided it into roughly a 5 year bucket for ages between 0 and 85. For over 85, we aggregated data into a single group for all ages. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset tabulates the Lost Nation population distribution across 18 age groups. It lists the population in each age group along with the percentage population relative of the total population for Lost Nation. The dataset can be utilized to understand the population distribution of Lost Nation by age. For example, using this dataset, we can identify the largest age group in Lost Nation.

Key observations

The largest age group in Lost Nation, IA was for the group of age 85+ years with a population of 44 (10.48%), according to the 2021 American Community Survey. At the same time, the smallest age group in Lost Nation, IA was the 15-19 years with a population of 7 (1.67%). Source: U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

Age groups:

Under 5 years

5 to 9 years

10 to 14 years

15 to 19 years

20 to 24 years

25 to 29 years

30 to 34 years

35 to 39 years

40 to 44 years

45 to 49 years

50 to 54 years

55 to 59 years

60 to 64 years

65 to 69 years

70 to 74 years

75 to 79 years

80 to 84 years

85 years and over

Variables / Data Columns

Age Group: This column displays the age group in consideration

Population: The population for the specific age group in the Lost Nation is shown in this column.

% of Total Population: This column displays the population of each age group as a proportion of Lost Nation total population. Please note that the sum of all percentages may not equal one due to rounding of values.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for Lost Nation Population by Age. You can refer the same here
p
Counts of Murine typhus reported in UNITED STATES OF AMERICA: 1945-1961
tycho.pitt.edu
Updated Apr 1, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Willem G Van Panhuis; Anne L Cross; Donald S Burke (2018). Counts of Murine typhus reported in UNITED STATES OF AMERICA: 1945-1961 [Dataset]. https://www.tycho.pitt.edu/dataset/US.25668000
Explore at:
Dataset updated
Apr 1, 2018
Dataset provided by
Project Tycho, University of Pittsburgh
Authors
Willem G Van Panhuis; Anne L Cross; Donald S Burke
Time period covered
1945 - 1961
Area covered
United States
Description
Project Tycho datasets contain case counts for reported disease conditions for countries around the world. The Project Tycho data curation team extracts these case counts from various reputable sources, typically from national or international health authorities, such as the US Centers for Disease Control or the World Health Organization. These original data sources include both open- and restricted-access sources. For restricted-access sources, the Project Tycho team has obtained permission for redistribution from data contributors. All datasets contain case count data that are identical to counts published in the original source and no counts have been modified in any way by the Project Tycho team. The Project Tycho team has pre-processed datasets by adding new variables, such as standard disease and location identifiers, that improve data interpretability. We also formatted the data into a standard data format.

Each Project Tycho dataset contains case counts for a specific condition (e.g. measles) and for a specific country (e.g. The United States). Case counts are reported per time interval. In addition to case counts, datasets include information about these counts (attributes), such as the location, age group, subpopulation, diagnostic certainty, place of acquisition, and the source from which we extracted case counts. One dataset can include many series of case count time intervals, such as "US measles cases as reported by CDC", or "US measles cases reported by WHO", or "US measles cases that originated abroad", etc.

Depending on the intended use of a dataset, we recommend a few data processing steps before analysis: - Analyze missing data: Project Tycho datasets do not include time intervals for which no case count was reported (for many datasets, time series of case counts are incomplete, due to incompleteness of source documents) and users will need to add time intervals for which no count value is available. Project Tycho datasets do include time intervals for which a case count value of zero was reported. - Separate cumulative from non-cumulative time interval series. Case count time series in Project Tycho datasets can be "cumulative" or "fixed-intervals". Cumulative case count time series consist of overlapping case count intervals starting on the same date, but ending on different dates. For example, each interval in a cumulative count time series can start on January 1st, but end on January 7th, 14th, 21st, etc. It is common practice among public health agencies to report cases for cumulative time intervals. Case count series with fixed time intervals consist of mutually exclusive time intervals that all start and end on different dates and all have identical length (day, week, month, year). Given the different nature of these two types of case count data, we indicated this with an attribute for each count value, named "PartOfCumulativeCountSeries".
z
Counts of Cholera reported in UNITED STATES OF AMERICA: 1895-1905
zenodo.org
json, xml, zip
Updated Jun 3, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Willem Van Panhuis; Willem Van Panhuis; Anne Cross; Anne Cross; Donald Burke; Donald Burke (2024). Counts of Cholera reported in UNITED STATES OF AMERICA: 1895-1905 [Dataset]. http://doi.org/10.25337/t7/ptycho.v2.0/us.63650001
Explore at:
json, zip, xmlAvailable download formats
Unique identifier
https://doi.org/10.25337/t7/ptycho.v2.0/us.63650001
Dataset updated
Jun 3, 2024
Dataset provided by
Project Tycho
Authors
Willem Van Panhuis; Willem Van Panhuis; Anne Cross; Anne Cross; Donald Burke; Donald Burke
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Dec 29, 1895 - Nov 25, 1905
Area covered
United States
Description
Project Tycho datasets contain case counts for reported disease conditions for countries around the world. The Project Tycho data curation team extracts these case counts from various reputable sources, typically from national or international health authorities, such as the US Centers for Disease Control or the World Health Organization. These original data sources include both open- and restricted-access sources. For restricted-access sources, the Project Tycho team has obtained permission for redistribution from data contributors. All datasets contain case count data that are identical to counts published in the original source and no counts have been modified in any way by the Project Tycho team. The Project Tycho team has pre-processed datasets by adding new variables, such as standard disease and location identifiers, that improve data interpretabilty. We also formatted the data into a standard data format.
Each Project Tycho dataset contains case counts for a specific condition (e.g. measles) and for a specific country (e.g. The United States). Case counts are reported per time interval. In addition to case counts, datsets include information about these counts (attributes), such as the location, age group, subpopulation, diagnostic certainty, place of aquisition, and the source from which we extracted case counts. One dataset can include many series of case count time intervals, such as "US measles cases as reported by CDC", or "US measles cases reported by WHO", or "US measles cases that originated abroad", etc.
Depending on the intended use of a dataset, we recommend a few data processing steps before analysis:
Analyze missing data: Project Tycho datasets do not inlcude time intervals for which no case count was reported (for many datasets, time series of case counts are incomplete, due to incompleteness of source documents) and users will need to add time intervals for which no count value is available. Project Tycho datasets do include time intervals for which a case count value of zero was reported.
Separate cumulative from non-cumulative time interval series. Case count time series in Project Tycho datasets can be "cumulative" or "fixed-intervals". Cumulative case count time series consist of overlapping case count intervals starting on the same date, but ending on different dates. For example, each interval in a cumulative count time series can start on January 1st, but end on January 7th, 14th, 21st, etc. It is common practice among public health agencies to report cases for cumulative time intervals. Case count series with fixed time intervals consist of mutually exxclusive time intervals that all start and end on different dates and all have identical length (day, week, month, year). Given the different nature of these two types of case count data, we indicated this with an attribute for each count value, named "PartOfCumulativeCountSeries".
Z
Counts of Rocky Mountain spotted fever reported in UNITED STATES OF AMERICA:...
data.niaid.nih.gov
Updated Jun 3, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Van Panhuis, Willem (2024). Counts of Rocky Mountain spotted fever reported in UNITED STATES OF AMERICA: 1942-2016 [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_11452295
Explore at:
Dataset updated
Jun 3, 2024
Dataset provided by
Van Panhuis, Willem
Cross, Anne
Burke, Donald
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Rocky Mountains, United States
Description
Project Tycho datasets contain case counts for reported disease conditions for countries around the world. The Project Tycho data curation team extracts these case counts from various reputable sources, typically from national or international health authorities, such as the US Centers for Disease Control or the World Health Organization. These original data sources include both open- and restricted-access sources. For restricted-access sources, the Project Tycho team has obtained permission for redistribution from data contributors. All datasets contain case count data that are identical to counts published in the original source and no counts have been modified in any way by the Project Tycho team. The Project Tycho team has pre-processed datasets by adding new variables, such as standard disease and location identifiers, that improve data interpretabilty. We also formatted the data into a standard data format. Each Project Tycho dataset contains case counts for a specific condition (e.g. measles) and for a specific country (e.g. The United States). Case counts are reported per time interval. In addition to case counts, datsets include information about these counts (attributes), such as the location, age group, subpopulation, diagnostic certainty, place of aquisition, and the source from which we extracted case counts. One dataset can include many series of case count time intervals, such as "US measles cases as reported by CDC", or "US measles cases reported by WHO", or "US measles cases that originated abroad", etc. Depending on the intended use of a dataset, we recommend a few data processing steps before analysis:

Analyze missing data: Project Tycho datasets do not inlcude time intervals for which no case count was reported (for many datasets, time series of case counts are incomplete, due to incompleteness of source documents) and users will need to add time intervals for which no count value is available. Project Tycho datasets do include time intervals for which a case count value of zero was reported. Separate cumulative from non-cumulative time interval series. Case count time series in Project Tycho datasets can be "cumulative" or "fixed-intervals". Cumulative case count time series consist of overlapping case count intervals starting on the same date, but ending on different dates. For example, each interval in a cumulative count time series can start on January 1st, but end on January 7th, 14th, 21st, etc. It is common practice among public health agencies to report cases for cumulative time intervals. Case count series with fixed time intervals consist of mutually exxclusive time intervals that all start and end on different dates and all have identical length (day, week, month, year). Given the different nature of these two types of case count data, we indicated this with an attribute for each count value, named "PartOfCumulativeCountSeries".
T
United States Unemployment Rate
tradingeconomics.com
pt.tradingeconomics.com
+15more
csv, excel, json, xml
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS, United States Unemployment Rate [Dataset]. https://tradingeconomics.com/united-states/unemployment-rate
Explore at:
excel, xml, csv, jsonAvailable download formats
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jan 31, 1948 - Feb 28, 2025
Area covered
United States
Description
Unemployment Rate in the United States increased to 4.10 percent in February from 4 percent in January of 2025. This dataset provides the latest reported value for - United States Unemployment Rate - plus previous releases, historical high and low, short-term forecast and long-term prediction, economic calendar, survey consensus and news.
z
Counts of Measles reported in UNITED STATES OF AMERICA: 1888-2002
zenodo.org
data.niaid.nih.gov
json, xml, zip
Updated Jun 3, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Willem Van Panhuis; Willem Van Panhuis; Anne Cross; Anne Cross; Donald Burke; Donald Burke (2024). Counts of Measles reported in UNITED STATES OF AMERICA: 1888-2002 [Dataset]. http://doi.org/10.25337/t7/ptycho.v2.0/us.14189004
Explore at:
xml, json, zipAvailable download formats
Unique identifier
https://doi.org/10.25337/t7/ptycho.v2.0/us.14189004
Dataset updated
Jun 3, 2024
Dataset provided by
Project Tycho
Authors
Willem Van Panhuis; Willem Van Panhuis; Anne Cross; Anne Cross; Donald Burke; Donald Burke
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jul 15, 1888 - Dec 28, 2002
Area covered
United States
Description
Project Tycho datasets contain case counts for reported disease conditions for countries around the world. The Project Tycho data curation team extracts these case counts from various reputable sources, typically from national or international health authorities, such as the US Centers for Disease Control or the World Health Organization. These original data sources include both open- and restricted-access sources. For restricted-access sources, the Project Tycho team has obtained permission for redistribution from data contributors. All datasets contain case count data that are identical to counts published in the original source and no counts have been modified in any way by the Project Tycho team. The Project Tycho team has pre-processed datasets by adding new variables, such as standard disease and location identifiers, that improve data interpretabilty. We also formatted the data into a standard data format.
Each Project Tycho dataset contains case counts for a specific condition (e.g. measles) and for a specific country (e.g. The United States). Case counts are reported per time interval. In addition to case counts, datsets include information about these counts (attributes), such as the location, age group, subpopulation, diagnostic certainty, place of aquisition, and the source from which we extracted case counts. One dataset can include many series of case count time intervals, such as "US measles cases as reported by CDC", or "US measles cases reported by WHO", or "US measles cases that originated abroad", etc.
Depending on the intended use of a dataset, we recommend a few data processing steps before analysis:
Analyze missing data: Project Tycho datasets do not inlcude time intervals for which no case count was reported (for many datasets, time series of case counts are incomplete, due to incompleteness of source documents) and users will need to add time intervals for which no count value is available. Project Tycho datasets do include time intervals for which a case count value of zero was reported.
Separate cumulative from non-cumulative time interval series. Case count time series in Project Tycho datasets can be "cumulative" or "fixed-intervals". Cumulative case count time series consist of overlapping case count intervals starting on the same date, but ending on different dates. For example, each interval in a cumulative count time series can start on January 1st, but end on January 7th, 14th, 21st, etc. It is common practice among public health agencies to report cases for cumulative time intervals. Case count series with fixed time intervals consist of mutually exxclusive time intervals that all start and end on different dates and all have identical length (day, week, month, year). Given the different nature of these two types of case count data, we indicated this with an attribute for each count value, named "PartOfCumulativeCountSeries".
Energy Consumption of United States Over Time
kaggle.com
Updated Dec 14, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2022). Energy Consumption of United States Over Time [Dataset]. https://www.kaggle.com/datasets/thedevastator/unlocking-the-energy-consumption-of-united-state
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 14, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
The Devastator
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Energy Consumption of United States Over Time

Building Energy Data Book

By Department of Energy [source]

About this dataset

The Building Energy Data Book (2011) is an invaluable resource for gaining insight into the current state of energy consumption in the buildings sector. This dataset provides comprehensive data on residential, commercial and industrial building energy consumption, construction techniques, building technologies and characteristics. With this resource, you can get an in-depth understanding of how energy is used in various types of buildings - from single family homes to large office complexes - as well as its impact on the environment. The BTO within the U.S Department of Energy's Office of Energy Efficiency and Renewable Energy developed this dataset to provide a wealth of knowledge for researchers, policy makers, engineers and even everyday observers who are interested in learning more about our built environment and its energy usage patterns

More Datasets

For more datasets, click here.

Featured Notebooks

🚨 Your notebook can be here! 🚨!

How to use the dataset

This dataset provides comprehensive information regarding energy consumption in the buildings sector of the United States. It contains a number of key variables which can be used to analyze and explore the relations between energy consumption and building characteristics, technologies, and construction. The data is provided in both CSV format as well as tabular format which can make it helpful for those who prefer to use programs like Excel or other statistical modeling software.

In order to get started with this dataset we've developed a guide outlining how to effectively use it for your research or project needs.

Understand what's included: Before you start analyzing the data, you should read through the provided documentation so that you fully understand what is included in the datasets. You'll want to be aware of any potential limitations or requirements associated with each type of data point so that your results are valid and reliable when drawing conclusions from them.

Clean up any outliers: You may need to take some time upfront investigating suspicious outliers within your dataset before using it in any further analyses — otherwise, they can skew results down the road if not dealt with first-hand! Furthermore, they could also make complex statistical modeling more difficult as well since they artificially inflate values depending on their magnitude within each example data point (i.e., one outlier could affect an entire model’s prior distributions). Missing values should also be accounted for too since these may not always appear obvious at first glance when reviewing a table or graphical representation - but accurate statistics must still be obtained either way no matter how messy things seem!

Exploratory data analysis: After cleaning up your dataset you'll want to do some basic exploring by visualizing different types of summaries like boxplots, histograms and scatter plots etc.. This will give you an initial case into what trends might exist within certain demographic/geographic/etc.. regions & variables which can then help inform future predictive models when needed! Additionally this step will highlight any clear discontinuous changes over time due over-generalization (if applicable), making sure predictors themselves don’t become part noise instead contributing meaningful signals towards overall effect predictions accuracy etc…

Analyze key metrics & observations: Once exploratory analyses have been carried out on rawsamples post-processing steps are next such as analyzing metrics such ascorrelations amongst explanatory functions; performing significance testing regression models; imputing missing/outlier values and much more depending upon specific project needs at hand… Additionally – interpretation efforts based

Research Ideas

Creating an energy efficiency rating system for buildings - Using the dataset, an organization can develop a metric to rate the energy efficiency of commercial and residential buildings in a standardized way.

Developing targeted campaigns to raise awareness about energy conservation - Analyzing data from this dataset can help organizations identify areas of high energy consumption and create targeted campaigns and incentives to encourage people to conserve energy in those areas.

Estimating costs associated with upgrading building technologies - By evaluating various trends in building technologies and their associated costs, decision-makers can determine the most cost-effective option when it comes time to upgrade their structures' energy efficiency...
z
Counts of Scarlet fever reported in UNITED STATES OF AMERICA: 1888-1969
zenodo.org
data.niaid.nih.gov
json, xml, zip
Updated Jun 3, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Willem Van Panhuis; Willem Van Panhuis; Anne Cross; Anne Cross; Donald Burke; Donald Burke (2024). Counts of Scarlet fever reported in UNITED STATES OF AMERICA: 1888-1969 [Dataset]. http://doi.org/10.25337/t7/ptycho.v2.0/us.30242009
Explore at:
json, zip, xmlAvailable download formats
Unique identifier
https://doi.org/10.25337/t7/ptycho.v2.0/us.30242009
Dataset updated
Jun 3, 2024
Dataset provided by
Project Tycho
Authors
Willem Van Panhuis; Willem Van Panhuis; Anne Cross; Anne Cross; Donald Burke; Donald Burke
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jun 10, 1888 - Dec 27, 1969
Area covered
United States
Description
Project Tycho datasets contain case counts for reported disease conditions for countries around the world. The Project Tycho data curation team extracts these case counts from various reputable sources, typically from national or international health authorities, such as the US Centers for Disease Control or the World Health Organization. These original data sources include both open- and restricted-access sources. For restricted-access sources, the Project Tycho team has obtained permission for redistribution from data contributors. All datasets contain case count data that are identical to counts published in the original source and no counts have been modified in any way by the Project Tycho team. The Project Tycho team has pre-processed datasets by adding new variables, such as standard disease and location identifiers, that improve data interpretabilty. We also formatted the data into a standard data format.
Each Project Tycho dataset contains case counts for a specific condition (e.g. measles) and for a specific country (e.g. The United States). Case counts are reported per time interval. In addition to case counts, datsets include information about these counts (attributes), such as the location, age group, subpopulation, diagnostic certainty, place of aquisition, and the source from which we extracted case counts. One dataset can include many series of case count time intervals, such as "US measles cases as reported by CDC", or "US measles cases reported by WHO", or "US measles cases that originated abroad", etc.
Depending on the intended use of a dataset, we recommend a few data processing steps before analysis:
Analyze missing data: Project Tycho datasets do not inlcude time intervals for which no case count was reported (for many datasets, time series of case counts are incomplete, due to incompleteness of source documents) and users will need to add time intervals for which no count value is available. Project Tycho datasets do include time intervals for which a case count value of zero was reported.
Separate cumulative from non-cumulative time interval series. Case count time series in Project Tycho datasets can be "cumulative" or "fixed-intervals". Cumulative case count time series consist of overlapping case count intervals starting on the same date, but ending on different dates. For example, each interval in a cumulative count time series can start on January 1st, but end on January 7th, 14th, 21st, etc. It is common practice among public health agencies to report cases for cumulative time intervals. Case count series with fixed time intervals consist of mutually exxclusive time intervals that all start and end on different dates and all have identical length (day, week, month, year). Given the different nature of these two types of case count data, we indicated this with an attribute for each count value, named "PartOfCumulativeCountSeries".
Z
Counts of Pertussis reported in UNITED STATES OF AMERICA: 1888-2017
data.niaid.nih.gov
Updated Jun 3, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Van Panhuis, Willem (2024). Counts of Pertussis reported in UNITED STATES OF AMERICA: 1888-2017 [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_11452359
Explore at:
Dataset updated
Jun 3, 2024
Dataset provided by
Van Panhuis, Willem
Cross, Anne
Burke, Donald
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
United States
Description
Project Tycho datasets contain case counts for reported disease conditions for countries around the world. The Project Tycho data curation team extracts these case counts from various reputable sources, typically from national or international health authorities, such as the US Centers for Disease Control or the World Health Organization. These original data sources include both open- and restricted-access sources. For restricted-access sources, the Project Tycho team has obtained permission for redistribution from data contributors. All datasets contain case count data that are identical to counts published in the original source and no counts have been modified in any way by the Project Tycho team. The Project Tycho team has pre-processed datasets by adding new variables, such as standard disease and location identifiers, that improve data interpretabilty. We also formatted the data into a standard data format. Each Project Tycho dataset contains case counts for a specific condition (e.g. measles) and for a specific country (e.g. The United States). Case counts are reported per time interval. In addition to case counts, datsets include information about these counts (attributes), such as the location, age group, subpopulation, diagnostic certainty, place of aquisition, and the source from which we extracted case counts. One dataset can include many series of case count time intervals, such as "US measles cases as reported by CDC", or "US measles cases reported by WHO", or "US measles cases that originated abroad", etc. Depending on the intended use of a dataset, we recommend a few data processing steps before analysis:

Analyze missing data: Project Tycho datasets do not inlcude time intervals for which no case count was reported (for many datasets, time series of case counts are incomplete, due to incompleteness of source documents) and users will need to add time intervals for which no count value is available. Project Tycho datasets do include time intervals for which a case count value of zero was reported. Separate cumulative from non-cumulative time interval series. Case count time series in Project Tycho datasets can be "cumulative" or "fixed-intervals". Cumulative case count time series consist of overlapping case count intervals starting on the same date, but ending on different dates. For example, each interval in a cumulative count time series can start on January 1st, but end on January 7th, 14th, 21st, etc. It is common practice among public health agencies to report cases for cumulative time intervals. Case count series with fixed time intervals consist of mutually exxclusive time intervals that all start and end on different dates and all have identical length (day, week, month, year). Given the different nature of these two types of case count data, we indicated this with an attribute for each count value, named "PartOfCumulativeCountSeries".
Z
Counts of Syphilis reported in UNITED STATES OF AMERICA: 1978-1978
data.niaid.nih.gov
zenodo.org
Updated Jun 3, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Burke, Donald (2024). Counts of Syphilis reported in UNITED STATES OF AMERICA: 1978-1978 [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_11452560
Explore at:
Dataset updated
Jun 3, 2024
Dataset provided by
Van Panhuis, Willem
Cross, Anne
Burke, Donald
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
United States
Description
Project Tycho datasets contain case counts for reported disease conditions for countries around the world. The Project Tycho data curation team extracts these case counts from various reputable sources, typically from national or international health authorities, such as the US Centers for Disease Control or the World Health Organization. These original data sources include both open- and restricted-access sources. For restricted-access sources, the Project Tycho team has obtained permission for redistribution from data contributors. All datasets contain case count data that are identical to counts published in the original source and no counts have been modified in any way by the Project Tycho team. The Project Tycho team has pre-processed datasets by adding new variables, such as standard disease and location identifiers, that improve data interpretabilty. We also formatted the data into a standard data format. Each Project Tycho dataset contains case counts for a specific condition (e.g. measles) and for a specific country (e.g. The United States). Case counts are reported per time interval. In addition to case counts, datsets include information about these counts (attributes), such as the location, age group, subpopulation, diagnostic certainty, place of aquisition, and the source from which we extracted case counts. One dataset can include many series of case count time intervals, such as "US measles cases as reported by CDC", or "US measles cases reported by WHO", or "US measles cases that originated abroad", etc. Depending on the intended use of a dataset, we recommend a few data processing steps before analysis:

Analyze missing data: Project Tycho datasets do not inlcude time intervals for which no case count was reported (for many datasets, time series of case counts are incomplete, due to incompleteness of source documents) and users will need to add time intervals for which no count value is available. Project Tycho datasets do include time intervals for which a case count value of zero was reported. Separate cumulative from non-cumulative time interval series. Case count time series in Project Tycho datasets can be "cumulative" or "fixed-intervals". Cumulative case count time series consist of overlapping case count intervals starting on the same date, but ending on different dates. For example, each interval in a cumulative count time series can start on January 1st, but end on January 7th, 14th, 21st, etc. It is common practice among public health agencies to report cases for cumulative time intervals. Case count series with fixed time intervals consist of mutually exxclusive time intervals that all start and end on different dates and all have identical length (day, week, month, year). Given the different nature of these two types of case count data, we indicated this with an attribute for each count value, named "PartOfCumulativeCountSeries".
Z
Counts of Meningococcal meningitis reported in UNITED STATES OF AMERICA:...
data.niaid.nih.gov
Updated Jun 3, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Burke, Donald (2024). Counts of Meningococcal meningitis reported in UNITED STATES OF AMERICA: 1926-1964 [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_11452299
Explore at:
Dataset updated
Jun 3, 2024
Dataset provided by
Van Panhuis, Willem
Cross, Anne
Burke, Donald
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
United States
Description
Project Tycho datasets contain case counts for reported disease conditions for countries around the world. The Project Tycho data curation team extracts these case counts from various reputable sources, typically from national or international health authorities, such as the US Centers for Disease Control or the World Health Organization. These original data sources include both open- and restricted-access sources. For restricted-access sources, the Project Tycho team has obtained permission for redistribution from data contributors. All datasets contain case count data that are identical to counts published in the original source and no counts have been modified in any way by the Project Tycho team. The Project Tycho team has pre-processed datasets by adding new variables, such as standard disease and location identifiers, that improve data interpretabilty. We also formatted the data into a standard data format. Each Project Tycho dataset contains case counts for a specific condition (e.g. measles) and for a specific country (e.g. The United States). Case counts are reported per time interval. In addition to case counts, datsets include information about these counts (attributes), such as the location, age group, subpopulation, diagnostic certainty, place of aquisition, and the source from which we extracted case counts. One dataset can include many series of case count time intervals, such as "US measles cases as reported by CDC", or "US measles cases reported by WHO", or "US measles cases that originated abroad", etc. Depending on the intended use of a dataset, we recommend a few data processing steps before analysis:

Analyze missing data: Project Tycho datasets do not inlcude time intervals for which no case count was reported (for many datasets, time series of case counts are incomplete, due to incompleteness of source documents) and users will need to add time intervals for which no count value is available. Project Tycho datasets do include time intervals for which a case count value of zero was reported. Separate cumulative from non-cumulative time interval series. Case count time series in Project Tycho datasets can be "cumulative" or "fixed-intervals". Cumulative case count time series consist of overlapping case count intervals starting on the same date, but ending on different dates. For example, each interval in a cumulative count time series can start on January 1st, but end on January 7th, 14th, 21st, etc. It is common practice among public health agencies to report cases for cumulative time intervals. Case count series with fixed time intervals consist of mutually exxclusive time intervals that all start and end on different dates and all have identical length (day, week, month, year). Given the different nature of these two types of case count data, we indicated this with an attribute for each count value, named "PartOfCumulativeCountSeries".
p
Counts of Active tuberculosis reported in UNITED STATES OF AMERICA:...
tycho.pitt.edu
Updated Apr 1, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Willem G Van Panhuis; Anne L Cross; Donald S Burke (2018). Counts of Active tuberculosis reported in UNITED STATES OF AMERICA: 1972-1974 [Dataset]. https://www.tycho.pitt.edu/dataset/US.427099000
Explore at:
Dataset updated
Apr 1, 2018
Dataset provided by
Project Tycho, University of Pittsburgh
Authors
Willem G Van Panhuis; Anne L Cross; Donald S Burke
Time period covered
1972 - 1974
Area covered
United States
Description
Project Tycho datasets contain case counts for reported disease conditions for countries around the world. The Project Tycho data curation team extracts these case counts from various reputable sources, typically from national or international health authorities, such as the US Centers for Disease Control or the World Health Organization. These original data sources include both open- and restricted-access sources. For restricted-access sources, the Project Tycho team has obtained permission for redistribution from data contributors. All datasets contain case count data that are identical to counts published in the original source and no counts have been modified in any way by the Project Tycho team. The Project Tycho team has pre-processed datasets by adding new variables, such as standard disease and location identifiers, that improve data interpretability. We also formatted the data into a standard data format.

Each Project Tycho dataset contains case counts for a specific condition (e.g. measles) and for a specific country (e.g. The United States). Case counts are reported per time interval. In addition to case counts, datasets include information about these counts (attributes), such as the location, age group, subpopulation, diagnostic certainty, place of acquisition, and the source from which we extracted case counts. One dataset can include many series of case count time intervals, such as "US measles cases as reported by CDC", or "US measles cases reported by WHO", or "US measles cases that originated abroad", etc.

Depending on the intended use of a dataset, we recommend a few data processing steps before analysis: - Analyze missing data: Project Tycho datasets do not include time intervals for which no case count was reported (for many datasets, time series of case counts are incomplete, due to incompleteness of source documents) and users will need to add time intervals for which no count value is available. Project Tycho datasets do include time intervals for which a case count value of zero was reported. - Separate cumulative from non-cumulative time interval series. Case count time series in Project Tycho datasets can be "cumulative" or "fixed-intervals". Cumulative case count time series consist of overlapping case count intervals starting on the same date, but ending on different dates. For example, each interval in a cumulative count time series can start on January 1st, but end on January 7th, 14th, 21st, etc. It is common practice among public health agencies to report cases for cumulative time intervals. Case count series with fixed time intervals consist of mutually exclusive time intervals that all start and end on different dates and all have identical length (day, week, month, year). Given the different nature of these two types of case count data, we indicated this with an attribute for each count value, named "PartOfCumulativeCountSeries".
Z
Counts of Brucellosis reported in UNITED STATES OF AMERICA: 1945-1983
data.niaid.nih.gov
Updated Jun 3, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Burke, Donald (2024). Counts of Brucellosis reported in UNITED STATES OF AMERICA: 1945-1983 [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_11452558
Explore at:
Dataset updated
Jun 3, 2024
Dataset provided by
Van Panhuis, Willem
Cross, Anne
Burke, Donald
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
United States
Description
Project Tycho datasets contain case counts for reported disease conditions for countries around the world. The Project Tycho data curation team extracts these case counts from various reputable sources, typically from national or international health authorities, such as the US Centers for Disease Control or the World Health Organization. These original data sources include both open- and restricted-access sources. For restricted-access sources, the Project Tycho team has obtained permission for redistribution from data contributors. All datasets contain case count data that are identical to counts published in the original source and no counts have been modified in any way by the Project Tycho team. The Project Tycho team has pre-processed datasets by adding new variables, such as standard disease and location identifiers, that improve data interpretabilty. We also formatted the data into a standard data format. Each Project Tycho dataset contains case counts for a specific condition (e.g. measles) and for a specific country (e.g. The United States). Case counts are reported per time interval. In addition to case counts, datsets include information about these counts (attributes), such as the location, age group, subpopulation, diagnostic certainty, place of aquisition, and the source from which we extracted case counts. One dataset can include many series of case count time intervals, such as "US measles cases as reported by CDC", or "US measles cases reported by WHO", or "US measles cases that originated abroad", etc. Depending on the intended use of a dataset, we recommend a few data processing steps before analysis:

Analyze missing data: Project Tycho datasets do not inlcude time intervals for which no case count was reported (for many datasets, time series of case counts are incomplete, due to incompleteness of source documents) and users will need to add time intervals for which no count value is available. Project Tycho datasets do include time intervals for which a case count value of zero was reported. Separate cumulative from non-cumulative time interval series. Case count time series in Project Tycho datasets can be "cumulative" or "fixed-intervals". Cumulative case count time series consist of overlapping case count intervals starting on the same date, but ending on different dates. For example, each interval in a cumulative count time series can start on January 1st, but end on January 7th, 14th, 21st, etc. It is common practice among public health agencies to report cases for cumulative time intervals. Case count series with fixed time intervals consist of mutually exxclusive time intervals that all start and end on different dates and all have identical length (day, week, month, year). Given the different nature of these two types of case count data, we indicated this with an attribute for each count value, named "PartOfCumulativeCountSeries".
p
Counts of Diphtheria reported in UNITED STATES OF AMERICA: 1888-1981
tycho.pitt.edu
Updated Apr 1, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Willem G Van Panhuis; Anne L Cross; Donald S Burke (2018). Counts of Diphtheria reported in UNITED STATES OF AMERICA: 1888-1981 [Dataset]. https://www.tycho.pitt.edu/dataset/US.397428000
Explore at:
Dataset updated
Apr 1, 2018
Dataset provided by
Project Tycho, University of Pittsburgh
Authors
Willem G Van Panhuis; Anne L Cross; Donald S Burke
Time period covered
1888 - 1981
Area covered
United States
Description
Project Tycho datasets contain case counts for reported disease conditions for countries around the world. The Project Tycho data curation team extracts these case counts from various reputable sources, typically from national or international health authorities, such as the US Centers for Disease Control or the World Health Organization. These original data sources include both open- and restricted-access sources. For restricted-access sources, the Project Tycho team has obtained permission for redistribution from data contributors. All datasets contain case count data that are identical to counts published in the original source and no counts have been modified in any way by the Project Tycho team. The Project Tycho team has pre-processed datasets by adding new variables, such as standard disease and location identifiers, that improve data interpretability. We also formatted the data into a standard data format.

Each Project Tycho dataset contains case counts for a specific condition (e.g. measles) and for a specific country (e.g. The United States). Case counts are reported per time interval. In addition to case counts, datasets include information about these counts (attributes), such as the location, age group, subpopulation, diagnostic certainty, place of acquisition, and the source from which we extracted case counts. One dataset can include many series of case count time intervals, such as "US measles cases as reported by CDC", or "US measles cases reported by WHO", or "US measles cases that originated abroad", etc.

Depending on the intended use of a dataset, we recommend a few data processing steps before analysis: - Analyze missing data: Project Tycho datasets do not include time intervals for which no case count was reported (for many datasets, time series of case counts are incomplete, due to incompleteness of source documents) and users will need to add time intervals for which no count value is available. Project Tycho datasets do include time intervals for which a case count value of zero was reported. - Separate cumulative from non-cumulative time interval series. Case count time series in Project Tycho datasets can be "cumulative" or "fixed-intervals". Cumulative case count time series consist of overlapping case count intervals starting on the same date, but ending on different dates. For example, each interval in a cumulative count time series can start on January 1st, but end on January 7th, 14th, 21st, etc. It is common practice among public health agencies to report cases for cumulative time intervals. Case count series with fixed time intervals consist of mutually exclusive time intervals that all start and end on different dates and all have identical length (day, week, month, year). Given the different nature of these two types of case count data, we indicated this with an attribute for each count value, named "PartOfCumulativeCountSeries".
d
Counts of Haemophilus influenzae type b infection reported in UNITED STATES...
data.datacite.org
json, zip
Updated Jun 3, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Willem Van Panhuis; Willem Van Panhuis; Anne Cross; Anne Cross; Donald Burke; Donald Burke (2024). Counts of Haemophilus influenzae type b infection reported in UNITED STATES OF AMERICA: 2001-2005 [Dataset]. http://doi.org/10.25337/t7/ptycho.v2.0/us.709410003
Explore at:
zip, jsonAvailable download formats
Unique identifier
https://doi.org/10.25337/t7/ptycho.v2.0/us.709410003
Dataset updated
Jun 3, 2024
Dataset provided by
Project Tycho
Authors
Willem Van Panhuis; Willem Van Panhuis; Anne Cross; Anne Cross; Donald Burke; Donald Burke
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Dec 30, 2001 - Dec 31, 2005
Area covered
United States
Description
Project Tycho datasets contain case counts for reported disease conditions for countries around the world. The Project Tycho data curation team extracts these case counts from various reputable sources, typically from national or international health authorities, such as the US Centers for Disease Control or the World Health Organization. These original data sources include both open- and restricted-access sources. For restricted-access sources, the Project Tycho team has obtained permission for redistribution from data contributors. All datasets contain case count data that are identical to counts published in the original source and no counts have been modified in any way by the Project Tycho team. The Project Tycho team has pre-processed datasets by adding new variables, such as standard disease and location identifiers, that improve data interpretabilty. We also formatted the data into a standard data format.
Each Project Tycho dataset contains case counts for a specific condition (e.g. measles) and for a specific country (e.g. The United States). Case counts are reported per time interval. In addition to case counts, datsets include information about these counts (attributes), such as the location, age group, subpopulation, diagnostic certainty, place of aquisition, and the source from which we extracted case counts. One dataset can include many series of case count time intervals, such as "US measles cases as reported by CDC", or "US measles cases reported by WHO", or "US measles cases that originated abroad", etc.
Depending on the intended use of a dataset, we recommend a few data processing steps before analysis:
Analyze missing data: Project Tycho datasets do not inlcude time intervals for which no case count was reported (for many datasets, time series of case counts are incomplete, due to incompleteness of source documents) and users will need to add time intervals for which no count value is available. Project Tycho datasets do include time intervals for which a case count value of zero was reported.
Separate cumulative from non-cumulative time interval series. Case count time series in Project Tycho datasets can be "cumulative" or "fixed-intervals". Cumulative case count time series consist of overlapping case count intervals starting on the same date, but ending on different dates. For example, each interval in a cumulative count time series can start on January 1st, but end on January 7th, 14th, 21st, etc. It is common practice among public health agencies to report cases for cumulative time intervals. Case count series with fixed time intervals consist of mutually exxclusive time intervals that all start and end on different dates and all have identical length (day, week, month, year). Given the different nature of these two types of case count data, we indicated this with an attribute for each count value, named "PartOfCumulativeCountSeries".
c
California Overlapping Cities and Counties and Identifiers
gis.data.ca.gov
hub.arcgis.com
Updated Sep 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
California Department of Technology (2024). California Overlapping Cities and Counties and Identifiers [Dataset]. https://gis.data.ca.gov/datasets/california-overlapping-cities-and-counties-and-identifiers/about
Explore at:
Dataset updated
Sep 16, 2024
Dataset authored and provided by
California Department of Technology
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
WARNING: This is a pre-release dataset and its fields names and data structures are subject to change. It should be considered pre-release until the end of 2024. Expected changes:Metadata is missing or incomplete for some layers at this time and will be continuously improved.We expect to update this layer roughly in line with CDTFA at some point, but will increase the update cadence over time as we are able to automate the final pieces of the process.This dataset is continuously updated as the source data from CDTFA is updated, as often as many times a month. If you require unchanging point-in-time data, export a copy for your own use rather than using the service directly in your applications.PurposeCounty and incorporated place (city) boundaries along with third party identifiers used to join in external data. Boundaries are from the authoritative source the California Department of Tax and Fee Administration (CDTFA), altered to show the counties as one polygon. This layer displays the city polygons on top of the County polygons so the area isn"t interrupted. The GEOID attribute information is added from the US Census. GEOID is based on merged State and County FIPS codes for the Counties. Abbreviations for Counties and Cities were added from Caltrans Division of Local Assistance (DLA) data. Place Type was populated with information extracted from the Census. Names and IDs from the US Board on Geographic Names (BGN), the authoritative source of place names as published in the Geographic Name Information System (GNIS), are attached as well. Finally, coastal buffers are removed, leaving the land-based portions of jurisdictions. This feature layer is for public use.Related LayersThis dataset is part of a grouping of many datasets:Cities: Only the city boundaries and attributes, without any unincorporated areasWith Coastal BuffersWithout Coastal BuffersCounties: Full county boundaries and attributes, including all cities within as a single polygonWith Coastal BuffersWithout Coastal BuffersCities and Full Counties: A merge of the other two layers, so polygons overlap within city boundaries. Some customers require this behavior, so we provide it as a separate service.With Coastal BuffersWithout Coastal Buffers (this dataset)Place AbbreviationsUnincorporated Areas (Coming Soon)Census Designated Places (Coming Soon)Cartographic CoastlinePolygonLine source (Coming Soon)Working with Coastal BuffersThe dataset you are currently viewing includes the coastal buffers for cities and counties that have them in the authoritative source data from CDTFA. In the versions where they are included, they remain as a second polygon on cities or counties that have them, with all the same identifiers, and a value in the COASTAL field indicating if it"s an ocean or a bay buffer. If you wish to have a single polygon per jurisdiction that includes the coastal buffers, you can run a Dissolve on the version that has the coastal buffers on all the fields except COASTAL, Area_SqMi, Shape_Area, and Shape_Length to get a version with the correct identifiers.Point of ContactCalifornia Department of Technology, Office of Digital Services, odsdataservices@state.ca.govField and Abbreviation DefinitionsCOPRI: county number followed by the 3-digit city primary number used in the Board of Equalization"s 6-digit tax rate area numbering systemPlace Name: CDTFA incorporated (city) or county nameCounty: CDTFA county name. For counties, this will be the name of the polygon itself. For cities, it is the name of the county the city polygon is within.Legal Place Name: Board on Geographic Names authorized nomenclature for area names published in the Geographic Name Information SystemGNIS_ID: The numeric identifier from the Board on Geographic Names that can be used to join these boundaries to other datasets utilizing this identifier.GEOID: numeric geographic identifiers from the US Census Bureau Place Type: Board on Geographic Names authorized nomenclature for boundary type published in the Geographic Name Information SystemPlace Abbr: CalTrans Division of Local Assistance abbreviations of incorporated area namesCNTY Abbr: CalTrans Division of Local Assistance abbreviations of county namesArea_SqMi: The area of the administrative unit (city or county) in square miles, calculated in EPSG 3310 California Teale Albers.COASTAL: Indicates if the polygon is a coastal buffer. Null for land polygons. Additional values include "ocean" and "bay".GlobalID: While all of the layers we provide in this dataset include a GlobalID field with unique values, we do not recommend you make any use of it. The GlobalID field exists to support offline sync, but is not persistent, so data keyed to it will be orphaned at our next update. Use one of the other persistent identifiers, such as GNIS_ID or GEOID instead.AccuracyCDTFA"s source data notes the following about accuracy:City boundary changes and county boundary line adjustments filed with the Board of Equalization per Government Code 54900. This GIS layer contains the boundaries of the unincorporated county and incorporated cities within the state of California. The initial dataset was created in March of 2015 and was based on the State Board of Equalization tax rate area boundaries. As of April 1, 2024, the maintenance of this dataset is provided by the California Department of Tax and Fee Administration for the purpose of determining sales and use tax rates. The boundaries are continuously being revised to align with aerial imagery when areas of conflict are discovered between the original boundary provided by the California State Board of Equalization and the boundary made publicly available by local, state, and federal government. Some differences may occur between actual recorded boundaries and the boundaries used for sales and use tax purposes. The boundaries in this map are representations of taxing jurisdictions for the purpose of determining sales and use tax rates and should not be used to determine precise city or county boundary line locations. COUNTY = county name; CITY = city name or unincorporated territory; COPRI = county number followed by the 3-digit city primary number used in the California State Board of Equalization"s 6-digit tax rate area numbering system (for the purpose of this map, unincorporated areas are assigned 000 to indicate that the area is not within a city).Boundary ProcessingThese data make a structural change from the source data. While the full boundaries provided by CDTFA include coastal buffers of varying sizes, many users need boundaries to end at the shoreline of the ocean or a bay. As a result, after examining existing city and county boundary layers, these datasets provide a coastline cut generally along the ocean facing coastline. For county boundaries in northern California, the cut runs near the Golden Gate Bridge, while for cities, we cut along the bay shoreline and into the edge of the Delta at the boundaries of Solano, Contra Costa, and Sacramento counties.In the services linked above, the versions that include the coastal buffers contain them as a second (or third) polygon for the city or county, with the value in the COASTAL field set to whether it"s a bay or ocean polygon. These can be processed back into a single polygon by dissolving on all the fields you wish to keep, since the attributes, other than the COASTAL field and geometry attributes (like areas) remain the same between the polygons for this purpose.SliversIn cases where a city or county"s boundary ends near a coastline, our coastline data may cross back and forth many times while roughly paralleling the jurisdiction"s boundary, resulting in many polygon slivers. We post-process the data to remove these slivers using a city/county boundary priority algorithm. That is, when the data run parallel to each other, we discard the coastline cut and keep the CDTFA-provided boundary, even if it extends into the ocean a small amount. This processing supports consistent boundaries for Fort Bragg, Point Arena, San Francisco, Pacifica, Half Moon Bay, and Capitola, in addition to others. More information on this algorithm will be provided soon.Coastline CaveatsSome cities have buffers extending into water bodies that we do not cut at the shoreline. These include South Lake Tahoe and Folsom, which extend into neighboring lakes, and San Diego and surrounding cities that extend into San Diego Bay, which our shoreline encloses. If you have feedback on the exclusion of these items, or others, from the shoreline cuts, please reach out using the contact information above.Offline UseThis service is fully enabled for sync and export using Esri Field Maps or other similar tools. Importantly, the GlobalID field exists only to support that use case and should not be used for any other purpose (see note in field descriptions).Updates and Date of ProcessingConcurrent with CDTFA updates, approximately every two weeks, Last Processed: 12/17/2024 by Nick Santos using code path at https://github.com/CDT-ODS-DevSecOps/cdt-ods-gis-city-county/ at commit 0bf269d24464c14c9cf4f7dea876aa562984db63. It incorporates updates from CDTFA as of 12/12/2024. Future updates will include improvements to metadata and update frequency.

Facebook

Twitter

Click to copy link

Link copied

Cite

Statista (2024). Number of missing persons files in the U.S. 2022, by race [Dataset]. https://www.statista.com/statistics/240396/number-of-missing-persons-files-in-the-us-by-race/

Number of missing persons files in the U.S. 2022, by race

Explore at:

Dataset updated

Jul 5, 2024

Dataset authored and provided by

Statistahttp://statista.com/

Time period covered

2022

Area covered

United States

Description

In 2022, there were 313,017 cases filed by the NCIC where the race of the reported missing was White. In the same year, 18,928 people were missing whose race was unknown.

What is the NCIC?

The National Crime Information Center (NCIC) is a digital database that stores crime data for the United States, so criminal justice agencies can access it. As a part of the FBI, it helps criminal justice professionals find criminals, missing people, stolen property, and terrorists. The NCIC database is broken down into 21 files. Seven files belong to stolen property and items, and 14 belong to persons, including the National Sex Offender Register, Missing Person, and Identify Theft. It works alongside federal, tribal, state, and local agencies. The NCIC’s goal is to maintain a centralized information system between local branches and offices, so information is easily accessible nationwide.

Missing people in the United States

A person is considered missing when they have disappeared and their location is unknown. A person who is considered missing might have left voluntarily, but that is not always the case. The number of the NCIC unidentified person files in the United States has fluctuated since 1990, and in 2022, there were slightly more NCIC missing person files for males as compared to females. Fortunately, the number of NCIC missing person files has been mostly decreasing since 1998.

Clear search

Close search

Google apps

Main menu

Number of missing persons files in the U.S. 2022, by race

Coronavirus (Covid-19) Data in the United States

Lost Creek, WV Population Dataset: Yearly Figures, Population Change, and...

About this dataset

Content

Inspiration

Recommended for further research

Counts of Dysentery reported in UNITED STATES OF AMERICA: 1942-1948

Lost Nation, IA Age Group Population Dataset: A complete breakdown of Lost...

About this dataset

Content

Inspiration

Recommended for further research

Counts of Murine typhus reported in UNITED STATES OF AMERICA: 1945-1961

Counts of Cholera reported in UNITED STATES OF AMERICA: 1895-1905

Counts of Rocky Mountain spotted fever reported in UNITED STATES OF AMERICA:...

United States Unemployment Rate

Counts of Measles reported in UNITED STATES OF AMERICA: 1888-2002

Energy Consumption of United States Over Time

Energy Consumption of United States Over Time

Building Energy Data Book

About this dataset

More Datasets

Featured Notebooks

How to use the dataset

Research Ideas

Counts of Scarlet fever reported in UNITED STATES OF AMERICA: 1888-1969

Counts of Pertussis reported in UNITED STATES OF AMERICA: 1888-2017

Counts of Syphilis reported in UNITED STATES OF AMERICA: 1978-1978

Counts of Meningococcal meningitis reported in UNITED STATES OF AMERICA:...

Counts of Active tuberculosis reported in UNITED STATES OF AMERICA:...

Counts of Brucellosis reported in UNITED STATES OF AMERICA: 1945-1983

Counts of Diphtheria reported in UNITED STATES OF AMERICA: 1888-1981

Counts of Haemophilus influenzae type b infection reported in UNITED STATES...

California Overlapping Cities and Counties and Identifiers

Number of missing persons files in the U.S. 2022, by race