Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the United States population distribution across 18 age groups. It lists the population in each age group along with the percentage population relative of the total population for United States. The dataset can be utilized to understand the population distribution of United States by age. For example, using this dataset, we can identify the largest age group in United States.
Key observations
The largest age group in United States was for the group of age 30 to 34 years years with a population of 23.06 million (6.94%), according to the ACS 2019-2023 5-Year Estimates. At the same time, the smallest age group in United States was the 80 to 84 years years with a population of 6.34 million (1.91%). Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates
Age groups:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for United States Population by Age. You can refer the same here
In 2023, Washington, D.C. had the highest population density in the United States, with 11,130.69 people per square mile. As a whole, there were about 94.83 residents per square mile in the U.S., and Alaska was the state with the lowest population density, with 1.29 residents per square mile. The problem of population density Simply put, population density is the population of a country divided by the area of the country. While this can be an interesting measure of how many people live in a country and how large the country is, it does not account for the degree of urbanization, or the share of people who live in urban centers. For example, Russia is the largest country in the world and has a comparatively low population, so its population density is very low. However, much of the country is uninhabited, so cities in Russia are much more densely populated than the rest of the country. Urbanization in the United States While the United States is not very densely populated compared to other countries, its population density has increased significantly over the past few decades. The degree of urbanization has also increased, and well over half of the population lives in urban centers.
In the past four centuries, the population of the United States has grown from a recorded 350 people around the Jamestown colony of Virginia in 1610, to an estimated 331 million people in 2020. The pre-colonization populations of the indigenous peoples of the Americas have proven difficult for historians to estimate, as their numbers decreased rapidly following the introduction of European diseases (namely smallpox, plague and influenza). Native Americans were also omitted from most censuses conducted before the twentieth century, therefore the actual population of what we now know as the United States would have been much higher than the official census data from before 1800, but it is unclear by how much. Population growth in the colonies throughout the eighteenth century has primarily been attributed to migration from the British Isles and the Transatlantic slave trade; however it is also difficult to assert the ethnic-makeup of the population in these years as accurate migration records were not kept until after the 1820s, at which point the importation of slaves had also been illegalized. Nineteenth century In the year 1800, it is estimated that the population across the present-day United States was around six million people, with the population in the 16 admitted states numbering at 5.3 million. Migration to the United States began to happen on a large scale in the mid-nineteenth century, with the first major waves coming from Ireland, Britain and Germany. In some aspects, this wave of mass migration balanced out the demographic impacts of the American Civil War, which was the deadliest war in U.S. history with approximately 620 thousand fatalities between 1861 and 1865. The civil war also resulted in the emancipation of around four million slaves across the south; many of whose ancestors would take part in the Great Northern Migration in the early 1900s, which saw around six million black Americans migrate away from the south in one of the largest demographic shifts in U.S. history. By the end of the nineteenth century, improvements in transport technology and increasing economic opportunities saw migration to the United States increase further, particularly from southern and Eastern Europe, and in the first decade of the 1900s the number of migrants to the U.S. exceeded one million people in some years. Twentieth and twenty-first century The U.S. population has grown steadily throughout the past 120 years, reaching one hundred million in the 1910s, two hundred million in the 1960s, and three hundred million in 2007. In the past century, the U.S. established itself as a global superpower, with the world's largest economy (by nominal GDP) and most powerful military. Involvement in foreign wars has resulted in over 620,000 further U.S. fatalities since the Civil War, and migration fell drastically during the World Wars and Great Depression; however the population continuously grew in these years as the total fertility rate remained above two births per woman, and life expectancy increased (except during the Spanish Flu pandemic of 1918).
Since the Second World War, Latin America has replaced Europe as the most common point of origin for migrants, with Hispanic populations growing rapidly across the south and border states. Because of this, the proportion of non-Hispanic whites, which has been the most dominant ethnicity in the U.S. since records began, has dropped more rapidly in recent decades. Ethnic minorities also have a much higher birth rate than non-Hispanic whites, further contributing to this decline, and the share of non-Hispanic whites is expected to fall below fifty percent of the U.S. population by the mid-2000s. In 2020, the United States has the third-largest population in the world (after China and India), and the population is expected to reach four hundred million in the 2050s.
The number of Twitter users in the United States was forecast to continuously increase between 2024 and 2028 by in total 4.3 million users (+5.32 percent). After the ninth consecutive increasing year, the Twitter user base is estimated to reach 85.08 million users and therefore a new peak in 2028. Notably, the number of Twitter users of was continuously increasing over the past years.User figures, shown here regarding the platform twitter, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Twitter users in countries like Canada and Mexico.
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Sustainable cities depend on urban forests. City trees -- a pillar of urban forests -- improve our health, clean the air, store CO2, and cool local temperatures. Comparatively less is known about urban forests as ecosystems, particularly their spatial composition, nativity statuses, biodiversity, and tree health. Here, we assembled and standardized a new dataset of N=5,660,237 trees from 63 of the largest US cities. The data comes from tree inventories conducted at the level of cities and/or neighborhoods. Each data sheet includes detailed information on tree location, species, nativity status (whether a tree species is naturally occurring or introduced), health, size, whether it is in a park or urban area, and more (comprising 28 standardized columns per datasheet). This dataset could be analyzed in combination with citizen-science datasets on bird, insect, or plant biodiversity; social and demographic data; or data on the physical environment. Urban forests offer a rare opportunity to intentionally design biodiverse, heterogenous, rich ecosystems. Methods See eLife manuscript for full details. Below, we provide a summary of how the dataset was collected and processed.
Data Acquisition We limited our search to the 150 largest cities in the USA (by census population). To acquire raw data on street tree communities, we used a search protocol on both Google and Google Datasets Search (https://datasetsearch.research.google.com/). We first searched the city name plus each of the following: street trees, city trees, tree inventory, urban forest, and urban canopy (all combinations totaled 20 searches per city, 10 each in Google and Google Datasets Search). We then read the first page of google results and the top 20 results from Google Datasets Search. If the same named city in the wrong state appeared in the results, we redid the 20 searches adding the state name. If no data were found, we contacted a relevant state official via email or phone with an inquiry about their street tree inventory. Datasheets were received and transformed to .csv format (if they were not already in that format). We received data on street trees from 64 cities. One city, El Paso, had data only in summary format and was therefore excluded from analyses.
Data Cleaning All code used is in the zipped folder Data S5 in the eLife publication. Before cleaning the data, we ensured that all reported trees for each city were located within the greater metropolitan area of the city (for certain inventories, many suburbs were reported - some within the greater metropolitan area, others not). First, we renamed all columns in the received .csv sheets, referring to the metadata and according to our standardized definitions (Table S4). To harmonize tree health and condition data across different cities, we inspected metadata from the tree inventories and converted all numeric scores to a descriptive scale including “excellent,” “good”, “fair”, “poor”, “dead”, and “dead/dying”. Some cities included only three points on this scale (e.g., “good”, “poor”, “dead/dying”) while others included five (e.g., “excellent,” “good”, “fair”, “poor”, “dead”). Second, we used pandas in Python (W. McKinney & Others, 2011) to correct typos, non-ASCII characters, variable spellings, date format, units used (we converted all units to metric), address issues, and common name format. In some cases, units were not specified for tree diameter at breast height (DBH) and tree height; we determined the units based on typical sizes for trees of a particular species. Wherever diameter was reported, we assumed it was DBH. We standardized health and condition data across cities, preserving the highest granularity available for each city. For our analysis, we converted this variable to a binary (see section Condition and Health). We created a column called “location_type” to label whether a given tree was growing in the built environment or in green space. All of the changes we made, and decision points, are preserved in Data S9. Third, we checked the scientific names reported using gnr_resolve in the R library taxize (Chamberlain & Szöcs, 2013), with the option Best_match_only set to TRUE (Data S9). Through an iterative process, we manually checked the results and corrected typos in the scientific names until all names were either a perfect match (n=1771 species) or partial match with threshold greater than 0.75 (n=453 species). BGS manually reviewed all partial matches to ensure that they were the correct species name, and then we programmatically corrected these partial matches (for example, Magnolia grandifolia-- which is not a species name of a known tree-- was corrected to Magnolia grandiflora, and Pheonix canariensus was corrected to its proper spelling of Phoenix canariensis). Because many of these tree inventories were crowd-sourced or generated in part through citizen science, such typos and misspellings are to be expected. Some tree inventories reported species by common names only. Therefore, our fourth step in data cleaning was to convert common names to scientific names. We generated a lookup table by summarizing all pairings of common and scientific names in the inventories for which both were reported. We manually reviewed the common to scientific name pairings, confirming that all were correct. Then we programmatically assigned scientific names to all common names (Data S9). Fifth, we assigned native status to each tree through reference to the Biota of North America Project (Kartesz, 2018), which has collected data on all native and non-native species occurrences throughout the US states. Specifically, we determined whether each tree species in a given city was native to that state, not native to that state, or that we did not have enough information to determine nativity (for cases where only the genus was known). Sixth, some cities reported only the street address but not latitude and longitude. For these cities, we used the OpenCageGeocoder (https://opencagedata.com/) to convert addresses to latitude and longitude coordinates (Data S9). OpenCageGeocoder leverages open data and is used by many academic institutions (see https://opencagedata.com/solutions/academia). Seventh, we trimmed each city dataset to include only the standardized columns we identified in Table S4. After each stage of data cleaning, we performed manual spot checking to identify any issues.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This layer represents USDA Food Access Research Atlas data at the census tract geography. Low Income is defined as tracts with a poverty rate of 20% or higher, or tracts with median family income less than 80% of median family income of the state or metropolitan area. Low Access is defined as tracts where a significant number or share of residents is more than 1 mile (urban) or 10 miles (rural) from the nearest supermarket.http://www.ers.usda.gov/data-products/food-access-research-atlas/go-to-the-atlas.aspxFood accessLimited access to supermarkets, supercenters, grocery stores, or other sources of healthy and affordable food may make it harder for some Americans to eat a healthy diet. There are many ways to measure food store access for individuals and for neighborhoods, and many ways to define which areas are food deserts—neighborhoods that lack healthy food sources. Most measures and definitions take into account at least some of the following indicators of access:Accessibility to sources of healthy food, as measured by distance to a store or by the number of stores in an area.Individual-level resources that may affect accessibility, such as family income or vehicle availability.Neighborhood-level indicators of resources, such as the average income of the neighborhood and the availability of public transportation.In the Food Access Research Atlas, several indicators are available to measure food access along these dimensions. For example, users can choose alternative distance markers to measure low access in a neighborhood, such as the number and share of people more than half a mile to a supermarket or 1 mile to a supermarket. Users can also view other census-tract-level characteristics that provide context on food access in neighborhoods, such as whether the tract has a high percentage of households far from supermarkets and without vehicles, individuals with low income, or people residing in group quarters.Low-income neighborhoodsThe criteria for identifying a census tract as low income are from the Department of Treasury’s New Markets Tax Credit (NMTC) program. This program defines a low-income census tract as any tract where:The tract’s poverty rate is 20 percent or greater; orThe tract’s median family income is less than or equal to 80 percent of the State-wide median family income; orThe tract is in a metropolitan area and has a median family income less than or equal to 80 percent of the metropolitan area's median family income.Low-access census tractsIn the Food Access Research Atlas, low access to healthy food is defined as being far from a supermarket, supercenter, or large grocery store ("supermarket" for short). A census tract is considered to have low access if a significant number or share of individuals in the tract is far from a supermarket.In the original Food Desert Locator, low access was measured as living far from a supermarket, where 1 mile was used in urban areas and 10 miles was used in rural areas to demarcate those who are far from a supermarket. In urban areas, about 70 percent of the population was within 1 mile of a supermarket, while in rural areas over 90 percent of the population was within 10 miles (see Access to Affordable and Nutritious Food: Updated Estimates of Distance to Supermarkets Using 2010 Data). Updating the original 1- and 10-mile low-access measure shows that an estimated 18.3 million people in these low-income and low-access census tracts were far from a supermarket in 2010.Three additional measures of food access based on distance to a supermarket are provided in the Atlas:One additional measure applies a 0.5-mile demarcation in urban areas and a 10-mile distance in rural areas. Using this measure, an estimated 52.5 million people, or 17 percent of the U.S. population, have low access to a supermarket;A second measure applies a 1.0-mile demarcation in urban areas and a 20-mile distance in rural areas. Under this measure, an estimated 16.5 million people, or 5.3 percent of the U.S. population, have low access to a supermarket; andA slightly more complex measure incorporates vehicle access directly into the measure, delineating low-income tracts in which a significant number of households are located far from a supermarket and do not have access to a vehicle. This measure also includes census tracts with populations that are so remote, that, even with a vehicle, driving to a supermarket may be considered a burden due to the great distance. Using this measure, an estimated 2.1 million households, or 1.8 percent of all households, in low-income census tracts are far from a supermarket and do not have a vehicle. An additional 0.3 million people are more than 20 miles from a supermarket.For each of the first three measures that are based solely on distance, a tract is designated as low access if the aggregate number of people in the census tract with low access is at least 500 or the percentage of people in the census tract with low access is at least 33 percent. For the final measure using vehicle availability, a tract is designated as having low vehicle access if at least one of the following is true:at least 100 households are more than ½ mile from the nearest supermarket and have no access to a vehicle; orat least 500 people or 33 percent of the population live more than 20 miles from the nearest supermarket, regardless of vehicle access.Methods used to assess distance to the nearest supermarket are the same for each of these measures. First, the entire country is divided into ½-km square grids, and data on the population are aerially allocated to these grids (see Access to Affordable and Nutritious Food: Updated Estimates of Distance to Supermarkets Using 2010 Data). Then, distance to the nearest supermarket is measured for each grid cell by calculating the distance between the geographic center of the ½-km square grid that contains estimates of the population (number of people and other subgroup characteristics) and the center of the grid with the nearest supermarket.Once the distance to the nearest supermarket is calculated for each grid cell, the estimated number of people or housing units that are more than 1 mile from a supermarket in urban tracts, or 10 miles in rural census tracts, is aggregated at the census-tract level (and similarly for the alternative distance markers). A census tract is considered rural if the population-weighted centroid of that tract is located in an area with a population of less than 2,500; all other tracts are considered urban tracts.Food desertsThe Food Access Research Atlas maps census tracts that are both low income (li) and low access (la), as measured by the different distance demarcations. This tool provides researchers and other users multiple ways to understand the characteristics that can contribute to food deserts, including income level, distance to supermarkets, and vehicle access.Additional tract-level indicators of accessVehicle availabilityA tract is identified as having low vehicle availability if more than 100 households in the tract report having no vehicle available and are more than 0.5 miles from the nearest supermarket. This corresponds closely to the 80th percentile of the distribution of the number of housing units in a census tract without vehicles at least 0.5 miles from a supermarket (the 80th percentile value was 106 housing units). This means that about 20 percent of all census tracts had more than 100 housing units that were 0.5 miles from a supermarket and without a vehicle. This indicator was applied to both urban and rural census tracts.Overall, 8.8 percent of all housing units in the United States do not have a vehicle, and 4.2 percent of all housing units are at least 0.5 mile from a store and without a vehicle. Vehicle availability is defined in the American Community Survey as the number of passenger cars, vans, or trucks with a capacity of 1-ton or less kept at the home and available for use by household members. The number of available vehicles includes those vehicles leased or rented for at least 1 month, as well as company, police, or government vehicles that are kept at home and available for non-business use.Whether a vehicle is available to a household for private use is an important additional indicator of access to healthy and affordable food. For households living far from a supermarket or large grocery store, access to a private vehicle may make accessing these retailers easier than relying on public or alternative means of transportation.Group quarters populationUsers may be interested in highlighting tracts with large shares of people living in group quarters. Group quarters are residential arrangements where an entity or organization owns and provides housing (and often services) for individuals residing in these buildings. This includes college dormitories, military quarters, correctional facilities, homeless shelters, residential treatment centers, and assisted living or skilled nursing facilities. These living arrangements frequently provide dining and food retail solely for their residents. While individuals living in these areas may appear to be far from a supermarket or grocery store, they may not truly experience difficulty accessing healthy and affordable food. Tracts in which 67 percent of individuals or more live in group quarters are highlighted.General tract characteristicsPopulation, tract totalGeographic level: census tractYear of data: 2010Definition: Total number of individuals residing in a tract.Data sources: Data are from the 2012 report, Access to Affordable and Nutritious Food: Updated Estimates of Distances to Supermarkets Using 2010 Data. Population data are reported at the block level from the 2010 Census of Population and Housing. These data were aerially allocated down to ½-kilometer-square grids across the United States.Low-income tractGeographic level: census tractYear of data: 2010Definition: A tract with either a poverty rate of 20
This shapefile represents habitat suitability categories (High, Moderate, Low, and Non-Habitat) derived from a composite, continuous surface of sage-grouse habitat suitability index (HSI) values for Nevada and northeastern California during spring, which is a surrogate for habitat conditions during the sage-grouse breeding and nesting period. Summary of steps to create Habitat Categories: HABITAT SUITABILITY INDEX: The HSI was derived from a generalized linear mixed model (specified by binomial distribution) that contrasted data from multiple environmental factors at used sites (telemetry locations) and available sites (random locations). Predictor variables for the model represented vegetation communities at multiple spatial scales, water resources, habitat configuration, urbanization, roads, elevation, ruggedness, and slope. Vegetation data was derived from various mapping products, which included NV SynthMap (Petersen 2008, SageStitch (Comer et al. 2002, LANDFIRE (Landfire 2010), and the CA Fire and Resource Assessment Program (CFRAP 2006). The analysis was updated to include high resolution percent cover within 30 x 30 m pixels for Sagebrush, non-sagebrush, herbaceous vegetation, and bare ground (C. Homer, unpublished; based on the methods of Homer et al. 2014, Xian et al. 2015 ) and conifer (primarily pinyon-juniper, P. Coates, unpublished). The pool of telemetry data included the same data from 1998 - 2013 used by Coates et al. (2014); additional telemetry location data from field sites in 2014 were added to the dataset. The dataset was then split according calendar date into three seasons (spring, summer, winter). Spring included telemetry locations (n = 14,058) from mid-March to June, and is a surrogate for habitat conditions during the sage-grouse breeding and nesting period. All age and sex classes of marked grouse were used in the analysis. Sufficient data (i.e., a minimum of 100 locations from at least 20 marked Sage-grouse) for modeling existed in 10 subregions for spring and summer, and seven subregions in winter, using all age and sex classes of marked grouse. It is important to note that although this map is composed of HSI values derived from the seasonal data, it does not explicitly represent habitat suitability for reproductive females (i.e., nesting). Insufficient data were available to allow for estimation of this habitat type for all seasons throughout the study area extent. A Resource Selection Function (RSF) was calculated for each subregion and using generalized linear models to derive model-averaged parameter estimates for each covariate across a set of additive models. Subregional RSFs were transformed into Habitat Suitability Indices, and averaged together to produce an overall statewide HSI whereby a relative probability of occurrence was calculated for each raster cell during the spring season. In order to account for discrepancies in HSI values caused by varying ecoregions within Nevada, the HSI was divided into north and south extents using a slightly modified flood region boundary (Mason 1999) that was designed to represent respective mesic and xeric regions of the state. North and south HSI rasters were each relativized according to their maximum value to rescale between zero and one, then mosaicked once more into a state-wide extent. HABITAT CATEGORIZATION: Using the same ecoregion boundaries described above, the habitat classification dataset (an independent data set comprising 10% of the total telemetry location sample) was split into locations falling within respective north and south regions. HSI values from the composite and relativized statewide HSI surface were then extracted to each classification dataset location within the north and south region. The distribution of these values were used to identify class break values corresponding to 0.5 (high), 1.0 (moderate), and 1.5 (low) standard deviations (SD) from the mean HSI. These class breaks were used to classify the HSI surface into four discrete categories of habitat suitability: High, Moderate, Low, and Non-Habitat. In terms of percentiles, High habitat comprised greater than 30.9 % of the HSI values, Moderate comprised 15 – 30.9%, Low comprised 6.7 – 15%, and Non-Habitat comprised less than 6.7%.The classified north and south regions were then clipped by the boundary layer and mosaicked to create a statewide categorical surface for habitat selection. Each habitat suitability category was converted to a vector output where gaps within polygons less than 1.2 million square meters were eliminated, polygons within 500 meters of each other were connected to create corridors and polygons less than 1.2 million square meters in one category were incorporated to the adjacent category. The final step was to mask major roads that were buffered by 50m (Census, 2014), lakes (Peterson, 2008) and urban areas, and place those masked areas into the non-habitat category. The existing urban layer (Census 2010) was not sufficient for our needs because it excluded towns with a population lower than 1,500. Hence, we masked smaller towns (populations of 100 to 1500) and development with Census Block polygons (Census 2015) that had at least 50% urban development within their boundaries when viewed with reference imagery (ArcGIS World Imagery Service Layer). REFERENCES: California Forest and Resource Assessment Program (CFRAP). 2006. Statewide Land Use / Land Cover Mosaic. [Geospatial data.] California Department of Forestry and Fire Protection, http://frap.cdf.ca.gov/data/frapgisdata-sw-rangeland-assessment_data.php Census 2010. TIGER/Line Shapefiles. Urban Areas [Geospatial data.] U.S. Census Bureau, Washington D.C., https://www.census.gov/geo/maps-data/data/tiger-line.html Census 2014. TIGER/Line Shapefiles. Roads [Geospatial data.] U.S. Census Bureau, Washington D.C., https://www.census.gov/geo/maps-data/data/tiger-line.html Census 2015. TIGER/Line Shapefiles. Blocks [Geospatial data.] U.S. Census Bureau, Washington D.C., https://www.census.gov/geo/maps-data/data/tiger-line.html Coates, P.S., Casazza, M.L., Brussee, B.E., Ricca, M.A., Gustafson, K.B., Overton, C.T., Sanchez-Chopitea, E., Kroger, T., Mauch, K., Niell, L., Howe, K., Gardner, S., Espinosa, S., and Delehanty, D.J. 2014, Spatially explicit modeling of greater sage-grouse (Centrocercus urophasianus) habitat in Nevada and northeastern California—A decision-support tool for management: U.S. Geological Survey Open-File Report 2014-1163, 83 p., http://dx.doi.org/10.3133/ofr20141163. ISSN 2331-1258 (online) Comer, P., Kagen, J., Heiner, M., and Tobalske, C. 2002. Current distribution of sagebrush and associated vegetation in the western United States (excluding NM). [Geospatial data.] Interagency Sagebrush Working Group, http://sagemap.wr.usgs.gov Homer, C.G., Aldridge, C.L., Meyer, D.K., and Schell, S.J. 2014. Multi-Scale Remote Sensing Sagebrush Characterization with Regression Trees over Wyoming, USA; Laying a Foundation for Monitoring. International Journal of Applied Earth Observation and Geoinformation 14, Elsevier, US. LANDFIRE. 2010. 1.2.0 Existing Vegetation Type Layer. [Geospatial data.] U.S. Department of the Interior, Geological Survey, http://landfire.cr.usgs.gov/viewer/ Mason, R.R. 1999. The National Flood-Frequency Program—Methods For Estimating Flood Magnitude And Frequency In Rural Areas In Nevada U.S. Geological Survey Fact Sheet 123-98 September, 1999, Prepared by Robert R. Mason, Jr. and Kernell G. Ries III, of the U.S. Geological Survey; and Jeffrey N. King and Wilbert O. Thomas, Jr., of Michael Baker, Jr., Inc. http://pubs.usgs.gov/fs/fs-123-98/ Peterson, E. B. 2008. A Synthesis of Vegetation Maps for Nevada (Initiating a 'Living' Vegetation Map). Documentation and geospatial data, Nevada Natural Heritage Program, Carson City, Nevada, http://www.heritage.nv.gov/gis Xian, G., Homer, C., Rigge, M., Shi, H., and Meyer, D. 2015. Characterization of shrubland ecosystem components as continuous fields in the northwest United States. Remote Sensing of Environment 168:286-300. NOTE: This file does not include habitat areas for the Bi-State management area and the spatial extent is modified in comparison to Coates et al. 2014
Updates are delayed due to technical difficulties. How many people are staying at home? How far are people traveling when they don’t stay home? Which states and counties have more people taking trips? The Bureau of Transportation Statistics (BTS) now provides answers to those questions through our new mobility statistics. The Trips by Distance data and number of people staying home and not staying home are estimated for the Bureau of Transportation Statistics by the Maryland Transportation Institute and Center for Advanced Transportation Technology Laboratory at the University of Maryland. The travel statistics are produced from an anonymized national panel of mobile device data from multiple sources. All data sources used in the creation of the metrics contain no personal information. Data analysis is conducted at the aggregate national, state, and county levels. A weighting procedure expands the sample of millions of mobile devices, so the results are representative of the entire population in a nation, state, or county. To assure confidentiality and support data quality, no data are reported for a county if it has fewer than 50 devices in the sample on any given day. Trips are defined as movements that include a stay of longer than 10 minutes at an anonymized _location away from home. Home locations are imputed on a weekly basis. A movement with multiple stays of longer than 10 minutes before returning home is counted as multiple trips. Trips capture travel by all modes of transportation. including driving, rail, transit, and air. The daily travel estimates are from a mobile device data panel from merged multiple data sources that address the geographic and temporal sample variation issues often observed in a single data source. The merged data panel only includes mobile devices whose anonymized _location data meet a set of data quality standards, which further ensures the overall data quality and consistency. The data quality standards consider both temporal frequency and spatial accuracy of anonymized _location point observations, temporal coverage and representativeness at the device level, spatial representativeness at the sample and county level, etc. A multi-level weighting method that employs both device and trip-level weights expands the sample to the underlying population at the county and state levels, before travel statistics are computed. These data are experimental and may not meet all of our quality standards. Experimental data products are created using new data sources or methodologies that benefit data users in the absence of other relevant products. We are seeking feedback from data users and stakeholders on the quality and usefulness of these new products. Experimental data products that meet our quality standards and demonstrate sufficient user demand may enter regular production if resources permit.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the population of United States by gender across 18 age groups. It lists the male and female population in each age group along with the gender ratio for United States. The dataset can be utilized to understand the population distribution of United States by gender and age. For example, using this dataset, we can identify the largest age group for both Men and Women in United States. Additionally, it can be used to see how the gender ratio changes from birth to senior most age group and male to female ratio across each age group for United States.
Key observations
Largest age group (population): Male # 25-29 years (11.57 million) | Female # 30-34 years (11.18 million). Source: U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates.
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates.
Age groups:
Scope of gender :
Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis.
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for United States Population by Gender. You can refer the same here
This shapefile represents habitat suitability categories (High, Moderate, Low, and Non-Habitat) derived from a composite, continuous surface of sage-grouse habitat suitability index (HSI) values for Nevada and northeastern California formed from the multiplicative product of the spring, summer, and winter HSI surfaces. Summary of steps to create Habitat Categories: HABITAT SUITABILITY INDEX: The HSI was derived from a generalized linear mixed model (specified by binomial distribution and created using ArcGIS 10.2.2) that contrasted data from multiple environmental factors at used sites (telemetry locations) and available sites (random locations). Predictor variables for the model represented vegetation communities at multiple spatial scales, water resources, habitat configuration, urbanization, roads, elevation, ruggedness, and slope. Vegetation data was derived from various mapping products, which included NV SynthMap (Petersen 2008, SageStitch (Comer et al. 2002, LANDFIRE (Landfire 2010), and the CA Fire and Resource Assessment Program (CFRAP 2006). The analysis was updated to include high resolution percent cover within 30 x 30 m pixels for Sagebrush, non-sagebrush, herbaceous vegetation, and bare ground (C. Homer, unpublished; based on the methods of Homer et al. 2014, Xian et al. 2015 ) and conifer (primarily pinyon-juniper, P. Coates, unpublished). The pool of telemetry data included the same data from 1998 - 2013 used by Coates et al. (2014) as well as additional telemetry location data from field sites in 2014. The dataset was then split according to calendar date into three seasons. Spring included telemetry locations (n = 14,058) from mid-March to June; summer included locations (n = 11,743) from July to mid-October; winter included locations (n = 4862) from November to March. All age and sex classes of marked grouse were used in the analysis. Sufficient data (i.e., a minimum of 100 locations from at least 20 marked Sage-grouse) for modeling existed in 10 subregions for spring and summer, and seven subregions in winter, using all age and sex classes of marked grouse. It is important to note that although this map is composed of HSI values derived from the seasonal data, it does not explicitly represent habitat suitability for reproductive females (i.e., nesting and with broods). Insufficient data were available to allow for estimation of this habitat type for all seasons throughout the study area extent. A Resource Selection Function (RSF) was calculated for each subregion using R software (v 3.13) and season using generalized linear models to derive model-averaged parameter estimates for each covariate across a set of additive models. For each season, subregional RSFs were transformed into Habitat Suitability Indices, and averaged together to produce an overall statewide HSI whereby a relative probability of occurrence was calculated for each raster cell. The three seasonal HSI rasters were then multiplied to create a composite HSI. In order to account for discrepancies in HSI values caused by varying ecoregions within Nevada, the HSI was divided into north and south extents using a slightly modified flood region boundary (Mason 1999) that was designed to represent respective mesic and xeric regions of the state. North and south HSI rasters were each relativized according to their maximum value to rescale between zero and one, then mosaicked once more into a state-wide extent. HABITAT CATEGORIZATION: Using the same ecoregion boundaries described above, the habitat classification dataset (an independent data set comprising 10% of the total telemetry location sample) was split into locations falling within respective north and south regions. HSI values from the composite and relativized statewide HSI surface were then extracted to each classification dataset location within the north and south region. The distribution of these values were used to identify class break values corresponding to 0.5 (high), 1.0 (moderate), and 1.5 (low) standard deviations (SD) from the mean HSI. These class breaks were used to classify the HSI surface into four discrete categories of habitat suitability: High, Moderate, Low, and Non-Habitat. In terms of percentiles, High habitat comprised greater than 30.9 % of the HSI values, Moderate comprised 15 – 30.9%, Low comprised 6.7 – 15%, and Non-Habitat comprised less than 6.7%.The classified north and south regions were then clipped by the boundary layer and mosaicked to create a statewide categorical surface for habitat selection. Each habitat suitability category was converted to a vector output where gaps within polygons less than 1.2 million square meters were eliminated, polygons within 500 meters of each other were connected to create corridors and polygons less than 1.2 million square meters in one category were incorporated to the adjacent category. The final step was to mask major roads that were buffered by 50m (Census, 2014), lakes (Peterson, 2008) and urban areas, and place those masked areas into the non-habitat category. The existing urban layer (Census 2010) was not sufficient for our needs because it excluded towns with a population lower than 1,500. Hence, we masked smaller towns (populations of 100 to 1500) and development with Census Block polygons (Census 2015) that had at least 50% urban development within their boundaries when viewed with reference imagery (ArcGIS World Imagery Service Layer). REFERENCES: California Forest and Resource Assessment Program (CFRAP). 2006. Statewide Land Use / Land Cover Mosaic. [Geospatial data.] California Department of Forestry and Fire Protection, http://frap.cdf.ca.gov/data/frapgisdata-sw-rangeland-assessment_data.php Census 2010. TIGER/Line Shapefiles. Urban Areas [Geospatial data.] U.S. Census Bureau, Washington D.C., https://www.census.gov/geo/maps-data/data/tiger-line.html Census 2014. TIGER/Line Shapefiles. Roads [Geospatial data.] U.S. Census Bureau, Washington D.C., https://www.census.gov/geo/maps-data/data/tiger-line.html Census 2015. TIGER/Line Shapefiles. Blocks [Geospatial data.] U.S. Census Bureau, Washington D.C., https://www.census.gov/geo/maps-data/data/tiger-line.html Coates, P.S., Casazza, M.L., Brussee, B.E., Ricca, M.A., Gustafson, K.B., Overton, C.T., Sanchez-Chopitea, E., Kroger, T., Mauch, K., Niell, L., Howe, K., Gardner, S., Espinosa, S., and Delehanty, D.J. 2014, Spatially explicit modeling of greater sage-grouse (Centrocercus urophasianus) habitat in Nevada and northeastern California—A decision-support tool for management: U.S. Geological Survey Open-File Report 2014-1163, 83 p., http://dx.doi.org/10.3133/ofr20141163. ISSN 2331-1258 (online) Comer, P., Kagen, J., Heiner, M., and Tobalske, C. 2002. Current distribution of sagebrush and associated vegetation in the western United States (excluding NM). [Geospatial data.] Interagency Sagebrush Working Group, http://sagemap.wr.usgs.gov Homer, C.G., Aldridge, C.L., Meyer, D.K., and Schell, S.J. 2014. Multi-Scale Remote Sensing Sagebrush Characterization with Regression Trees over Wyoming, USA; Laying a Foundation for Monitoring. International Journal of Applied Earth Observation and Geoinformation 14, Elsevier, US. LANDFIRE. 2010. 1.2.0 Existing Vegetation Type Layer. [Geospatial data.] U.S. Department of the Interior, Geological Survey, http://landfire.cr.usgs.gov/viewer/ Mason, R.R. 1999. The National Flood-Frequency Program—Methods For Estimating Flood Magnitude And Frequency In Rural Areas In Nevada U.S. Geological Survey Fact Sheet 123-98 September, 1999, Prepared by Robert R. Mason, Jr. and Kernell G. Ries III, of the U.S. Geological Survey; and Jeffrey N. King and Wilbert O. Thomas, Jr., of Michael Baker, Jr., Inc. http://pubs.usgs.gov/fs/fs-123-98/ Peterson, E. B. 2008. A Synthesis of Vegetation Maps for Nevada (Initiating a 'Living' Vegetation Map). Documentation and geospatial data, Nevada Natural Heritage Program, Carson City, Nevada, http://www.heritage.nv.gov/gis Xian, G., Homer, C., Rigge, M., Shi, H., and Meyer, D. 2015. Characterization of shrubland ecosystem components as continuous fields in the northwest United States. Remote Sensing of Environment 168:286-300. NOTE: This file does not include habitat areas for the Bi-State management area and the spatial extent is modified in comparison to Coates et al. 2014
This shapefile represents habitat suitability categories (High, Moderate, Low, and Non-Habitat) derived from a composite, continuous surface of sage-grouse habitat suitability index (HSI) values for Nevada and northeastern California during the winter season, and is a surrogate for habitat conditions during periods of cold and snow. Summary of steps to create Habitat Categories: HABITAT SUITABILITY INDEX: The HSI was derived from a generalized linear mixed model (specified by binomial distribution and created using ArcGIS 10.2.2) that contrasted data from multiple environmental factors at used sites (telemetry locations) and available sites (random locations). Predictor variables for the model represented vegetation communities at multiple spatial scales, water resources, habitat configuration, urbanization, roads, elevation, ruggedness, and slope. Vegetation data was derived from various mapping products, which included NV SynthMap (Petersen 2008, SageStitch (Comer et al. 2002, LANDFIRE (Landfire 2010), and the CA Fire and Resource Assessment Program (CFRAP 2006). The analysis was updated to include high resolution percent cover within 30 x 30 m pixels for Sagebrush, non-sagebrush, herbaceous vegetation, and bare ground (C. Homer, unpublished; based on the methods of Homer et al. 2014, Xian et al. 2015 ) and conifer (primarily pinyon-juniper, P. Coates, unpublished). The pool of telemetry data included the same data from 1998 - 2013 used by Coates et al. (2014); additional telemetry location data from field sites in 2014 were added to the dataset. The dataset was then split according calendar date into three seasons (spring, summer, winter). Winter included telemetry locations (n = 4862) from November to March. All age and sex classes of marked grouse were used in the analysis. Sufficient data (i.e., a minimum of 100 locations from at least 20 marked Sage-grouse) for modeling existed in 10 subregions for spring and summer, and seven subregions in winter, using all age and sex classes of marked grouse. It is important to note that although this map is composed of HSI values derived from the seasonal data, it does not explicitly represent habitat suitability for reproductive females (i.e., nesting and with broods). Insufficient data were available to allow for estimation of this habitat type for all seasons throughout the study area extent. A Resource Selection Function (RSF) was calculated for each subregion using R software (v 3.13) and using generalized linear models to derive model-averaged parameter estimates for each covariate across a set of additive models. Subregional RSFs were transformed into Habitat Suitability Indices, and averaged together to produce an overall statewide HSI whereby a relative probability of occurrence was calculated for each raster cell during the spring season. In order to account for discrepancies in HSI values caused by varying ecoregions within Nevada, the HSI was divided into north and south extents using a slightly modified flood region boundary (Mason 1999) that was designed to represent respective mesic and xeric regions of the state. North and south HSI rasters were each relativized according to their maximum value to rescale between zero and one, then mosaicked once more into a state-wide extent. HABITAT CATEGORIZATION: Using the same ecoregion boundaries described above, the habitat classification dataset (an independent data set comprising 10% of the total telemetry location sample) was split into locations falling within respective north and south regions. HSI values from the composite and relativized statewide HSI surface were then extracted to each classification dataset location within the north and south region. The distribution of these values were used to identify class break values corresponding to 0.5 (high), 1.0 (moderate), and 1.5 (low) standard deviations (SD) from the mean HSI. These class breaks were used to classify the HSI surface into four discrete categories of habitat suitability: High, Moderate, Low, and Non-Habitat. In terms of percentiles, High habitat comprised greater than 30.9 % of the HSI values, Moderate comprised 15 – 30.9%, Low comprised 6.7 – 15%, and Non-Habitat comprised less than 6.7%.The classified north and south regions were then clipped by the boundary layer and mosaicked to create a statewide categorical surface for habitat selection . Each habitat suitability category was converted to a vector output where gaps within polygons less than 1.2 million square meters were eliminated, polygons within 500 meters of each other were connected to create corridors and polygons less than 1.2 million square meters in one category were incorporated to the adjacent category. The final step was to mask major roads that were buffered by 50m (Census, 2014), lakes (Peterson, 2008) and urban areas, and place those masked areas into the non-habitat category. The existing urban layer (Census 2010) was not sufficient for our needs because it excluded towns with a population lower than 1,500. Hence, we masked smaller towns (populations of 100 to 1500) and development with Census Block polygons (Census 2015) that had at least 50% urban development within their boundaries when viewed with reference imagery (ArcGIS World Imagery Service Layer). REFERENCES: California Forest and Resource Assessment Program (CFRAP). 2006. Statewide Land Use / Land Cover Mosaic. [Geospatial data.] California Department of Forestry and Fire Protection, http://frap.cdf.ca.gov/data/frapgisdata-sw-rangeland-assessment_data.php Census 2010. TIGER/Line Shapefiles. Urban Areas [Geospatial data.] U.S. Census Bureau, Washington D.C., https://www.census.gov/geo/maps-data/data/tiger-line.html Census 2014. TIGER/Line Shapefiles. Roads [Geospatial data.] U.S. Census Bureau, Washington D.C., https://www.census.gov/geo/maps-data/data/tiger-line.html Census 2015. TIGER/Line Shapefiles. Blocks [Geospatial data.] U.S. Census Bureau, Washington D.C., https://www.census.gov/geo/maps-data/data/tiger-line.html Coates, P.S., Casazza, M.L., Brussee, B.E., Ricca, M.A., Gustafson, K.B., Overton, C.T., Sanchez-Chopitea, E., Kroger, T., Mauch, K., Niell, L., Howe, K., Gardner, S., Espinosa, S., and Delehanty, D.J. 2014, Spatially explicit modeling of greater sage-grouse (Centrocercus urophasianus) habitat in Nevada and northeastern California—A decision-support tool for management: U.S. Geological Survey Open-File Report 2014-1163, 83 p., http://dx.doi.org/10.3133/ofr20141163. ISSN 2331-1258 (online) Comer, P., Kagen, J., Heiner, M., and Tobalske, C. 2002. Current distribution of sagebrush and associated vegetation in the western United States (excluding NM). [Geospatial data.] Interagency Sagebrush Working Group, http://sagemap.wr.usgs.gov Homer, C.G., Aldridge, C.L., Meyer, D.K., and Schell, S.J. 2014. Multi-Scale Remote Sensing Sagebrush Characterization with Regression Trees over Wyoming, USA; Laying a Foundation for Monitoring. International Journal of Applied Earth Observation and Geoinformation 14, Elsevier, US. LANDFIRE. 2010. 1.2.0 Existing Vegetation Type Layer. [Geospatial data.] U.S. Department of the Interior, Geological Survey, http://landfire.cr.usgs.gov/viewer/ Mason, R.R. 1999. The National Flood-Frequency Program—Methods For Estimating Flood Magnitude And Frequency In Rural Areas In Nevada U.S. Geological Survey Fact Sheet 123-98 September, 1999, Prepared by Robert R. Mason, Jr. and Kernell G. Ries III, of the U.S. Geological Survey; and Jeffrey N. King and Wilbert O. Thomas, Jr., of Michael Baker, Jr., Inc. http://pubs.usgs.gov/fs/fs-123-98/ Peterson, E. B. 2008. A Synthesis of Vegetation Maps for Nevada (Initiating a 'Living' Vegetation Map). Documentation and geospatial data, Nevada Natural Heritage Program, Carson City, Nevada, http://www.heritage.nv.gov/gis Xian, G., Homer, C., Rigge, M., Shi, H., and Meyer, D. 2015. Characterization of shrubland ecosystem components as continuous fields in the northwest United States. Remote Sensing of Environment 168:286-300. NOTE: This file does not include habitat areas for the Bi-State management area and the spatial extent is modified in comparison to Coates et al. 2014
https://www.usa.gov/government-workshttps://www.usa.gov/government-works
This dataset is sourced from the U.S. Department of Transportation Bureau of Transportation Statistics. All data and metadata is sourced from the page linked below. Metadata is not updated automatically; data updates weekly.
Source Data Link: https://data.bts.gov/Research-and-Statistics/Trips-by-Distance/w96p-f2qv
How many people are staying at home? How far are people traveling when they don’t stay home? Which states and counties have more people taking trips? The Bureau of Transportation Statistics (BTS) now provides answers to those questions through our new mobility statistics.
The Trips by Distance data and number of people staying home and not staying home are estimated for the Bureau of Transportation Statistics by the Maryland Transportation Institute and Center for Advanced Transportation Technology Laboratory at the University of Maryland. The travel statistics are produced from an anonymized national panel of mobile device data from multiple sources. All data sources used in the creation of the metrics contain no personal information. Data analysis is conducted at the aggregate national, state, and county levels. A weighting procedure expands the sample of millions of mobile devices, so the results are representative of the entire population in a nation, state, or county. To assure confidentiality and support data quality, no data are reported for a county if it has fewer than 50 devices in the sample on any given day.
Trips are defined as movements that include a stay of longer than 10 minutes at an anonymized location away from home. Home locations are imputed on a weekly basis. A movement with multiple stays of longer than 10 minutes before returning home is counted as multiple trips. Trips capture travel by all modes of transportation. including driving, rail, transit, and air.
The daily travel estimates are from a mobile device data panel from merged multiple data sources that address the geographic and temporal sample variation issues often observed in a single data source. The merged data panel only includes mobile devices whose anonymized location data meet a set of data quality standards, which further ensures the overall data quality and consistency. The data quality standards consider both temporal frequency and spatial accuracy of anonymized location point observations, temporal coverage and representativeness at the device level, spatial representativeness at the sample and county level, etc. A multi-level weighting method that employs both device and trip-level weights expands the sample to the underlying population at the county and state levels, before travel statistics are computed.
These data are experimental and may not meet all of our quality standards. Experimental data products are created using new data sources or methodologies that benefit data users in the absence of other relevant products. We are seeking feedback from data users and stakeholders on the quality and usefulness of these new products. Experimental data products that meet our quality standards and demonstrate sufficient user demand may enter regular production if resources permit.
The Consumer Expenditure Survey (CE) program provides a continuous and comprehensive flow of data on the buying habits of American consumers. These data are used widely in economic research and analysis, and in support of revisions of the Consumer Price Index. To meet the needs of users, the Bureau of Labor Statistics (BLS) produces population estimates for consumer units (CUs) of average expenditures in news releases, reports, issues, and articles in the Monthly Labor Review. Tabulated CE data are also available on the Internet and by facsimile transmission (See Section XV. APPENDIX 4). The microdata are available online at http://www/bls.gov/cex/pumdhome.htm. These microdata files present detailed expenditure and income data for the Diary component of the CE for 2002. They include weekly expenditure (EXPD) and annual income (DTBD) files. The data in EXPD and DTBD files are categorized by a Universal Classification Code (UCC). The advantage of the EXPD and DTBD files is that with the data classified in a standardized format, the user may perform comparative expenditure (income) analysis with relative ease. The FMLD and MEMD files present data on the characteristics and demographics of CUs and CU members. The summary level expenditure and income information on the FMLD files permits the data user to link consumer spending, by general expenditure category, and household characteristics and demographics on one set of files. Estimates of average expenditures in 2002 from the Diary survey, integrated with data from the Interview survey, are published in Consumer Expenditures in 2002. A list of recent publications containing data from the CE appears at the end of this documentation. The microdata files are in the public domain and with appropriate credit, may be reproduced without permission. A suggested citation is: "U.S. Department of Labor, Bureau of Labor Statistics, Consumer Expenditure Survey, Diary Survey, 2002".
Consumer Units
Sample survey data [ssd]
Samples for the CE are national probability samples of households designed to be representative of the total U. S. civilian population. Eligible population includes all civilian noninstitutional persons. The first step in sampling is the selection of primary sampling units (PSUs), which consist of counties (or parts thereof) or groups of counties. The set of sample PSUs used for the 2002 sample is composed of 105 areas. The design classifies the PSUs into four categories: • 31 "A" certainty PSUs are Metropolitan Statistical Areas (MSA's) with a population greater than 1.5 million. • 46 "B" PSUs, are medium-sized MSA's. • 10 "C" PSUs are nonmetropolitan areas that are included in the CPI. • 18 "D" PSUs are nonmetropolitan areas where only the urban population data will be included in the CPI.
The sampling frame (that is, the list from which housing units were chosen) for the 2002 survey is generated from the 1990 Population Census 100-percent-detail file. The sampling frame is augmented by new construction permits and by techniques used to eliminate recognized deficiencies in census coverage. All Enumeration Districts (ED's) from the Census that fail to meet the criterion for good addresses for new construction, and all ED's in nonpermit-issuing areas are grouped into the area segment frame. To the extent possible, an unclustered sample of units is selected within each PSU. This lack of clustering is desirable because the sample size of the Diary Survey is small relative to other surveys, while the intraclass correlations for expenditure characteristics are relatively large. This suggests that any clustering of the sample units could result in an unacceptable increase in the within-PSU variance and, as a result, the total variance. Each selected sample unit is requested to keep two 1-week diaries of expenditures over consecutive weeks. The earliest possible day for placing a diary with a household is predesignated with each day of the week having an equal chance to be the first of the reference week. The diaries are evenly spaced throughout the year. During the last 6 weeks of the year, however, the Diary Survey sample is supplemented to twice its normal size to increase the reporting of types of expenditures unique to the holidays.
STATE IDENTIFIER Since the CE is not designed to produce state-level estimates, summing the consumer unit weights by state will not yield state population totals. A CU's basic weight reflects its probability of selection among a group of primary sampling units of similar characteristics. For example, sample units in an urban nonmetropolitan area in California may represent similar areas in Wyoming and Nevada. Among other adjustments, CUs are post-stratified nationally by sex-age-race. For example, the weights of consumer units containing a black male, age 16-24 in Alabama, Colorado, or New York, are all adjusted equivalently. Therefore, weighted population state totals will not match population totals calculated from other surveys that are designed to represent state data. To summarize, the CE sample was not designed to produce precise estimates for individual states. Although state-level estimates that are unbiased in a repeated sampling sense can be calculated for various statistical measures, such as means and aggregates, their estimates will generally be subject to large variances. Additionally, a particular state-population estimate from the CE sample may be far from the true state-population estimate.
INTERPRETING THE DATA Several factors should be considered when interpreting the expenditure data. The average expenditure for an item may be considerably lower than the expenditure by those CUs that purchased the item. The less frequently an item is purchased, the greater the difference between the average for all consumer units and the average of those purchasing. (See Section V.B. for ESTIMATION OF TOTAL AND MEAN EXPENDITURES). Also, an individual CU may spend more or less than the average, depending on its particular characteristics. Factors such as income, age of family members, geographic location, taste and personal preference also influence expenditures. Furthermore, even within groups with similar characteristics, the distribution of expenditures varies substantially. Expenditures reported are the direct out-of-pocket expenditures. Indirect expenditures, which may be significant, may be reflected elsewhere. For example, rental contracts often include utilities. Renters with such contracts would record no direct expense for utilities, and therefore, appear to have no utility expenses. Employers or insurance companies frequently pay other costs. CUs with members whose employers pay for all or part of their health insurance or life insurance would have lower direct expenses for these items than those who pay the entire amount themselves. These points should be considered when relating reported averages to individual circumstances.
Computer Assisted Personal Interview [capi]
The Employment and Unemployment surveys of National sample Survey (NSS) are primary sources of data on various indicators of labour force at National and State levels. These are used for planning, policy formulation, decision support and as input for further statistical exercises by various Government organizations, academicians, researchers and scholars. NSS surveys on employment and un-employment with large sample size of households have been conducted quinquennially from 27th. round(October'1972 - September'1973) onwards. Cotinuing in this series the fourth such all-india survey on the situation of employment and unemployment in India was carried out during the period july 1987 - june 1988 .
The working Group set up for planning of the entire scheme of the survey, among other things, examined also in detail some of the key results generated from the 38th round data and recommended some stream-lining of the 38th round schedule for the use in the 43rd round. Further, it felt no need for changing the engaging the easting conceptual frame work. However, some additional items were recommended to be included in the schedule to obtain the necessary and relevant information for generating results to see the effects on participation rates in view of the ILO suggestions.5.0.1. The NSSO Governing Council approved the recommendations of the working Group and also the schedule of enquiry in its 44th meeting held on 16 January, 1987. In this survey, a nation-wide enquiry was conducted to provide estimates on various characteristics pertaining to employment and unemployment in India and some characteristics associated with them at the national and state levels. Information on various facets of employment and unemployment in India was collected through a schedule of enquiry (schedule 10).
The survey covered the whole of Indian Union excepting i) Ladakh and Kargil districts of Jammu & Kashmir ii) Rural areas of Nagaland
Randomly selected households based on sampling procedure and members of the household
Sample survey data [ssd]
It may be mentioned here that in order to net more households of the upper income bracket in the Sample , significant changes have been made in the sample design in this round (compares to the design of the 38th round).
SAMPLE DESIGN AND SAMPLE SIZE The survey had a two-stage stratified design. The first stage units (f.s.u.'s) are villages in the rural sector and urban blocks in the urban sector. The second stage units are households in both the sectors. Sampling frame for f.s.u.'s : The lists of 1981 census villages constituted the sampling frame for rural sector in most districts. But the 1981 census frame could not be used for a few districts because, either the 1981 census was not held there or the list of 1981 census villages could not be obtained or the lists obtained from the census authorities were found to be grossly incomplete. In such cases 1971 census frame were used. In the urban sector , the Urban Frame Survey (U.F.S.) blocks constituted the sampling frame. STRATIFICATION : States were first divided into agro-economic regions which are groups of contiguous districts , similar with respect to population density and crop pattern. In Gujarat, however , some districts have been split for the purpose of region formation In consideration of the location of dry areas and the distribution of the tribal population in the state. The composition of the regions is given in the Appendix. RURAL SECTOR: In the rural sector, within each region, each district with 1981Census rural population less 1.8 million formed a single stratum. Districts with larger population were divided into two or more strata, depending on population, by grouping contiguous tehsils similar, as for as possible, in respect of rural population Density and crop pattern. (In Gujarat, however , in the case of districts extending over more than one region, even if the rural population was less than 1.8 million, the portion of a district falling in each region constituted a separate stratum. Further ,in Assam the old "basic strata" formed on the basis of 1971 census rural population exactly in the above manner, but with cut-off population as 1.5 million have been retained as the strata for rural sampling.) URBAN SECTOR : In the urban sector , strata were formed , again within NSS region , on the basis of the population size class of towns . Each city with population 10 lakhs or more is self-representative , as in the earlier rounds . For the purpose of stratification, in towns with '81 census population 4 lakhs or more , the blocks have been divided into two categories , viz . : One consisting of blocks in areas inhabited by the relatively affluent section of the population and the other consisting of the remaining blocks. The strata within each region were constituted as follows :
Stratum population class of town
1 all towns with population less than 50,000 2 -do- 50,000 - 199,999 3 -do- 200,000 - 399,999 4 -do- 400,000 - 999,999 ( affluent area) 5 (other area) 6 a single city with population 1 million and above (affluent area) 7 " (other area) 8 another city with population 1 million and above
Note : There is no region with more than one city with population 1 million and above. The stratum number have been retained as above even if in some regions some of the strata are empty.
Allocation for first stage units : The total all-India sample size was allocated to the states /U.T.'s proportionate to the strength of central field staff. This was allocated to the rural and urban sectors considering the relative size of the rural and urban population. Now the rural samples were allocated to the rural strata in proportion to rural population. The urban samples were allocated to the urban strata in proportion to urban population with double weight age given to those strata of towns with population 4 lakhs or more which lie in area inhabited by the relatively affluent section. All allocations have been adjusted such that the sample size for stratum was at least a multiple of 4 (preferably multiple of 8) and the total sample size of a region is a multiple of 8 for the rural and urban sectors separately.
Selection of f.s.u.'s : The sample villages have been selected circular systematically with probability proportional to population in the form of two independent interpenetrating sub-samples (IPNS) . The sample blocks have been selected circular systematically with equal probability , also in the form of two IPNS' s.
As regards the rural areas of Arunachal Pradesh, the procedure of 'cluster sampling' was:- The field staff will be supplied with a list of the nucleus villages of each cluster and they selected the remaining villages of the cluster according to the procedure described in Section Two. The nucleus villages were selected circular systematically with equal probability, in the form of two IPNS 's.
Hamlet-group and sub-blocks : Large villages and blocks were sub- divided into a suitable number of hamlet-groups and sub-blocks respectively having equal population convent and one them was selected at random for surveys.
Hamlet-group and sub-blocks : Large villages and blocks were sub- divided into a suitable number of hamlet-groups and sub-blocks respectively having equal population convent and one them was selected at random for surveys.
Selection of households : rural : In order to have adequate number of sample households from the affluent section of the society, some new procedures were introduced for selection of sample households, both in the rural and urban sectors. In the rural sector , while listing households, the investigator identified the households in village/ selected hamlet- group which may be considered to be relatively more affluent than the rest. This was done largely on the basis of his own judgment but while exercising his judgment considered factors generally associated with rich people in the localitysuch as : living in large pucca house in well-maintained state, ownership/possession of cultivated/irrigated land in excess of certain norms. ( e.g.20 acres of cultivated land or 10 acres of irrigated land), ownership of motor vehicles and costly consumer durables like T.V. , VCR, VCP AND refrigerator, ownership of large business establishment , etc. Now these "rich" households will form sub-stratum 1. (If the total number of households listed is 80 or more , 10 relatively most affluent households will form sub-stratum 1. If it is below 80, 8 such households will form sub-stratum 1. The remaining households will 'constitute sub-stratum 2. At the time of listing, information relating to each household' s major sources of income will be collected, on the basis of which its means of livelihood will be identified as one of the following : "self-employed in non-agriculture " "rural labour" and "others" (see section Two for definition of these terms) . Also the area of land possessed as on date of survey will be ascertained from all households while listing. Now the households of sub-stratum 2 will be arranged in the order : (1)self-employed in non-agriculture, (2) rural labour, other households, with land possessed (acres) : (3) less than 1.00 (4) 1.00-2.49,(5)2.50-4.99, (6)
Note: DPH is updating and streamlining the COVID-19 cases, deaths, and testing data. As of 6/27/2022, the data will be published in four tables instead of twelve. The COVID-19 Cases, Deaths, and Tests by Day dataset contains cases and test data by date of sample submission. The death data are by date of death. This dataset is updated daily and contains information back to the beginning of the pandemic. The data can be found at https://data.ct.gov/Health-and-Human-Services/COVID-19-Cases-Deaths-and-Tests-by-Day/g9vi-2ahj. The COVID-19 State Metrics dataset contains over 93 columns of data. This dataset is updated daily and currently contains information starting June 21, 2022 to the present. The data can be found at https://data.ct.gov/Health-and-Human-Services/COVID-19-State-Level-Data/qmgw-5kp6 . The COVID-19 County Metrics dataset contains 25 columns of data. This dataset is updated daily and currently contains information starting June 16, 2022 to the present. The data can be found at https://data.ct.gov/Health-and-Human-Services/COVID-19-County-Level-Data/ujiq-dy22 . The COVID-19 Town Metrics dataset contains 16 columns of data. This dataset is updated daily and currently contains information starting June 16, 2022 to the present. The data can be found at https://data.ct.gov/Health-and-Human-Services/COVID-19-Town-Level-Data/icxw-cada . To protect confidentiality, if a town has fewer than 5 cases or positive NAAT tests over the past 7 days, those data will be suppressed. COVID-19 cases and associated deaths that have been reported among Connecticut residents, broken down by race and ethnicity. All data in this report are preliminary; data for previous dates will be updated as new reports are received and data errors are corrected. Deaths reported to the either the Office of the Chief Medical Examiner (OCME) or Department of Public Health (DPH) are included in the COVID-19 update. The following data show the number of COVID-19 cases and associated deaths per 100,000 population by race and ethnicity. Crude rates represent the total cases or deaths per 100,000 people. Age-adjusted rates consider the age of the person at diagnosis or death when estimating the rate and use a standardized population to provide a fair comparison between population groups with different age distributions. Age-adjustment is important in Connecticut as the median age of among the non-Hispanic white population is 47 years, whereas it is 34 years among non-Hispanic blacks, and 29 years among Hispanics. Because most non-Hispanic white residents who died were over 75 years of age, the age-adjusted rates are lower than the unadjusted rates. In contrast, Hispanic residents who died tend to be younger than 75 years of age which results in higher age-adjusted rates. The population data used to calculate rates is based on the CT DPH population statistics for 2019, which is available online here: https://portal.ct.gov/DPH/Health-Information-Systems--Reporting/Population/Population-Statistics. Prior to 5/10/2021, the population estimates from 2018 were used. Rates are standardized to the 2000 US Millions Standard population (data available here: https://seer.cancer.gov/stdpopulations/). Standardization was done using 19 age groups (0, 1-4, 5-9, 10-14, ..., 80-84, 85 years and older). More information about direct standardization for age adjustment is available here: https://www.cdc.gov/nchs/data/statnt/statnt06rv.pdf Categories are mutually exclusive. The category “multiracial” includes people who answered ‘yes’ to more than one race category. Counts may not add up to total case counts as data on race and ethnicity may be missing. Age adjusted rates calculated only for groups with more than 20 deaths. Abbreviation: NH=Non-Hispanic. Data on Connecticut deaths were obtained from the Connecticut Deaths Registry maintained by the DPH Office of Vital Records. Cause of death was determined by a death certifier (e.g., physician, APRN, medical
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The purpose of this data set is to allow exploration between various types of data that is commonly collected by the US government across the states and the USA as a whole. The data set consists of three different types of data:
When creating the data set, I combined data from many different types of sources, all of which are cited below. I have also provided the fields included in the data set and what they represent below. I have not performed any research on the data yet, but am going to dive in soon. I am particularly interested in the relationships between various types of data (i.e. GDP or birth rate) in prediction algorithms. Given that I have compiled 5 years’ worth of data, this data set was primarily constructed with predictive algorithms in mind.
An additional note before you delve into the fields: * There could have been many more variables added across many different fields of metrics. I have stopped here, but it could potentially be beneficial to observe the interaction of these variables with others (i.e. the GDP of certain industries, the average age in a state, the male/female gender ratio, etc.) to attempt to find additional trends.
As noted from the census:
Net international migration for the United States includes the international migration of both native and foreign-born populations. Specifically, it includes: (a) the net international migration of the foreign born, (b) the net migration between the United States and Puerto Rico, (c) the net migration of natives to and from the United States, and (d) the net movement of the Armed Forces population between the United States and overseas. Net international migration for Puerto Rico includes the migration of native and foreign-born populations between the United States and Puerto Rico.
Codes for most of the data, information about the geographic terms and coditions, and more information about the methodology behind the population estimates can be found on the US Census website.
As coronavirus cases have exploded across the country, states have struggled to obtain sufficient personal protective equipment such as masks, face shields, gloves and ventilators to meet the needs of healthcare workers. FEMA began distributing PPE from the national stockpile as well as PPE obtained from private manufacturers to states in March.
Initially, FEMA distributed materials based primarily on population. By late March, Its methods changed to send more PPE to hotspot locations, and FEMA claimed these decisions were data-driven and need-based. By late spring, the agency was considering requests from states as well.
Although all U.S. states and territories have received some amount of PPE from FEMA, the amounts of PPE states have per capita and per positive COVID-19 case vary widely.
The AP used this data in a story that ran July 7.
These numbers include material distributed by FEMA and also those sold by private distributors under direction from FEMA. They include materials both delivered to and en route to states.
States have purchased PPE directly in addition to receiving PPE from FEMA or directed there by the agency, and this data only includes the latter categories.
FEMA also distributed and directed the distribution of gear to U.S. territories in addition to states, which are included in FEMA’s release linked below, but not are not included in this data.
FEMA has publicly distributed its breakdown of PPE delivery by state for May and June. FEMA did not provide comprehensive numbers for each state before May.
These numbers are cumulative, meaning that the numbers for May include items of PPE distributed prior to May 14, dating to when the agency began allocations on March 1. The June numbers include the May numbers and any new PPE distributions since then.
The population column, which was used to calculate the numbers of PPE items per state, came from data from the U.S Census Bureau. Since the Census releases annual population data, population data from 2019 was used for each state.
The numbers of coronavirus cases were pulled from the data released daily by Johns Hopkins University as of the dates that FEMA released its distribution numbers — May 14 and June 10.
The data includes amounts of gear that had been delivered to the states or were en route as of the reporting dates.
All PPE item numbers above 1 million were rounded to the nearest hundred thousand by FEMA, but numbers lower than that were not rounded.
In some cases, gear headed to a state was rerouted because it was needed more somewhere else or a state decided it did not need it. In some instances, that resulted in states having higher numbers for certain supplies in May than in June.
This shapefile represents proposed management categories (Core, Priority, General, and Non-Habitat) derived from the intersection of habitat suitability categories and lek space use. Habitat suitability categories were derived from a composite, continuous surface of sage-grouse habitat suitability index (HSI) values for Nevada and northeastern California formed from the multiplicative product of the spring, summer, and winter HSI surfaces. Summary of steps to create Management Categories: HABITAT SUITABILITY INDEX: The HSI was derived from a generalized linear mixed model (specified by binomial distribution and created using ArcGIS 10.2.2) that contrasted data from multiple environmental factors at used sites (telemetry locations) and available sites (random locations). Predictor variables for the model represented vegetation communities at multiple spatial scales, water resources, habitat configuration, urbanization, roads, elevation, ruggedness, and slope. Vegetation data was derived from various mapping products, which included NV SynthMap (Petersen 2008, SageStitch (Comer et al. 2002, LANDFIRE (Landfire 2010), and the CA Fire and Resource Assessment Program (CFRAP 2006). The analysis was updated to include high resolution percent cover within 30 x 30 m pixels for Sagebrush, non-sagebrush, herbaceous vegetation, and bare ground (C. Homer, unpublished; based on the methods of Homer et al. 2014, Xian et al. 2015 ) and conifer (primarily pinyon-juniper, P. Coates, unpublished). The pool of telemetry data included the same data from 1998 - 2013 used by Coates et al. (2014) as well as additional telemetry location data from field sites in 2014. The dataset was then split according to calendar date into three seasons. Spring included telemetry locations (n = 14,058) from mid-March to June; summer included locations (n = 11,743) from July to mid-October; winter included locations (n = 4862) from November to March. All age and sex classes of marked grouse were used in the analysis. Sufficient data (i.e., a minimum of 100 locations from at least 20 marked Sage-grouse) for modeling existed in 10 subregions for spring and summer, and seven subregions in winter, using all age and sex classes of marked grouse. It is important to note that although this map is composed of HSI values derived from the seasonal data, it does not explicitly represent habitat suitability for reproductive females (i.e., nesting and with broods). Insufficient data were available to allow for estimation of this habitat type for all seasons throughout the study area extent. A Resource Selection Function (RSF) was calculated for each subregion using R software (v 3.13) and season using generalized linear models to derive model-averaged parameter estimates for each covariate across a set of additive models. For each season, subregional RSFs were transformed into Habitat Suitability Indices, and averaged together to produce an overall statewide HSI whereby a relative probability of occurrence was calculated for each raster cell. The three seasonal HSI rasters were then multiplied to create a composite annual HSI. In order to account for discrepancies in HSI values caused by varying ecoregions within Nevada, the HSI was divided into north and south extents using a slightly modified flood region boundary (Mason 1999) that was designed to represent respective mesic and xeric regions of the state. North and south HSI rasters were each relativized according to their maximum value to rescale between zero and one, then mosaicked once more into a state-wide extent. HABITAT CATEGORIZATION: Using the same ecoregion boundaries described above, the habitat classification dataset (an independent data set comprising 10% of the total telemetry location sample) was split into locations falling within respective north and south regions. HSI values from the composite and relativized statewide HSI surface were then extracted to each classification dataset location within the north and south region. The distribution of these values were used to identify class break values corresponding to 0.5 (high), 1.0 (moderate), and 1.5 (low) standard deviations (SD) from the mean HSI. These class breaks were used to classify the HSI surface into four discrete categories of habitat suitability: High, Moderate, Low, and Non-Habitat. In terms of percentiles, High habitat comprised greater than 30.9 % of the HSI values, Moderate comprised 15 – 30.9%, Low comprised 6.7 – 15%, and Non-Habitat comprised less than 6.7%.The classified north and south regions were then clipped by the boundary layer and mosaicked to create a statewide categorical surface for habitat selection. Each habitat suitability category was converted to a vector output where gaps within polygons less than 1.2 million square meters were eliminated, polygons within 500 meters of each other were connected to create corridors and polygons less than 1.2 million square meters in one category were incorporated to the adjacent category. The final step was to mask major roads that were buffered by 50m (Census, 2014), lakes (Peterson, 2008) and urban areas, and place those masked areas into the non-habitat category. The existing urban layer (Census 2010) was not sufficient for our needs because it excluded towns with a population lower than 1,500. Hence, we masked smaller towns (populations of 100 to 1500) and development with Census Block polygons (Census 2015) that had at least 50% urban development within their boundaries when viewed with reference imagery (ArcGIS World Imagery Service Layer). SPACE USE INDEX CALCULATION: Updated lek coordinates and associated trend count data were obtained from the 2015 Nevada Sage-grouse Lek Database compiled by the Nevada Department of Wildlife (NDOW, S. Espinosa, 9/20/2015). Leks count data from the California side of the Buffalo-Skedaddle and Modoc PMU's that contributed to the overall space-use model were obtained from the Western Association of Fish and Wildlife Agencies (WAFWA), and included count data up to 2014. We used NDOW data for border leks (n = 12), and WAFWA data for those fully in California and not consistently surveyed by NDOW. We queried the database for leks with a ‘LEKSTATUS’ field classified as ‘Active’ or ‘Pending’. Active leks comprised leks with breeding males observed within the last 5 years (through the 2014 breeding season). Pending leks comprised leks without consistent breeding activity during the prior 3 - 5 surveys or had not been surveyed during the past 5 years; these leks typically trended towards ‘inactive’, or newly discovered leks with at least 2 males. A sage-grouse management area (SGMA) was calculated by buffering Population Management Units developed by NDOW by 10km. This included leks from the Buffalo-Skedaddle PMU that straddles the northeastern California – Nevada border, but excluded leks for the Bi-State Distinct Population Segment. The 5-year average (2011 - 2015) for the number of male grouse (or NDOW classified 'pseudo-males' if males were not clearly identified but likely) attending each lek was calculated. Compared to the 2014 input lek dataset, 36 leks switched from pending to inactive, and 74 new leks were added for 2015 (which included pending ‘new’ leks with one year of counts. A total of 917 leks were used for space use index calculation in 2015 compared to 878 leks in 2014. Utilization distributions describing the probability of lek occurrence were calculated using fixed kernel density estimators (Silverman 1986) with bandwidths estimated from likelihood based cross-validation (CVh) (Horne and Garton 2006). UDs were weighted by the 5-year average (2011 - 2015) for the number of males grouse (or unknown gender if males were not identified) attending leks. UDs and bandwidths were calculated using Geospatial Modelling Environment (Beyer 2012) and the ‘ks’ package (Duong 2012) in Program R. Grid cell size was 30m. The resulting raster was re-scaled between zero and one by dividing by the maximum pixel value. The non-linear effect of distance to lek on the probability of grouse spatial use was estimated using the inverse of the utilization distribution curves described by Coates et al. (2013), where essentially the highest probability of grouse spatial use occurs near leks and then declines precipitously as a non-linear function. Euclidean distance was first calculated in ArcGIS, reclassified into 30-m distance bins (ranging from 0 - 30,000m), and bins reclassified according to the non-linear curve in Coates et al. (2013). The resulting raster was re-scaled between zero and one by dividing by the maximum cell value. A Spatial Use Index (SUI) was calculated by taking the average of the lek utilization distribution and non-linear distance-to-lek rasters in ArcGIS, and re-scaled between zero and one by dividing by the maximum cell value. The volume of the SUI at cumulative at specific isopleths was extracted in Geospatial Modelling Environment (Beyer 2012) with the command ‘isopleth’. Interior polygons (i.e., donuts’ > 1.2 km2) representing no probability of use within a larger polygon of use were erased from each isopleth. The 85% isopleth, which provided greater spatial connectivity and consistency with previously used agency standards (e.g., Doherty et al. 2010), was ultimately recommended by the Sagebrush Ecosystem Technical Team. The 85% SUI isopleth was clipped by the Nevada state boundary. MANAGEMENT CATEGORIES: The process for category determination was directed by the Nevada Sagebrush Ecosystem Technical team. Sage-grouse habitat was categorized into 4 classes: High, Moderate, Low, and Non-Habitat as described above, and intersected with the space use index to form the following management categories . 1) Core habitat: Defined as the intersection between all suitable habitat (High, Moderate, and Low) and the 85% Space Use Index (SUI). 2) Priority habitat: Defined as all high quality
The national sample survey (NSS), set-up by the government of India in 1950 to collect socio-economic data employing scientific sampling methods, completed its forty-ninth round as a six months survey during the period January to June,1993. Housing condition of the people is one of the very important indicators of the socio-economic development of the country. Statistical data on housing condition in qualitative and quantitative terms are needed periodically for an assessment of housing stock and formulation of housing policies and programmes. NSS 49th round was devoted mainly to the survey on housing condition and migration with special emphasis on slum dwellers. An integrated schedule was designed for collecting data on 'housing condition' as well as ' migration '. Also,households living in the slums were adequately represented in the sample of households where the integrated schedule was canvassed.The present study was different from the earlier study in the sense that the coverage in the present round was much wider. Detailed information on migration have been made with a view to throw data on different facets of migration. For this reason we find separate migration data for males & females, migrant households, return migrants, the structure of the residence of the migrants' households before & after migration, status of the migrants before and after migration and other details on migration. It is to be noted that comprehensive data on out-migrants & return-migrants were collected for the first time in the 49th round.
The survey covered the whole of Indian union excepting ( i) Ladakh and kargil districts of Jammu & kashmir ( ii ) 768 interior villages of Nagaland ( out of a total of 1119 villages ) located beyond 5 kms. of a bus route and ( iii ) 172 villages in Andaman & Nicobar islands ( out of a total of 520 villages ) which are inaccessible throughout the year.
The survey used the interview method of data collection from a sample of randomly selected households and members of the household.
Sample survey data [ssd]
A two-stage stratified design was adopted for the 49th round survey. The first-stage units(fsu) were census villages in the rural sector and U.F.S. (Urban Frame Survey) blocks in the urban sector (However, for some of the newly declared towns of 1991 census for which UFS frames were not available, census EBs were first-stage units). The second-stage units were households in both the sectors. In the central sample altogether 5072 sample villages and 2928 urban sample blocks at all-India level were selected. Sixteen households were selected per sample village/block in each of which the schedule of enquiry was canvassed. The number of sample households actually surveyed for the enquiry was 119403.
Sample frame for fsus : Mostly the 1981 census lists of villages constituted the sampling frame for rural sector. For Nagaland, the villages located within 5 kms. of a bus route constituted the sampling frame. For Andaman and Nicobar Islands, the list of accessible villages was used as the sampling frame. For the Urban sector, the lists of NSS Urban Frame Survey (UFS) blocks have been considered as the sampling frame in most cases. However, 1991 house listing EBs (Enumeration blocks) were considered as the sampling frame for some of the new towns of 1991 census, for which UFS frames were not available.
Stratification for rural sector : States have been divided into NSS regions by grouping contiguous districts similar in respect of population density and crop pattern. In Gujarat, however, some districts have been split for the purpose of region formation, considering the location of dry areas and distribution of tribal population in the state. In the rural sector, each district with 1981 / 1991 census rural population less than, 1.8 million/2 million formed a separate stratum. Districts with larger population were divided into two or more strata, by grouping contiguous tehsils.
Stratification for urban sector : In the urban sector, strata were formed, within the NSS region, according to census population size classes of towns. Each city with population 10 lakhs or more formed a separate stratum. Further, within each region, the different towns were grouped to form three different strata on the basis of their respective census population as follows : all towns with population less than 50,000 as stratum 1, those with population 50,000 to 1,99,999 as stratum-2 and those with population 2,00,000 to 9,99,999 as stratum-3.
Sample size for fsu's : The central sample comprised of 5072 villages and 2928 blocks. Selection of first stage units : The sample villages have been selected with probability proportional to population with replacement and the sample blocks by simple random sampling without replacement. Selection was done in both the sectors in the form of two independent sub-samples.
There was no deviation from the original sample.
Face-to-face [f2f]
The questionnaire consisted of 13 blocks as given below : Block - 0 : Descriptive Identification of Sample Household Block - 1 : Identification of Sample Household Block - 2 : Particulars of Field Operations Block - 3 : Household Characteristics Block - 4 : Demographic and Migration Particulars of Members of Household Block - 5 : Building and Environment Particulars Block - 6 : Particulars of the Dwelling Block - 7 : Particulars of Living Facilities Block - 8 : Particulars of Building Construction for Residential Purpose Block - 9 : Particulars of Dwelling/Land Owned Elsewhere Block - 10 : Use of Public Distribution System(PDS) Block - 11 : Some General Particulars of Slum Dwellers Block - 12 : Remarks by Investigator Block - 13 : Comments by Supervisory Officer(s)
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Rising sea levels (SLR) will cause coastal groundwater to rise in many coastal urban environments. Inundation of contaminated soils by groundwater rise (GWR) will alter the physical, biological, and geochemical conditions that influence the fate and transport of existing contaminants. These transformed products can be more toxic and/or more mobile under future conditions driven by SLR and GWR. We reviewed the vulnerability of contaminated sites to GWR in a US national database and in a case comparison with the San Francisco Bay region to estimate the risk of rising groundwater to human and ecosystem health. The results show that 326 sites in the US Superfund program may be vulnerable to changes in groundwater depth or flow direction as a result of SLR, representing 18.1 million hectares of contaminated land. In the San Francisco Bay Area, we found that GWR is predicted to impact twice as much coastal land area as inundation from SLR alone, and 5,297 state-managed sites of contamination may be vulnerable to inundation from GWR in a 1-meter SLR scenario. Increases of only a few centimeters of elevation can mobilize soil contaminants, alter flow directions in a heterogeneous urban environment with underground pipes and utility trenches, and result in new exposure pathways. Pumping for flood protection will elevate the salt water interface, changing groundwater salinity and mobilizing metals in soil. Socially vulnerable communities are more exposed to this risk at both the national scale and in a regional comparison with the San Francisco Bay Area. Methods Data Dryad This data set includes data from the California State Water Resources Control Board (WRCB), the California Department of Toxic Substances Control (DTSC), the USGS, the US EPA, and the US Census. National Assessment Data Processing: For this portion of the project, ArcGIS Pro and RStudio software applications were used. Data processing for superfund site contaminants in the text and supplementary materials was done in RStudio using R programming language. RStudio and R were also used to clean population data from the American Community Survey. Packages used include: Dplyr, data.table, and tidyverse to clean and organize data from the EPA and ACS. ArcGIS Pro was used to compute spatial data regarding sites in the risk zone and vulnerable populations. DEM data processed for each state removed any elevation data above 10m, keeping anything 10m and below. The Intersection tool was used to identify superfund sites within the 10m sea level rise risk zone. The Calculate Geometry tool was used to calculate the area within each coastal state that was occupied by the 10m SLR zone and used again to calculate the area of each superfund site. Summary Statistics were used to generate the total proportion of superfund site surface area / 10m SLR area for each state. To generate population estimates of socially vulnerable households in proximity to superfund sites, we followed methods similar to that of Carter and Kalman (2020). First, we generated buffers at the 1km, 3km, and 5km distance of superfund sites. Then, using Tabulate Intersection, the estimated population of each census block group within each buffer zone was calculated. Summary Statistics were used to generate total numbers for each state. Bay Area Data Processing: In this regional study, we compared the groundwater elevation projections by Befus et al (2020) to a combined dataset of contaminated sites that we built from two separate databases (Envirostor and GeoTracker) that are maintained by two independent agencies of the State of California (DTSC and WRCB). We used ArcGIS to manage both the groundwater surfaces, as raster files, from Befus et al (2020) and the State’s point datasets of street addresses for contaminated sites. We used SF BCDC (2020) as the source of social vulnerability rankings for census blocks, using block shapefiles from the US Census (ACS) dataset. In addition, we generated isolines that represent the magnitude of change in groundwater elevation in specific sea level rise scenarios. We compared these isolines of change in elevation to the USGS geological map of the San Francisco Bay region and noted that groundwater is predicted to rise farther inland where Holocene paleochannels meet artificial fill near the shoreline. We also used maps of historic baylands (altered by dikes and fill) from the San Francisco Estuary Institute (SFEI) to identify the number of contaminated sites over rising groundwater that are located on former mudflats and tidal marshes. The contaminated sites' data from the California State Water Resources Control Board (WRCB) and the Department of Toxic Substances (DTSC) was clipped to our study area of nine-bay area counties. The study area does not include the ocean shorelines or the north bay delta area because the water system dynamics differ in deltas. The data was cleaned of any duplicates within each dataset using the Find Identical and Delete Identical tools. Then duplicates between the two datasets were removed by running the intersect tool for the DTSC and WRCB point data. We chose this method over searching for duplicates by name because some sites change names when management is transferred from DTSC to WRCB. Lastly, the datasets were sorted into open and closed sites based on the DTSC and WRCB classifications which are shown in a table in the paper's supplemental material. To calculate areas of rising groundwater, we used data from the USGS paper “Projected groundwater head for coastal California using present-day and future sea-level rise scenarios” by Befus, K. M., Barnard, P., Hoover, D. J., & Erikson, L. (2020). We used the hydraulic conductivity of 1 condition (Kh1) to calculate areas of rising groundwater. We used the Raster Calculator to subtract the existing groundwater head from the groundwater head under a 1-meter of sea level rise scenario to find the areas where groundwater is rising. Using the Reclass Raster tool, we reclassified the data to give every cell with a value of 0.1016 meters (4”) or greater a value of 1. We chose 0.1016 because groundwater rise of that little can leach into pipes and infrastructure. We then used the Raster to Poly tool to generate polygons of areas of groundwater rise.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the United States population distribution across 18 age groups. It lists the population in each age group along with the percentage population relative of the total population for United States. The dataset can be utilized to understand the population distribution of United States by age. For example, using this dataset, we can identify the largest age group in United States.
Key observations
The largest age group in United States was for the group of age 30 to 34 years years with a population of 23.06 million (6.94%), according to the ACS 2019-2023 5-Year Estimates. At the same time, the smallest age group in United States was the 80 to 84 years years with a population of 6.34 million (1.91%). Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates
Age groups:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for United States Population by Age. You can refer the same here