https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset was obtained from the Google Jobs API through serpAPI and contains information about job offers for data scientists in companies based in the United States of America (USA). The data may include details such as job title, company name, location, job description, salary range, and other relevant information. The dataset is likely to be valuable for individuals seeking to understand the job market for data scientists in the USA and for companies looking to recruit data scientists. It may also be useful for researchers who are interested in exploring trends and patterns in the job market for data scientists. The data should be used with caution, as the API source may not cover all job offers in the USA and the information provided by the companies may not always be accurate or up-to-date.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
United States US: Total Researchers: Full-Time Equivalent data was reported at 1,639,258.000 FTE in 2021. This records an increase from the previous number of 1,513,964.000 FTE for 2020. United States US: Total Researchers: Full-Time Equivalent data is updated yearly, averaging 998,340.036 FTE from Dec 1981 (Median) to 2021, with 41 observations. The data reached an all-time high of 1,639,258.000 FTE in 2021 and a record low of 531,938.478 FTE in 1981. United States US: Total Researchers: Full-Time Equivalent data remains active status in CEIC and is reported by Organisation for Economic Co-operation and Development. The data is categorized under Global Database’s United States – Table US.OECD.MSTI: Number of Researchers and Personnel on Research and Development: OECD Member: Annual.
For the United States, from 2021 onwards, changes to the US BERD survey questionnaire allowed for more exhaustive identification of acquisition costs for ‘identifiable intangible assets’ used for R&D. This has resulted in a substantial increase in reported R&D capital expenditure within BERD. In the business sector, the funds from the rest of the world previously included in the business-financed BERD, are available separately from 2008. From 2006 onwards, GOVERD includes state government intramural performance (most of which being financed by the federal government and state government own funds). From 2016 onwards, PNPERD data are based on a new R&D performer survey. In the higher education sector all fields of SSH are included from 2003 onwards.
Following a survey of federally-funded research and development centers (FFRDCs) in 2005, it was concluded that FFRDC R&D belongs in the government sector - rather than the sector of the FFRDC administrator, as had been reported in the past. R&D expenditures by FFRDCs were reclassified from the other three R&D performing sectors to the Government sector; previously published data were revised accordingly. Between 2003 and 2004, the method used to classify data by industry has been revised. This particularly affects the ISIC category “wholesale trade” and consequently the BERD for total services.
U.S. R&D data are generally comparable, but there are some areas of underestimation:
Breakdown by type of R&D (basic research, applied research, etc.) was also revised back to 1998 in the business enterprise and higher education sectors due to improved estimation procedures.
The methodology for estimating researchers was changed as of 1985. In the Government, Higher Education and PNP sectors the data since then refer to employed doctoral scientists and engineers who report their primary work activity as research, development or the management of R&D, plus, for the Higher Education sector, the number of full-time equivalent graduate students with research assistantships averaging an estimated 50 % of their time engaged in R&D activities. As of 1985 researchers in the Government sector exclude military personnel. As of 1987, Higher education R&D personnel also include those who report their primary work activity as design.
Due to lack of official data for the different employment sectors, the total researchers figure is an OECD estimate up to 2019. Comprehensive reporting of R&D personnel statistics by the United States has resumed with records available since 2020, reflecting the addition of official figures for the number of researchers and total R&D personnel for the higher education sector and the Private non-profit sector; as well as the number of researchers for the government sector. The new data revise downwards previous OECD estimates as the OECD extrapolation methods drawing on historical US data, required to produce a consistent OECD aggregate, appear to have previously overestimated the growth in the number of researchers in the higher education sector.
Pre-production development is excluded from Defence GBARD (in accordance with the Frascati Manual) as of 2000. 2009 GBARD data also includes the one time incremental R&D funding legislated in the American Recovery and Reinvestment Act of 2009. Beginning with the 2000 GBARD data, budgets for capital expenditure – “R&D plant” in national terminology - are included. GBARD data for earlier years relate to budgets for current costs only.
https://github.com/nytimes/covid-19-data/blob/master/LICENSEhttps://github.com/nytimes/covid-19-data/blob/master/LICENSE
The New York Times is releasing a series of data files with cumulative counts of coronavirus cases in the United States, at the state and county level, over time. We are compiling this time series data from state and local governments and health departments in an attempt to provide a complete record of the ongoing outbreak.
Since the first reported coronavirus case in Washington State on Jan. 21, 2020, The Times has tracked cases of coronavirus in real time as they were identified after testing. Because of the widespread shortage of testing, however, the data is necessarily limited in the picture it presents of the outbreak.
We have used this data to power our maps and reporting tracking the outbreak, and it is now being made available to the public in response to requests from researchers, scientists and government officials who would like access to the data to better understand the outbreak.
The data begins with the first reported coronavirus case in Washington State on Jan. 21, 2020. We will publish regular updates to the data in this repository.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The Bureau of Labor Statistics (BLS) is a unit of the United States Department of Labor. It is the principal fact-finding agency for the U.S. government in the broad field of labor economics and statistics and serves as a principal agency of the U.S. Federal Statistical System. The BLS is a governmental statistical agency that collects, processes, analyzes, and disseminates essential statistical data to the American public, the U.S. Congress, other Federal agencies, State and local governments, business, and labor representatives. Source: https://en.wikipedia.org/wiki/Bureau_of_Labor_Statistics
Bureau of Labor Statistics including CPI (inflation), employment, unemployment, and wage data.
Update Frequency: Monthly
Fork this kernel to get started.
https://bigquery.cloud.google.com/dataset/bigquery-public-data:bls
https://cloud.google.com/bigquery/public-data/bureau-of-labor-statistics
Dataset Source: http://www.bls.gov/data/
This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source - http://www.data.gov/privacy-policy#data_policy - and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.
Banner Photo by Clark Young from Unsplash.
What is the average annual inflation across all US Cities? What was the monthly unemployment rate (U3) in 2016? What are the top 10 hourly-waged types of work in Pittsburgh, PA for 2016?
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the Science Hill population by age. The dataset can be utilized to understand the age distribution and demographics of Science Hill.
The dataset constitues the following three datasets
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
The U.S. Geological Survey (USGS), Woods Hole Science Center (WHSC) has been an active member of the Woods Hole research community for over 40 years. In that time there have been many sediment collection projects conducted by USGS scientists and technicians for the research and study of seabed environments and processes. These samples are collected at sea or near shore and then brought back to the WHSC for study. While at the Center, samples are stored in ambient temperature, cold or freezing conditions, depending on the best mode of preparation for the study being conducted or the duration of storage planned for the samples. Recently, storage methods and available storage space have become a major concern at the WHSC. The shapefile sed_archive.shp, gives a geographical view of the samples in the WHSC's collections, and where they were collected along with images and hyperlinks to useful resources.
For those who are actively looking for data scientist jobs in the U.S., the best news this month is the LinkedIn Workforce Report August 2018. According to the report, there is a shortage of 151,717 people with data science skills, with particularly acute shortages in New York City, San Francisco Bay Area and Los Angeles.
To help job hunters (including me) to better understand the job market, I scraped Indeed website and collected information of 7,000 data scientist jobs around the U.S. on August 3rd. The information that I collected are: Company Name, Position Name, Location, Job Description, and Number of Reviews of the Company.
Special thanks to Indeed for not blocking me : )
Possible Questions:
https://www.myvisajobs.com/terms-of-service/https://www.myvisajobs.com/terms-of-service/
A dataset that explores Green Card sponsorship trends, salary data, and employer insights for data scientist in the U.S.
HabibAhmed/Data-Science-Instruct-Dataset dataset hosted on Hugging Face and contributed by the HF Datasets community
The 5-year goal of the “Model America” concept was to generate a model of every building in the United States. This data repository delivers on that goal with "Model America v1".Oak Ridge National Laboratory (ORNL) has developed the Automatic Building Energy Modeling (AutoBEM) software suite to process multiple types of data, extract building-specific descriptors, generate building energy models, and simulate them on High Performance Computing (HPC) resources. For more information, see AutoBEM-related publications (bit.ly/AutoBEM).There were 125,715,609 buildings detected in the United States. Of this number, 122,146,671 (97.2%) buildings resulted in a successful generation and simulation of a building energy model. This dataset includes the full 125 million buildings. Future updates may include additional buildings, data improvements, or other algorithmic model enhancements in "Model America v2".This dataset contains OSM and IDF zip files for every U.S. county. Each zip file contains the generated buildings from that county.The .csv input data contains the following data fields:1. ID - unique building ID2. Centroid - building center location in latitude/longitude (from Footprint2D)3. Footprint2D - building polygon of 2D footprint (lat1/lon1_lat2/lon2_...)4. State_abbr - state name5. Area - estimate of total conditioned floor area (ft2)6. Area2D - footprint area (ft2)7. Height - building height (ft)8. NumFloors - number of floors (above-grade)9. WWR_surfaces - percent of each facade (pair of points from Footprint2D) covered by fenestration/windows (average 14.5% for residential, 40% for commercial buildings)10. CZ - ASHRAE Climate Zone designation11. BuildingType - DOE prototype building designation (IECC=residential) as implemented by OpenStudio-standards12. Standard - building vintageThis data is made free and openly available in hopes of stimulating any simulation-informed use case. Data is provided as-is with no warranties, express or implied, regarding fitness for a particular purpose. We wish to thank our sponsors which include Oak Ridge National Laboratory (ORNL) Laboratory Directed Research and Development (LDRD), U.S. Dept. of Energy’s (DOE) Building Technologies Office (BTO), Office of Electricity (OE), Biological and Environmental Research (BER), and National Nuclear Security Administration (NNSA).
This dataset includes Level 1B (L1B) and Level 2 (L2) data products from the MODIS/ASTER Airborne Simulator (MASTER) instrument. The spectral data were collected during five flights aboard a NASA ER-2 aircraft over southwestern U.S., from 2011-05-15 to 2011-05-23. This deployment was coordinated by NASA's Dryden Flight Research Center (DRFC), renamed Armstrong Flight Research Center in 2014, located in Edwards, California. Data products include L1B georeferenced multispectral imagery of calibrated radiance in 50 bands covering wavelengths of 0.460 to 12.879 micrometers at approximately 50-meter spatial resolution. Derived L2 data products are emissivity in 5 bands in thermal infrared range (8.58 to 12.13 micrometers) and land surface temperature. The L1B file format is HDF-4, and L2 products are provided in ENVI and KMZ formats. In addition, the dataset includes the flight path, spectral band information, instrument configuration, ancillary notes, and summary information for each flight, and browse images derived from each L1B data file.
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
The United States Geological Survey (USGS) - Science Analytics and Synthesis (SAS) - Gap Analysis Project (GAP) manages the Protected Areas Database of the United States (PAD-US), an Arc10x geodatabase, that includes a full inventory of areas dedicated to the preservation of biological diversity and to other natural, recreation, historic, and cultural uses, managed for these purposes through legal or other effective means (www.usgs.gov/core-science-systems/science-analytics-and-synthesis/gap/science/protected-areas). The PAD-US is developed in partnership with many organizations, including coordination groups at the [U.S.] Federal level, lead organizations for each State, and a number of national and other non-governmental organizations whose work is closely related to the PAD-US. Learn more about the USGS PAD-US partners program here: www.usgs.gov/core-science-systems/science-analytics-and-synthesis/gap/science/pad-us-data-stewards. The United Nations Environmental Program - Worl ...
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Introduced (non-native) species that becomes established may eventually become invasive, so tracking introduced species provides a baseline for effective modeling of species trends and interactions, geospatially and temporally.
The umbrella dataset, called United States Register of Introduced and Invasive Species (US-RIIS), is comprised of three lists, one each for Alaska (AK, with 545 records, this dataset), Hawaii (HI, with 5,628 records), and the conterminous (or lower 48) United States (L48, with 8,527 records). Each list includes introduced (non-native), established (reproducing) taxa that: are, or may become, invasive (harmful) in the locality; are not known to be harmful there; and/or have been used for biological control in the locality.
To be included in the GRIIS-AK, a taxon must be non-native everywhere in the locality and established (reproducing) anywhere in the locality. Native pest species are not included.
Each record has information on taxonomy, a vernacular name, establishment means designation (introduced unintentionally, or assisted colonization), degree of establishment (established, invasive, or widespread invasive), hybrid status, pathway of introduction (where available), habitat (where available), whether a biocontrol species, dates of introduction (where available; currently 77% of the records for Alaska), associated taxa (where applicable), native and introduced distributions (where available), and citations for the authoritative source(s) from which this information is drawn. The umbrella dataset US-RIIS builds on a previous dataset, A Comprehensive List of Non-Native Species Established in Three Major Regions of the U.S.: Version 3.0 (Simpson et al., 2020, https://doi.org/10.5066/p9e5k160).
There are 14,700 records in the master list (USRIISv2_MasterList) and 12,571 unique scientific names. The list is derived from more than 5,800 authoritative sources (USRIISv2_AuthorityReferences) and was reviewed by (or based on input from) more than 30 taxonomic experts and invasive species scientists.
Many thanks to these reviewers and contributors: Coauthors Pam Fuller (USGS Emeritus), Kevin Faccenda (University of Hawaii), Neal Evenhuis (Bishop Museum), Janis Matsunaga (Hawaii Department of Agriculture), and Matt Bowser (US-Fish and Wildlife Service); contributors Rachael Blake (data science), National Socio-Environmental Synthesis Center (SESYNC); M. Lourdes Chamorro (Curculionidae), USDA-ARS Entomology; Meghan C. Eyler (data reviewer), US Fish & Wildlife Service; Danielle Froelich (Hawaiian botany), SWCA Environmental Consultants; Thomas Henry (Heteroptera), USDA-ARS Entomology; Sam James (Annelida), Maharishi University; Nancy Khan (Hawaiian botany), Smithsonian Institution; Alex Konstantinov (Chrysomelidae), USDA-ARS Entomology; Andrew P. Landsman (Arachnida), National Park Service, C&O Canal National Historical Park; Christopher Lepczyk (Vertebrata), Auburn University; Sandy Liebhold (Coleoptera), USDA-FS; Steven Lingafelter (Cerambycidae), USDA-APHIS; Walter Meshaka (Herpetology), State Museum of Pennsylvania; Gary L. Miller (Aphididae), USDA-ARS Entomology; Allen Norrbom (Tephritidae), USDA-ARS Entomology; Shyama Pagad (global invasive species), IUCN SSC Invasive Species Specialists' Group; John Reynolds (Annelida), Oligochaetology Laboratory; Alexander Salazar (Lycosidae), Miami University, Ohio; Elizabeth A. Sellers (data manager), USGS; Derek Sikes (Alaskan invertebrates), University of Alaska; Bruce A. Snyder (Annelida), Georgia College and State University; Alma Solis (Pyralid moths), USDS-ARS at the Smithsonian Institution; Rebecca Turner (data manager), Scion Inc., New Zealand; Darrell Ubick (Arachnida), Cal Academy; Warren Wagner (Hawaiian botany), Smithsonian Institution; Mark Wetzel (Annelida), Illinois Natural History Survey; and James D. Young (Lepidoptera), USDA-APHIS-PPQ-PHP. Our apologies to the many contributing experts we may have inadvertently omitted.
Our dataset are transcripts and codebooks for a focus group study. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. EPA cannot release CBI, or data protected by copyright, patent, or otherwise subject to trade secret restrictions. Request for access to CBI data may be directed to the dataset owner by an authorized person by contacting the party listed. It can be accessed through the following means: Contact Katie Williams, williams.kathleen@epa.gov. Format: The data are transcripts and protected by IRB approvals. This dataset is associated with the following publication: Eisenhauer, E., K. Williams, K. Margeson, S. Paczuski, K. Mulvaney, and M.C. Hano. Advancing translational research in environmental science: The role and impact of social science. Environmental Science & Policy. Elsevier Science Ltd, New York, NY, USA, 120: 165-172, (2021).
This report is a pdf file of the best practices determined by participants from the Indigenous Sentinels Network, a community-driven network coordinated by the Tribal government of St. Paul Island, Aleut Community of St. Paul Island Ecosystem Conservation Office and Axiom Data Science. The intended audience is any team responsible for data governance in environmental or social data work, in particular: funding agencies, government agencies, non-governmental organizations, individual researchers and technology companies. Recommendations cover Enhancing responsiveness and organizational capacity for funders, technology companies (Axiom Data Science is named), and specific technical strategies and enhancements for cyberinfrastructure (Again, Axiom Data Science is named). Resources and template data sharing agreement are included.
This is a tiled collection of the 3D Elevation Program (3DEP) and is 1 arc-second (approximately 30 m) resolution. The elevations in this Digital Elevation Model (DEM) represent the topographic bare-earth surface. The 3DEP data holdings serve as the elevation layer of The National Map, and provide foundational elevation information for earth science studies and mapping applications in the United States. Scientists and resource managers use 3DEP data for hydrologic modeling, resource monitoring, mapping and visualization, and many other applications. The seamless 1 arc-second DEM layers are derived from diverse source data that are processed to a common coordinate system and unit of vertical measure. These data are distributed in geographic coordinates in units of decimal degrees, and in conformance with the North American Datum of 1983 (NAD 83). All elevation values are in meters and, over the continental United States, are referenced to the North American Vertical Datum of 1988 (NAVD88). The seamless 1 arc-second DEM layer provides coverage of the conterminous United States, Hawaii, Puerto Rico, other territorial islands, and much of Alaska and Canada. The seamless 1 arc-second DEM is available as pre-staged current and historical products tiled in GeoTIFF format. The seamless 1 arc-second DEM layer is updated continually as new data become available in the current folder. Previously created 1 degree blocks are retained in the historical folder with an appended date suffix (YYYYMMDD) when they were produced. Other 3DEP products are nationally seamless DEMs in resolutions of 1 and 1/3 arc-second. These seamless DEMs were referred to as the National Elevation Dataset (NED) from about 2000 through 2015 at which time they became the seamless DEM layers under the 3DEP program and the NED name and system were retired. Other 3DEP products include one-meter DEMs produced exclusively from high resolution light detection and ranging (lidar) source data and five-meter DEMs in Alaska as well as various source datasets including the lidar point cloud and interferometric synthetic aperture radar (Ifsar) digital surface models and intensity images. All 3DEP products are public domain.
https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
Explore the Redfin USA Properties Dataset, available in CSV format. This extensive dataset provides valuable insights into the U.S. real estate market, including detailed property listings, prices, property types, and more across various states and cities. Perfect for those looking to conduct in-depth market analysis, real estate investment research, or financial forecasting.
Key Features:
Who Can Benefit From This Dataset:
Download the Redfin USA Properties Dataset to access essential information on the U.S. housing market, ideal for professionals in real estate, finance, and data analytics. Unlock key insights to make informed decisions in a dynamic market environment.
Looking for deeper insights or a custom data pull from Redfin?
Send a request with just one click and explore detailed property listings, price trends, and housing data.
🔗 Request Redfin Real Estate Data
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset presents median household incomes for various household sizes in Science Hill, KY, as reported by the U.S. Census Bureau. The dataset highlights the variation in median household income with the size of the family unit, offering valuable insights into economic trends and disparities within different household sizes, aiding in data analysis and decision-making.
Key observations
https://i.neilsberg.com/ch/science-hill-ky-median-household-income-by-household-size.jpeg" alt="Science Hill, KY median household income, by household size (in 2022 inflation-adjusted dollars)">
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.
Household Sizes:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Science Hill median household income. You can refer the same here
MATLAB led the global advanced analytics and data science software industry in 2025 with a market share of ***** percent. First launched in 1984, MATLAB is developed by the U.S. firm MathWorks.
The oceanographic time series data collected by U.S. Geological Survey scientists and collaborators are served in an online database at http://stellwagen.er.usgs.gov/index.html. These data were collected as part of research experiments investigating circulation and sediment transport in the coastal ocean. The experiments (projects, research programs) are typically one month to several years long and have been carried out since 1975. New experiments will be conducted, and the data from them will be added to the collection. As of 2016, all but one of the experiments were conducted in waters abutting the U.S. coast; the exception was conducted in the Adriatic Sea. Measurements acquired vary by site and experiment; they usually include current velocity, wave statistics, water temperature, salinity, pressure, turbidity, and light transmission from one or more depths over a time period. The measurements are concentrated near the sea floor but may also include data from the water column. The user interface provides an interactive map, a tabular summary of the experiments, and a separate page for each experiment. Each experiment page has documentation and maps that provide details of what data were collected at each site. Links to related publications with additional information about the research are also provided. The data are stored in Network Common Data Format (netCDF) files using the Equatorial Pacific Information Collection (EPIC) conventions defined by the National Oceanic and Atmospheric Administration (NOAA) Pacific Marine Environmental Laboratory. NetCDF is a general, self-documenting, machine-independent, open source data format created and supported by the University Corporation for Atmospheric Research (UCAR). EPIC is an early set of standards designed to allow researchers from different organizations to share oceanographic data. The files may be downloaded or accessed online using the Open-source Project for a Network Data Access Protocol (OPeNDAP). The OPeNDAP framework allows users to access data from anywhere on the Internet using a variety of Web services including Thematic Realtime Environmental Distributed Data Services (THREDDS). A subset of the data compliant with the Climate and Forecast convention (CF, currently version 1.6) is also available.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset was obtained from the Google Jobs API through serpAPI and contains information about job offers for data scientists in companies based in the United States of America (USA). The data may include details such as job title, company name, location, job description, salary range, and other relevant information. The dataset is likely to be valuable for individuals seeking to understand the job market for data scientists in the USA and for companies looking to recruit data scientists. It may also be useful for researchers who are interested in exploring trends and patterns in the job market for data scientists. The data should be used with caution, as the API source may not cover all job offers in the USA and the information provided by the companies may not always be accurate or up-to-date.