62 datasets found
  1. 2023 Data Scientists Jobs Descriptions

    • kaggle.com
    Updated Feb 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Diego Silva França (2023). 2023 Data Scientists Jobs Descriptions [Dataset]. https://www.kaggle.com/datasets/diegosilvadefrana/2023-data-scientists-jobs-descriptions
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 1, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Diego Silva França
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset was obtained from the Google Jobs API through serpAPI and contains information about job offers for data scientists in companies based in the United States of America (USA). The data may include details such as job title, company name, location, job description, salary range, and other relevant information. The dataset is likely to be valuable for individuals seeking to understand the job market for data scientists in the USA and for companies looking to recruit data scientists. It may also be useful for researchers who are interested in exploring trends and patterns in the job market for data scientists. The data should be used with caution, as the API source may not cover all job offers in the USA and the information provided by the companies may not always be accurate or up-to-date.

  2. United States US: Total Researchers: Full-Time Equivalent

    • ceicdata.com
    Updated Mar 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com (2023). United States US: Total Researchers: Full-Time Equivalent [Dataset]. https://www.ceicdata.com/en/united-states/number-of-researchers-and-personnel-on-research-and-development-oecd-member-annual/us-total-researchers-fulltime-equivalent
    Explore at:
    Dataset updated
    Mar 15, 2023
    Dataset provided by
    CEIC Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Dec 1, 2010 - Dec 1, 2021
    Area covered
    United States
    Description

    United States US: Total Researchers: Full-Time Equivalent data was reported at 1,639,258.000 FTE in 2021. This records an increase from the previous number of 1,513,964.000 FTE for 2020. United States US: Total Researchers: Full-Time Equivalent data is updated yearly, averaging 998,340.036 FTE from Dec 1981 (Median) to 2021, with 41 observations. The data reached an all-time high of 1,639,258.000 FTE in 2021 and a record low of 531,938.478 FTE in 1981. United States US: Total Researchers: Full-Time Equivalent data remains active status in CEIC and is reported by Organisation for Economic Co-operation and Development. The data is categorized under Global Database’s United States – Table US.OECD.MSTI: Number of Researchers and Personnel on Research and Development: OECD Member: Annual.

    For the United States, from 2021 onwards, changes to the US BERD survey questionnaire allowed for more exhaustive identification of acquisition costs for ‘identifiable intangible assets’ used for R&D. This has resulted in a substantial increase in reported R&D capital expenditure within BERD. In the business sector, the funds from the rest of the world previously included in the business-financed BERD, are available separately from 2008. From 2006 onwards, GOVERD includes state government intramural performance (most of which being financed by the federal government and state government own funds). From 2016 onwards, PNPERD data are based on a new R&D performer survey. In the higher education sector all fields of SSH are included from 2003 onwards.

    Following a survey of federally-funded research and development centers (FFRDCs) in 2005, it was concluded that FFRDC R&D belongs in the government sector - rather than the sector of the FFRDC administrator, as had been reported in the past. R&D expenditures by FFRDCs were reclassified from the other three R&D performing sectors to the Government sector; previously published data were revised accordingly. Between 2003 and 2004, the method used to classify data by industry has been revised. This particularly affects the ISIC category “wholesale trade” and consequently the BERD for total services.

    U.S. R&D data are generally comparable, but there are some areas of underestimation:

    1. i) Up to 2008, Government sector R&D performance covers only federal government activities. That by State and local government establishments is excluded;
    2. ii) Except for the Government and the Business Enterprise sectors, the R&D data exclude most capital expenditures. For the Business Enterprise sector, depreciation is reported in place of gross capital expenditures up to 2014. Higher education (and national total) data were revised back to 1998 due to an improved methodology that corrects for double-counting of R&D funds passed between institutions.

    Breakdown by type of R&D (basic research, applied research, etc.) was also revised back to 1998 in the business enterprise and higher education sectors due to improved estimation procedures.

    The methodology for estimating researchers was changed as of 1985. In the Government, Higher Education and PNP sectors the data since then refer to employed doctoral scientists and engineers who report their primary work activity as research, development or the management of R&D, plus, for the Higher Education sector, the number of full-time equivalent graduate students with research assistantships averaging an estimated 50 % of their time engaged in R&D activities. As of 1985 researchers in the Government sector exclude military personnel. As of 1987, Higher education R&D personnel also include those who report their primary work activity as design.

    Due to lack of official data for the different employment sectors, the total researchers figure is an OECD estimate up to 2019. Comprehensive reporting of R&D personnel statistics by the United States has resumed with records available since 2020, reflecting the addition of official figures for the number of researchers and total R&D personnel for the higher education sector and the Private non-profit sector; as well as the number of researchers for the government sector. The new data revise downwards previous OECD estimates as the OECD extrapolation methods drawing on historical US data, required to produce a consistent OECD aggregate, appear to have previously overestimated the growth in the number of researchers in the higher education sector.

    Pre-production development is excluded from Defence GBARD (in accordance with the Frascati Manual) as of 2000. 2009 GBARD data also includes the one time incremental R&D funding legislated in the American Recovery and Reinvestment Act of 2009. Beginning with the 2000 GBARD data, budgets for capital expenditure – “R&D plant” in national terminology - are included. GBARD data for earlier years relate to budgets for current costs only.

  3. g

    Coronavirus (Covid-19) Data in the United States

    • github.com
    • openicpsr.org
    • +2more
    csv
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    New York Times, Coronavirus (Covid-19) Data in the United States [Dataset]. https://github.com/nytimes/covid-19-data
    Explore at:
    csvAvailable download formats
    Dataset provided by
    New York Times
    License

    https://github.com/nytimes/covid-19-data/blob/master/LICENSEhttps://github.com/nytimes/covid-19-data/blob/master/LICENSE

    Description

    The New York Times is releasing a series of data files with cumulative counts of coronavirus cases in the United States, at the state and county level, over time. We are compiling this time series data from state and local governments and health departments in an attempt to provide a complete record of the ongoing outbreak.

    Since the first reported coronavirus case in Washington State on Jan. 21, 2020, The Times has tracked cases of coronavirus in real time as they were identified after testing. Because of the widespread shortage of testing, however, the data is necessarily limited in the picture it presents of the outbreak.

    We have used this data to power our maps and reporting tracking the outbreak, and it is now being made available to the public in response to requests from researchers, scientists and government officials who would like access to the data to better understand the outbreak.

    The data begins with the first reported coronavirus case in Washington State on Jan. 21, 2020. We will publish regular updates to the data in this repository.

  4. USA Bureau of Labor Statistics

    • kaggle.com
    zip
    Updated Aug 30, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    US Bureau of Labor Statistics (2019). USA Bureau of Labor Statistics [Dataset]. https://www.kaggle.com/bls/bls
    Explore at:
    zip(0 bytes)Available download formats
    Dataset updated
    Aug 30, 2019
    Dataset provided by
    Bureau of Labor Statisticshttp://www.bls.gov/
    Authors
    US Bureau of Labor Statistics
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    The Bureau of Labor Statistics (BLS) is a unit of the United States Department of Labor. It is the principal fact-finding agency for the U.S. government in the broad field of labor economics and statistics and serves as a principal agency of the U.S. Federal Statistical System. The BLS is a governmental statistical agency that collects, processes, analyzes, and disseminates essential statistical data to the American public, the U.S. Congress, other Federal agencies, State and local governments, business, and labor representatives. Source: https://en.wikipedia.org/wiki/Bureau_of_Labor_Statistics

    Content

    Bureau of Labor Statistics including CPI (inflation), employment, unemployment, and wage data.

    Update Frequency: Monthly

    Querying BigQuery Tables

    Fork this kernel to get started.

    Acknowledgements

    https://bigquery.cloud.google.com/dataset/bigquery-public-data:bls

    https://cloud.google.com/bigquery/public-data/bureau-of-labor-statistics

    Dataset Source: http://www.bls.gov/data/

    This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source - http://www.data.gov/privacy-policy#data_policy - and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.

    Banner Photo by Clark Young from Unsplash.

    Inspiration

    What is the average annual inflation across all US Cities? What was the monthly unemployment rate (U3) in 2016? What are the top 10 hourly-waged types of work in Pittsburgh, PA for 2016?

  5. N

    Dataset for Science Hill, KY Census Bureau Demographics and Population...

    • neilsberg.com
    Updated Jul 24, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2024). Dataset for Science Hill, KY Census Bureau Demographics and Population Distribution Across Age // 2024 Edition [Dataset]. https://www.neilsberg.com/research/datasets/b7b31599-5460-11ee-804b-3860777c1fe6/
    Explore at:
    Dataset updated
    Jul 24, 2024
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Kentucky, Science Hill
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the Science Hill population by age. The dataset can be utilized to understand the age distribution and demographics of Science Hill.

    Content

    The dataset constitues the following three datasets

    • Science Hill, KY Age Group Population Dataset: A complete breakdown of Science Hill age demographics from 0 to 85 years, distributed across 18 age groups
    • Science Hill, KY Age Cohorts Dataset: Children, Working Adults, and Seniors in Science Hill - Population and Percentage Analysis
    • Science Hill, KY Population Pyramid Dataset: Age Groups, Male and Female Population, and Total Population for Demographics Analysis

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

  6. U

    Data from: Database for the U.S. Geological Survey Woods Hole Science...

    • data.usgs.gov
    • s.cnmilf.com
    • +1more
    Updated Jan 7, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Brian Buczkowski (2025). Database for the U.S. Geological Survey Woods Hole Science Center's marine sediment samples, including locations, sample data and collection information (SED_ARCHIVE) [Dataset]. https://data.usgs.gov/datacatalog/data/USGS:5359d475-defb-4a2c-9226-906a99616be0
    Explore at:
    Dataset updated
    Jan 7, 2025
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Authors
    Brian Buczkowski
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Time period covered
    2006
    Area covered
    Woods Hole
    Description

    The U.S. Geological Survey (USGS), Woods Hole Science Center (WHSC) has been an active member of the Woods Hole research community for over 40 years. In that time there have been many sediment collection projects conducted by USGS scientists and technicians for the research and study of seabed environments and processes. These samples are collected at sea or near shore and then brought back to the WHSC for study. While at the Center, samples are stored in ambient temperature, cold or freezing conditions, depending on the best mode of preparation for the study being conducted or the duration of storage planned for the samples. Recently, storage methods and available storage space have become a major concern at the WHSC. The shapefile sed_archive.shp, gives a geographical view of the samples in the WHSC's collections, and where they were collected along with images and hyperlinks to useful resources.

  7. Data Scientist Job Market in the U.S.

    • kaggle.com
    Updated Aug 31, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shanshan Lu (2018). Data Scientist Job Market in the U.S. [Dataset]. https://www.kaggle.com/sl6149/data-scientist-job-market-in-the-us/activity
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 31, 2018
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Shanshan Lu
    Area covered
    United States
    Description

    Context

    For those who are actively looking for data scientist jobs in the U.S., the best news this month is the LinkedIn Workforce Report August 2018. According to the report, there is a shortage of 151,717 people with data science skills, with particularly acute shortages in New York City, San Francisco Bay Area and Los Angeles.

    To help job hunters (including me) to better understand the job market, I scraped Indeed website and collected information of 7,000 data scientist jobs around the U.S. on August 3rd. The information that I collected are: Company Name, Position Name, Location, Job Description, and Number of Reviews of the Company.

    Content

    • alldata.csv If you want to explore the job market around the U.S., download this one because it aggregates all information and cleans the job description by removing the tags.
    • all other files. If you want to explore specific city or region, you can download any of them.

    Acknowledgements

    Special thanks to Indeed for not blocking me : )

    Inspiration

    • If you have no clue of where to start, check my blog for inspiration.
    • Link to my PowerPoint Slides for Presentation.
    • Link to my GitHub Code.
    • Reach me at sl6149@nyu.edu

    Possible Questions:

    1. Who gets hired? What kind of talent do employers want when they are hiring a data scientist?
    2. Which location has the most opportunities?
    3. What skills, tools, degrees or majors do employers want the most for data scientists?
    4. What's the difference between data scientist, data engineer and data analyst?
    5. Can you develop an efficient classification algorithm to differentiate the three job types above?
  8. 2025 Green Card Report for Data Scientist

    • myvisajobs.com
    Updated Jan 16, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MyVisaJobs (2025). 2025 Green Card Report for Data Scientist [Dataset]. https://www.myvisajobs.com/reports/green-card/job-title/data-scientist/
    Explore at:
    Dataset updated
    Jan 16, 2025
    Dataset provided by
    MyVisaJobs.com
    Authors
    MyVisaJobs
    License

    https://www.myvisajobs.com/terms-of-service/https://www.myvisajobs.com/terms-of-service/

    Variables measured
    Salary, Job Title, Petitions Filed
    Description

    A dataset that explores Green Card sponsorship trends, salary data, and employer insights for data scientist in the U.S.

  9. h

    Data-Science-Instruct-Dataset

    • huggingface.co
    Updated May 3, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mohammed Habib Ahmed (2025). Data-Science-Instruct-Dataset [Dataset]. https://huggingface.co/datasets/HabibAhmed/Data-Science-Instruct-Dataset
    Explore at:
    Dataset updated
    May 3, 2025
    Authors
    Mohammed Habib Ahmed
    Description

    HabibAhmed/Data-Science-Instruct-Dataset dataset hosted on Hugging Face and contributed by the HF Datasets community

  10. Model America: Data and Models for every U.S. Building

    • osti.gov
    • search.dataone.org
    Updated Apr 14, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Southwest Urban Corridor Integrated Field Laboratory (SW-IFL) (2021). Model America: Data and Models for every U.S. Building [Dataset]. http://doi.org/10.15485/2283980
    Explore at:
    Dataset updated
    Apr 14, 2021
    Dataset provided by
    Office of Sciencehttp://www.er.doe.gov/
    United States Department of Energyhttp://energy.gov/
    Southwest Urban Corridor Integrated Field Laboratory (SW-IFL)
    Area covered
    United States
    Description

    The 5-year goal of the “Model America” concept was to generate a model of every building in the United States. This data repository delivers on that goal with "Model America v1".Oak Ridge National Laboratory (ORNL) has developed the Automatic Building Energy Modeling (AutoBEM) software suite to process multiple types of data, extract building-specific descriptors, generate building energy models, and simulate them on High Performance Computing (HPC) resources. For more information, see AutoBEM-related publications (bit.ly/AutoBEM).There were 125,715,609 buildings detected in the United States. Of this number, 122,146,671 (97.2%) buildings resulted in a successful generation and simulation of a building energy model. This dataset includes the full 125 million buildings. Future updates may include additional buildings, data improvements, or other algorithmic model enhancements in "Model America v2".This dataset contains OSM and IDF zip files for every U.S. county. Each zip file contains the generated buildings from that county.The .csv input data contains the following data fields:1. ID - unique building ID2. Centroid - building center location in latitude/longitude (from Footprint2D)3. Footprint2D - building polygon of 2D footprint (lat1/lon1_lat2/lon2_...)4. State_abbr - state name5. Area - estimate of total conditioned floor area (ft2)6. Area2D - footprint area (ft2)7. Height - building height (ft)8. NumFloors - number of floors (above-grade)9. WWR_surfaces - percent of each facade (pair of points from Footprint2D) covered by fenestration/windows (average 14.5% for residential, 40% for commercial buildings)10. CZ - ASHRAE Climate Zone designation11. BuildingType - DOE prototype building designation (IECC=residential) as implemented by OpenStudio-standards12. Standard - building vintageThis data is made free and openly available in hopes of stimulating any simulation-informed use case. Data is provided as-is with no warranties, express or implied, regarding fitness for a particular purpose. We wish to thank our sponsors which include Oak Ridge National Laboratory (ORNL) Laboratory Directed Research and Development (LDRD), U.S. Dept. of Energy’s (DOE) Building Technologies Office (BTO), Office of Electricity (OE), Biological and Environmental Research (BER), and National Nuclear Security Administration (NNSA).

  11. Data from: MASTER: Airborne Science, Southwest US, May, 2011

    • catalog.data.gov
    • datasets.ai
    • +6more
    Updated Jun 28, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ORNL_DAAC (2025). MASTER: Airborne Science, Southwest US, May, 2011 [Dataset]. https://catalog.data.gov/dataset/master-airborne-science-southwest-us-may-2011-7dd49
    Explore at:
    Dataset updated
    Jun 28, 2025
    Dataset provided by
    Oak Ridge National Laboratory Distributed Active Archive Center
    Description

    This dataset includes Level 1B (L1B) and Level 2 (L2) data products from the MODIS/ASTER Airborne Simulator (MASTER) instrument. The spectral data were collected during five flights aboard a NASA ER-2 aircraft over southwestern U.S., from 2011-05-15 to 2011-05-23. This deployment was coordinated by NASA's Dryden Flight Research Center (DRFC), renamed Armstrong Flight Research Center in 2014, located in Edwards, California. Data products include L1B georeferenced multispectral imagery of calibrated radiance in 50 bands covering wavelengths of 0.460 to 12.879 micrometers at approximately 50-meter spatial resolution. Derived L2 data products are emissivity in 5 bands in thermal infrared range (8.58 to 12.13 micrometers) and land surface temperature. The L1B file format is HDF-4, and L2 products are provided in ENVI and KMZ formats. In addition, the dataset includes the flight path, spectral band information, instrument configuration, ancillary notes, and summary information for each flight, and browse images derived from each L1B data file.

  12. U

    Protected Areas Database of the United States (PAD-US) 3.0 - World Database...

    • data.usgs.gov
    • catalog.data.gov
    Updated May 16, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (USGS) Gap Analysis Project (GAP) (2023). Protected Areas Database of the United States (PAD-US) 3.0 - World Database on Protected Areas (WDPA) Submission [Dataset]. http://doi.org/10.5066/P9PSRGH4
    Explore at:
    Dataset updated
    May 16, 2023
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Authors
    U.S. Geological Survey (USGS) Gap Analysis Project (GAP)
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Time period covered
    2008 - 2022
    Area covered
    United States
    Description

    The United States Geological Survey (USGS) - Science Analytics and Synthesis (SAS) - Gap Analysis Project (GAP) manages the Protected Areas Database of the United States (PAD-US), an Arc10x geodatabase, that includes a full inventory of areas dedicated to the preservation of biological diversity and to other natural, recreation, historic, and cultural uses, managed for these purposes through legal or other effective means (www.usgs.gov/core-science-systems/science-analytics-and-synthesis/gap/science/protected-areas). The PAD-US is developed in partnership with many organizations, including coordination groups at the [U.S.] Federal level, lead organizations for each State, and a number of national and other non-governmental organizations whose work is closely related to the PAD-US. Learn more about the USGS PAD-US partners program here: www.usgs.gov/core-science-systems/science-analytics-and-synthesis/gap/science/pad-us-data-stewards. The United Nations Environmental Program - Worl ...

  13. Global Register of Introduced and Invasive Species - Alaska, United States...

    • gbif.org
    Updated Mar 1, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Annie Simpson; Elizabeth Sellers; Annie Simpson; Elizabeth Sellers (2023). Global Register of Introduced and Invasive Species - Alaska, United States (ver.2.0, 2022) [Dataset]. http://doi.org/10.5066/p9kfftod
    Explore at:
    Dataset updated
    Mar 1, 2023
    Dataset provided by
    Global Biodiversity Information Facilityhttps://www.gbif.org/
    Invasive Species Specialist Group ISSG
    Authors
    Annie Simpson; Elizabeth Sellers; Annie Simpson; Elizabeth Sellers
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Time period covered
    Jan 1, 1775 - Oct 23, 2022
    Area covered
    Description

    Introduced (non-native) species that becomes established may eventually become invasive, so tracking introduced species provides a baseline for effective modeling of species trends and interactions, geospatially and temporally.

    The umbrella dataset, called United States Register of Introduced and Invasive Species (US-RIIS), is comprised of three lists, one each for Alaska (AK, with 545 records, this dataset), Hawaii (HI, with 5,628 records), and the conterminous (or lower 48) United States (L48, with 8,527 records). Each list includes introduced (non-native), established (reproducing) taxa that: are, or may become, invasive (harmful) in the locality; are not known to be harmful there; and/or have been used for biological control in the locality.

    To be included in the GRIIS-AK, a taxon must be non-native everywhere in the locality and established (reproducing) anywhere in the locality. Native pest species are not included.

    Each record has information on taxonomy, a vernacular name, establishment means designation (introduced unintentionally, or assisted colonization), degree of establishment (established, invasive, or widespread invasive), hybrid status, pathway of introduction (where available), habitat (where available), whether a biocontrol species, dates of introduction (where available; currently 77% of the records for Alaska), associated taxa (where applicable), native and introduced distributions (where available), and citations for the authoritative source(s) from which this information is drawn. The umbrella dataset US-RIIS builds on a previous dataset, A Comprehensive List of Non-Native Species Established in Three Major Regions of the U.S.: Version 3.0 (Simpson et al., 2020, https://doi.org/10.5066/p9e5k160).

    There are 14,700 records in the master list (USRIISv2_MasterList) and 12,571 unique scientific names. The list is derived from more than 5,800 authoritative sources (USRIISv2_AuthorityReferences) and was reviewed by (or based on input from) more than 30 taxonomic experts and invasive species scientists.

    Many thanks to these reviewers and contributors: Coauthors Pam Fuller (USGS Emeritus), Kevin Faccenda (University of Hawaii), Neal Evenhuis (Bishop Museum), Janis Matsunaga (Hawaii Department of Agriculture), and Matt Bowser (US-Fish and Wildlife Service); contributors Rachael Blake (data science), National Socio-Environmental Synthesis Center (SESYNC); M. Lourdes Chamorro (Curculionidae), USDA-ARS Entomology; Meghan C. Eyler (data reviewer), US Fish & Wildlife Service; Danielle Froelich (Hawaiian botany), SWCA Environmental Consultants; Thomas Henry (Heteroptera), USDA-ARS Entomology; Sam James (Annelida), Maharishi University; Nancy Khan (Hawaiian botany), Smithsonian Institution; Alex Konstantinov (Chrysomelidae), USDA-ARS Entomology; Andrew P. Landsman (Arachnida), National Park Service, C&O Canal National Historical Park; Christopher Lepczyk (Vertebrata), Auburn University; Sandy Liebhold (Coleoptera), USDA-FS; Steven Lingafelter (Cerambycidae), USDA-APHIS; Walter Meshaka (Herpetology), State Museum of Pennsylvania; Gary L. Miller (Aphididae), USDA-ARS Entomology; Allen Norrbom (Tephritidae), USDA-ARS Entomology; Shyama Pagad (global invasive species), IUCN SSC Invasive Species Specialists' Group; John Reynolds (Annelida), Oligochaetology Laboratory; Alexander Salazar (Lycosidae), Miami University, Ohio; Elizabeth A. Sellers (data manager), USGS; Derek Sikes (Alaskan invertebrates), University of Alaska; Bruce A. Snyder (Annelida), Georgia College and State University; Alma Solis (Pyralid moths), USDS-ARS at the Smithsonian Institution; Rebecca Turner (data manager), Scion Inc., New Zealand; Darrell Ubick (Arachnida), Cal Academy; Warren Wagner (Hawaiian botany), Smithsonian Institution; Mark Wetzel (Annelida), Illinois Natural History Survey; and James D. Young (Lepidoptera), USDA-APHIS-PPQ-PHP. Our apologies to the many contributing experts we may have inadvertently omitted.

  14. Advancing translational research in environmental science: The role and...

    • catalog.data.gov
    • s.cnmilf.com
    Updated Apr 12, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2021). Advancing translational research in environmental science: The role and impact of social science [Dataset]. https://catalog.data.gov/dataset/advancing-translational-research-in-environmental-science-the-role-and-impact-of-social-sc
    Explore at:
    Dataset updated
    Apr 12, 2021
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    Our dataset are transcripts and codebooks for a focus group study. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. EPA cannot release CBI, or data protected by copyright, patent, or otherwise subject to trade secret restrictions. Request for access to CBI data may be directed to the dataset owner by an authorized person by contacting the party listed. It can be accessed through the following means: Contact Katie Williams, williams.kathleen@epa.gov. Format: The data are transcripts and protected by IRB approvals. This dataset is associated with the following publication: Eisenhauer, E., K. Williams, K. Margeson, S. Paczuski, K. Mulvaney, and M.C. Hano. Advancing translational research in environmental science: The role and impact of social science. Environmental Science & Policy. Elsevier Science Ltd, New York, NY, USA, 120: 165-172, (2021).

  15. d

    Best Practices for Upholding Indigenous Data Sovereignty: Insights and...

    • dataone.org
    • search.dataone.org
    Updated May 31, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    C. Turner; A. Canino; H. M. Garcia; L. M. Divine; B. R. Robson; N. Parlato (2024). Best Practices for Upholding Indigenous Data Sovereignty: Insights and recommendations for funding entities, government agencies, philantrhopic instituations and researchers [Dataset]. http://doi.org/10.24431/rw1k8e1
    Explore at:
    Dataset updated
    May 31, 2024
    Dataset provided by
    Research Workspace
    Authors
    C. Turner; A. Canino; H. M. Garcia; L. M. Divine; B. R. Robson; N. Parlato
    Description

    This report is a pdf file of the best practices determined by participants from the Indigenous Sentinels Network, a community-driven network coordinated by the Tribal government of St. Paul Island, Aleut Community of St. Paul Island Ecosystem Conservation Office and Axiom Data Science. The intended audience is any team responsible for data governance in environmental or social data work, in particular: funding agencies, government agencies, non-governmental organizations, individual researchers and technology companies. Recommendations cover Enhancing responsiveness and organizational capacity for funders, technology companies (Axiom Data Science is named), and specific technical strategies and enhancements for cyberinfrastructure (Again, Axiom Data Science is named). Resources and template data sharing agreement are included.

  16. d

    1 Arc-second Digital Elevation Models (DEMs) - USGS National Map 3DEP...

    • catalog.data.gov
    • data.usgs.gov
    • +1more
    Updated Mar 11, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). 1 Arc-second Digital Elevation Models (DEMs) - USGS National Map 3DEP Downloadable Data Collection [Dataset]. https://catalog.data.gov/dataset/1-arc-second-digital-elevation-models-dems-usgs-national-map-3dep-downloadable-data-collec
    Explore at:
    Dataset updated
    Mar 11, 2025
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Description

    This is a tiled collection of the 3D Elevation Program (3DEP) and is 1 arc-second (approximately 30 m) resolution. The elevations in this Digital Elevation Model (DEM) represent the topographic bare-earth surface. The 3DEP data holdings serve as the elevation layer of The National Map, and provide foundational elevation information for earth science studies and mapping applications in the United States. Scientists and resource managers use 3DEP data for hydrologic modeling, resource monitoring, mapping and visualization, and many other applications. The seamless 1 arc-second DEM layers are derived from diverse source data that are processed to a common coordinate system and unit of vertical measure. These data are distributed in geographic coordinates in units of decimal degrees, and in conformance with the North American Datum of 1983 (NAD 83). All elevation values are in meters and, over the continental United States, are referenced to the North American Vertical Datum of 1988 (NAVD88). The seamless 1 arc-second DEM layer provides coverage of the conterminous United States, Hawaii, Puerto Rico, other territorial islands, and much of Alaska and Canada. The seamless 1 arc-second DEM is available as pre-staged current and historical products tiled in GeoTIFF format. The seamless 1 arc-second DEM layer is updated continually as new data become available in the current folder. Previously created 1 degree blocks are retained in the historical folder with an appended date suffix (YYYYMMDD) when they were produced. Other 3DEP products are nationally seamless DEMs in resolutions of 1 and 1/3 arc-second. These seamless DEMs were referred to as the National Elevation Dataset (NED) from about 2000 through 2015 at which time they became the seamless DEM layers under the 3DEP program and the NED name and system were retired. Other 3DEP products include one-meter DEMs produced exclusively from high resolution light detection and ranging (lidar) source data and five-meter DEMs in Alaska as well as various source datasets including the lidar point cloud and interferometric synthetic aperture radar (Ifsar) digital surface models and intensity images. All 3DEP products are public domain.

  17. c

    Redfin usa properties dataset

    • crawlfeeds.com
    csv, zip
    Updated Jun 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Crawl Feeds (2025). Redfin usa properties dataset [Dataset]. https://crawlfeeds.com/datasets/redfin-usa-properties-dataset
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Jun 13, 2025
    Dataset authored and provided by
    Crawl Feeds
    License

    https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy

    Area covered
    United States
    Description

    Explore the Redfin USA Properties Dataset, available in CSV format. This extensive dataset provides valuable insights into the U.S. real estate market, including detailed property listings, prices, property types, and more across various states and cities. Perfect for those looking to conduct in-depth market analysis, real estate investment research, or financial forecasting.

    Key Features:

    • Comprehensive Property Data: Includes essential details such as listing prices, property types, square footage, and the number of bedrooms and bathrooms.
    • Geographic Coverage: Encompasses a wide range of U.S. states and cities, providing a broad view of the national real estate market.
    • Historical Trends: Analyze past market data to understand price movements, regional differences, and market trends over time.
    • Geo-Location Details: Enables spatial analysis and mapping by including precise geographical coordinates of properties.

    Who Can Benefit From This Dataset:

    • Real Estate Investors: Identify lucrative opportunities by analyzing property values, market trends, and regional price variations.
    • Market Analysts: Gain a deeper understanding of the U.S. housing market dynamics to inform research and reporting.
    • Data Scientists and Researchers: Leverage detailed real estate data for modeling, urban studies, or economic analysis.
    • Financial Analysts: Utilize the dataset for financial modeling, helping to predict market behavior and assess investment risks.

    Download the Redfin USA Properties Dataset to access essential information on the U.S. housing market, ideal for professionals in real estate, finance, and data analytics. Unlock key insights to make informed decisions in a dynamic market environment.

    Looking for deeper insights or a custom data pull from Redfin?
    Send a request with just one click and explore detailed property listings, price trends, and housing data.
    🔗 Request Redfin Real Estate Data

  18. N

    Median Household Income Variation by Family Size in Science Hill, KY:...

    • neilsberg.com
    csv, json
    Updated Jan 11, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2024). Median Household Income Variation by Family Size in Science Hill, KY: Comparative analysis across 7 household sizes [Dataset]. https://www.neilsberg.com/research/datasets/1b6b8902-73fd-11ee-949f-3860777c1fe6/
    Explore at:
    json, csvAvailable download formats
    Dataset updated
    Jan 11, 2024
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Kentucky, Science Hill
    Variables measured
    Household size, Median Household Income
    Measurement technique
    The data presented in this dataset is derived from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates. It delineates income distributions across 7 household sizes (mentioned above) following an initial analysis and categorization. Using this dataset, you can find out how household income varies with the size of the family unit. For additional information about these estimations, please contact us via email at research@neilsberg.com
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset presents median household incomes for various household sizes in Science Hill, KY, as reported by the U.S. Census Bureau. The dataset highlights the variation in median household income with the size of the family unit, offering valuable insights into economic trends and disparities within different household sizes, aiding in data analysis and decision-making.

    Key observations

    • Of the 7 household sizes (1 person to 7-or-more person households) reported by the census bureau, Science Hill did not include 1, 5, 6, or 7-person households. Across the different household sizes in Science Hill the mean income is $61,809, and the standard deviation is $19,517. The coefficient of variation (CV) is 31.58%. This high CV indicates high relative variability, suggesting that the incomes vary significantly across different sizes of households.
    • In the most recent year, 2021, The smallest household size for which the bureau reported a median household income was 2-person households, with an income of $47,290. It then further increased to $83,996 for 4-person households, the largest household size for which the bureau reported a median household income.

    https://i.neilsberg.com/ch/science-hill-ky-median-household-income-by-household-size.jpeg" alt="Science Hill, KY median household income, by household size (in 2022 inflation-adjusted dollars)">

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

    Household Sizes:

    • 1-person households
    • 2-person households
    • 3-person households
    • 4-person households
    • 5-person households
    • 6-person households
    • 7-or-more-person households

    Variables / Data Columns

    • Household Size: This column showcases 7 household sizes ranging from 1-person households to 7-or-more-person households (As mentioned above).
    • Median Household Income: Median household income, in 2022 inflation-adjusted dollars for the specific household size.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Science Hill median household income. You can refer the same here

  19. Global advanced analytics and data science software market share 2025

    • statista.com
    Updated Jun 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Global advanced analytics and data science software market share 2025 [Dataset]. https://www.statista.com/statistics/1258535/advanced-analytics-data-science-market-share-technology-worldwide/
    Explore at:
    Dataset updated
    Jun 30, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2025
    Area covered
    Worldwide
    Description

    MATLAB led the global advanced analytics and data science software industry in 2025 with a market share of ***** percent. First launched in 1984, MATLAB is developed by the U.S. firm MathWorks.

  20. A

    U.S. Geological Survey Oceanographic Time Series Data Collection

    • data.amerigeoss.org
    • data.usgs.gov
    • +4more
    xml
    Updated Aug 12, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    United States (2022). U.S. Geological Survey Oceanographic Time Series Data Collection [Dataset]. https://data.amerigeoss.org/dataset/u-s-geological-survey-oceanographic-time-series-data-collection-6fd4d
    Explore at:
    xmlAvailable download formats
    Dataset updated
    Aug 12, 2022
    Dataset provided by
    United States
    Description

    The oceanographic time series data collected by U.S. Geological Survey scientists and collaborators are served in an online database at http://stellwagen.er.usgs.gov/index.html. These data were collected as part of research experiments investigating circulation and sediment transport in the coastal ocean. The experiments (projects, research programs) are typically one month to several years long and have been carried out since 1975. New experiments will be conducted, and the data from them will be added to the collection. As of 2016, all but one of the experiments were conducted in waters abutting the U.S. coast; the exception was conducted in the Adriatic Sea. Measurements acquired vary by site and experiment; they usually include current velocity, wave statistics, water temperature, salinity, pressure, turbidity, and light transmission from one or more depths over a time period. The measurements are concentrated near the sea floor but may also include data from the water column. The user interface provides an interactive map, a tabular summary of the experiments, and a separate page for each experiment. Each experiment page has documentation and maps that provide details of what data were collected at each site. Links to related publications with additional information about the research are also provided. The data are stored in Network Common Data Format (netCDF) files using the Equatorial Pacific Information Collection (EPIC) conventions defined by the National Oceanic and Atmospheric Administration (NOAA) Pacific Marine Environmental Laboratory. NetCDF is a general, self-documenting, machine-independent, open source data format created and supported by the University Corporation for Atmospheric Research (UCAR). EPIC is an early set of standards designed to allow researchers from different organizations to share oceanographic data. The files may be downloaded or accessed online using the Open-source Project for a Network Data Access Protocol (OPeNDAP). The OPeNDAP framework allows users to access data from anywhere on the Internet using a variety of Web services including Thematic Realtime Environmental Distributed Data Services (THREDDS). A subset of the data compliant with the Climate and Forecast convention (CF, currently version 1.6) is also available.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Diego Silva França (2023). 2023 Data Scientists Jobs Descriptions [Dataset]. https://www.kaggle.com/datasets/diegosilvadefrana/2023-data-scientists-jobs-descriptions
Organization logo

2023 Data Scientists Jobs Descriptions

An Insight into the Job Market for Data Scientist in the United States

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 1, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Diego Silva França
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

This dataset was obtained from the Google Jobs API through serpAPI and contains information about job offers for data scientists in companies based in the United States of America (USA). The data may include details such as job title, company name, location, job description, salary range, and other relevant information. The dataset is likely to be valuable for individuals seeking to understand the job market for data scientists in the USA and for companies looking to recruit data scientists. It may also be useful for researchers who are interested in exploring trends and patterns in the job market for data scientists. The data should be used with caution, as the API source may not cover all job offers in the USA and the information provided by the companies may not always be accurate or up-to-date.

Search
Clear search
Close search
Google apps
Main menu