100+ datasets found
  1. Types of unique data points collection in selected iOS fitness apps 2024

    • statista.com
    Updated Feb 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Types of unique data points collection in selected iOS fitness apps 2024 [Dataset]. https://www.statista.com/statistics/1559485/collection-and-tracking-ios-fitness-apps/
    Explore at:
    Dataset updated
    Feb 25, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Dec 30, 2024
    Area covered
    Worldwide
    Description

    In 2024, the fitness app Strava had the largest number of collected data points that were linked to their app users. Out of the total 21 collected data types, 20 were linked to the users' identity, while two data points could potentially be used to track users. The Nike Training Club app was examined to collect four data points that potentially could help track users. Fitbit, Future Personal Training, and Fitness by Apple did not present any data point that could potentially track users.

  2. d

    Altosight | AI Custom Web Scraping Data | 100% Global | Free Unlimited Data...

    • datarade.ai
    .json, .csv, .xls
    Updated Sep 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Altosight (2024). Altosight | AI Custom Web Scraping Data | 100% Global | Free Unlimited Data Points | Bypassing All CAPTCHAs & Blocking Mechanisms | GDPR Compliant [Dataset]. https://datarade.ai/data-products/altosight-ai-custom-web-scraping-data-100-global-free-altosight
    Explore at:
    .json, .csv, .xlsAvailable download formats
    Dataset updated
    Sep 7, 2024
    Dataset authored and provided by
    Altosight
    Area covered
    Chile, Svalbard and Jan Mayen, Tajikistan, Guatemala, Singapore, Côte d'Ivoire, Greenland, Wallis and Futuna, Paraguay, Czech Republic
    Description

    Altosight | AI Custom Web Scraping Data

    ✦ Altosight provides global web scraping data services with AI-powered technology that bypasses CAPTCHAs, blocking mechanisms, and handles dynamic content.

    We extract data from marketplaces like Amazon, aggregators, e-commerce, and real estate websites, ensuring comprehensive and accurate results.

    ✦ Our solution offers free unlimited data points across any project, with no additional setup costs.

    We deliver data through flexible methods such as API, CSV, JSON, and FTP, all at no extra charge.

    ― Key Use Cases ―

    ➤ Price Monitoring & Repricing Solutions

    🔹 Automatic repricing, AI-driven repricing, and custom repricing rules 🔹 Receive price suggestions via API or CSV to stay competitive 🔹 Track competitors in real-time or at scheduled intervals

    ➤ E-commerce Optimization

    🔹 Extract product prices, reviews, ratings, images, and trends 🔹 Identify trending products and enhance your e-commerce strategy 🔹 Build dropshipping tools or marketplace optimization platforms with our data

    ➤ Product Assortment Analysis

    🔹 Extract the entire product catalog from competitor websites 🔹 Analyze product assortment to refine your own offerings and identify gaps 🔹 Understand competitor strategies and optimize your product lineup

    ➤ Marketplaces & Aggregators

    🔹 Crawl entire product categories and track best-sellers 🔹 Monitor position changes across categories 🔹 Identify which eRetailers sell specific brands and which SKUs for better market analysis

    ➤ Business Website Data

    🔹 Extract detailed company profiles, including financial statements, key personnel, industry reports, and market trends, enabling in-depth competitor and market analysis

    🔹 Collect customer reviews and ratings from business websites to analyze brand sentiment and product performance, helping businesses refine their strategies

    ➤ Domain Name Data

    🔹 Access comprehensive data, including domain registration details, ownership information, expiration dates, and contact information. Ideal for market research, brand monitoring, lead generation, and cybersecurity efforts

    ➤ Real Estate Data

    🔹 Access property listings, prices, and availability 🔹 Analyze trends and opportunities for investment or sales strategies

    ― Data Collection & Quality ―

    ► Publicly Sourced Data: Altosight collects web scraping data from publicly available websites, online platforms, and industry-specific aggregators

    ► AI-Powered Scraping: Our technology handles dynamic content, JavaScript-heavy sites, and pagination, ensuring complete data extraction

    ► High Data Quality: We clean and structure unstructured data, ensuring it is reliable, accurate, and delivered in formats such as API, CSV, JSON, and more

    ► Industry Coverage: We serve industries including e-commerce, real estate, travel, finance, and more. Our solution supports use cases like market research, competitive analysis, and business intelligence

    ► Bulk Data Extraction: We support large-scale data extraction from multiple websites, allowing you to gather millions of data points across industries in a single project

    ► Scalable Infrastructure: Our platform is built to scale with your needs, allowing seamless extraction for projects of any size, from small pilot projects to ongoing, large-scale data extraction

    ― Why Choose Altosight? ―

    ✔ Unlimited Data Points: Altosight offers unlimited free attributes, meaning you can extract as many data points from a page as you need without extra charges

    ✔ Proprietary Anti-Blocking Technology: Altosight utilizes proprietary techniques to bypass blocking mechanisms, including CAPTCHAs, Cloudflare, and other obstacles. This ensures uninterrupted access to data, no matter how complex the target websites are

    ✔ Flexible Across Industries: Our crawlers easily adapt across industries, including e-commerce, real estate, finance, and more. We offer customized data solutions tailored to specific needs

    ✔ GDPR & CCPA Compliance: Your data is handled securely and ethically, ensuring compliance with GDPR, CCPA and other regulations

    ✔ No Setup or Infrastructure Costs: Start scraping without worrying about additional costs. We provide a hassle-free experience with fast project deployment

    ✔ Free Data Delivery Methods: Receive your data via API, CSV, JSON, or FTP at no extra charge. We ensure seamless integration with your systems

    ✔ Fast Support: Our team is always available via phone and email, resolving over 90% of support tickets within the same day

    ― Custom Projects & Real-Time Data ―

    ✦ Tailored Solutions: Every business has unique needs, which is why Altosight offers custom data projects. Contact us for a feasibility analysis, and we’ll design a solution that fits your goals

    ✦ Real-Time Data: Whether you need real-time data delivery or scheduled updates, we provide the flexibility to receive data when you need it. Track price changes, monitor product trends, or gather...

  3. Types of unique data points collection in selected iOS weight loss apps 2025...

    • statista.com
    Updated Feb 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Types of unique data points collection in selected iOS weight loss apps 2025 [Dataset]. https://www.statista.com/statistics/1559523/collection-and-tracking-ios-nutrition-apps/
    Explore at:
    Dataset updated
    Feb 26, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jan 8, 2025
    Area covered
    Worldwide
    Description

    In 2024, the Calorie Counter app had the largest number of collected data points possibly linked to the user identity. Out of the total 22 collected data types, 20 were linked to the users' identity, while seven data points could potentially be used to track users. Calorie counting app Eato did not display any of the collected data types that could potentially be used to track users. The iOS mobile app for the Weight Watchers Program collected seven different data points that were not linked to users.

  4. Heidelberg Tributary Loading Program (HTLP) Dataset

    • zenodo.org
    • explore.openaire.eu
    • +1more
    bin, png
    Updated Jul 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NCWQR; NCWQR (2024). Heidelberg Tributary Loading Program (HTLP) Dataset [Dataset]. http://doi.org/10.5281/zenodo.6606950
    Explore at:
    bin, pngAvailable download formats
    Dataset updated
    Jul 16, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    NCWQR; NCWQR
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is updated more frequently and can be visualized on NCWQR's data portal.

    If you have any questions, please contact Dr. Laura Johnson or Dr. Nathan Manning.

    The National Center for Water Quality Research (NCWQR) is a research laboratory at Heidelberg University in Tiffin, Ohio, USA. Our primary research program is the Heidelberg Tributary Loading Program (HTLP), where we currently monitor water quality at 22 river locations throughout Ohio and Michigan, effectively covering ~half of the land area of Ohio. The goal of the program is to accurately measure the total amounts (loads) of pollutants exported from watersheds by rivers and streams. Thus these data are used to assess different sources (nonpoint vs point), forms, and timing of pollutant export from watersheds. The HTLP officially began with high-frequency monitoring for sediment and nutrients from the Sandusky and Maumee rivers in 1974, and has continually expanded since then.

    Each station where samples are collected for water quality is paired with a US Geological Survey gage for quantifying discharge (http://waterdata.usgs.gov/usa/nwis/rt). Our stations cover a wide range of watershed areas upstream of the sampling point from 11.0 km2 for the unnamed tributary to Lost Creek to 19,215 km2 for the Muskingum River. These rivers also drain a variety of land uses, though a majority of the stations drain over 50% row-crop agriculture.

    At most sampling stations, submersible pumps located on the stream bottom continuously pump water into sampling wells inside heated buildings where automatic samplers collect discrete samples (4 unrefrigerated samples/d at 6-h intervals, 1974–1987; 3 refrigerated samples/d at 8-h intervals, 1988-current). At weekly intervals the samples are returned to the NCWQR laboratories for analysis. When samples either have high turbidity from suspended solids or are collected during high flow conditions, all samples for each day are analyzed. As stream flows and/or turbidity decreases, analysis frequency shifts to one sample per day. At the River Raisin and Muskingum River, a cooperator collects a grab sample from a bridge at or near the USGS station approximately daily and all samples are analyzed. Each sample bottle contains sufficient volume to support analyses of total phosphorus (TP), dissolved reactive phosphorus (DRP), suspended solids (SS), total Kjeldahl nitrogen (TKN), ammonium-N (NH4), nitrate-N and nitrite-N (NO2+3), chloride, fluoride, and sulfate. Nitrate and nitrite are commonly added together when presented; henceforth we refer to the sum as nitrate.

    Upon return to the laboratory, all water samples are analyzed within 72h for the nutrients listed below using standard EPA methods. For dissolved nutrients, samples are filtered through a 0.45 um membrane filter prior to analysis. We currently use a Seal AutoAnalyzer 3 for DRP, silica, NH4, TP, and TKN colorimetry, and a DIONEX Ion Chromatograph with AG18 and AS18 columns for anions. Prior to 2014, we used a Seal TRAACs for all colorimetry.

    2017 Ohio EPA Project Study Plan and Quality Assurance Plan

    Project Study Plan

    Quality Assurance Plan

    Data quality control and data screening

    The data provided in the River Data files have all been screened by NCWQR staff. The purpose of the screening is to remove outliers that staff deem likely to reflect sampling or analytical errors rather than outliers that reflect the real variability in stream chemistry. Often, in the screening process, the causes of the outlier values can be determined and appropriate corrective actions taken. These may involve correction of sample concentrations or deletion of those data points.

    This micro-site contains data for approximately 126,000 water samples collected beginning in 1974. We cannot guarantee that each data point is free from sampling bias/error, analytical errors, or transcription errors. However, since its beginnings, the NCWQR has operated a substantial internal quality control program and has participated in numerous external quality control reviews and sample exchange programs. These programs have consistently demonstrated that data produced by the NCWQR is of high quality.

    A note on detection limits and zero and negative concentrations

    It is routine practice in analytical chemistry to determine method detection limits and/or limits of quantitation, below which analytical results are considered less reliable or unreliable. This is something that we also do as part of our standard procedures. Many laboratories, especially those associated with agencies such as the U.S. EPA, do not report individual values that are less than the detection limit, even if the analytical equipment returns such values. This is in part because as individual measurements they may not be considered valid under litigation.

    The measured concentration consists of the true but unknown concentration plus random instrument error, which is usually small compared to the range of expected environmental values. In a sample for which the true concentration is very small, perhaps even essentially zero, it is possible to obtain an analytical result of 0 or even a small negative concentration. Results of this sort are often “censored” and replaced with the statement “

    Censoring these low values creates a number of problems for data analysis. How do you take an average? If you leave out these numbers, you get a biased result because you did not toss out any other (higher) values. Even if you replace negative concentrations with 0, a bias ensues, because you’ve chopped off some portion of the lower end of the distribution of random instrument error.

    For these reasons, we do not censor our data. Values of -9 and -1 are used as missing value codes, but all other negative and zero concentrations are actual, valid results. Negative concentrations make no physical sense, but they make analytical and statistical sense. Users should be aware of this, and if necessary make their own decisions about how to use these values. Particularly if log transformations are to be used, some decision on the part of the user will be required.

    Analyte Detection Limits

    https://ncwqr.files.wordpress.com/2021/12/mdl-june-2019-epa-methods.jpg?w=1024

    For more information, please visit https://ncwqr.org/

  5. [Superseded] Intellectual Property Government Open Data 2019

    • researchdata.edu.au
    • data.gov.au
    Updated Jun 6, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    IP Australia (2019). [Superseded] Intellectual Property Government Open Data 2019 [Dataset]. https://researchdata.edu.au/superseded-intellectual-property-data-2019/2994670
    Explore at:
    Dataset updated
    Jun 6, 2019
    Dataset provided by
    Data.govhttps://data.gov/
    Authors
    IP Australia
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    What is IPGOD?\r

    The Intellectual Property Government Open Data (IPGOD) includes over 100 years of registry data on all intellectual property (IP) rights administered by IP Australia. It also has derived information about the applicants who filed these IP rights, to allow for research and analysis at the regional, business and individual level. This is the 2019 release of IPGOD.\r \r \r

    How do I use IPGOD?\r

    IPGOD is large, with millions of data points across up to 40 tables, making them too large to open with Microsoft Excel. Furthermore, analysis often requires information from separate tables which would need specialised software for merging. We recommend that advanced users interact with the IPGOD data using the right tools with enough memory and compute power. This includes a wide range of programming and statistical software such as Tableau, Power BI, Stata, SAS, R, Python, and Scalar.\r \r \r

    IP Data Platform\r

    IP Australia is also providing free trials to a cloud-based analytics platform with the capabilities to enable working with large intellectual property datasets, such as the IPGOD, through the web browser, without any installation of software. IP Data Platform\r \r

    References\r

    \r The following pages can help you gain the understanding of the intellectual property administration and processes in Australia to help your analysis on the dataset.\r \r * Patents\r * Trade Marks\r * Designs\r * Plant Breeder’s Rights\r \r \r

    Updates\r

    \r

    Tables and columns\r

    \r Due to the changes in our systems, some tables have been affected.\r \r * We have added IPGOD 225 and IPGOD 325 to the dataset!\r * The IPGOD 206 table is not available this year.\r * Many tables have been re-built, and as a result may have different columns or different possible values. Please check the data dictionary for each table before use.\r \r

    Data quality improvements\r

    \r Data quality has been improved across all tables.\r \r * Null values are simply empty rather than '31/12/9999'.\r * All date columns are now in ISO format 'yyyy-mm-dd'.\r * All indicator columns have been converted to Boolean data type (True/False) rather than Yes/No, Y/N, or 1/0.\r * All tables are encoded in UTF-8.\r * All tables use the backslash \ as the escape character.\r * The applicant name cleaning and matching algorithms have been updated. We believe that this year's method improves the accuracy of the matches. Please note that the "ipa_id" generated in IPGOD 2019 will not match with those in previous releases of IPGOD.

  6. Delaware River and Upper Bay Sediment Data

    • catalog.data.gov
    • fisheries.noaa.gov
    Updated Oct 31, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NOAA Office for Coastal Management (Point of Contact, Custodian) (2024). Delaware River and Upper Bay Sediment Data [Dataset]. https://catalog.data.gov/dataset/delaware-river-and-upper-bay-sediment-data1
    Explore at:
    Dataset updated
    Oct 31, 2024
    Dataset provided by
    National Oceanic and Atmospheric Administrationhttp://www.noaa.gov/
    Area covered
    Delaware River
    Description

    The area of coverage consists of 192 square miles of benthic habitat mapped from 2005 to 2007 in the Delaware River and Upper Delaware Bay. The bottom sediment map was constructed by the utilization of a Roxann Seabed Classification System and extensive sediment grab samples. Data was collected in a gridded trackline configuration, with tracklines spacing of 100 meters parallel to the shoreline and 200 meters perpendicular to the shoreline.This project is an extension of the work currently being performed in Delaware waters by DNREC's Delaware Coastal Program's Delaware Bay Benthic Mapping Project.The bottom sediment point data, which has been classified according to the existing benthic mapping Roxann box plot, are converted from a number that categorizes the point according to its corresponding box (in the Roxann) into a number which reflects the sediment properties of each box in relation to one another. A ranking scale is used to allow a statistical griding scheme to interpolate between sediment data points, while minimizing erroneous sediment classifications and allowing gradational sediment deposits to be gridded. A ranking scale from 0 to 28 was used for this project, with 0 representing the finest grained classifications (fluidized clay) and 28 representing the coarsest grained classifications (dense shell material). Table 1 illustrates the distribution of sediment classifications along the ranking scale, which takes into account the relation of sediment types and grain sizes to one another using both the Wentworth Scale and Shepard's classification system. Finer grains are more similar in their deposition environments, such as clay and silts, because they reflect similar current regimes, sorting, and reworking patterns (Poppe et al., 2003). While coarse sediments are much more dissimilar to finer grains, with respect to current velocities, sorting, and winnowing, the finer grains are much more closely related in their sediment diameters that the coarser grains as you increase in Phi size and/or diameter. These account for the close clustering of coarse grained deposit descriptions at the upper end of the ranking scale, while the finer grained sediments show a gradation as you increase in the rating scale.The bottom sediment data is gridded in Surfer 8, a surface and terrain modeling program, using block kriging and a nugget effect. This statistical griding technique estimates the average value of a variable within a prescribed local area (Isaaks and Srivastava, 1989). Block kriging utilizes the existing point data values, weights the values of the data depending upon the proximity to the point being estimated, to discretize the local area into an array of estimated data value points and then averaging those individual point estimates together to get an average estimated value over the area of interest (Isaaks and Srivastava, 1989). A variogram is constructed for the data, and the resultant spatial model that is developed from the variogram is used in the block kriging surface model to more accurately interpolate the sediment data . The fitted model was a nugget effect (with an error variance of 21.8%) and a linear model (with a slope of 0.00286 and an anisotropy of 1, which represents a complete lack of spatial correlation).The accuracy of the estimation is dependent upon the grid size of the area of interpolation, the size of each cell within the grid, and the number of discretized data points that are necessary to estimate the cells within that grid spacing. The grid size that was used to interpolate the bottom sediment maps was 442 lines x 454 lines, with a cell size of 44.93 m2. The nugget effect is added to allow the griding to assume there is very little, if any, lateral correlation or trends within the bottom sediment (Isaaks and Srivastava, 1989). The nugget effect model entails a complete lack of spatial correlation; the point data values at any particular location bear no similarity even to adjacent data values (Isaaks and Srivastava, 1989). Without the nugget effect the griding would assume that you could only have a linear progression of sediment types and would insert all the sediment types along the scale between two sediment types (i.e. silty fine to medium sands and fine to medium sand with varying amounts of pebbles would be inserted between fine sand and coarse sand even though that is not what is occurring along the bottom. The sediment data is gridded with no drift for the data interpolation, also helping to minimize erroneous classifications. Sediment Classification Ranking Sediment Description 0-11-2 Clay, 2-33-44-55-66-7 Silt, 7-88-9 Sandy Silts, 9-1010-11 Fine Sand, 11-1212-13 Silty Fine to Medium Sands, 13-14 Silty Medium Sand, 14-1515-16 Fine to Medium Sand, 16-1717-18 Fine to Medium Sand with abundant shell material and/or pebbles, 18-1919-20 Coarse Sand with varying amounts of pebbles, 20-2121-2222-23 Moderate Shell Material/Sandy Pebbles, 23-2424-2525-26 Abundant Shell Material/Gravel, 26-2727-28 Dense Oyster Shell Original contact information: Contact Name: Bartholomew Wilson Contact Org: Delaware DNREC Coastal Programs Phone: 302-739-9283

  7. Success.ai | LinkedIn Data | 700M Public Profiles & 70M Companies – Best...

    • datarade.ai
    Updated Jan 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Success.ai (2022). Success.ai | LinkedIn Data | 700M Public Profiles & 70M Companies – Best Price Guarantee [Dataset]. https://datarade.ai/data-products/success-ai-linkedin-data-700m-public-profiles-70m-compa-success-ai-294c
    Explore at:
    .bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
    Dataset updated
    Jan 1, 2022
    Dataset provided by
    Area covered
    Austria, Luxembourg, Singapore, Montserrat, Greenland, Mauritius, Saudi Arabia, Estonia, Virgin Islands (British), Mayotte
    Description

    Success.ai’s LinkedIn Data Solutions offer unparalleled access to a vast dataset of 700 million public LinkedIn profiles and 70 million LinkedIn company records, making it one of the most comprehensive and reliable LinkedIn datasets available on the market today. Our employee data and LinkedIn data are ideal for businesses looking to streamline recruitment efforts, build highly targeted lead lists, or develop personalized B2B marketing campaigns.

    Whether you’re looking for recruiting data, conducting investment research, or seeking to enrich your CRM systems with accurate and up-to-date LinkedIn profile data, Success.ai provides everything you need with pinpoint precision. By tapping into LinkedIn company data, you’ll have access to over 40 critical data points per profile, including education, professional history, and skills.

    Key Benefits of Success.ai’s LinkedIn Data: Our LinkedIn data solution offers more than just a dataset. With GDPR-compliant data, AI-enhanced accuracy, and a price match guarantee, Success.ai ensures you receive the highest-quality data at the best price in the market. Our datasets are delivered in Parquet format for easy integration into your systems, and with millions of profiles updated daily, you can trust that you’re always working with fresh, relevant data.

    Global Reach and Industry Coverage: Our LinkedIn data covers professionals across all industries and sectors, providing you with detailed insights into businesses around the world. Our geographic coverage spans 259M profiles in the United States, 22M in the United Kingdom, 27M in India, and thousands of profiles in regions such as Europe, Latin America, and Asia Pacific. With LinkedIn company data, you can access profiles of top companies from the United States (6M+), United Kingdom (2M+), and beyond, helping you scale your outreach globally.

    Why Choose Success.ai’s LinkedIn Data: Success.ai stands out for its tailored approach and white-glove service, making it easy for businesses to receive exactly the data they need without managing complex data platforms. Our dedicated Success Managers will curate and deliver your dataset based on your specific requirements, so you can focus on what matters most—reaching the right audience. Whether you’re sourcing employee data, LinkedIn profile data, or recruiting data, our service ensures a seamless experience with 99% data accuracy.

    • Best Price Guarantee: We offer unbeatable pricing on LinkedIn data, and we’ll match any competitor.
    • Global Scale: Access 700 million LinkedIn profiles and 70 million company records globally.
    • AI-Verified Accuracy: Enjoy 99% data accuracy through our advanced AI and manual validation processes.
    • Real-Time Data: Profiles are updated daily, ensuring you always have the most relevant insights.
    • Tailored Solutions: Get custom-curated LinkedIn data delivered directly, without managing platforms.
    • Ethically Sourced Data: Compliant with global privacy laws, ensuring responsible data usage.
    • Comprehensive Profiles: Over 40 data points per profile, including job titles, skills, and company details.
    • Wide Industry Coverage: Covering sectors from tech to finance across regions like the US, UK, Europe, and Asia.

    Key Use Cases:

    • Sales Prospecting and Lead Generation: Build targeted lead lists using LinkedIn company data and professional profiles, helping sales teams engage decision-makers at high-value accounts.
    • Recruitment and Talent Sourcing: Use LinkedIn profile data to identify and reach top candidates globally. Our employee data includes work history, skills, and education, providing all the details you need for successful recruitment.
    • Account-Based Marketing (ABM): Use our LinkedIn company data to tailor marketing campaigns to key accounts, making your outreach efforts more personalized and effective.
    • Investment Research & Due Diligence: Identify companies with strong growth potential using LinkedIn company data. Access key data points such as funding history, employee count, and company trends to fuel investment decisions.
    • Competitor Analysis: Stay ahead of your competition by tracking hiring trends, employee movement, and company growth through LinkedIn data. Use these insights to adjust your market strategy and improve your competitive positioning.
    • CRM Data Enrichment: Enhance your CRM systems with real-time updates from Success.ai’s LinkedIn data, ensuring that your sales and marketing teams are always working with accurate and up-to-date information.
    • Comprehensive Data Points for LinkedIn Profiles: Our LinkedIn profile data includes over 40 key data points for every individual and company, ensuring a complete understanding of each contact:

    LinkedIn URL: Access direct links to LinkedIn profiles for immediate insights. Full Name: Verified first and last names. Job Title: Current job titles, and prior experience. Company Information: Company name, LinkedIn URL, domain, and location. Work and Per...

  8. d

    2017 Countywide LiDAR Point Cloud

    • catalog.data.gov
    • datasets.ai
    • +2more
    Updated Sep 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lake County Illinois GIS (2022). 2017 Countywide LiDAR Point Cloud [Dataset]. https://catalog.data.gov/dataset/2017-countywide-lidar-point-cloud-638f8
    Explore at:
    Dataset updated
    Sep 1, 2022
    Dataset provided by
    Lake County Illinois GIS
    Description

    Click here to access the data directly from the Illinois State Geospatial Data Clearinghouse. These lidar data are processed Classified LAS 1.4 files, formatted to 2,117 individual 2500 ft x 2500 ft tiles; used to create Reflectance Images, 3D breaklines and hydro-flattened DEMs as necessary. Geographic Extent: Lake county, Illinois covering approximately 466 square miles. Dataset Description: WI Kenosha-Racine Counties and IL 4 County QL1 Lidar project called for the Planning, Acquisition, processing and derivative products of lidar data to be collected at a derived nominal pulse spacing (NPS) of 1 point every 0.35 meters. Project specifications are based on the U.S. Geological Survey National Geospatial Program Base Lidar Specification, Version 1.2. The data was developed based on a horizontal projection/datum of NAD83 (2011), State Plane, U.S Survey Feet and vertical datum of NAVD88 (GEOID12B), U.S. Survey Feet. Lidar data was delivered as processed Classified LAS 1.4 files, formatted to 2,117 individual 2500 ft x 2500 ft tiles, as tiled Reflectance Imagery, and as tiled bare earth DEMs; all tiled to the same 2500 ft x 2500 ft schema. Ground Conditions: Lidar was collected April-May 2017, while no snow was on the ground and rivers were at or below normal levels. In order to post process the lidar data to meet task order specifications and meet ASPRS vertical accuracy guidelines, Ayers established a total of 66 ground control points that were used to calibrate the lidar to known ground locations established throughout the WI Kenosha-Racine Counties and IL 4 County QL1 project area. An additional 195 independent accuracy checkpoints, 116 in Bare Earth and Urban landcovers (116 NVA points), 79 in Tall Grass and Brushland/Low Trees categories (79 VVA points), were used to assess the vertical accuracy of the data. These checkpoints were not used to calibrate or post process the data. Users should be aware that temporal changes may have occurred since this dataset was collected and that some parts of these data may no longer represent actual surface conditions. Users should not use these data for critical applications without a full awareness of its limitations. Acknowledgement of the U.S. Geological Survey would be appreciated for products derived from these data. These LAS data files include all data points collected. No points have been removed or excluded. A visual qualitative assessment was performed to ensure data completeness. No void areas or missing data exist. The raw point cloud is of good quality and data passes Non-Vegetated Vertical Accuracy specifications.Link Source: Illinois Geospatial Data Clearinghouse

  9. United States COVID-19 Community Levels by County

    • data.cdc.gov
    • data.virginia.gov
    • +1more
    application/rdfxml +5
    Updated Nov 2, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CDC COVID-19 Response (2023). United States COVID-19 Community Levels by County [Dataset]. https://data.cdc.gov/Public-Health-Surveillance/United-States-COVID-19-Community-Levels-by-County/3nnm-4jni
    Explore at:
    application/rdfxml, application/rssxml, csv, tsv, xml, jsonAvailable download formats
    Dataset updated
    Nov 2, 2023
    Dataset provided by
    Centers for Disease Control and Preventionhttp://www.cdc.gov/
    Authors
    CDC COVID-19 Response
    License

    https://www.usa.gov/government-workshttps://www.usa.gov/government-works

    Area covered
    United States
    Description

    Reporting of Aggregate Case and Death Count data was discontinued May 11, 2023, with the expiration of the COVID-19 public health emergency declaration. Although these data will continue to be publicly available, this dataset will no longer be updated.

    This archived public use dataset has 11 data elements reflecting United States COVID-19 community levels for all available counties.

    The COVID-19 community levels were developed using a combination of three metrics — new COVID-19 admissions per 100,000 population in the past 7 days, the percent of staffed inpatient beds occupied by COVID-19 patients, and total new COVID-19 cases per 100,000 population in the past 7 days. The COVID-19 community level was determined by the higher of the new admissions and inpatient beds metrics, based on the current level of new cases per 100,000 population in the past 7 days. New COVID-19 admissions and the percent of staffed inpatient beds occupied represent the current potential for strain on the health system. Data on new cases acts as an early warning indicator of potential increases in health system strain in the event of a COVID-19 surge.

    Using these data, the COVID-19 community level was classified as low, medium, or high.

    COVID-19 Community Levels were used to help communities and individuals make decisions based on their local context and their unique needs. Community vaccination coverage and other local information, like early alerts from surveillance, such as through wastewater or the number of emergency department visits for COVID-19, when available, can also inform decision making for health officials and individuals.

    For the most accurate and up-to-date data for any county or state, visit the relevant health department website. COVID Data Tracker may display data that differ from state and local websites. This can be due to differences in how data were collected, how metrics were calculated, or the timing of web updates.

    Archived Data Notes:

    This dataset was renamed from "United States COVID-19 Community Levels by County as Originally Posted" to "United States COVID-19 Community Levels by County" on March 31, 2022.

    March 31, 2022: Column name for county population was changed to “county_population”. No change was made to the data points previous released.

    March 31, 2022: New column, “health_service_area_population”, was added to the dataset to denote the total population in the designated Health Service Area based on 2019 Census estimate.

    March 31, 2022: FIPS codes for territories American Samoa, Guam, Commonwealth of the Northern Mariana Islands, and United States Virgin Islands were re-formatted to 5-digit numeric for records released on 3/3/2022 to be consistent with other records in the dataset.

    March 31, 2022: Changes were made to the text fields in variables “county”, “state”, and “health_service_area” so the formats are consistent across releases.

    March 31, 2022: The “%” sign was removed from the text field in column “covid_inpatient_bed_utilization”. No change was made to the data. As indicated in the column description, values in this column represent the percentage of staffed inpatient beds occupied by COVID-19 patients (7-day average).

    March 31, 2022: Data values for columns, “county_population”, “health_service_area_number”, and “health_service_area” were backfilled for records released on 2/24/2022. These columns were added since the week of 3/3/2022, thus the values were previously missing for records released the week prior.

    April 7, 2022: Updates made to data released on 3/24/2022 for Guam, Commonwealth of the Northern Mariana Islands, and United States Virgin Islands to correct a data mapping error.

    April 21, 2022: COVID-19 Community Level (CCL) data released for counties in Nebraska for the week of April 21, 2022 have 3 counties identified in the high category and 37 in the medium category. CDC has been working with state officials to verify the data submitted, as other data systems are not providing alerts for substantial increases in disease transmission or severity in the state.

    May 26, 2022: COVID-19 Community Level (CCL) data released for McCracken County, KY for the week of May 5, 2022 have been updated to correct a data processing error. McCracken County, KY should have appeared in the low community level category during the week of May 5, 2022. This correction is reflected in this update.

    May 26, 2022: COVID-19 Community Level (CCL) data released for several Florida counties for the week of May 19th, 2022, have been corrected for a data processing error. Of note, Broward, Miami-Dade, Palm Beach Counties should have appeared in the high CCL category, and Osceola County should have appeared in the medium CCL category. These corrections are reflected in this update.

    May 26, 2022: COVID-19 Community Level (CCL) data released for Orange County, New York for the week of May 26, 2022 displayed an erroneous case rate of zero and a CCL category of low due to a data source error. This county should have appeared in the medium CCL category.

    June 2, 2022: COVID-19 Community Level (CCL) data released for Tolland County, CT for the week of May 26, 2022 have been updated to correct a data processing error. Tolland County, CT should have appeared in the medium community level category during the week of May 26, 2022. This correction is reflected in this update.

    June 9, 2022: COVID-19 Community Level (CCL) data released for Tolland County, CT for the week of May 26, 2022 have been updated to correct a misspelling. The medium community level category for Tolland County, CT on the week of May 26, 2022 was misspelled as “meduim” in the data set. This correction is reflected in this update.

    June 9, 2022: COVID-19 Community Level (CCL) data released for Mississippi counties for the week of June 9, 2022 should be interpreted with caution due to a reporting cadence change over the Memorial Day holiday that resulted in artificially inflated case rates in the state.

    July 7, 2022: COVID-19 Community Level (CCL) data released for Rock County, Minnesota for the week of July 7, 2022 displayed an artificially low case rate and CCL category due to a data source error. This county should have appeared in the high CCL category.

    July 14, 2022: COVID-19 Community Level (CCL) data released for Massachusetts counties for the week of July 14, 2022 should be interpreted with caution due to a reporting cadence change that resulted in lower than expected case rates and CCL categories in the state.

    July 28, 2022: COVID-19 Community Level (CCL) data released for all Montana counties for the week of July 21, 2022 had case rates of 0 due to a reporting issue. The case rates have been corrected in this update.

    July 28, 2022: COVID-19 Community Level (CCL) data released for Alaska for all weeks prior to July 21, 2022 included non-resident cases. The case rates for the time series have been corrected in this update.

    July 28, 2022: A laboratory in Nevada reported a backlog of historic COVID-19 cases. As a result, the 7-day case count and rate will be inflated in Clark County, NV for the week of July 28, 2022.

    August 4, 2022: COVID-19 Community Level (CCL) data was updated on August 2, 2022 in error during performance testing. Data for the week of July 28, 2022 was changed during this update due to additional case and hospital data as a result of late reporting between July 28, 2022 and August 2, 2022. Since the purpose of this data set is to provide point-in-time views of COVID-19 Community Levels on Thursdays, any changes made to the data set during the August 2, 2022 update have been reverted in this update.

    August 4, 2022: COVID-19 Community Level (CCL) data for the week of July 28, 2022 for 8 counties in Utah (Beaver County, Daggett County, Duchesne County, Garfield County, Iron County, Kane County, Uintah County, and Washington County) case data was missing due to data collection issues. CDC and its partners have resolved the issue and the correction is reflected in this update.

    August 4, 2022: Due to a reporting cadence change, case rates for all Alabama counties will be lower than expected. As a result, the CCL levels published on August 4, 2022 should be interpreted with caution.

    August 11, 2022: COVID-19 Community Level (CCL) data for the week of August 4, 2022 for South Carolina have been updated to correct a data collection error that resulted in incorrect case data. CDC and its partners have resolved the issue and the correction is reflected in this update.

    August 18, 2022: COVID-19 Community Level (CCL) data for the week of August 11, 2022 for Connecticut have been updated to correct a data ingestion error that inflated the CT case rates. CDC, in collaboration with CT, has resolved the issue and the correction is reflected in this update.

    August 25, 2022: A laboratory in Tennessee reported a backlog of historic COVID-19 cases. As a result, the 7-day case count and rate may be inflated in many counties and the CCLs published on August 25, 2022 should be interpreted with caution.

    August 25, 2022: Due to a data source error, the 7-day case rate for St. Louis County, Missouri, is reported as zero in the COVID-19 Community Level data released on August 25, 2022. Therefore, the COVID-19 Community Level for this county should be interpreted with caution.

    September 1, 2022: Due to a reporting issue, case rates for all Nebraska counties will include 6 days of data instead of 7 days in the COVID-19 Community Level (CCL) data released on September 1, 2022. Therefore, the CCLs for all Nebraska counties should be interpreted with caution.

    September 8, 2022: Due to a data processing error, the case rate for Philadelphia County, Pennsylvania,

  10. Data from: A two-dimensional interpolation function for irregularly-spaced...

    • hosted-metadata.bgs.ac.uk
    Updated Jan 1, 1968
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    British Geological Survey (1968). A two-dimensional interpolation function for irregularly-spaced data [Dataset]. https://hosted-metadata.bgs.ac.uk/geonetwork/srv/api/records/f56eb439-5bd9-4111-990b-820a94ad9092?language=all
    Explore at:
    Dataset updated
    Jan 1, 1968
    Dataset provided by
    British Geological Surveyhttps://www.bgs.ac.uk/
    Harvard University
    Description

    This is a peer reviewed publication looking at methods of data interpretation relevant to geochemical datasets. in many fields using empirical areal data there arises a need for interpolating from irregularly-spaced data to produce a continuous surface. These irregularly spaced locations, hence referred to as "data points," may have diverse meanings: in meterology, weather observation stations; in geography, surveyed locations; in city and regional planning, centers of data-collection zones; in biology, observation locations. It is assumed that a unique number (such as rainfall in meteorology, or altitude in geography) is associated with each data point. In order to display these data in some type of contour map or perspective view, to compare them with data for the same region based on other data points, or to analyze them for extremes, gradients, or other purposes, it is extremely useful, if not essential, to define a continuous function fitting the given values exactly. Interpolated values over a fine grid may then be evaluated. In using such a function it is assumed that the original data are without error, or that compensation for error will be made after interpolation. In essence, an operational solution to the problem of two-dimensional interpolation from irregularly-spaced data points is desired. It is assumed that a finite number N of triplets (xi, Yi, gi) are given, where xi, Yi are the locational coordinates of the data point D;, and zi is the corresponding data value. Data point locations may not be coincident. An interpolation function z=f(x,y) to assign a value to any location P(x,y) in the plane is sought. This two-dimensional interpolation function is to be "smooth" (continuous and once differentiable), to pass through the specified points, (i.e., f(xi,Yi)=Zi), and to meet the user's intuitive expectations about the phenomenon under investigation. Furthermore, the function should be suitable for computer application at reasonable cost

    Website:

    http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.154.6880&rep=rep1&type=pdf

  11. Data from: Medicare Spending per Beneficiary

    • kaggle.com
    Updated Jan 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). Medicare Spending per Beneficiary [Dataset]. https://www.kaggle.com/datasets/thedevastator/medicare-spending-per-beneficiary
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 22, 2023
    Dataset provided by
    Kaggle
    Authors
    The Devastator
    Description

    Medicare Spending per Beneficiary

    Detailed Hospital Expense Breakdown

    By Health [source]

    About this dataset

    This file allows healthcare executives and analysts to make informed decisions regarding how well continued improvements are being made over time so that they can understand how efficient they are fulfilling treatments while staying within budgetary constraints. Additionally, it’ll also help them map out trends amongst different hospitals and spot anomalies that could indicate areas where decisions should be reassessed as needed

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    This dataset can provide valuable insights into how Medicare is spending per patient at specific hospitals in the United States. It can be used to gain a better understanding of the types of services covered under Medicare, and to what extent those services are being used. By comparing the average Medicare spending across different hospitals, users can also gain insight into potential disparities in care delivery or availability.

    To use this dataset, first identify which hospital you are interested in analyzing. Then locate the row for that hospital in the dataset and review its associated values: value, footnote (optional), and start/end dates (optional). The Value column refers to how much Medicare spends on each particular patient; this is a numerical value represented as a decimal number up to 6 decimal places. The Footnote (optional) provides more information about any special circumstances that may need attention when interpreting the value data points. Finally, if Start Date and End Date fields are present they will specify over what timeframe these values were aggregated over.

    Once all relevant data elements have been reviewed successively for all hospitals of interest then comparison analysis among them can be conducted based on Value, Footnote or Start/End dates as necessary to answer specific research questions or formulate conclusions about how Medicare is spending per patient at various hospitals nationwide

    Research Ideas

    • Developing a cost comparison tool for hospitals that allows patients to compare how much Medicare spends per patient across different hospitals.
    • Creating an algorithm to help predict Medicare spending at different facilities over time and build strategies on how best to manage those costs.
    • Identifying areas in which a hospital can save money by reducing unnecessary spending in order to reduce overall Medicare expenses

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: Dataset copyright by authors - You are free to: - Share - copy and redistribute the material in any medium or format for any purpose, even commercially. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original. - Keep intact - all notices that refer to this license, including copyright notices.

    Columns

    File: Medicare_hospital_spending_per_patient_Medicare_Spending_per_Beneficiary_Additional_Decimal_Places.csv | Column name | Description | |:---------------|:--------------------------------------------------------------------------------------| | Value | The amount of Medicare spending per patient for a given hospital or region. (Numeric) | | Footnote | Any additional notes or information related to the value. (Text) | | Start_Date | The start date of the period for which the value applies. (Date) | | End_Date | The end date of the period for which the value applies. (Date) |

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit Health.

  12. H

    Whitefish Lake Institute Long-Term Monitoring Dataset (2007-2021)

    • hydroshare.org
    • beta.hydroshare.org
    • +1more
    zip
    Updated Feb 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Meghan Robinson; W. Adam Sigler; Mike Koopal (2023). Whitefish Lake Institute Long-Term Monitoring Dataset (2007-2021) [Dataset]. http://doi.org/10.4211/hs.5ca7307fda8949299e6782885da95046
    Explore at:
    zip(219.0 MB)Available download formats
    Dataset updated
    Feb 28, 2023
    Dataset provided by
    HydroShare
    Authors
    Meghan Robinson; W. Adam Sigler; Mike Koopal
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    May 27, 2007 - Nov 3, 2021
    Area covered
    Description

    This resource contains data collected by the Whitefish Lake Institute (WLI) as well as R code used to compile and conduct quality assurance on the data. This resource reflects joint publication efforts between WLI and the Montana State University Extension Water Quality (MSUEWQ) program. All data included here was uploaded to the National Water Quality Portal (WQX) in 2022. It is the intention of WLI to upload all future data to WQX and this HydroShare resource may also be updated in the future with data for 2022 and forward.

    Data Purpose: The ‘Data’ folder of this resource holds the final data products for the extensive dataset collected by WLI between 2007 and 2021. This folder is likely of interest to users who want data for research and analysis purposes. This dataset contains physical water parameter field data collected by Hydrolab MS5 and DS5 loggers, including water temperature, specific conductance, dissolved oxygen concentration and saturation, barometric pressure, and turbidity. Additional field data that needs further quality assurance prior to use includes chlorophyll a, ORP, pH, and PAR. This dataset also contains water chemistry data analyzed at certified laboratories including total nitrogen, total phosphorus, nitrate, orthophosphate, total suspended solids, organic carbon, and chlorophyll a. The data folder includes R scripts with code for examples of data visualization. This dataset can provide insight to water quality trends in lakes and streams of northwestern Montana over time. Data Summary: During the time-period, WLI collected water quality data for 63 lake sites and 17 stream and river sites in northwestern Montana under two separate monitoring projects. The Northwest Montana Lakes Network (NMLN) project currently visits 41 lake sites in Northwestern Montana once per summer. Field data from Hydrolabs are collected at discrete depths throughout a lake's profile, and depth integrated water chemistry samples are collected as well. The Whitefish Water Quality Monitoring Project (WWQMP) currently visits two sites on Whitefish Lake, one site on Tally Lake, and 11 stream and river sites in the Whitefish Lake and Upper Whitefish River watersheds monthly between April and November. Field data is collected at one depth for streams and many depths throughout the lake profiles, and water chemistry samples are collected at discrete depths for Whitefish Lake and streams. The final dataset for both programs includes over 112,000 datapoints of data passing quality assurance assessment and an additional 72,000 datapoints that would need further quality assurance before use.

    Workflow Purpose: The ‘Workflow’ folder of this resource contains the raw data, folder structure, and R code used during this data compilation and upload process. This folder is likely of interest to users who have similar datasets and are interested in code for automating data compilation or upload processes. The R scripts included here have code to stitch together many individual Hydrolab MS5 and DS5 logger files as well as lab electronic data deliverables (EDDs), which may be useful for users who are interested in compiling one or multiple seasons' worth of data into a single file. Reformatting scripts format data to match the multi-sheet excel workbook format required by the Montana Department of Environmental Quality for uploads to WQX, and may be useful to others hoping to automate database uploads. Workflow Summary: Compilation code in the workflow folder compiles data from its most original forms, including Hydrolab sonde export files and lab EDDs. This compilation process includes extracting dates and times from comment fields and producing a single file from many input files. Formatting code then reformats the data to match WQX upload requirements, which includes generating unique activity IDs for data collected at the same site, date, and time then linking these activity IDs with results across worksheets in an excel workbook. Code for generating all quality assurance figures used in the decision-making process outlined in the Quality Assurance Document and resulting data removal decisions are included here as well. Finally, this folder includes code for combining data from the separate program uploads for WQX to the more user-friendly structure for analysis provided in the 'Data' file for this HydroShare resource.

  13. US Crime Dataset

    • brightdata.com
    .json, .csv, .xlsx
    Updated May 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bright Data (2024). US Crime Dataset [Dataset]. https://brightdata.com/products/datasets/crime/us
    Explore at:
    .json, .csv, .xlsxAvailable download formats
    Dataset updated
    May 21, 2024
    Dataset authored and provided by
    Bright Datahttps://brightdata.com/
    License

    https://brightdata.com/licensehttps://brightdata.com/license

    Area covered
    Worldwide, United States
    Description

    We will build you a custom US crime dataset based on your needs. Data points may include date, time, location, crime type, crime description, victim demographics, offender demographics, arrest records, charges filed, court outcomes, police department response time, incident outcome, weapon used, property stolen or damaged, crime location type, and other related data.

    Use our US crime datasets for a range of applications to enhance public safety and policy effectiveness. Analyzing these datasets can help organizations understand crime patterns and trends across different regions of the United States, enabling them to tailor their strategies and interventions accordingly. Depending on your needs, you may access the entire dataset or a customized subset.

    Popular use cases include: improving public safety measures, designing targeted crime prevention programs, resource allocation for law enforcement, and more.

  14. Z

    Modern China Geospatial Database - Main Dataset

    • data.niaid.nih.gov
    • zenodo.org
    Updated Feb 28, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christian Henriot (2025). Modern China Geospatial Database - Main Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_5735393
    Explore at:
    Dataset updated
    Feb 28, 2025
    Dataset authored and provided by
    Christian Henriot
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    China
    Description

    MCGD_Data_V2.2 contains all the data that we have collected on locations in modern China, plus a number of locations outside of China that we encounter frequently in historical sources on China. All further updates will appear under the name "MCGD_Data" with a time stamp (e.g., MCGD_Data2023-06-21)

    You can also have access to this dataset and all the datasets that the ENP-China makes available on GitLab: https://gitlab.com/enpchina/IndexesEnp

    Altogether there are 464,970 entries. The data include the name of locations and their variants in Chinese, pinyin, and any recorded transliteration; the name of the province in Chinese and in pinyin; Province ID; the latitude and longitude; the Name ID and Location ID, and NameID_Legacy. The Name IDs all start with H followed by seven digits. This is the internal ID system of MCGD (the NameID_Legacy column records the Name IDs in their original format depending on the source). Locations IDs that start with "DH" are data points extracted from China Historical GIS (Harvard University); those that start with "D" are locations extracted from the data points in Geonames; those that have only digits (8 digits) are data points we have added from various map sources.

    One of the main features of the MCGD Main Dataset is the systematic collection and compilation of place names from non-Chinese language historical sources. Locations were designated in transliteration systems that are hardly comprehensible today, which makes it very difficult to find the actual locations they correspond to. This dataset allows for the conversion from these obsolete transliterations to the current names and geocoordinates.

    From June 2021 onward, we have adopted a different file naming system to keep track of versions. From MCGD_Data_V1 we have moved to MCGD_Data_V2. In June 2022, we introduced time stamps, which result in the following naming convention: MCGD_Data_YYYY.MM.DD.

    UPDATES

    MCGD_Data2025_02_28 includes a major change with the duplication of all the locations listed under Beijing, Shanghai, Tianjin, and Chongqing (北京, 上海, 天津, 重慶) and their listing under the name of the provinces to which they belonge origially before the creation of the four special municipalities after 1949. This is meant to facilitate the matching of data from historical sources. Each location has a unique NameID. Altogether there are 472,818 entries

    MCGD_Data2025_02_27 inclues an update on locations extracted from Minguo zhengfu ge yuanhui keyuan yishang zhiyuanlu 國民政府各院部會科員以上職員錄 (Directory of staff members and above in the ministries and committees of the National Government). Nanjing: Guomin zhengfu wenguanchu yinzhuju 國民政府文官處印鑄局國民政府文官處印鑄局, 1944). We also made corrections in the Prov_Py and Prov_Zh columns as there were some misalignments between the pinyin name and the name in Chines characters. The file now includes 465,128 entries.

    MCGD_Data2024_03_23 includes an update on locations in Taiwan from the Asia Directories. Altogether there are 465,603 entries (of which 187 place names without geocoordinates, labelled in the Lat Long columns as "Unknown").

    MCGD_Data2023.12.22 contains all the data that we have collected on locations in China, whatever the period. Altogether there are 465,603 entries (of which 187 place names without geocoordinates, labelled in the Lat Long columns as "Unknown"). The dataset also includes locations outside of China for the purpose of matching such locations to the place names extracted from historical sources. For example, one may need to locate individuals born outside of China. Rather than maintaining two separate files, we made the decision to incorporate all the place names found in historical sources in the gazetteer. Such place names can easily be removed by selecting all the entries where the 'Province' data is missing.

  15. Human Resource Data Set (The Company)

    • kaggle.com
    Updated Jan 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Koluit (2025). Human Resource Data Set (The Company) [Dataset]. https://www.kaggle.com/datasets/koluit/human-resource-data-set-the-company/versions/940
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 10, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Koluit
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Context

    Similar to others who have created HR data sets, we felt that the lack of data out there for HR was limiting. It is very hard for someone to test new systems or learn People Analytics in the HR space. The only dataset most HR practitioners have is their real employee data and there are a lot of reasons why you would not want to use that when experimenting. We hope that by providing this dataset with an evergrowing variation of data points, others can learn and grow their HR data analytics and systems knowledge.

    Some example test cases where someone might use this dataset:

    HR Technology Testing and Mock-Ups Engagement survey tools HCM tools BI Tools Learning To Code For People Analytics Python/R/SQL HR Tech and People Analytics Educational Courses/Tools

    Content

    The core data CompanyData.txt has the basic demographic data about a worker. We treat this as the core data that you can join future data sets to.

    Please read the Readme.md for additional information about this along with the Changelog for additional updates as they are made.

    Acknowledgements

    Initial names, addresses, and ages were generated using FakenameGenerator.com. All additional details including Job, compensation, and additional data sets were created by the Koluit team using random generation in Excel.

    Inspiration

    Our hope is this data is used in the HR or Research space to experiment and learn using HR data. Some examples that we hope this data will be used are listed above.

    Contact Us

    Have any suggestions for additions to the data? See any issues with our data? Want to use it for your project? Please reach out to us! https://koluit.com/ ryan@koluit.com

  16. LinkedIn Datasets

    • brightdata.com
    .json, .csv, .xlsx
    Updated Mar 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bright Data (2025). LinkedIn Datasets [Dataset]. https://brightdata.com/products/datasets/linkedin
    Explore at:
    .json, .csv, .xlsxAvailable download formats
    Dataset updated
    Mar 27, 2025
    Dataset authored and provided by
    Bright Datahttps://brightdata.com/
    License

    https://brightdata.com/licensehttps://brightdata.com/license

    Area covered
    Worldwide
    Description

    Unlock the full potential of LinkedIn data with our extensive dataset that combines profiles, company information, and job listings into one powerful resource for business decision-making, strategic hiring, competitive analysis, and market trend insights. This all-encompassing dataset is ideal for professionals, recruiters, analysts, and marketers aiming to enhance their strategies and operations across various business functions. Dataset Features

    Profiles: Dive into detailed public profiles featuring names, titles, positions, experience, education, skills, and more. Utilize this data for talent sourcing, lead generation, and investment signaling, with a refresh rate ensuring up to 30 million records per month. Companies: Access comprehensive company data including ID, country, industry, size, number of followers, website details, subsidiaries, and posts. Tailored subsets by industry or region provide invaluable insights for CRM enrichment, competitive intelligence, and understanding the startup ecosystem, updated monthly with up to 40 million records. Job Listings: Explore current job opportunities detailed with job titles, company names, locations, and employment specifics such as seniority levels and employment functions. This dataset includes direct application links and real-time application numbers, serving as a crucial tool for job seekers and analysts looking to understand industry trends and the job market dynamics.

    Customizable Subsets for Specific Needs Our LinkedIn dataset offers the flexibility to tailor the dataset according to your specific business requirements. Whether you need comprehensive insights across all data points or are focused on specific segments like job listings, company profiles, or individual professional details, we can customize the dataset to match your needs. This modular approach ensures that you get only the data that is most relevant to your objectives, maximizing efficiency and relevance in your strategic applications. Popular Use Cases

    Strategic Hiring and Recruiting: Track talent movement, identify growth opportunities, and enhance your recruiting efforts with targeted data. Market Analysis and Competitive Intelligence: Gain a competitive edge by analyzing company growth, industry trends, and strategic opportunities. Lead Generation and CRM Enrichment: Enrich your database with up-to-date company and professional data for targeted marketing and sales strategies. Job Market Insights and Trends: Leverage detailed job listings for a nuanced understanding of employment trends and opportunities, facilitating effective job matching and market analysis. AI-Driven Predictive Analytics: Utilize AI algorithms to analyze large datasets for predicting industry shifts, optimizing business operations, and enhancing decision-making processes based on actionable data insights.

    Whether you are mapping out competitive landscapes, sourcing new talent, or analyzing job market trends, our LinkedIn dataset provides the tools you need to succeed. Customize your access to fit specific needs, ensuring that you have the most relevant and timely data at your fingertips.

  17. d

    Data from: 2019 Distribution System Upgrade Unit Cost Database Current...

    • catalog.data.gov
    • data.openei.org
    • +2more
    Updated Jul 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Renewable Energy Laboratory (2025). 2019 Distribution System Upgrade Unit Cost Database Current Version [Dataset]. https://catalog.data.gov/dataset/2019-distribution-system-upgrade-unit-cost-database-current-version-24139
    Explore at:
    Dataset updated
    Jul 24, 2025
    Dataset provided by
    National Renewable Energy Laboratory
    Description

    IMPORTANT NOTE: This is the current version of NREL's Distribution System Unit cost Database and should be considered the most up-to-date. Compared to the previous version (https//data.nrel.gov/submissions/77) this database has additional data points and has been modified for improved usability. More information on the changes that have been made can be found in the attached file Unit_cost_database_guide_v2.docx. This guide also has important information about data sources and quality as well as intended use of the database. Please consult this database guide before using this data for any purpose. This database contains unit cost information for different components that may be used to integrate distributed photovotaic DPV systems onto distribution systems. Some of these upgrades and costs may also apply to integration of other distributed energy resources DER. Which components are required and how many of each is system-specific and should be determined by analyzing the effects of distributed PV at a given penetration level on the circuit of interest in combination with engineering assessments on the efficacy of different solutions to increase the ability of the circuit to host additional PV as desired. The current state of the distribution system should always be considered in these types of analysis. The data in this database was collected from a variety of utilities PV developers technology vendors and published research reports. Where possible we have included information on the source of each data point and relevant notes. In some cases where data provided is sensitive or proprietary we were not able to specify the source but provide other information that may be useful to the user e.g. year location where equipment was installed. NREL has carefully reviewed these sources prior to inclusion in this database. - Originated 01/02/2019 by National Renewable Energy Laboratory

  18. h

    UGround-V1-Data-Box

    • huggingface.co
    Updated May 4, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    OSU NLP Group (2025). UGround-V1-Data-Box [Dataset]. https://huggingface.co/datasets/osunlp/UGround-V1-Data-Box
    Explore at:
    Dataset updated
    May 4, 2025
    Dataset authored and provided by
    OSU NLP Group
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Updates

    [May 1, 2025] Bounding Box Data: We have added bounding box version of Web-Hybrid. For everyone's convenience, no conversation template is applied to this version of data. All the coordinates (x1, y1, x2, y2) are as always normalized to [0,999]. The data has also been filtered (757k datapoints after content moderation).

      Notes for Requests
    

    If you have applied for access to this dataset but have not received approval, please contact us via email (Boyu Gou)… See the full description on the dataset page: https://huggingface.co/datasets/osunlp/UGround-V1-Data-Box.

  19. West Oakland Lead Sampling (Scribe) Data Points, West Oakland CA, 2018, U.S....

    • catalog.data.gov
    • datasets.ai
    Updated Feb 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Environmental Protection Agency, Region 9 (Publisher) (2025). West Oakland Lead Sampling (Scribe) Data Points, West Oakland CA, 2018, U.S. EPA Region 9 [Dataset]. https://catalog.data.gov/dataset/west-oakland-lead-sampling-scribe-data-points-west-oakland-ca-2018-u-s-epa-region-913
    Explore at:
    Dataset updated
    Feb 25, 2025
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Area covered
    California, Oakland, West Oakland, United States
    Description

    This feature class contains 436 points depicting lead sampling locations across West Oakland, California taken during the 2018 Urban Metals Study. The U.S. Environmental Protection Agency (EPA) and the California Department of Toxic Substances Control (DTSC) have partnered on a project in West Oakland to study lead in soil. Lead is a heavy metal that is often found in urban soil. It usually comes from sources such as chipped paint on pre-1978 housing, historic pollution from leaded gas, or lead recycling (smelting). Children exposed to lead can have health problems, including impaired brain and physical development. In June 2018, EPA and DTSC took soil samples in city-owned property along streets, not private property, at nearly 200 randomly selected locations across West Oakland. This area was selected because of its mix of possible sources of lead, including industry, older homes that may be painted with leaded paint and nearby freeways.

  20. H

    Introduction to Time Series Analysis for Hydrologic Data

    • beta.hydroshare.org
    • hydroshare.org
    • +1more
    zip
    Updated Jan 29, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gabriela Garcia; Kateri Salk (2021). Introduction to Time Series Analysis for Hydrologic Data [Dataset]. https://beta.hydroshare.org/resource/ee2a4c2151f24115a12e34d4d22d96fe/
    Explore at:
    zip(1.1 MB)Available download formats
    Dataset updated
    Jan 29, 2021
    Dataset provided by
    HydroShare
    Authors
    Gabriela Garcia; Kateri Salk
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Oct 1, 1974 - Jan 27, 2021
    Area covered
    Description

    This lesson was adapted from educational material written by Dr. Kateri Salk for her Fall 2019 Hydrologic Data Analysis course at Duke University. This is the first part of a two-part exercise focusing on time series analysis.

    Introduction

    Time series are a special class of dataset, where a response variable is tracked over time. The frequency of measurement and the timespan of the dataset can vary widely. At its most simple, a time series model includes an explanatory time component and a response variable. Mixed models can include additional explanatory variables (check out the nlme and lme4 R packages). We will be covering a few simple applications of time series analysis in these lessons.

    Opportunities

    Analysis of time series presents several opportunities. In aquatic sciences, some of the most common questions we can answer with time series modeling are:

    • Has there been an increasing or decreasing trend in the response variable over time?
    • Can we forecast conditions in the future?

      Challenges

    Time series datasets come with several caveats, which need to be addressed in order to effectively model the system. A few common challenges that arise (and can occur together within a single dataset) are:

    • Autocorrelation: Data points are not independent from one another (i.e., the measurement at a given time point is dependent on previous time point(s)).

    • Data gaps: Data are not collected at regular intervals, necessitating interpolation between measurements. There are often gaps between monitoring periods. For many time series analyses, we need equally spaced points.

    • Seasonality: Cyclic patterns in variables occur at regular intervals, impeding clear interpretation of a monotonic (unidirectional) trend. Ex. We can assume that summer temperatures are higher.

    • Heteroscedasticity: The variance of the time series is not constant over time.

    • Covariance: the covariance of the time series is not constant over time. Many of these models assume that the variance and covariance are similar over the time-->heteroschedasticity.

      Learning Objectives

    After successfully completing this notebook, you will be able to:

    1. Choose appropriate time series analyses for trend detection and forecasting

    2. Discuss the influence of seasonality on time series analysis

    3. Interpret and communicate results of time series analyses

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Statista (2025). Types of unique data points collection in selected iOS fitness apps 2024 [Dataset]. https://www.statista.com/statistics/1559485/collection-and-tracking-ios-fitness-apps/
Organization logo

Types of unique data points collection in selected iOS fitness apps 2024

Explore at:
Dataset updated
Feb 25, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Dec 30, 2024
Area covered
Worldwide
Description

In 2024, the fitness app Strava had the largest number of collected data points that were linked to their app users. Out of the total 21 collected data types, 20 were linked to the users' identity, while two data points could potentially be used to track users. The Nike Training Club app was examined to collect four data points that potentially could help track users. Fitbit, Future Personal Training, and Fitness by Apple did not present any data point that could potentially track users.

Search
Clear search
Close search
Google apps
Main menu