55 datasets found
  1. d

    Factori Visit Data | Global | Location Intelligence | Geospatial Data |POI ,...

    • datarade.ai
    .csv
    Updated Jan 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Factori (2022). Factori Visit Data | Global | Location Intelligence | Geospatial Data |POI , Foot Traffic, Store Visit [Dataset]. https://datarade.ai/data-products/factori-geospatial-data-global-location-intelligence-po-factori
    Explore at:
    .csvAvailable download formats
    Dataset updated
    Jan 29, 2022
    Dataset authored and provided by
    Factori
    Area covered
    Myanmar, Pakistan, Madagascar, Germany, Saint Martin (French part), Chile, Luxembourg, Nicaragua, Guatemala, Ghana
    Description

    Our Geospatial Dataset connects people's movements to over 200M physical locations globally. These are aggregated and anonymized data that are only used to offer context for the volume and patterns of visits to certain locations. This data feed is compiled from different data sources around the world.

    It includes information such as the name, address, coordinates, and category of these locations, which can range from restaurants and hotels to parks and tourist attractions

    Location Intelligence Data Reach: Location Intelligence data brings the POI/Place/OOH level insights calculated on the basis of Factori’s Mobility & People Graph data aggregated from multiple data sources globally. In order to achieve the desired foot-traffic attribution, specific attributes are combined to bring forward the desired reach data. For instance, in order to calculate the foot traffic for a specific location, a combination of location ID, day of the week, and part of the day can be combined to give specific location intelligence data. There can be a maximum of 56 data records possible for one POI based on the combination of these attributes.

    Data Export Methodology: Since we collect data dynamically, we provide the most updated data and insights via a best-suited method at a suitable interval (daily/weekly/monthly).

    Use Cases: Credit Scoring: Financial services can use alternative data to score an underbanked or unbanked customer by validating locations and persona. Retail Analytics: Analyze footfall trends in various locations and gain an understanding of customer personas. Market Intelligence: Study various market areas, the proximity of points or interests, and the competitive landscape Urban Planning: Build cases for urban development, public infrastructure needs, and transit planning based on fresh population data. Marketing Campaign Strategy: Analyzing visitor demographics and behavior patterns around POIs, businesses can tailor their marketing strategies to effectively reach their target audience. OOH/DOOH Campaign Planning: Identify high-traffic locations and understand consumer behavior in specific areas, to execute targeted advertising strategies effectively. Geofencing: Geofencing involves creating virtual boundaries around physical locations, enabling businesses to trigger actions when users enter or exit these areas

    Data Attributes Included: LocationID
    name
    website BrandID Phone streetAddress
    city
    state country_code zip lat lng poi_status
    geoHash8 poi_id category category_id full_address address additional_categories url domain rating price_level rating_distribution is_claimed photo_url attributes brand_name brand_id status total_photos popular_times places_topics people_also_search work_hours local_business_links contact_info reviews_count naics_code naics_code_description sis_code sic_code_description shape_polygon building_id building_type building_name geometry_location_type geometry_viewport_northeast_lat geometry_viewport_northeast_lng geometry_viewport_southwest_lat geometry_viewport_southwest_lng geometry_location_lat geometry_location_lng calculated_geo_hash_8

  2. f

    Data and tools for studying isograms

    • figshare.com
    Updated Jul 31, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Florian Breit (2017). Data and tools for studying isograms [Dataset]. http://doi.org/10.6084/m9.figshare.5245810.v1
    Explore at:
    application/x-sqlite3Available download formats
    Dataset updated
    Jul 31, 2017
    Dataset provided by
    figshare
    Authors
    Florian Breit
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    A collection of datasets and python scripts for extraction and analysis of isograms (and some palindromes and tautonyms) from corpus-based word-lists, specifically Google Ngram and the British National Corpus (BNC).Below follows a brief description, first, of the included datasets and, second, of the included scripts.1. DatasetsThe data from English Google Ngrams and the BNC is available in two formats: as a plain text CSV file and as a SQLite3 database.1.1 CSV formatThe CSV files for each dataset actually come in two parts: one labelled ".csv" and one ".totals". The ".csv" contains the actual extracted data, and the ".totals" file contains some basic summary statistics about the ".csv" dataset with the same name.The CSV files contain one row per data point, with the colums separated by a single tab stop. There are no labels at the top of the files. Each line has the following columns, in this order (the labels below are what I use in the database, which has an identical structure, see section below):

    Label Data type Description

    isogramy int The order of isogramy, e.g. "2" is a second order isogram

    length int The length of the word in letters

    word text The actual word/isogram in ASCII

    source_pos text The Part of Speech tag from the original corpus

    count int Token count (total number of occurences)

    vol_count int Volume count (number of different sources which contain the word)

    count_per_million int Token count per million words

    vol_count_as_percent int Volume count as percentage of the total number of volumes

    is_palindrome bool Whether the word is a palindrome (1) or not (0)

    is_tautonym bool Whether the word is a tautonym (1) or not (0)

    The ".totals" files have a slightly different format, with one row per data point, where the first column is the label and the second column is the associated value. The ".totals" files contain the following data:

    Label

    Data type

    Description

    !total_1grams

    int

    The total number of words in the corpus

    !total_volumes

    int

    The total number of volumes (individual sources) in the corpus

    !total_isograms

    int

    The total number of isograms found in the corpus (before compacting)

    !total_palindromes

    int

    How many of the isograms found are palindromes

    !total_tautonyms

    int

    How many of the isograms found are tautonyms

    The CSV files are mainly useful for further automated data processing. For working with the data set directly (e.g. to do statistics or cross-check entries), I would recommend using the database format described below.1.2 SQLite database formatOn the other hand, the SQLite database combines the data from all four of the plain text files, and adds various useful combinations of the two datasets, namely:• Compacted versions of each dataset, where identical headwords are combined into a single entry.• A combined compacted dataset, combining and compacting the data from both Ngrams and the BNC.• An intersected dataset, which contains only those words which are found in both the Ngrams and the BNC dataset.The intersected dataset is by far the least noisy, but is missing some real isograms, too.The columns/layout of each of the tables in the database is identical to that described for the CSV/.totals files above.To get an idea of the various ways the database can be queried for various bits of data see the R script described below, which computes statistics based on the SQLite database.2. ScriptsThere are three scripts: one for tiding Ngram and BNC word lists and extracting isograms, one to create a neat SQLite database from the output, and one to compute some basic statistics from the data. The first script can be run using Python 3, the second script can be run using SQLite 3 from the command line, and the third script can be run in R/RStudio (R version 3).2.1 Source dataThe scripts were written to work with word lists from Google Ngram and the BNC, which can be obtained from http://storage.googleapis.com/books/ngrams/books/datasetsv2.html and [https://www.kilgarriff.co.uk/bnc-readme.html], (download all.al.gz).For Ngram the script expects the path to the directory containing the various files, for BNC the direct path to the *.gz file.2.2 Data preparationBefore processing proper, the word lists need to be tidied to exclude superfluous material and some of the most obvious noise. This will also bring them into a uniform format.Tidying and reformatting can be done by running one of the following commands:python isograms.py --ngrams --indir=INDIR --outfile=OUTFILEpython isograms.py --bnc --indir=INFILE --outfile=OUTFILEReplace INDIR/INFILE with the input directory or filename and OUTFILE with the filename for the tidied and reformatted output.2.3 Isogram ExtractionAfter preparing the data as above, isograms can be extracted from by running the following command on the reformatted and tidied files:python isograms.py --batch --infile=INFILE --outfile=OUTFILEHere INFILE should refer the the output from the previosu data cleaning process. Please note that the script will actually write two output files, one named OUTFILE with a word list of all the isograms and their associated frequency data, and one named "OUTFILE.totals" with very basic summary statistics.2.4 Creating a SQLite3 databaseThe output data from the above step can be easily collated into a SQLite3 database which allows for easy querying of the data directly for specific properties. The database can be created by following these steps:1. Make sure the files with the Ngrams and BNC data are named “ngrams-isograms.csv” and “bnc-isograms.csv” respectively. (The script assumes you have both of them, if you only want to load one, just create an empty file for the other one).2. Copy the “create-database.sql” script into the same directory as the two data files.3. On the command line, go to the directory where the files and the SQL script are. 4. Type: sqlite3 isograms.db 5. This will create a database called “isograms.db”.See the section 1 for a basic descript of the output data and how to work with the database.2.5 Statistical processingThe repository includes an R script (R version 3) named “statistics.r” that computes a number of statistics about the distribution of isograms by length, frequency, contextual diversity, etc. This can be used as a starting point for running your own stats. It uses RSQLite to access the SQLite database version of the data described above.

  3. Data from: Global data on crop nutrient concentration and harvest indices

    • data.niaid.nih.gov
    zip
    Updated Nov 27, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cameron I. Ludemann; Renske Hijbeek; Marloes van Loon; T. Scott Murrell; Achim Dobermann; Martin van Ittersum (2023). Global data on crop nutrient concentration and harvest indices [Dataset]. http://doi.org/10.5061/dryad.n2z34tn0x
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 27, 2023
    Dataset provided by
    International Fertilizer Associationhttp://www.fertilizer.org/
    Wageningen University & Research
    African Plant Nutrition Institute
    Authors
    Cameron I. Ludemann; Renske Hijbeek; Marloes van Loon; T. Scott Murrell; Achim Dobermann; Martin van Ittersum
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Estimates of crop nutrient removal (as crop products and crop residues) are an important component of crop nutrient balances. Crop nutrient removal can be estimated through multiplication of the quantity of crop products or crop residues (removed) by the nutrient concentration of those crop products and crop residue components respectively. Data for quantities of crop products removed at a country level are available through FAOSTAT (https://www.fao.org/faostat/en/), but equivalent data for quantities of crop residues are not available at a global level. However, quantities of crop residues can be estimated if the relationship between quantity of crop residues and crop products is known. Harvest index (HI) provides one such indication of the relationship between quantity of crop products and crop residues. HI is the proportion of above-ground biomass as crop products and can be used to estimate quantity of crop residues based on quantity of crop products. Previously, meta-analyses or surveys have been performed to estimate nutrient concentrations of crop products and crop residues and harvest indices (collectively known as crop coefficients). The challenges for using these coefficients in global nutrient balances include the representativeness of world regions or countries. Moreover, it may be unclear which countries or crop types are actually represented in the analyses of data. In addition, units used among studies differ which makes comparisons challenging. To overcome these challenges, data from meta-analyses and surveys were collated in one dataset with standardised units and referrals to the original region and crop names used by the sources of data. Original region and crop names were converted into internationally recognised names, and crop coefficients were summarised into two Tiers of data, representing the world (Tier 1, with single coefficient values for the world) and specific regions or countries of the world (Tier 2, with single coefficient values for each country). This dataset will aid both global and regional analyses for crop nutrient balances.

    Methods Data acquisition Data were primarily collated from meta-analyses found in scientific literature. Terms used in Ovid (https://ovidsp.ovid.com/), CAB Abstracts (https://www.cabdirect.org/) and Google Scholar (https://scholar.google.com/) were: (crop) AND (“nutrient concentration” OR “nutrient content” OR “harvest index”) across any time. This search resulted in over 245,000 results. These results were refined to include studies that purported to represent crop nutrient concentration and/or harvest index of crops for geographic regions of the world, as opposed to site-specific field experiments. Given the range in different crops grown globally, preference was given to acquiring datasets that included multiple crops. In some cases, authors of meta-analyses were asked for raw data to aid the standardisation process. In addition, the International Fertilizer Association (IFA), and the Food and Agriculture Organization of the United Nations (UN FAO) provided data used for crop nutrient balances (FAOSTAT 2020). The request to UN FAO yielded phosphorus and potassium crop nutrient concentrations in addition to their publicly available nitrogen concentration values (FAOSTAT 2020). In total the refined search resulted in 26 different sources of data. Data files were converted to separate comma-delimited CSV files for each source of data, whereby a unique ‘source’ was a dataset from an article from the scientific literature or a dataset sent by the UN FAO or IFA. Crop nutrient concentrations were expressed as a percentage of dry matter and/or the percentage of fresh weight depending on which units were reported and whether dry matter percentages of crop fresh weight were reported. Meta-data text files were written to accompany each standardized CSV file. The standardized CSV files for each source of data included information on the name of the original region, the crop coefficients it purported to represent, as well as the original names of the crops as categorised by the authors of the data. If the data related to a meta-analysis of multiple sources, information was included for the primary source of data when available. Data from the separate source files were collated into one file named ‘Combined_crop_data.csv’ using R Studio (version 4.1.0) (hereafter referred to as R) with the scripts available at https://github.com/ludemannc/Tier_1_2_crop_coefficients.git. Processing of data When transforming the combined data file (‘Combined_crop_data.csv’) into representative crop coefficients for different regions (available in ‘Tier_1_and_2_crop_coefficients.csv’), crop coefficients that were duplicates from the same primary source of data were excluded from processing. For instance, Zhang et al. (2021) referred to multiple primary sources of data, and the data requested from the UN FAO and the IFA referred (in many cases) to crop coefficients from IPNI (2014). Duplicate crop coefficient data that came from the same primary source were therefore excluded from the summarised dataset of crop coefficients. Two tiers of data The data were sub-divided into two Tiers to help overcome the challenge of using these data in a global nutrient balance when data are not available for every country. This follows the approach taken by the Intergovernmental Panel for Climate Change-IPCC (IPCC 2019). Data were assigned different ‘Tiers’ based on complexity and data requirements. · Tier 1: crop coefficients at the world level. · Tier 2: crop coefficients at more granular geographic regions of the world (e.g. at regional, country or sub-country levels).
    Crop coefficients were summarised as means for each crop item and crop component based on either ‘Tier 1’ or ‘Tier 2’. One could also envision a more detailed site-specific level (Tier 3). The data in this dataset did not meet the required level of complexity or data requirements for Tier 3, unlike, say, the site-specific data being collected as part of the Consortium for Precision Crop Nutrition (CPCN) (www.cropnutrientdata.net)-which could be described as being Tier 3. No data from the current dataset were therefore assigned to Tier 3. It is expected that in the future, site-specific data will be used to improve the crop coefficients further with a Tier 3 approach. The ‘Tier_1_and_2_crop_coefficients.csv’ file includes mean crop coefficients for the Tier 1 data, and mean crop coefficients for the Tier 2 data. The Tier 1 estimates of crop coefficients were mean values across Tier 1 data that purported to represent the World. Crop coefficients found in the data sources represent quite different geographic areas or regions. To enable combining data with different spatial overlaps for Tier 2, data were disaggregated to the country level. First, each region was assigned a list of countries (which the regional averages were assumed to represent, as listed in the ‘Original_region_names_and_assigned_countries.csv’ file). Countries were assigned alpha-3 country codes following the ISO 3166 international standards (https://www.iso.org/publication/PUB500001.html). Second, for each country mean, crop coefficients were calculated based on coefficients from regions listed for each country. For Australia for example, the mean values for each crop coefficient were calculated from values that represented sub-country (e.g. Australia New South Wales South East), country (Australia), and multi-country (e.g. Oceania) regions. For instance, if there was a harvest index value of 0.5 for wheat for the original region ‘Australia New South Wales South East’, a value of 0.51 for the original region named ‘Australia’ and a value of 0.47 for the original region named ‘Oceania’, then the mean Tier 2 harvest index for wheat for the country Australia would be 0.493, the unweighted mean. Using our dataset, a user can assign different weights to each entry. To aid analysis, the names of the original categories of crop were converted into UN FAO crop ‘item’ categories, following UN FAO standards (FAOSTAT 2022) (available in the ‘Original_crop_names_in_each_item_category.csv’ file). These item categories were also assigned categorical numeric codes following UN FAO standards (FAOSTAT 2022). Data related to crop products (e.g. grain, beans, saleable tubers or fibre) were assigned the category “Crop_products” and crop residues (eg straw, stover) were assigned the category “Crop_residues”. Dry and fresh matter weights In some cases nutrient concentration values from the original sources were available on a dry matter or a fresh weight basis, but not both. Gaps in either the nutrient concentration on a dry matter or fresh weight basis were given imputed values. If the data source mentioned the dry matter percentage of the crop component then this was preferentially used to impute the other missing nutrient concentration data. If dry matter percentage information was not available for a particular crop item or crop component, missing data were imputed using the mean dry matter percentage values across all Tier 1 and Tier 2 data. Global means for the UN FAO Cropland Nutrient Budget. Data were also summarised as means for nitrogen (N), elemental phosphorus (P) and elemental potassium (K) nutrient concentrations of crop products using data that represented the world (Tier 1) for the 2023 UN FAO Cropland Nutrient Budget. These data are available in the file named World_crop_coefficients_for_UN_FAO.csv.

  4. VLA Goulds Belt Survey Serpens Region Source Catalog - Dataset - NASA Open...

    • data.nasa.gov
    Updated Apr 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). VLA Goulds Belt Survey Serpens Region Source Catalog - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/vla-goulds-belt-survey-serpens-region-source-catalog
    Explore at:
    Dataset updated
    Apr 1, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    This table contains results from a deep (~17 µJy) radio continuum observations of the Serpens molecular cloud, the Serpens south cluster, and the W40 region that were obtained using the Jansky Very Large Array (JVLA) in its A configuration. The authors detected a total of 146 sources, 29 of which are young stellar objects (YSOs), 2 of which are BV stars, and 5 more of which are associated with phenomena related to YSOs. Based on their radio variability and spectral index, the authors propose that about 16 of the remaining 110 unclassified sources are also YSOs. For approximately 65% of the known YSOs detected here as radio sources, the emission is most likely non-thermal and related to stellar coronal activity. As also recently observed in Ophiuchus, this sample of YSOs with X-ray counterparts lies below the fiducial Guedel & Benz (1993, ApJ, 405, L63) relation. In the reference paper, the authors analyze the proper motions of nine sources in the W40 region, thus allowing them to better constrain the membership of the radio sources in the region. The Serpens molecular cloud and the Serpens South cluster were observed in the same observing sessions on three different epochs (2011 June 17, July 19, and September 12 UT), using 25 and 4 pointings, respectively, with the JVLA at 4.5 and 4.5GHz. The W40 region, on the other hand, was only observed on two epochs (2011 June 17 and July 16), using 13 pointings. The details of the observations are listed in Table 1 of the reference paper. The authors adopted the same criteria as Dzib et al. (2013, ApJ, 775, 63) to consider a detection as firm. For new sources, i.e., those without reported counterparts in the literature, they considered 5-sigma detections, where sigma is the rms noise of the area around the source. For known sources with counterparts in the literature, on the other hand, they included 4-sigma detections. According to these criteria, they detected 94 sources in the Serpens molecular cloud, 41 in the W40 region, and 8 in the Serpens South cluster, for a total of 143 detections. Out of the 143 sources, 69 are new detections (see Section 3.2 of the reference paper). GBS-VLA source positions were compared with source positions from X-ray, optical, near-IR, mid-IR, and radio catalogs. GBS-VLA sources were considered to have a counterpart at another wavelength when the positional coincidences were better than the combined uncertainties of the two data sets. These were about 1 arcsecond for the IR catalogs. For the X-ray and radio catalogs it depended on the instrument and its configuration. The search was done in SIMBAD and included all the major catalogs. The authors also accessed the lists with all YSOs in the c2d-GB clouds compiled by Dunham et al.(2013, AJ, 145, 94) and L.E. Allen et al. (2015, in preparation). In total, 354 c2d-GB sources lie inside the regions observed by the present survey. In order to find their radio counterparts, the authors imaged regions of 64 pixels in each dimension, centered in the c2d-GB positions, and combining accordingly with each region, the three or two epochs. For this search, they only used the field whose phase center was closest to the source. Three additional radio sources were found in Serpens South in this pursuit, increasing the number of the radio detections to 146. This table was created by the HEASARC in October 2015 based on electronic versions of Tables 2, 3 and 6 from the reference paper, which were obtained from the CDS (Catalog J/ApJ/805/9 files table2.dat, table3.dat and table6.dat). This is a service provided by NASA HEASARC .

  5. d

    Data from: Historical produced water chemistry data compiled for the Orcutt...

    • catalog.data.gov
    • data.usgs.gov
    • +3more
    Updated Oct 8, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). Historical produced water chemistry data compiled for the Orcutt and Oxnard oil fields, Santa Barbara and Ventura Counties, southern California [Dataset]. https://catalog.data.gov/dataset/historical-produced-water-chemistry-data-compiled-for-the-orcutt-and-oxnard-oil-fields-san
    Explore at:
    Dataset updated
    Oct 8, 2025
    Dataset provided by
    U.S. Geological Survey
    Area covered
    Southern California, California, Ventura County, Santa Barbara, Orcutt
    Description

    This digital dataset represents historical geochemical and other information for 58 sample results of produced water from 56 sites in the Orcutt and Oxnard oil fields in Santa Barbara and Ventura Counties, respectively, in southern California. Produced water is a term used in the oil industry to describe water that is produced as a byproduct along with the oil and gas. The locations from which these historical samples were collected include 20 wells (12 in the Oxnard oil field and 8 in the Orcutt oil field). The top and bottom perforations are known for all except one (Dataset ID 33) of these wells. Additional sample sites include 13 storage tanks, and 13 unidentifiable sources. Two of the storage tanks (Dataset IDs 8 and 54), are associated with one and two identifiable wells, respectively. Historical samples from other storage tanks and unidentifiable sample sources may also represent pre- or post-treated composite samples of produced water from single or multiple wells. Historical sample descriptions provide further insight about the site type associated with several of the samples. Eleven sites, including one well (Dataset ID 30), are classified as "injectate" based on the sample description combined with the designated well use at the time of sample collection (WD, water disposal). Two samples collected from wells in Orcutt (Dataset IDs 4 and 7), both oil wells with known perforation intervals, and one sample from an unidentified site (Dataset ID 56) are described as zone or formation samples. Three other samples collected from two wells (Dataset ID’s 46 and 49) in Oxnard were identified as water source wells which access groundwater for use in the production of oil. The numerical water chemistry data were compiled by the U.S. Geological Survey (USGS) from scanned laboratory analysis reports available from the California Geologic Energy Management Division (CalGEM). Sample site characteristics, such as well construction details, were attributed using a combination of information provided with the scanned laboratory analysis reports and well history files from CalGEM Well Finder. The compiled data are divided into two separate data files described as follows: 1) a summary data file identifying each site by name, the site location, basic construction information, and American Petroleum Institute (API) number (for wells), the number of chemistry samples, period of record, sample description, and the geologic formation associated with the origin of the sampled water, or intended destination (formation into which water was to intended to be injected for samples labeled as injectate) of the sample; and 2) a data file of geochemistry analyses for selected water-quality indicators, major and minor ions, nutrients, and trace elements, parameter code and (or) method, reporting level, reporting level type, and supplemental notes. A data dictionary was created to describe the geochemistry data file and is provided with this data release.

  6. U.S. Education Datasets: Unification Project

    • kaggle.com
    Updated Apr 13, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Roy Garrard (2020). U.S. Education Datasets: Unification Project [Dataset]. https://www.kaggle.com/datasets/noriuk/us-education-datasets-unification-project/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 13, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Roy Garrard
    Area covered
    United States
    Description

    Author's Note 2019/04/20: Revisiting this project, I recently discovered the incredibly comprehensive API produced by the Urban Institute. It achieves all of the goals laid out for this dataset in wonderful detail. I recommend that users interested pay a visit to their site.

    Context

    This dataset is designed to bring together multiple facets of U.S. education data into one convenient CSV (states_all.csv).

    Contents

    • states_all.csv: The primary data file. Contains aggregates from all state-level sources in one CSV.

    • output_files/states_all_extended.csv: The contents of states_all.csv with additional data related to race and gender.

    Column Breakdown

    Identification

    • PRIMARY_KEY: A combination of the year and state name.
    • YEAR
    • STATE

    Enrollment

    A breakdown of students enrolled in schools by school year.

    • GRADES_PK: Number of students in Pre-Kindergarten education.

    • GRADES_4: Number of students in fourth grade.

    • GRADES_8: Number of students in eighth grade.

    • GRADES_12: Number of students in twelfth grade.

    • GRADES_1_8: Number of students in the first through eighth grades.

    • GRADES 9_12: Number of students in the ninth through twelfth grades.

    • GRADES_ALL: The count of all students in the state. Comparable to ENROLL in the financial data (which is the U.S. Census Bureau's estimate for students in the state).

    The extended version of states_all contains additional columns that breakdown enrollment by race and gender. For example:

    • G06_A_A: Total number of sixth grade students.

    • G06_AS_M: Number of sixth grade male students whose ethnicity was classified as "Asian".

    • G08_AS_A_READING: Average reading score of eighth grade students whose ethnicity was classified as "Asian".

    The represented races include AM (American Indian or Alaska Native), AS (Asian), HI (Hispanic/Latino), BL (Black or African American), WH (White), HP (Hawaiian Native/Pacific Islander), and TR (Two or More Races). The represented genders include M (Male) and F (Female).

    Financials

    A breakdown of states by revenue and expenditure.

    • ENROLL: The U.S. Census Bureau's count for students in the state. Should be comparable to GRADES_ALL (which is the NCES's estimate for students in the state).

    • TOTAL REVENUE: The total amount of revenue for the state.

      • FEDERAL_REVENUE
      • STATE_REVENUE
      • LOCAL_REVENUE
    • TOTAL_EXPENDITURE: The total expenditure for the state.

      • INSTRUCTION_EXPENDITURE
      • SUPPORT_SERVICES_EXPENDITURE

      • CAPITAL_OUTLAY_EXPENDITURE

      • OTHER_EXPENDITURE

    Academic Achievement

    A breakdown of student performance as assessed by the corresponding exams (math and reading, grades 4 and 8).

    • AVG_MATH_4_SCORE: The state's average score for fourth graders taking the NAEP math exam.

    • AVG_MATH_8_SCORE: The state's average score for eight graders taking the NAEP math exam.

    • AVG_READING_4_SCORE: The state's average score for fourth graders taking the NAEP reading exam.

    • AVG_READING_8_SCORE: The state's average score for eighth graders taking the NAEP reading exam.

    Data Processing

    The original sources can be found here:

    # Enrollment
    https://nces.ed.gov/ccd/stnfis.asp
    # Financials
    https://www.census.gov/programs-surveys/school-finances/data/tables.html
    # Academic Achievement
    https://www.nationsreportcard.gov/ndecore/xplore/NDE
    

    Data was aggregated using a Python program I wrote. The code (as well as additional project information) can be found [here][1].

    Methodology Notes

    • Spreadsheets for NCES enrollment data for 2014, 2011, 2010, and 2009 were modified to place key data on the same sheet, making scripting easier.

    • The column 'ENROLL' represents the U.S. Census Bureau data value (financial data), while the column 'GRADES_ALL' represents the NCES data value (demographic data). Though the two organizations correspond on this matter, these values (which are ostensibly the same) do vary. Their documentation chalks this up to differences in membership (i.e. what is and is not a fourth grade student).

    • Enrollment data from NCES has seen a number of changes across survey years. One of the more notable is that data on student gender does not appear to have been collected until 2009. The information in states_all_extended.csv reflects this.

    • NAEP test score data is only available for certain years

    • The current version of this data is concerned with state-level patterns. It is the author's hope that future versions will allow for school district-level granularity.

    Acknowledgements

    Data is sourced from the U.S. Census Bureau and the National Center for Education Statistics (NCES).

    Licensing Notes

    The licensing of these datasets state that it must not be us...

  7. c

    California City Boundaries and Identifiers

    • gis.data.ca.gov
    • data.ca.gov
    • +1more
    Updated Sep 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    California Department of Technology (2024). California City Boundaries and Identifiers [Dataset]. https://gis.data.ca.gov/maps/California::california-city-boundaries-and-identifiers
    Explore at:
    Dataset updated
    Sep 16, 2024
    Dataset authored and provided by
    California Department of Technology
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Description

    Note: The schema changed in February 2025 - please see below. We will post a roadmap of upcoming changes, but service URLs and schema are now stable. For deployment status of new services beginning in February 2025, see https://gis.data.ca.gov/pages/city-and-county-boundary-data-status. Additional roadmap and status links at the bottom of this metadata.This dataset is regularly updated as the source data from CDTFA is updated, as often as many times a month. If you require unchanging point-in-time data, export a copy for your own use rather than using the service directly in your applications. PurposeCity boundaries along with third party identifiers used to join in external data. Boundaries are from the California Department of Tax and Fee Administration (CDTFA). These boundaries are the best available statewide data source in that CDTFA receives changes in incorporation and boundary lines from the Board of Equalization, who receives them from local jurisdictions for tax purposes. Boundary accuracy is not guaranteed, and though CDTFA works to align boundaries based on historical records and local changes, errors will exist. If you require a legal assessment of boundary location, contact a licensed surveyor.This dataset joins in multiple attributes and identifiers from the US Census Bureau and Board on Geographic Names to facilitate adding additional third party data sources. In addition, we attach attributes of our own to ease and reduce common processing needs and questions. Finally, coastal buffers are separated into separate polygons, leaving the land-based portions of jurisdictions and coastal buffers in adjacent polygons. This feature layer is for public use. Related LayersThis dataset is part of a grouping of many datasets:Cities: Only the city boundaries and attributes, without any unincorporated areasWith Coastal BuffersWithout Coastal Buffers (this dataset)Counties: Full county boundaries and attributes, including all cities within as a single polygonWith Coastal BuffersWithout Coastal BuffersCities and Full Counties: A merge of the other two layers, so polygons overlap within city boundaries. Some customers require this behavior, so we provide it as a separate service.With Coastal BuffersWithout Coastal BuffersCity and County AbbreviationsUnincorporated Areas (Coming Soon)Census Designated PlacesCartographic CoastlinePolygonLine source (Coming Soon) Working with Coastal Buffers The dataset you are currently viewing excludes the coastal buffers for cities and counties that have them in the source data from CDTFA. In the versions where they are included, they remain as a second polygon on cities or counties that have them, with all the same identifiers, and a value in the COASTAL field indicating if it"s an ocean or a bay buffer. If you wish to have a single polygon per jurisdiction that includes the coastal buffers, you can run a Dissolve on the version that has the coastal buffers on all the fields except OFFSHORE and AREA_SQMI to get a version with the correct identifiers. Point of ContactCalifornia Department of Technology, Office of Digital Services, odsdataservices@state.ca.gov Field and Abbreviation DefinitionsCDTFA_CITY: CDTFA incorporated city nameCDTFA_COUNTY: CDTFA county name. For counties, this will be the name of the polygon itself. For cities, it is the name of the county the city polygon is within.CDTFA_COPRI: county number followed by the 3-digit city primary number used in the Board of Equalization"s 6-digit tax rate area numbering system. The boundary data originate with CDTFA's teams managing tax rate information, so this field is preserved and flows into this dataset.CENSUS_GEOID: numeric geographic identifiers from the US Census BureauCENSUS_PLACE_TYPE: City, County, or Town, stripped off the census name for identification purpose.GNIS_PLACE_NAME: Board on Geographic Names authorized nomenclature for area names published in the Geographic Name Information SystemGNIS_ID: The numeric identifier from the Board on Geographic Names that can be used to join these boundaries to other datasets utilizing this identifier.CDT_CITY_ABBR: Abbreviations of incorporated area names - originally derived from CalTrans Division of Local Assistance and now managed by CDT. Abbreviations are 4 characters. Not present in the county-specific layers.CDT_COUNTY_ABBR: Abbreviations of county names - originally derived from CalTrans Division of Local Assistance and now managed by CDT. Abbreviations are 3 characters.CDT_NAME_SHORT: The name of the jurisdiction (city or county) with the word "City" or "County" stripped off the end. Some changes may come to how we process this value to make it more consistent.AREA_SQMI: The area of the administrative unit (city or county) in square miles, calculated in EPSG 3310 California Teale Albers.OFFSHORE: Indicates if the polygon is a coastal buffer. Null for land polygons. Additional values include "ocean" and "bay".PRIMARY_DOMAIN: Currently empty/null for all records. Placeholder field for official URL of the city or countyCENSUS_POPULATION: Currently null for all records. In the future, it will include the most recent US Census population estimate for the jurisdiction.GlobalID: While all of the layers we provide in this dataset include a GlobalID field with unique values, we do not recommend you make any use of it. The GlobalID field exists to support offline sync, but is not persistent, so data keyed to it will be orphaned at our next update. Use one of the other persistent identifiers, such as GNIS_ID or GEOID instead. Boundary AccuracyCounty boundaries were originally derived from a 1:24,000 accuracy dataset, with improvements made in some places to boundary alignments based on research into historical records and boundary changes as CDTFA learns of them. City boundary data are derived from pre-GIS tax maps, digitized at BOE and CDTFA, with adjustments made directly in GIS for new annexations, detachments, and corrections.Boundary accuracy within the dataset varies. While CDTFA strives to correctly include or exclude parcels from jurisdictions for accurate tax assessment, this dataset does not guarantee that a parcel is placed in the correct jurisdiction. When a parcel is in the correct jurisdiction, this dataset cannot guarantee accurate placement of boundary lines within or between parcels or rights of way. This dataset also provides no information on parcel boundaries. For exact jurisdictional or parcel boundary locations, please consult the county assessor's office and a licensed surveyor. CDTFA's data is used as the best available source because BOE and CDTFA receive information about changes in jurisdictions which otherwise need to be collected independently by an agency or company to compile into usable map boundaries. CDTFA maintains the best available statewide boundary information. CDTFA's source data notes the following about accuracy: City boundary changes and county boundary line adjustments filed with the Board of Equalization per Government Code 54900. This GIS layer contains the boundaries of the unincorporated county and incorporated cities within the state of California. The initial dataset was created in March of 2015 and was based on the State Board of Equalization tax rate area boundaries. As of April 1, 2024, the maintenance of this dataset is provided by the California Department of Tax and Fee Administration for the purpose of determining sales and use tax rates. The boundaries are continuously being revised to align with aerial imagery when areas of conflict are discovered between the original boundary provided by the California State Board of Equalization and the boundary made publicly available by local, state, and federal government. Some differences may occur between actual recorded boundaries and the boundaries used for sales and use tax purposes. The boundaries in this map are representations of taxing jurisdictions for the purpose of determining sales and use tax rates and should not be used to determine precise city or county boundary line locations. Boundary ProcessingThese data make a structural change from the source data. While the full boundaries provided by CDTFA include coastal buffers of varying sizes, many users need boundaries to end at the shoreline of the ocean or a bay. As a result, after examining existing city and county boundary layers, these datasets provide a coastline cut generally along the ocean facing coastline. For county boundaries in northern California, the cut runs near the Golden Gate Bridge, while for cities, we cut along the bay shoreline and into the edge of the Delta at the boundaries of Solano, Contra Costa, and Sacramento counties. In the services linked above, the versions that include the coastal buffers contain them as a second (or third) polygon for the city or county, with the value in the COASTAL field set to whether it"s a bay or ocean polygon. These can be processed back into a single polygon by dissolving on all the fields you wish to keep, since the attributes, other than the COASTAL field and geometry attributes (like areas) remain the same between the polygons for this purpose. SliversIn cases where a city or county"s boundary ends near a coastline, our coastline data may cross back and forth many times while roughly paralleling the jurisdiction"s boundary, resulting in many polygon slivers. We post-process the data to remove these slivers using a city/county boundary priority algorithm. That is, when the data run parallel to each other, we discard the coastline cut and keep the CDTFA-provided boundary, even if it extends into the ocean a small amount. This processing supports consistent boundaries for Fort Bragg, Point Arena, San Francisco, Pacifica, Half Moon Bay, and Capitola, in addition to others. More information on this algorithm will be provided soon. Coastline CaveatsSome cities have buffers extending into water bodies that we do not cut at the shoreline. These

  8. c

    MAXI/GSC 7-Year High and Low Galactic Latitude Source Catalog (3MAXI)

    • s.cnmilf.com
    • catalog.data.gov
    Updated Sep 19, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    High Energy Astrophysics Science Archive Research Center (2025). MAXI/GSC 7-Year High and Low Galactic Latitude Source Catalog (3MAXI) [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/maxi-gsc-7-year-high-and-low-galactic-latitude-source-catalog-3maxi
    Explore at:
    Dataset updated
    Sep 19, 2025
    Dataset provided by
    High Energy Astrophysics Science Archive Research Center
    Description

    This table combines the published X-ray source catalogs of the high galactic latitude (|b| > 10o), Kawamuro et 2018, and the low galactic latitude (|b| < 10o), Hori et al. 2018, based on 7 years of MAXI Gas Slit Camera (GSC) data from 2009 August 13 to 2016 July 31. The low galactic latitude catalog contains 221 sources with a significance threshold > 6.5 sigma. The low galactic faintest source has a flux of 5.2 x 10-12 erg cm-2 s-1 (or an intensity of 0.43 mCrab) in the 4-10 keV band. The high galactic latitude catalog contains 686 sources detected at significances >= 6.5 sigma in the 4-10 keV band. The high galactic 4-10 keV sensitivity reaches ~0.48 mCrab, or ~5.9 x 10-12 erg cm-2 s-1, over half of the survey area. The same data-screening criteria were applied to obtain the low and high galactic catalogs. In their papers the authors describe the detection method, the statistical quantities derived for each source and their variability. To derive a counterpart, each source was cross-matched with the Swift/BAT 105-month catalog (BAT105; Oh et al. 2018), the Uhuru fourth catalog (4U; Forman et al. 1978), the RXTE All-Sky Monitor long-term observed source table (XTEASMLONG16), Meta-Catalog of X-Ray Detected Clusters of Galaxies (MCXC; Piffaretti et al. 2011), the XMM-Newton Slew Survey Catalog (XMMSL217), and the ROSAT All-Sky Survey Bright Source Catalog (1RXS; Voges et al. 1999). Seven of the sources in the low galactic latitudes were detected by binning the data differently (source numbers 215-221 in the catalog named 73-day sources), and, similarly, four of the sources in the high galactic latitude catalog named transient sources. The parameters in the combined table include the source name (3MAXI), the position and its error, the detection significances and fluxes in the 4-10 keV, 3-4 keV bands and 10-20 keV bands the hardness ratios (HR1: 3-4 keV, 4-10 keV and HR2: 4-10 keV, 10-20 keV), excess variance in the 4-10 keV lightcurve and information on the likely counterpart. The high galactic catalogs also reports the flux in the 3-10 keV, an additional hardness (HR3: 3-10 keV and 10-20 keV) and an additional parameter representing variability. The hardness ratios are defined as H-S/H+S were S and H are the soft- and hard-band fluxes, respectively. This table was created by the HEASARC in April 2021. It is a combination of the 7-year low- and high-latitude MAXI source catalogs published on ApJS. The data for the low-galactic latitude and the high-galactic latitude were downloaded from the ApJS electronic version of the Hori et al. 2018 ApJS 235,7 and Kawamuro et al. 2018 ApJS 238,33 papers respectively. The low-latitude data included in this table are from tables 4, 5, 6, 7 of the Hori paper that report the X-ray sources detected (214 sources, table 4), their possible identification (table 5), the transient sources discovered binning the data on 73 days period (7 sources, table 5) and their identification (table 6). The high-latitude data included in this table are from the tables 1,2,3 of the Kawamuro paper that report the X-ray sources detected (682 sources in table 1), their identifications (table 2), and the transient sources (4 sources in table 3). The low and high galactic latitude source catalogs provide for each individual source similar parameters for the X-ray properties with the high-latitude having three additional parameters, specifically, the flux in the 3-10 keV energy range, the 3-10/10-20-keV hardness ratio, and a time variability test. These parameters are kept in the HEASARC combined table and set to "blank" values for the low-latitude sources. The four sources in the high-latitude catalog named transient sources have only fluxes in the 4-10 keV band and no other fluxes in the other energy bands or the hardness ratio are reported. The HEASARC combined table includes a field to identify whether the source is from the low-latitude paper or the high-latitude paper and also maintains the source numbers that were published in the original catalogs. This is a service provided by NASA HEASARC .

  9. Data from: ERA5 hourly data on single levels from 1940 to present

    • cds.climate.copernicus.eu
    • search-sandbox-2.test.dataone.org
    • +1more
    grib
    Updated Oct 18, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ECMWF (2025). ERA5 hourly data on single levels from 1940 to present [Dataset]. http://doi.org/10.24381/cds.adbb2d47
    Explore at:
    gribAvailable download formats
    Dataset updated
    Oct 18, 2025
    Dataset provided by
    European Centre for Medium-Range Weather Forecastshttp://ecmwf.int/
    Authors
    ECMWF
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 1940 - Oct 12, 2025
    Description

    ERA5 is the fifth generation ECMWF reanalysis for the global climate and weather for the past 8 decades. Data is available from 1940 onwards. ERA5 replaces the ERA-Interim reanalysis. Reanalysis combines model data with observations from across the world into a globally complete and consistent dataset using the laws of physics. This principle, called data assimilation, is based on the method used by numerical weather prediction centres, where every so many hours (12 hours at ECMWF) a previous forecast is combined with newly available observations in an optimal way to produce a new best estimate of the state of the atmosphere, called analysis, from which an updated, improved forecast is issued. Reanalysis works in the same way, but at reduced resolution to allow for the provision of a dataset spanning back several decades. Reanalysis does not have the constraint of issuing timely forecasts, so there is more time to collect observations, and when going further back in time, to allow for the ingestion of improved versions of the original observations, which all benefit the quality of the reanalysis product. ERA5 provides hourly estimates for a large number of atmospheric, ocean-wave and land-surface quantities. An uncertainty estimate is sampled by an underlying 10-member ensemble at three-hourly intervals. Ensemble mean and spread have been pre-computed for convenience. Such uncertainty estimates are closely related to the information content of the available observing system which has evolved considerably over time. They also indicate flow-dependent sensitive areas. To facilitate many climate applications, monthly-mean averages have been pre-calculated too, though monthly means are not available for the ensemble mean and spread. ERA5 is updated daily with a latency of about 5 days. In case that serious flaws are detected in this early release (called ERA5T), this data could be different from the final release 2 to 3 months later. In case that this occurs users are notified. The data set presented here is a regridded subset of the full ERA5 data set on native resolution. It is online on spinning disk, which should ensure fast and easy access. It should satisfy the requirements for most common applications. An overview of all ERA5 datasets can be found in this article. Information on access to ERA5 data on native resolution is provided in these guidelines. Data has been regridded to a regular lat-lon grid of 0.25 degrees for the reanalysis and 0.5 degrees for the uncertainty estimate (0.5 and 1 degree respectively for ocean waves). There are four main sub sets: hourly and monthly products, both on pressure levels (upper air fields) and single levels (atmospheric, ocean-wave and land surface quantities). The present entry is "ERA5 hourly data on single levels from 1940 to present".

  10. o

    Data from: HarDWR - Harmonized Water Rights Records

    • osti.gov
    Updated Oct 31, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Caccese, Robert; Fisher-Vanden, Karen; Fowler, Lara; Grogan, Danielle; Lammers, Richard; Lisk, Matthew; Olmstead, Sheila; Peklak, Darrah; Zheng, Jiameng; Zuidema, Shan (2024). HarDWR - Harmonized Water Rights Records [Dataset]. https://www.osti.gov/dataexplorer/biblio/dataset/2475306
    Explore at:
    Dataset updated
    Oct 31, 2024
    Dataset provided by
    MultiSector Dynamics - Living, Intuitive, Value-adding, Environment
    USDOE Office of Science (SC), Biological and Environmental Research (BER)
    Authors
    Caccese, Robert; Fisher-Vanden, Karen; Fowler, Lara; Grogan, Danielle; Lammers, Richard; Lisk, Matthew; Olmstead, Sheila; Peklak, Darrah; Zheng, Jiameng; Zuidema, Shan
    Description

    A dataset within the Harmonized Database of Western U.S. Water Rights (HarDWR). For a detailed description of the database, please see the meta-record v2.0. Changelog v2.0 - Recalculated based on data sourced from WestDAAT - Changed using a Site ID column to identify unique records to using aa combination of Site ID and Allocation ID - Removed the Water Management Area (WMA) column from the harmonized records. The replacement is a separate file which stores the relationship between allocations and WMAs. This allows for allocations to contribute to water right amounts to multiple WMAs during the subsequent cumulative process. - Added a column describing a water rights legal status - Added "Unspecified" was a water source category - Added an acre-foot (AF) column - Added a column for the classification of the right's owner v1.02 - Added a .RData file to the dataset as a convenience for anyone exploring our code. This is an internal file, and the one referenced in analysis scripts as the data objects are already in R data objects. v1.01 - Updated the names of each file with an ID number less than 3 digits to include leading 0s v1.0 - Initial public release Description Heremore » we present an updated database of Western U.S. water right records. This database provides consistent unique identifiers for each water right record, and a consistent categorization scheme that puts each water right record into one of seven broad use categories. These data were instrumental in conducting a study of the multi-sector dynamics of inter-sectoral water allocation changes though water markets (Grogan et al., in review). Specifically, the data were formatted for use as input to a process-based hydrologic model, Water Balance Model (WBM), with a water rights module (Grogan et al., in review). While this specific study motivated the development of the database presented here, water management in the U.S. West is a rich area of study (e.g., Anderson and Woosly, 2005; Tidwell, 2014; Null and Prudencio, 2016; Carney et al., 2021) so releasing this database publicly with documentation and usage notes will enable other researchers to do further work on water management in the U.S. West. We produced the water rights database presented here in four main steps: (1) data collection, (2) data quality control, (3) data harmonization, and (4) generation of cumulative water rights curves. Each of steps (1)-(3) had to be completed in order to produce (4), the final product that was used in the modeling exercise in Grogan et al. (in review). All data in each step is associated with a spatial unit called a Water Management Area (WMA), which is the unit of water right administration utilized by the state in which the right came from. Steps (2) and (3) required use to make assumptions and interpretation, and to remove records from the raw data collection. We describe each of these assumptions and interpretations below so that other researchers can choose to implement alternative assumptions an interpretation as fits their research aims. Motivation for Changing Data Sources The most significant change has been a switch from collecting the raw water rights directly from each state to using the water rights records presented in WestDAAT, a product of the Water Data Exchange (WaDE) Program under the Western States Water Council (WSWC). One of the main reasons for this is that each state of interest is a member of the WSWC, meaning that WaDE is partially funded by these states, as well as many universities. As WestDAAT is also a database with consistent categorization, it has allowed us to spend less time on data collection and quality control and more time on answering research questions. This has included records from water right sources we had previously not known about when creating v1.0 of this database. The only major downside to utilizing the WestDAAT records as our raw data is that further updates are tied to when WestDAAT is updated, as some states update their public water right records daily. However, as our focus is on cumulative water amounts at the regional scale, it is unlikely most records updates would have a significant effect on our results. The structure of WestDAAT led to several important changes to how HarWR is formatted. The most significant change is that WaDE has calculated a field known as SiteUUID, which is a unique identifier for the Point of Diversion (POD), or where the water is drawn from. This separate from AllocationNativeID, which is the identifier for the allocation of water, or the amount of water associated with the water right. It should be noted that it is possible for a single site to have multiple allocations associated with it and for an allocation to be able to be extracted from multiple sites. The site-allocation structure has allowed us to adapt a more consistent, and hopefully more realistic, approach in organizing the water right records than we had with HarDWR v1.0. This was incredibly helpful as the raw data from many states had multiple water uses within a single field within a single row of their raw data, and it was not always clear if the first water use was the most important, or simply first alphabetically. WestDAAT has already addressed this data quality issue. Furthermore, with v1.0, when there were multiple records with the same water right ID, we selected the largest volume or flow amount and disregarded the rest. As WestDAAT was already a common structure for disparate data formats, we were better able to identify sites with multiple allocations and, perhaps more importantly, allocations with multiple sites. This is particularly helpful when an allocation has sites which cross WMA boundaries, instead of just assigning the full water amount to a single WMA we are now able to divide the amount of water between the number of relevant WMAs. As it is now possible to identify allocations with water used in multiple WMAs, it is no longer practical to store this information within a single column. Instead the stAllocationToWMATab.csv file was created, which is an allocation by WMA matrix containing the percent Place of Use area overlap with each WMA. We then use this percentage to divide the allocation's flow amount between the given WMAs during the cumulation process to hopefully provide more realistic totals of water use in each area. However, not every state provides areas of water use, so like HarDWR v1.0, a hierarchical decision tree was used to assign each allocation to a WMA. First, if a WMA could be identified based on the allocation ID, then that WMA was used; typically, when available, this applied to the entire state and no further steps were needed. Second was the spatial analysis of Place of Use to WMAs. Third was a spatial analysis of the POD locations to WMAs, with the assumption that allocation's POD is within the WMA it should belong to; if an allocation still had multiple WMAs based on its POD locations, then the allocation's flow amount would be divided equally between all WMAs. The fourth, and final, process was to include water allocations which spatially fell outside of the state WMA boundaries. This could be due to several reasons, such as coordinate errors / imprecision in the POD location, imprecision in the WMA boundaries, or rights attached with features, such as a reservoir, which crosses state boundaries. To include these records, we decided for any POD which was within one kilometer of the state's edge would be assigned to the nearest WMA. Other Changes WestDAAT has Allowed In addition to a more nuanced and consistent method of assigning water right's data to WMAs, there are other benefits gained from using the WestDAAT dataset. Among those is a consistent categorization of a water right's legal status. In HarDWR v1.0, legal status was effectively ignored, which led to many valid concerns about the quality of the database related to the amounts of water the rights allowed to be claimed. The main issue was that rights with legal status' such as "application withdrawn", "non-active", or "cancelled" were included within HarDWR v1.0. These, and other water rights status' which were deemed to not be in use have been removed from this version of the database. Another major change has been the addition of the "unspecified water source category. This is water that can come from either surface water or groundwater, or the source of which is unknown. The addition of this source category brings the total number of categories to three. Due to reviewer feedback, we decided to add the acre-foot (AF) column so that the data may be more applicable to a wider audience. We added the ownerClassification column so that the data may be more applicable to a wider audience. File Descriptions The dataset is a series of various files organized by state sub-directories. In addition, each file begins with the state's name, in case the file is separate from its sub-directory for some reason. After the state name is the text which describes the contents of the file. Here is each file described in detail. Note that st is a placeholder for the state's name. stFullRecords_HarmonizedRights.csv: A file of the complete water records for each state. The column headers for each of this type of file are: state - The name of the state to which the allocations belong to. FIPS - The two digit numeric state ID code. siteID - The site location ID for POD locations. A site may have multiple allocations, which are the actual amount of water which can be drawn. In a simplified hypothetical, a farm stead may have an allocation for "irrigation" and an allocation for "domestic" water use, but the water is drawn from the same pumping equipment. It should be noted that many of the site ID appear to have been added by WaDE, and therefore may not be recognized by a given state's water rights database. allocationID - The allocation ID for the water right. For most states this is the water right ID, and what is

  11. d

    Factori AI & ML Training Data | Point of Interest Data (POI) | Global |...

    • datarade.ai
    .csv
    Updated May 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Factori (2024). Factori AI & ML Training Data | Point of Interest Data (POI) | Global | Machine Learning Data [Dataset]. https://datarade.ai/data-products/factori-ai-ml-training-data-point-of-interest-data-poi-factori
    Explore at:
    .csvAvailable download formats
    Dataset updated
    May 1, 2024
    Dataset authored and provided by
    Factori
    Area covered
    Heard Island and McDonald Islands, Honduras, French Southern Territories, Haiti, Antigua and Barbuda, San Marino, Lithuania, Gambia, Svalbard and Jan Mayen, Mauritius
    Description

    Our POI Data connects people's movements to over 200M+ physical locations globally. These are aggregated and anonymized data that are only used to offer context for the volume and patterns of visits to certain locations. This data feed is compiled from different data sources around the world.

    Reach: Our POI/Place/OOH level insights are calculated based on Factori’s Mobility & People Graph data aggregated from multiple data sources globally. To achieve the desired foot-traffic attribution, specific attributes are combined to bring forward the desired reach data. For instance, in order to calculate the foot traffic for a specific location, a combination of location ID, day of the week, and part of the day can be combined to give specific location intelligence data. There can be a maximum of 40 data records possible for one POI based on the combination of these attributes.

    Data Export Methodology: Since we collect data dynamically, we provide the most updated data and insights via a best-suited method at a suitable interval (daily/weekly/monthly).

    Use Cases: Geofencing: Geofencing involves creating virtual boundaries around physical locations, enabling businesses to trigger actions when users enter or exit these areas

    Geo-Targeted Advertising: Utilizing location-based insights, businesses can deliver highly personalized advertisements to consumers based on their proximity to relevant POIs.

    Marketing Campaign Strategy: Analyzing visitor demographics and behavior patterns around POIs, businesses can tailor their marketing strategies to effectively reach their target audience.

    Site Selection: By assessing the proximity to relevant POIs such as competitors, customer demographics, and economic indicators, organizations can make informed decisions about opening new locations.

    OOH/DOOH Campaign Planning: Identify high-traffic locations and understand consumer behavior in specific areas, to execute targeted advertising strategies effectively.

    Data Attributes Included: poi_id name category_id is_claimed photo_url brand name brand_id places_topics people_also_search local_business_links naics_code naics_code_description sis_code sic_code_description shape_polygon building_id geometry_location_type geometry_viewport_northeast_lat geometry_viewport_northeast_lng geometry_viewport_southwest_lat geometry_viewport_southwest_lng geometry_location_lat geometry_location_lng calculated_geo_hash_8 building_type building_name shape_type reviews count contact_info local_business_links work_hours popular_time total_photos status attributes price_level rating domain url phone additional_categories longitude latitude country_code zip state city full_address description

  12. a

    Lots

    • canadian-county-geographic-information-center-canadiancounty.hub.arcgis.com
    • hub.arcgis.com
    Updated Aug 8, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CanadianCounty (2023). Lots [Dataset]. https://canadian-county-geographic-information-center-canadiancounty.hub.arcgis.com/datasets/7bbc6322290241a891f237dc43ed16bd
    Explore at:
    Dataset updated
    Aug 8, 2023
    Dataset authored and provided by
    CanadianCounty
    Area covered
    Description

    The Canadian County Parcel Data Public View is a set of geospatial features representing the surface ownership of property in fee simple for property tax purposes as required by 68 O.S. § 2821 and other related data used to produce the parcels such as subdivision boundaries and subdivision lots. The data is created from source documentation filed with the Canadian County Clerk's Office including deeds, easements, and plats. Other data sources such as filed Certified Corner Records filed with the State of Oklahoma or highway plans produced by the Department of Transportation may be used to adjust parcel boundaries. Single legal descriptions may be split up into two or more parcels if the description crosses the boundaries of multiple taxing jurisdictions or crosses quarter section boundaries. Accuracy of parcel data can vary considerably due to a combination of factors. Most parcels and subdivision legal descriptions reference a quarter section or quarter section corner. The accuracy of the quarter section corners is discussed with Canadian County's Public Land Survey System Data. Accuracy is further enhanced or degraded by the quality of the legal description used to create the feature. Generally, legal descriptions created from surveys will have higher accuracy the newer they were created due to improvements in the field of surveying. However, it can be difficult to determine the age of a legal description as descriptions are generally reused on subsequent deeds after the description was first created. Legal descriptions can occasionally contain updated bearings and distances and may denote the updates. The Assessor's Office uses the latest available legal description for creating parcels. Legal descriptions may lack specificity such as the use of "North" instead of a measured bearing or have missing parameters such as missing bearings for curved boundaries. In these cases, parcel data accuracy can be degraded. Further, if a legal description contains a specific landmark or boundary, sometimes called a "bound", the boundary is drawn to that point or landmark regardless of whether the bearing and/or distance accurately arrive at that point. For instance, if a legal description reads "...to the south line of the southeast quarter", the boundary is drawn to the south line of the quarter section even if the bearing and distance are short of or extend beyond that point. Because parcel data must be created for the entire county regardless of the accuracy of the descriptions used to create those parcels, parcels may need to be "stretched" or "squeezed" to make them fit together. When possible, the Assessor's Office relies on the most accurate legal descriptions to set the boundaries and then fits older boundaries to them. Due to the large number of variables, parcel data accuracy cannot be guaranteed nor can the level of accuracy be described for the entire dataset. While Canadian County makes every reasonable effort to make sure parcel data is accurate, this data cannot be used in place of a survey performed by an Oklahoma Licensed Professional Land Surveyor.ParcelDataExternal - Polygons representing surface fee simple title. This parcel data formatted and prepared for public use. Some fields may be blank to comply with 22 O.S. § 60.14 & 68 O.S. § 2899.1Attributes:Account (account): The unique identifier for parcel data generated by the appraisal software used by the Assessor's Office"A" Number (a_number): An integer assigned in approximate chronological order to represent each parcel divided per quarter sectionParcel ID (parcel_id): Number used to identify parcels geographically, see Parcel Data Export Appendix A for an in-depth explanation. This identifier is not unique for all parcelsParcel Size (parcel_size): Size of the parcels, must be used in conjunction with following units fieldParcel Size Units (parcel_size_units): Units for the size of the parcel. Can be "Acres" or "Lots" for parcels within subdivisions that are valued per lotOwner's Name (owners_name): Name of the surface owner of the property in fee simple on recordMailing Information (mail_info): Extra space for the owners name if needed or trustee namesMailing Information 2 (mail_info2): Forwarded mail or "In care of" mailing informationMailing Address (mail_address): Mailing address for the owner or forwarding mailing addressMailing City (mail_city): Mailing or postal cityMailing State (mail_state): Mailing state abbreviated to standard United States Postal Service codesMailing ZIP Code (mail_zip): Mailing ZIP code as determined by the United States Postal ServiceTax Area Code (tax_area): Integer numeric code representing an area in which all the taxing jurisdictions are the same. See Parcel Data Appendix B for a more detailed description of each tax areaTax Area Description (tax_area_desc): Character string code representing the tax area. See Parcel Data Appendix B for a more detailed description of each tax areaProperty Class (prop_class): The Assessor's Office classification of each parcel by rural (no city taxes) or urban (subject to city taxes) and exempt, residential, commercial, or agriculture. This classification system is for property appraisal purposes and does not reflect zoning classifications in use by municipalities. See Parcel Data Appendix B for a more detailed description of each property classificationLegal Description (legal): A highly abbreviated version of the legal description for each parcel. This legal description may not match the most recent legal description for any given property due to administrative divisions as described above, or changes made to the property by way of recorded instruments dividing smaller parcels from the original description. This description may NOT be used in place of a true legal descriptionSubdivision Code (subdiv_code): A numeric code representing a recorded subdivision plat which contains the parcel. This value will be "0" for any parcel not part of a recorded subdivision plat.Subdivision Name (subdiv_name): The name of the recorded subdivision plat abbreviated as needed to adapt to appraisal software field limitationsSubdivision Block Number (subdiv_block): Numeric field representing the block number of a parcel. This value will be "0" if the parcel is not in a recorded subdivision plat or if the plat did not contain block numbersSubdivision Lot Number (subdiv_lot): Numeric field representing the lot number of a parcel. This value will be "0" if the parcel is not in a recorded subdivision platTownship Number (township): Numeric field representing the Public Land Survey System tier or township the parcel is located in. All townships or tiers in Canadian County are north of the base line of the Indian Meridian.Range Number (range): Numeric field representing the Public Land Survey System range the parcel is located in. All Ranges in Canadian County are west of the Indian MeridianSection Number (section): Numeric field representing the Public Land Survey System section number the parcel is located inQuarter Section Code (quarter_sec): Numeric field with a code representing the quarter section a majority of the parcel is located in, 1 = Northeast Quarter, 2 = Northwest Quarter, 3 = Southwest Quarter, 4 = Southeast QuarterSitus Address (situs): Address of the property itself if it is knownSitus City (situs_city): Name of the city the parcel is actually located in (regardless of the postal city) or "Unincorporated" if the parcel is outside any incorporated city limitsSitus ZIP Code (situs_zip): ZIP Code as determined by the United States Postal Service for the property itself if it is knownLand Value (land_val): Appraised value of the land encompassed by the parcel as determined by the Assessor's OfficeImprovement Value (impr_val): Appraised value of the improvements (house, commercial building, etc.) on the property as determined by the Assessor's OfficeManufactured Home Value (mh_val): Appraised value of any manufactured homes on the property and owned by the same owner of the land as determined by the Assessor's OfficeTotal Value (total_val): Total appraised value for the property as determined by the Assessor's OfficeTotal Capped Value (cap_val): The capped value as required by Article X, Section 8B of the Oklahoma ConstitutionTotal Assessed Value (total_assess): The capped value multiplied by the assessment ratio of Canadian County, which is 12% of the capped valueHomestead Exempt Amount (hs_ex_amount): The amount exempt from the assessed value if a homestead exemption is in placeOther Exempt Value (other_ex_amount): The amount exempt from the assessed value if other exemptions are in placeTaxable Value (taxable_val): The amount taxes are calculated on which is the total assessed value minus all exemptionsSubdivisions - Polygons representing a plat or subdivision filed with the County Clerk of Canadian County. Subdivision boundaries may be revised by vacations of the plat or subdivision or by replatting a portion or all of a subdivision. Therefore, subdivision boundaries may not match the boundaries as shown on the originally filed plat.Attributes:Subdivision Name (subdivision_name): The name of the plat or subdivisionSubdivision Number (subdivision_number): An ID for each subdivision created as a portion of the parcel ID discussed in Parcel Data Export Appendix APlat Book Number (book): The book number for the recorded documentPlat Book Page Number (page): The page number for the recorded documentRecorded Acres (acres): The number of acres within the subdivision if knownRecorded Date (recorded_date): The date the document creating the subdivision was recordedDocument URL (clerk_url): URL to download a copy of the document recorded by the Canadian County Clerk's OfficeBlocks - Polygons derived from subdivision lots representing the blocks

  13. California County Boundaries and Identifiers with Coastal Buffers

    • data.ca.gov
    • catalog.data.gov
    • +1more
    Updated Mar 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    California Department of Technology (2025). California County Boundaries and Identifiers with Coastal Buffers [Dataset]. https://data.ca.gov/dataset/california-county-boundaries-and-identifiers-with-coastal-buffers
    Explore at:
    csv, zip, arcgis geoservices rest api, kml, gpkg, gdb, xlsx, html, txt, geojsonAvailable download formats
    Dataset updated
    Mar 4, 2025
    Dataset authored and provided by
    California Department of Technologyhttp://cdt.ca.gov/
    Area covered
    California
    Description

    Note: The schema changed in February 2025 - please see below. We will post a roadmap of upcoming changes, but service URLs and schema are now stable. For deployment status of new services beginning in February 2025, see https://gis.data.ca.gov/pages/city-and-county-boundary-data-status. Additional roadmap and status links at the bottom of this metadata.

    This dataset is regularly updated as the source data from CDTFA is updated, as often as many times a month. If you require unchanging point-in-time data, export a copy for your own use rather than using the service directly in your applications.


    Purpose
    County boundaries along with third party identifiers used to join in external data. Boundaries are from the California Department of Tax and Fee Administration (CDTFA). These boundaries are the best available statewide data source in that CDTFA receives changes in incorporation and boundary lines from the Board of Equalization, who receives them from local jurisdictions for tax purposes. Boundary accuracy is not guaranteed, and though CDTFA works to align boundaries based on historical records and local changes, errors will exist. If you require a legal assessment of boundary location, contact a licensed surveyor.

    This dataset joins in multiple attributes and identifiers from the US Census Bureau and Board on Geographic Names to facilitate adding additional third party data sources. In addition, we attach attributes of our own to ease and reduce common processing needs and questions. Finally, coastal buffers are separated into separate polygons, leaving the land-based portions of jurisdictions and coastal buffers in adjacent polygons. This feature layer is for public use.

    Related Layers

    This dataset is part of a grouping of many datasets:

    1. Cities: Only the city boundaries and attributes, without any unincorporated areas
    2. Counties: Full county boundaries and attributes, including all cities within as a single polygon
    3. Cities and Full Counties: A merge of the other two layers, so polygons overlap within city boundaries. Some customers require this behavior, so we provide it as a separate service.
    4. City and County Abbreviations
    5. Unincorporated Areas (Coming Soon)
    6. Census Designated Places
    7. Cartographic Coastline
    Working with Coastal Buffers
    The dataset you are currently viewing includes the coastal buffers for cities and counties that have them in the source data from CDTFA. In the versions where they are included, they remain as a second polygon on cities or counties that have them, with all the same identifiers, and a value in the COASTAL field indicating if it"s an ocean or a bay buffer. If you wish to have a single polygon per jurisdiction that includes the coastal buffers, you can run a Dissolve on the version that has the coastal buffers on all the fields except OFFSHORE and AREA_SQMI to get a version with the correct identifiers.

    Point of Contact

    California Department of Technology, Office of Digital Services, gis@state.ca.gov

    Field and Abbreviation Definitions

    • CDTFA_COUNTY: CDTFA county name. For counties, this will be the name of the polygon itself. For cities, it is the name of the county the city polygon is within.
    • CDTFA_COPRI: county number followed by the 3-digit city primary number used in the Board of Equalization"s 6-digit tax rate area numbering system. The boundary data originate with CDTFA's teams managing tax rate information, so this field is preserved and flows into this dataset.
    • CENSUS_GEOID: numeric geographic identifiers from the US Census Bureau
    • CENSUS_PLACE_TYPE: City, County, or Town, stripped off the census name for identification purpose.
    • GNIS_PLACE_NAME: Board on Geographic Names authorized nomenclature for area names published in the Geographic Name Information System
    • GNIS_ID: The numeric identifier from the Board on Geographic Names that can be used to join these boundaries to other datasets utilizing this identifier.
    • CDT_COUNTY_ABBR: Abbreviations of county names - originally derived from CalTrans Division of Local Assistance and now managed by CDT. Abbreviations are 3 characters.
    • CDT_NAME_SHORT: The name of the jurisdiction (city or county) with the word "City" or "County" stripped off the end. Some changes may come to how we process this value to make it more consistent.
    • AREA_SQMI: The area of the administrative unit (city or county) in square miles, calculated in EPSG 3310 California Teale Albers.
    • OFFSHORE: Indicates if the polygon is a coastal buffer. Null for land polygons. Additional values include "ocean" and "bay".
    • PRIMARY_DOMAIN: Currently empty/null for all records. Placeholder field for official URL of the city or county
    • CENSUS_POPULATION: Currently null for all records. In the future, it will include the most recent US Census population estimate for the jurisdiction.
    • GlobalID: While all of the layers we provide in this dataset include a GlobalID field with unique values, we do not recommend you make any use of it. The GlobalID field exists to support offline sync, but is not persistent, so data keyed to it will be orphaned at our next update. Use one of the other persistent identifiers, such as GNIS_ID or GEOID instead.

    Boundary Accuracy
    County boundaries were originally derived from a 1:24,000 accuracy dataset, with improvements made in some places to boundary alignments based on research into historical records and boundary changes as CDTFA learns of them. City boundary data are derived from pre-GIS tax maps, digitized at BOE and CDTFA, with adjustments made directly in GIS for new annexations, detachments, and corrections.
    <br

  14. Data from: A dataset to model Levantine landcover and land-use change...

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Dec 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Michael Kempf; Michael Kempf (2023). A dataset to model Levantine landcover and land-use change connected to climate change, the Arab Spring and COVID-19 [Dataset]. http://doi.org/10.5281/zenodo.10396148
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 16, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Michael Kempf; Michael Kempf
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Dec 16, 2023
    Area covered
    Levant
    Description

    Overview

    This dataset is the repository for the following paper submitted to Data in Brief:

    Kempf, M. A dataset to model Levantine landcover and land-use change connected to climate change, the Arab Spring and COVID-19. Data in Brief (submitted: December 2023).

    The Data in Brief article contains the supplement information and is the related data paper to:

    Kempf, M. Climate change, the Arab Spring, and COVID-19 - Impacts on landcover transformations in the Levant. Journal of Arid Environments (revision submitted: December 2023).

    Description/abstract

    The Levant region is highly vulnerable to climate change, experiencing prolonged heat waves that have led to societal crises and population displacement. Since 2010, the area has been marked by socio-political turmoil, including the Syrian civil war and currently the escalation of the so-called Israeli-Palestinian Conflict, which strained neighbouring countries like Jordan due to the influx of Syrian refugees and increases population vulnerability to governmental decision-making. Jordan, in particular, has seen rapid population growth and significant changes in land-use and infrastructure, leading to over-exploitation of the landscape through irrigation and construction. This dataset uses climate data, satellite imagery, and land cover information to illustrate the substantial increase in construction activity and highlights the intricate relationship between climate change predictions and current socio-political developments in the Levant.

    Folder structure

    The main folder after download contains all data, in which the following subfolders are stored are stored as zipped files:

    “code” stores the above described 9 code chunks to read, extract, process, analyse, and visualize the data.

    “MODIS_merged” contains the 16-days, 250 m resolution NDVI imagery merged from three tiles (h20v05, h21v05, h21v06) and cropped to the study area, n=510, covering January 2001 to December 2022 and including January and February 2023.

    “mask” contains a single shapefile, which is the merged product of administrative boundaries, including Jordan, Lebanon, Israel, Syria, and Palestine (“MERGED_LEVANT.shp”).

    “yield_productivity” contains .csv files of yield information for all countries listed above.

    “population” contains two files with the same name but different format. The .csv file is for processing and plotting in R. The .ods file is for enhanced visualization of population dynamics in the Levant (Socio_cultural_political_development_database_FAO2023.ods).

    “GLDAS” stores the raw data of the NASA Global Land Data Assimilation System datasets that can be read, extracted (variable name), and processed using code “8_GLDAS_read_extract_trend” from the respective folder. One folder contains data from 1975-2022 and a second the additional January and February 2023 data.

    “built_up” contains the landcover and built-up change data from 1975 to 2022. This folder is subdivided into two subfolder which contain the raw data and the already processed data. “raw_data” contains the unprocessed datasets and “derived_data” stores the cropped built_up datasets at 5 year intervals, e.g., “Levant_built_up_1975.tif”.

    Code structure

    1_MODIS_NDVI_hdf_file_extraction.R


    This is the first code chunk that refers to the extraction of MODIS data from .hdf file format. The following packages must be installed and the raw data must be downloaded using a simple mass downloader, e.g., from google chrome. Packages: terra. Download MODIS data from after registration from: https://lpdaac.usgs.gov/products/mod13q1v061/ or https://search.earthdata.nasa.gov/search (MODIS/Terra Vegetation Indices 16-Day L3 Global 250m SIN Grid V061, last accessed, 09th of October 2023). The code reads a list of files, extracts the NDVI, and saves each file to a single .tif-file with the indication “NDVI”. Because the study area is quite large, we have to load three different (spatially) time series and merge them later. Note that the time series are temporally consistent.


    2_MERGE_MODIS_tiles.R


    In this code, we load and merge the three different stacks to produce large and consistent time series of NDVI imagery across the study area. We further use the package gtools to load the files in (1, 2, 3, 4, 5, 6, etc.). Here, we have three stacks from which we merge the first two (stack 1, stack 2) and store them. We then merge this stack with stack 3. We produce single files named NDVI_final_*consecutivenumber*.tif. Before saving the final output of single merged files, create a folder called “merged” and set the working directory to this folder, e.g., setwd("your directory_MODIS/merged").


    3_CROP_MODIS_merged_tiles.R


    Now we want to crop the derived MODIS tiles to our study area. We are using a mask, which is provided as .shp file in the repository, named "MERGED_LEVANT.shp". We load the merged .tif files and crop the stack with the vector. Saving to individual files, we name them “NDVI_merged_clip_*consecutivenumber*.tif. We now produced single cropped NDVI time series data from MODIS.
    The repository provides the already clipped and merged NDVI datasets.


    4_TREND_analysis_NDVI.R


    Now, we want to perform trend analysis from the derived data. The data we load is tricky as it contains 16-days return period across a year for the period of 22 years. Growing season sums contain MAM (March-May), JJA (June-August), and SON (September-November). December is represented as a single file, which means that the period DJF (December-February) is represented by 5 images instead of 6. For the last DJF period (December 2022), the data from January and February 2023 can be added. The code selects the respective images from the stack, depending on which period is under consideration. From these stacks, individual annually resolved growing season sums are generated and the slope is calculated. We can then extract the p-values of the trend and characterize all values with high confidence level (0.05). Using the ggplot2 package and the melt function from reshape2 package, we can create a plot of the reclassified NDVI trends together with a local smoother (LOESS) of value 0.3.
    To increase comparability and understand the amplitude of the trends, z-scores were calculated and plotted, which show the deviation of the values from the mean. This has been done for the NDVI values as well as the GLDAS climate variables as a normalization technique.


    5_BUILT_UP_change_raster.R


    Let us look at the landcover changes now. We are working with the terra package and get raster data from here: https://ghsl.jrc.ec.europa.eu/download.php?ds=bu (last accessed 03. March 2023, 100 m resolution, global coverage). Here, one can download the temporal coverage that is aimed for and reclassify it using the code after cropping to the individual study area. Here, I summed up different raster to characterize the built-up change in continuous values between 1975 and 2022.


    6_POPULATION_numbers_plot.R


    For this plot, one needs to load the .csv-file “Socio_cultural_political_development_database_FAO2023.csv” from the repository. The ggplot script provided produces the desired plot with all countries under consideration.


    7_YIELD_plot.R


    In this section, we are using the country productivity from the supplement in the repository “yield_productivity” (e.g., "Jordan_yield.csv". Each of the single country yield datasets is plotted in a ggplot and combined using the patchwork package in R.


    8_GLDAS_read_extract_trend


    The last code provides the basis for the trend analysis of the climate variables used in the paper. The raw data can be accessed https://disc.gsfc.nasa.gov/datasets?keywords=GLDAS%20Noah%20Land%20Surface%20Model%20L4%20monthly&page=1 (last accessed 9th of October 2023). The raw data comes in .nc file format and various variables can be extracted using the [“^a variable name”] command from the spatraster collection. Each time you run the code, this variable name must be adjusted to meet the requirements for the variables (see this link for abbreviations: https://disc.gsfc.nasa.gov/datasets/GLDAS_CLSM025_D_2.0/summary, last accessed 09th of October 2023; or the respective code chunk when reading a .nc file with the ncdf4 package in R) or run print(nc) from the code or use names(the spatraster collection).
    Choosing one variable, the code uses the MERGED_LEVANT.shp mask from the repository to crop and mask the data to the outline of the study area.
    From the processed data, trend analysis are conducted and z-scores were calculated following the code described above. However, annual trends require the frequency of the time series analysis to be set to value = 12. Regarding, e.g., rainfall, which is measured as annual sums and not means, the chunk r.sum=r.sum/12 has to be removed or set to r.sum=r.sum/1 to avoid calculating annual mean values (see other variables). Seasonal subset can be calculated as described in the code. Here, 3-month subsets were chosen for growing seasons, e.g. March-May (MAM), June-July (JJA), September-November (SON), and DJF (December-February, including Jan/Feb of the consecutive year).
    From the data, mean values of 48 consecutive years are calculated and trend analysis are performed as describe above. In the same way, p-values are extracted and 95 % confidence level values are marked with dots on the raster plot. This analysis can be performed with a much longer time series, other variables, ad different spatial extent across the globe due to the availability of the GLDAS variables.

  15. d

    WMAP Nine-Year CMB-Free QVW Point Source Catalog

    • catalog.data.gov
    • data.nasa.gov
    Updated Sep 19, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    High Energy Astrophysics Science Archive Research Center (2025). WMAP Nine-Year CMB-Free QVW Point Source Catalog [Dataset]. https://catalog.data.gov/dataset/wmap-nine-year-cmb-free-qvw-point-source-catalog
    Explore at:
    Dataset updated
    Sep 19, 2025
    Dataset provided by
    High Energy Astrophysics Science Archive Research Center
    Description

    The Wilkinson Microwave Anisotropy Probe (WMAP) is designed to produce all-sky maps of the cosmic microwave background (CMB) anisotropy. The WMAP 9-Year CMB-Free Point Source Catalog contained herein has information on 502 point sources in three frequency bands (41, 61 and 94 GHz, also known as the Q, V, and W bands, respectively) based on data from the entire 9 years of the WMAP sky survey from 10 Aug 2001 0:00 UT to 10 Aug 2010 0:00 UT, inclusive. The CMB-free method of point source identification was originally applied to one-year and three-year V- and W-band maps by Chen & Wright (2008, ApJ, 681, 747) and to five-year V- and W-band maps by Wright et al. (2009, ApJS, 180, 283). The method used here is that applied to five-year Q-, V-, and W-band maps by Chen & Wright (2009, ApJ, 694, 222) and to seven-year Q-, V-, and W-band maps by Gold et al. (2011, ApJS, 192, 15). The V- and W-band maps are smoothed to Q-band resolution. An internal linear combination (ILC) map (see Section 5.3.3 of the reference paper) is then formed from the three maps using weights such that CMB fluctuations are removed, flat-spectrum point sources are retained with fluxes normalized to Q-band, and the variance of the ILC map is minimized. The ILC map is filtered to reduce noise and suppress large angular scale structure. Peaks in the filtered map that are > 5 sigma and outside of the nine-year point source catalog mask are identified as point sources, and source positions are obtained by fitting the beam profile plus a baseline to the filtered map for each source. For the nine- year analysis, the position of the brightest pixel is adopted instead of the fit position in rare instances where they differ by > 0.1 degrees. Source fluxes are estimated by integrating the Q, V, and W temperature maps within 1.25 degrees of each source position, with a weighting function to enhance the contrast of the point source relative to background fluctuations, and applying a correction for Eddington bias due to noise (sometimes called "deboosting"). The authors identify possible 5-GHz counterparts to the WMAP sources found by cross-correlating with the GB6 (Gregory et al. 1996, ApJS, 103, 427), PMN (Griffith et al. 1994, ApJS, 90, 179; Griffith et al. 1995, ApJS, 97, 347; Wright et al. 1994, ApJS, 94, 111; Wright et al. 1996, ApJS, 103, 145), Kuehr et al. (1981, A&AS, 45, 367), and Healey et al. (2009, AJ, 138, 1032) catalogs. A 5-GHz source is identified as a counterpart if it lies within 11 arcminutes of the WMAP source position (the mean WMAP source position uncertainty is 4 arcminutes). When two or more 5 GHz sources are within 11 arcminutes, the brightest is assumed to be the counterpart and a multiple identification flag is entered in the catalog. A separate 9-year Point Source Catalog (available in Browse as the WMAPPTSRC table) has information on 501 point sources in five frequency bands from 23 to 94 GHz that were found using an alternative method. The two catalogs have 387 sources in common. As noted by Gold et al. (2011, ApJS, 192, 15), differences in the source populations detected by the two search methods are largely caused by Eddington bias in the five-band source detections due to CMB fluctuations and noise. At low flux levels, the five-band method tends to detect point sources located on positive CMB fluctuations and to overestimate their fluxes, and it tends to miss sources located in negative CMB fluctuations. Other point source detection methods have been applied to WMAP data and have identified sources not found by our methods (e.g., Scodeller et al. (2012, ApJ, 753, 27); Lanz (2012, ADASS 7); Ramos et al. (2011, A&A, 528, A75), and references therein). For more details of how the point source catalogs were constructed, see Section 5.2.2 of the reference paper. This table was last updated by the HEASARC in January 2013 based on an electronic version of Table 19 from the reference paper which was obtained from the LAMBDA web site, the file http://lambda.gsfc.nasa.gov/data/map/dr5/dfp/ptsrc/wmap_ptsrc_catalog_cmb_free_9yr_v5.txt. The source_flag values of 'M' in this file were changed to the 'a' values that were used in the printed version of this table. This is a service provided by NASA HEASARC .

  16. Habitat condition time-series for the Cooper region

    • researchdata.edu.au
    Updated Sep 22, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bioregional Assessment Program (2021). Habitat condition time-series for the Cooper region [Dataset]. https://researchdata.edu.au/habitat-condition-time-cooper-region/2980693
    Explore at:
    Dataset updated
    Sep 22, 2021
    Dataset provided by
    Data.govhttps://data.gov/
    Authors
    Bioregional Assessment Program
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Abstract

    This dataset represents a historic annual time-series of estimated habitat condition across the study region, from 2001 to 2018 inclusive. These spatial grids were developed by combining two spatial data products: (1) a new analytical approach called Compere, which contrasts the vegetation cover of each location with a set of environmentally similar locations across the region, for each year; (2) a temporally static spatial layer on the total length of all linear anthropogenic disturbances, in each 500 m grid cell across the study region. The resultant habitat condition spatial time-series is therefore intended to combine information on both localised (e.g. roads, seismic surveys) and dispersed (e.g grazing, fire) influences on the habitat condition for biodiversity of each location, ranging continuously from '0' (completely degraded) to '1' (pristine).

    Attribution

    Geological and Bioregional Assessment Program

    History

    To derive a past-to-present time-series of habitat condition across the buffered study regions, we combined two spatial data products. The first set of spatial data come from a new analytical approach called Compere, which contrasts the vegetation cover of each location with a set of environmentally similar locations across the region. For the present purpose, we applied the Absolute Range Ratio (ARR) metric from Compere, which for each location at a time-point is the observed vegetation cover divided by the maximum vegetation cover across all the environmentally similar locations at that time-point. The ARR spatial layers were available for each year from 2001 to 2018 inclusive. To translate the ARR spatial layers to better represent a habitat condition metric (h_ARR), we rescaled the ARR values as: \r h_ARR=ARR+(k×((100-ARR)/100))\r with the scalar k specifying the minimum habitat condition value set to 40 % for the present analysis.\r \r The second spatial dataset used in deriving habitat condition for biodiversity was a spatial layer on the total length of all linear disturbances in each 500 m grid cell across the study regions. We converted this layer into a habitat condition metric (hL) by assuming complete habitat loss for a width of 10 m for all linear disturbances, then taking the inverse of the proportion of each grid cell area that was disturbed.\r \r These two data sources on habitat condition provide complementary information. The h_ARR condition metric derived from the Compere analysis is useful in detecting the broadscale impacts of actions such as grazing and fire management across the region on habitat condition. In contrast, the hL condition metric derived from the spatial data on linear disturbances is useful in identifying known disturbances, such as roads, fence-lines and seismic survey lines. We therefore combined these two condition metrics, using a conservative approach of taking the minimum condition value from each of these metrics, for each grid cell:\r h=min(h_ARR,h_L )\r assuming a constant level of linear disturbance over the time period of the Compere analysis (2001-2018). This provided an historic annual time series of habitat condition across each study region.

  17. c

    California County Boundaries and Identifiers

    • gis.data.ca.gov
    • data.ca.gov
    • +1more
    Updated Sep 16, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    California Department of Technology (2024). California County Boundaries and Identifiers [Dataset]. https://gis.data.ca.gov/datasets/california-county-boundaries-and-identifiers
    Explore at:
    Dataset updated
    Sep 16, 2024
    Dataset authored and provided by
    California Department of Technology
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Description

    Note: The schema changed in February 2025 - please see below. We will post a roadmap of upcoming changes, but service URLs and schema are now stable. For deployment status of new services beginning in February 2025, see https://gis.data.ca.gov/pages/city-and-county-boundary-data-status. Additional roadmap and status links at the bottom of this metadata.This dataset is regularly updated as the source data from CDTFA is updated, as often as many times a month. If you require unchanging point-in-time data, export a copy for your own use rather than using the service directly in your applications. Purpose County boundaries along with third party identifiers used to join in external data. Boundaries are from the California Department of Tax and Fee Administration (CDTFA). These boundaries are the best available statewide data source in that CDTFA receives changes in incorporation and boundary lines from the Board of Equalization, who receives them from local jurisdictions for tax purposes. Boundary accuracy is not guaranteed, and though CDTFA works to align boundaries based on historical records and local changes, errors will exist. If you require a legal assessment of boundary location, contact a licensed surveyor.This dataset joins in multiple attributes and identifiers from the US Census Bureau and Board on Geographic Names to facilitate adding additional third party data sources. In addition, we attach attributes of our own to ease and reduce common processing needs and questions. Finally, coastal buffers are separated into separate polygons, leaving the land-based portions of jurisdictions and coastal buffers in adjacent polygons. This layer removes the coastal buffer polygons. This feature layer is for public use. Related LayersThis dataset is part of a grouping of many datasets:Cities: Only the city boundaries and attributes, without any unincorporated areasWith Coastal BuffersWithout Coastal BuffersCounties: Full county boundaries and attributes, including all cities within as a single polygonWith Coastal BuffersWithout Coastal Buffers (this dataset)Cities and Full Counties: A merge of the other two layers, so polygons overlap within city boundaries. Some customers require this behavior, so we provide it as a separate service.With Coastal BuffersWithout Coastal BuffersCity and County AbbreviationsUnincorporated Areas (Coming Soon)Census Designated PlacesCartographic CoastlinePolygonLine source (Coming Soon) Working with Coastal Buffers The dataset you are currently viewing excludes the coastal buffers for cities and counties that have them in the source data from CDTFA. In the versions where they are included, they remain as a second polygon on cities or counties that have them, with all the same identifiers, and a value in the COASTAL field indicating if it"s an ocean or a bay buffer. If you wish to have a single polygon per jurisdiction that includes the coastal buffers, you can run a Dissolve on the version that has the coastal buffers on all the fields except OFFSHORE and AREA_SQMI to get a version with the correct identifiers. Point of ContactCalifornia Department of Technology, Office of Digital Services, gis@state.ca.gov Field and Abbreviation DefinitionsCDTFA_COUNTY: CDTFA county name. For counties, this will be the name of the polygon itself. For cities, it is the name of the county the city polygon is within.CDTFA_COPRI: county number followed by the 3-digit city primary number used in the Board of Equalization"s 6-digit tax rate area numbering system. The boundary data originate with CDTFA's teams managing tax rate information, so this field is preserved and flows into this dataset.CENSUS_GEOID: numeric geographic identifiers from the US Census BureauCENSUS_PLACE_TYPE: City, County, or Town, stripped off the census name for identification purpose.GNIS_PLACE_NAME: Board on Geographic Names authorized nomenclature for area names published in the Geographic Name Information SystemGNIS_ID: The numeric identifier from the Board on Geographic Names that can be used to join these boundaries to other datasets utilizing this identifier.CDT_COUNTY_ABBR: Abbreviations of county names - originally derived from CalTrans Division of Local Assistance and now managed by CDT. Abbreviations are 3 characters.CDT_NAME_SHORT: The name of the jurisdiction (city or county) with the word "City" or "County" stripped off the end. Some changes may come to how we process this value to make it more consistent.AREA_SQMI: The area of the administrative unit (city or county) in square miles, calculated in EPSG 3310 California Teale Albers.OFFSHORE: Indicates if the polygon is a coastal buffer. Null for land polygons. Additional values include "ocean" and "bay".PRIMARY_DOMAIN: Currently empty/null for all records. Placeholder field for official URL of the city or countyCENSUS_POPULATION: Currently null for all records. In the future, it will include the most recent US Census population estimate for the jurisdiction.GlobalID: While all of the layers we provide in this dataset include a GlobalID field with unique values, we do not recommend you make any use of it. The GlobalID field exists to support offline sync, but is not persistent, so data keyed to it will be orphaned at our next update. Use one of the other persistent identifiers, such as GNIS_ID or GEOID instead. Boundary AccuracyCounty boundaries were originally derived from a 1:24,000 accuracy dataset, with improvements made in some places to boundary alignments based on research into historical records and boundary changes as CDTFA learns of them. City boundary data are derived from pre-GIS tax maps, digitized at BOE and CDTFA, with adjustments made directly in GIS for new annexations, detachments, and corrections.Boundary accuracy within the dataset varies. While CDTFA strives to correctly include or exclude parcels from jurisdictions for accurate tax assessment, this dataset does not guarantee that a parcel is placed in the correct jurisdiction. When a parcel is in the correct jurisdiction, this dataset cannot guarantee accurate placement of boundary lines within or between parcels or rights of way. This dataset also provides no information on parcel boundaries. For exact jurisdictional or parcel boundary locations, please consult the county assessor's office and a licensed surveyor. CDTFA's data is used as the best available source because BOE and CDTFA receive information about changes in jurisdictions which otherwise need to be collected independently by an agency or company to compile into usable map boundaries. CDTFA maintains the best available statewide boundary information. CDTFA's source data notes the following about accuracy: City boundary changes and county boundary line adjustments filed with the Board of Equalization per Government Code 54900. This GIS layer contains the boundaries of the unincorporated county and incorporated cities within the state of California. The initial dataset was created in March of 2015 and was based on the State Board of Equalization tax rate area boundaries. As of April 1, 2024, the maintenance of this dataset is provided by the California Department of Tax and Fee Administration for the purpose of determining sales and use tax rates. The boundaries are continuously being revised to align with aerial imagery when areas of conflict are discovered between the original boundary provided by the California State Board of Equalization and the boundary made publicly available by local, state, and federal government. Some differences may occur between actual recorded boundaries and the boundaries used for sales and use tax purposes. The boundaries in this map are representations of taxing jurisdictions for the purpose of determining sales and use tax rates and should not be used to determine precise city or county boundary line locations. Boundary ProcessingThese data make a structural change from the source data. While the full boundaries provided by CDTFA include coastal buffers of varying sizes, many users need boundaries to end at the shoreline of the ocean or a bay. As a result, after examining existing city and county boundary layers, these datasets provide a coastline cut generally along the ocean facing coastline. For county boundaries in northern California, the cut runs near the Golden Gate Bridge, while for cities, we cut along the bay shoreline and into the edge of the Delta at the boundaries of Solano, Contra Costa, and Sacramento counties. In the services linked above, the versions that include the coastal buffers contain them as a second (or third) polygon for the city or county, with the value in the COASTAL field set to whether it"s a bay or ocean polygon. These can be processed back into a single polygon by dissolving on all the fields you wish to keep, since the attributes, other than the COASTAL field and geometry attributes (like areas) remain the same between the polygons for this purpose. SliversIn cases where a city or county"s boundary ends near a coastline, our coastline data may cross back and forth many times while roughly paralleling the jurisdiction"s boundary, resulting in many polygon slivers. We post-process the data to remove these slivers using a city/county boundary priority algorithm. That is, when the data run parallel to each other, we discard the coastline cut and keep the CDTFA-provided boundary, even if it extends into the ocean a small amount. This processing supports consistent boundaries for Fort Bragg, Point Arena, San Francisco, Pacifica, Half Moon Bay, and Capitola, in addition to others. More information on this algorithm will be provided soon. Coastline CaveatsSome cities have buffers extending into water bodies that we do not cut at the shoreline. These include South Lake Tahoe and Folsom, which extend into neighboring lakes, and San Diego and surrounding cities that extend into San Diego Bay, which our shoreline encloses. If you have feedback on the exclusion of these

  18. o

    HarDWR - Raw Water Rights Records

    • osti.gov
    Updated Oct 31, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Caccese, Robert; Fisher-Vanden, Karen; Fowler, Lara; Grogan, Danielle; Lammers, Richard; Lisk, Matthew; Olmstead, Sheila; Peklak, Darrah; Zheng, Jiameng; Zuidema, Shan (2020). HarDWR - Raw Water Rights Records [Dataset]. https://www.osti.gov/dataexplorer/biblio/dataset/2475305
    Explore at:
    Dataset updated
    Oct 31, 2020
    Dataset provided by
    MultiSector Dynamics - Living, Intuitive, Value-adding, Environment
    USDOE Office of Science (SC), Biological and Environmental Research (BER)
    Authors
    Caccese, Robert; Fisher-Vanden, Karen; Fowler, Lara; Grogan, Danielle; Lammers, Richard; Lisk, Matthew; Olmstead, Sheila; Peklak, Darrah; Zheng, Jiameng; Zuidema, Shan
    Description

    A dataset within the Harmonized Database of Western U.S. Water Rights (HarDWR). For a detailed description of the database, please see the meta-record v2.0. Changelog v2.0 - Switched source data from collecting records from each state independently to using the WestDAAT dataset v1.0 - Initial public release Description In order to hold a water right in the western United States, an entity, (e.g., an individual, corporation, municipality, sovereign government, or non-profit) must register a physical document with the state's water regulatory agency. State water agencies each maintain their own database containing all registered water right documents within the state, along with relevant metadata such as the point of diversion and place of use of the water. All western U.S. states have digitized their individual water rights databases, as well as geospatial data defining the areas in which water rights are managed. Each state maintains and provides their own water rights data in accordance with individual state regulations and standards. In addition, while all states make their water rights publicly available, each provides their records in unique formats, meaning that file types, field availability, and terms vary from state to state. This leads to additional challenges to managing resources which crossmore » state lines, or conducting consistent multi-state water analyses. For the first version of HarDWR, we collected the water rights databases from 11 Western States of the United States. In order to preform regional analyses with the collected data, the raw records had to be harmonized into one single format. The Water Data Exchange (WaDE) is a program dedicated to the sharing of water-related data for the Western U.S. in a singular consistent format. Created by the Western States Water Council (WSWC) to facilitate the collection and dissemination of water data among WSWC's member states and the public, WaDE provides an important service for those interested in water resource planning and management in their focus region. Of the services which WaDE provides, the one of the most interesting is the WestDAAT dataset, which is a collection of water rights data provided by the 18 WSWC member states that have been standardized into a single format, much like we had done on a more limited scale with HarDWR v1. For this version of HarDWR we decided to use WestDAAT, specifically a snapshot created in Feburary 2024, as our water rights source data. A full explanation of the benefits gained from this switch can be found in the description of the updated Harmonized Water Rights Records v2.0, but in short it has allowed us to focus more of our efforts on answering research questions and gaining a more realistic understanding of how water rights are allocated. For more information on how the data for WestDAAT was collected, please see the WaDE data summary. Terms of Use While WaDE works directly with the state agencies to collect and standardize the water rights records, the ultimate authority for the water rights data remains the individual states. Each state, and their respective water right authorities, have made their water right records available for non-commercial reference uses. In addition, the states make no guarantees as to the completeness, accuracy, or timeliness of their respective databases, let alone the modifications which we, the authors of this paper, have made to the collected records. None of the states should be held liable for using this data outside of its intended use. As several of the states update their water rights databases daily, the information provided here is not the latest possible, and should not be used for legal purposes. WestDAAT itself has irregular updates. Additional questions about the data the source states provided should be directed to the respective state agencies (see methods.csv and organization.csv files described below). In addition, although data was presented here was not collected directly from the states, several states requested specifically worked disclaimers when sharing their data. These disclaimers are included here as an acknowledgement from where the water rights data is primarily sourced. Colorado: "The data made available here has been modified for use from its original source, which is the State of Colorado. THE STATE OF COLORADO MAKES NO REPRESENTATIONS OR WARRANTY AS TO THE COMPLETENESS, ACCURACY, TIMELINESS, OR CONTENT OF ANY DATA MADE AVAILABLE THROUGH THIS SITE. THE STATE OF COLORADO EXPRESSLY DISCLAIMS ALL WARRANTIES, WHETHER EXPRESS OR IMPLIED, INCLUDING ANY IMPLIED WARRANTIES OF MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE. The data is subject to change as modifications and updates are complete. It is understood that the information contained in the Web feed is being used at one's own risk." Montana: "The Montana State Library provides this product/service for informational purposes only. The Library did not produce it for, nor is it suitable for legal, engineering, or surveying purposes. Consumers of this information should review or consult the primary data and information sources to ascertain the viability of the information for their purposes. The Library provides these data in good faith but does not represent or warrant its accuracy, adequacy, or completeness. In no event shall the Library be liable for any incorrect results or analysis; any direct, indirect, special, or consequential damages to any party; or any lost profits arising out of or in connection with the use or the inability to use the data or the services provided. The Library makes these data and services available as a convenience to the public, and for no other purpose. The Library reserves the right to change or revise published data and/or services at any time." Oregon: "This product is for informational purposes and may not have been prepared for, or be suitable for legal, engineering, or surveying purposes. Users of this information should review or consult the primary data and information sources to ascertain the usability of the information." File Descriptions The unmodified February, 2024 WestDAAT snapshot is composed of nine files. Below is a brief description of each file, as well as how they were utilized for HarDWR. WaDEDataDictionaryTerms.xlsx: As the file's name implies, this is a data dictionary for all of the below named files. This file describes the column names for each of the following files, with the exception of citation.txt which does not have any columns. The descriptions for each file are divided by tab,with the same name as their associated file, within this document. allocationamount.csv: The "main" file of the group, it contains the water right records for each state. Of particular note, each water right is broken down into one or more water allocations. Allocations may be withdrawn from one or more locations, or even multiple allocations associated with a particular location. This is a more subtle and realistic representation of how water is used than what was available in the first version of HarDWR. For the records from some states, this can mean that multiple allocations listed under a single right will appear as rows within this file. citation.txt: A combination of contact information for WaDE personnel, disclaimer about how the data should be used, and guidelines for citing WestDAAT. methods.csv: A file describing the source and method by which WaDE collected water rights data from each state. organization.csv: A file listing the water rights authoritative agencies for each state. sites.csv: This file provides the geographic, and other descriptors, of the physical location of allocations, called 'sites'. To reiterate, it is possible for one allocation to be associated with multiple sites, as well as one site to be associated with multiple allocations. The two descriptors which we were most interested in where the site's coordinates, as well as whether the site was classified as a Point of Diversion (POD) or a Place of Use (POU). As a general rule, PODs are geographic points, while POUs are areas typically represented as property boundaries or irregularly shaped polygons. sites_pouGeometry.csv: For those allocations with a POU site, this file contains the defining points for the associated polygons. variables.csv: A file describing the units in which an allocation's water amount is reported within WestDAAT. This information is essentially a repeat of the 'AllocationFlow_CFS' and 'AllocationVolume_AF' columns within allocationamount.csv, at least for our purposes. watersources: This file describes the source of water from which each site extracts from. For our purposes, this table was used to determine whether the water came from Surface Water, Groundwater, or Unspecified Water.« less

  19. h

    M3C2-EP: Pushing the limits of 3D topographic point cloud change detection...

    • heidata.uni-heidelberg.de
    zip
    Updated Jun 29, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lukas Winiwarter; Lukas Winiwarter; Katharina Anders; Katharina Anders; Vivien Zahs; Vivien Zahs; Martin Hämmerle; Martin Hämmerle; Bernhard Höfle; Bernhard Höfle (2021). M3C2-EP: Pushing the limits of 3D topographic point cloud change detection by error propagation [Data and Source Code] [Dataset]. http://doi.org/10.11588/DATA/XHYB10
    Explore at:
    zip(20250), zip(6158980), zip(6518881256), zip(5189455640), zip(3841415388), zip(130143397)Available download formats
    Dataset updated
    Jun 29, 2021
    Dataset provided by
    heiDATA
    Authors
    Lukas Winiwarter; Lukas Winiwarter; Katharina Anders; Katharina Anders; Vivien Zahs; Vivien Zahs; Martin Hämmerle; Martin Hämmerle; Bernhard Höfle; Bernhard Höfle
    License

    https://heidata.uni-heidelberg.de/api/datasets/:persistentId/versions/1.1/customlicense?persistentId=doi:10.11588/DATA/XHYB10https://heidata.uni-heidelberg.de/api/datasets/:persistentId/versions/1.1/customlicense?persistentId=doi:10.11588/DATA/XHYB10

    Time period covered
    Jul 19, 2017 - Jul 30, 2018
    Area covered
    Äußeres Hochebenkar, Austria, Tirol, Obergurgl
    Description

    The analysis of topographic time series is often based on bitemporal change detection and quantification. For 3D point clouds, acquired using laser scanning or photogrammetry, random and systematic noise has to be separated from the signal of surface change by determining the minimum detectable change. To analyse geomorphic change in point cloud data, the multiscale model-to-model cloud comparison (M3C2) approach is commonly applied, which provides a statistical significance test. This test assumes planar surfaces and a uniform registration error. For natural surfaces, the planarity assumption does not necessarily apply, in which cases the value of minimal detectable change (Level of Detection) is overestimated. To overcome these limitations, we quantify an uncertainty information for each 3D point by propagating the uncertainty of the measurements themselves and of the alignment uncertainty to the 3D points. This allows the calculation of 3D covariance information for the point cloud, which we use in an extended statistical test for equality of multivariate means. Our method, called M3C2-EP, gives a less biased estimate of the Level of Detection, allowing a more appropriate significance threshold in typical cases. We verify our method in two simulated scenarios, and apply it to a time series of terrestrial laser scans of a rock glacier at two different timespans of three weeks and one year. Over the three-week period, we detect significant change at 12.5% fewer 3D locations, while quantifying additional 25.2% of change volume, when compared to the reference method of M3C2. Compared with manual assessment, M3C2-EP achieves a specificity of 0.97, where M3C2 reaches 0.86 for the one year timespan, while sensitivity drops from 0.72 for M3C2 to 0.60 for M3C2-EP. Lower Levels of Detection enable the analysis of high-frequency monitoring data, where usually less change has occurred between successive scans, and where change is small compared to local roughness. Our method further allows the combination of data from multiple scan positions or data sources with different levels of uncertainty. The combination using error propagation ensures that every dataset is used to its full potential. --- This dataset includes three point clouds acquired by terrestrial laser scanning in 2017 and 2018, as well as alignment information (ICPout) and the code used for processing the datasets. Unzipping all the folders to the same directory should allow you to run the python script as-is. Point clouds have been pre-processed using the following workflow: 1) MSA coregistration within every epoch using RiScan Pro v2.7 2) MSA coregistration within across epochs on stable areas using RiScan Pro v2.7 3) ICP coregistration in opalsICP v2.3.1, resulting in the data in ICPout 4) Point cloud filterting using PDAL v2.2.0 ("filters.pmf") 5) Tiling into 100m tiles (5m overlap) using lastile

  20. u

    Ice cards - Catalogue - Canadian Urban Data Catalogue (CUDC)

    • data.urbandatacentre.ca
    Updated Oct 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Ice cards - Catalogue - Canadian Urban Data Catalogue (CUDC) [Dataset]. https://data.urbandatacentre.ca/dataset/gov-canada-4a05016f-fa46-4b1d-94cf-ff45b4cb9391
    Explore at:
    Dataset updated
    Oct 1, 2024
    Area covered
    Canada
    Description

    Ice maps produced for the prevention of flooding by ice jams and the monitoring of river ice during spring floods, winter temperatures or even during problems with ice jams. The maps are derived from radar satellite images, therefore available regardless of cloud cover, from several different sources, using algorithms to classify pixels into types of ice cover. Data is only processed and displayed on the main rivers at risk. The date the image was taken and the approximate region covered by the data is shown in the layer name. Data is added several times a week, but the frequency of revisits to each river can vary between 2 days and 2 weeks. | Name | Period | Satellite | Resolution | Algorithm | | R2 | 2018 - 2022 | Radarsat 2 | 7m | 7m | 7m | Icemap-r | | IceMap-r | | R | R | R | R | R | R | R | R | R | R | 2 times a week, 7 m | 7 m | IceMap-r | 7 m | | IceMap-r | 7 m | | IceMap-r | 7 m | | Icemap-r | | R | R | R | R | R | R | R | R | R | R | R | R | 2, 8, 7 m | | IceMap-r | 7 m | | IceMap-r 12.5m | owner DGI | The different classes in the legend make it possible to differentiate the following types of ice: * Water (dark blue) : open water * Water/Smooth ice (blue) : combination of water on ice, or spaced rafts of frasil * Smooth ice (cyan) : or black ice, the exact term for this type of ice is “columnar ice”, due to the vertical and elongated shape of the crystals that compose it. Black ice is generally transparent because it contains few or no air bubbles. It is formed by cooling, in fairly calm water, which is why it is sometimes called “thermal ice”. Its surface is very smooth. * Consolidated ice (light pink) : it includes Frasil ice or snow ice. Frasil ice forms in turbulent and very cold water. Composed of fine rounded crystals. These grains accumulate and rise to the surface to form moving ice rafts. These rafts end up close enough to freeze together (agglomerated ice). It contains a lot of air bubbles. Its surface is slightly to moderately rough. * Consolidated ice with accumulations (dark pink) : ice cover formed by the stacking and freezing of various forms of moving ice. blocks that are superimposed or pieces of ice that are detached in one place and that are piled up in another. Moderately rough to very rough surface. The images from Radarsat-2 and RCM are obtained through a partnership between Public Safety Canada and the MSP. The ICEMAP-R algorithm developed by INRS makes it possible to identify the type of ice according to the internal roughness of the ice (presence of air bubbles) and the roughness of the surface of the ice cover (presence of blocks and accumulations). The initial version was usable for Radarsat 2. The 2022 and 2023 RCM ice maps are given as an indication (new algorithm in process), only the 2024 data are processed with the Icemap-R algorithm adapted to RCM. Since 2018, the MSP has also used images from Sentinel-1, a radar satellite from the European Space Agency with a resolution of 10 m, resampled to 12.5m for ice maps. The images are then processed by the firm Dromedaire Géo-Innovation, which uses a proprietary algorithm. The output of the various algorithms has been reclassified to obtain a comparable legend. Historical data may have presented an alternative classification. Until 2022, the legend varied between winter and thaw. LIMITATIONS: the ice map is the result of an automated radar satellite image processing process. This process involves interpretation uncertainties that may be caused by the climatic conditions that prevailed when the image was acquired (melt, presence of water on the ice) or by physical characteristics of the watercourse (presence of shoals, islands or rapids). They also depend on the resolution of the initial images. Thus, although the ice map created is representative of reality, there may be some errors in identifying ice conditions at the local level. The use of the product is optimal when combined with field observations. The web service also contains visible satellite images from Landsat (L8, L9) or Sentinel 2 (S2); in this case colored compounds (false colors to benefit from the infrared bands in particular) are used to best visualize the presence of ice.This third party metadata element was translated using an automated translation tool (Amazon Translate).

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Factori (2022). Factori Visit Data | Global | Location Intelligence | Geospatial Data |POI , Foot Traffic, Store Visit [Dataset]. https://datarade.ai/data-products/factori-geospatial-data-global-location-intelligence-po-factori

Factori Visit Data | Global | Location Intelligence | Geospatial Data |POI , Foot Traffic, Store Visit

Explore at:
.csvAvailable download formats
Dataset updated
Jan 29, 2022
Dataset authored and provided by
Factori
Area covered
Myanmar, Pakistan, Madagascar, Germany, Saint Martin (French part), Chile, Luxembourg, Nicaragua, Guatemala, Ghana
Description

Our Geospatial Dataset connects people's movements to over 200M physical locations globally. These are aggregated and anonymized data that are only used to offer context for the volume and patterns of visits to certain locations. This data feed is compiled from different data sources around the world.

It includes information such as the name, address, coordinates, and category of these locations, which can range from restaurants and hotels to parks and tourist attractions

Location Intelligence Data Reach: Location Intelligence data brings the POI/Place/OOH level insights calculated on the basis of Factori’s Mobility & People Graph data aggregated from multiple data sources globally. In order to achieve the desired foot-traffic attribution, specific attributes are combined to bring forward the desired reach data. For instance, in order to calculate the foot traffic for a specific location, a combination of location ID, day of the week, and part of the day can be combined to give specific location intelligence data. There can be a maximum of 56 data records possible for one POI based on the combination of these attributes.

Data Export Methodology: Since we collect data dynamically, we provide the most updated data and insights via a best-suited method at a suitable interval (daily/weekly/monthly).

Use Cases: Credit Scoring: Financial services can use alternative data to score an underbanked or unbanked customer by validating locations and persona. Retail Analytics: Analyze footfall trends in various locations and gain an understanding of customer personas. Market Intelligence: Study various market areas, the proximity of points or interests, and the competitive landscape Urban Planning: Build cases for urban development, public infrastructure needs, and transit planning based on fresh population data. Marketing Campaign Strategy: Analyzing visitor demographics and behavior patterns around POIs, businesses can tailor their marketing strategies to effectively reach their target audience. OOH/DOOH Campaign Planning: Identify high-traffic locations and understand consumer behavior in specific areas, to execute targeted advertising strategies effectively. Geofencing: Geofencing involves creating virtual boundaries around physical locations, enabling businesses to trigger actions when users enter or exit these areas

Data Attributes Included: LocationID
name
website BrandID Phone streetAddress
city
state country_code zip lat lng poi_status
geoHash8 poi_id category category_id full_address address additional_categories url domain rating price_level rating_distribution is_claimed photo_url attributes brand_name brand_id status total_photos popular_times places_topics people_also_search work_hours local_business_links contact_info reviews_count naics_code naics_code_description sis_code sic_code_description shape_polygon building_id building_type building_name geometry_location_type geometry_viewport_northeast_lat geometry_viewport_northeast_lng geometry_viewport_southwest_lat geometry_viewport_southwest_lng geometry_location_lat geometry_location_lng calculated_geo_hash_8

Search
Clear search
Close search
Google apps
Main menu