100+ datasets found

d
R code used to estimate public supply consumptive water use
catalog.data.gov
data.usgs.gov
+1more
Updated Aug 29, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2024). R code used to estimate public supply consumptive water use [Dataset]. https://catalog.data.gov/dataset/r-code-used-to-estimate-public-supply-consumptive-water-use
Explore at:
Dataset updated
Aug 29, 2024
Dataset provided by
U.S. Geological Survey
Description
This child item describes R code used to determine public supply consumptive use estimates. Consumptive use was estimated by scaling an assumed fraction of deliveries used for outdoor irrigation by spatially explicit estimates of evaporative demand using estimated domestic and commercial, industrial, and institutional deliveries from the public supply delivery machine learning model child item. This method scales public supply water service area outdoor water use by the relationship between service area gross reference evapotranspiration provided by GridMET and annual continental U.S. (CONUS) growing season maximum evapotranspiration. This relationship to climate at the CONUS scale could result in over- or under-estimation of consumptive use at public supply service areas where local variations differ from national variations in climate. This method also assumes that 50% of deliveries for total domestic and commercial, industrial, and institutional deliveries is used for outdoor purposes. This dataset is part of a larger data release using machine learning to predict public supply water use for 12-digit hydrologic units from 2000-2020. This page includes the following file: PS_ConsumptiveUse.zip - a zip file containing input datasets, scripts, and output datasets
m
R codes and dataset for Visualisation of Diachronic Constructional Change...
bridges.monash.edu
researchdata.edu.au
zip
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gede Primahadi Wijaya Rajeg (2023). R codes and dataset for Visualisation of Diachronic Constructional Change using Motion Chart [Dataset]. http://doi.org/10.26180/5c844c7a81768
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.26180/5c844c7a81768
Dataset updated
May 30, 2023
Dataset provided by
Monash University
Authors
Gede Primahadi Wijaya Rajeg
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
PublicationPrimahadi Wijaya R., Gede. 2014. Visualisation of diachronic constructional change using Motion Chart. In Zane Goebel, J. Herudjati Purwoko, Suharno, M. Suryadi & Yusuf Al Aried (eds.). Proceedings: International Seminar on Language Maintenance and Shift IV (LAMAS IV), 267-270. Semarang: Universitas Diponegoro. doi: https://doi.org/10.4225/03/58f5c23dd8387Description of R codes and data files in the repositoryThis repository is imported from its GitHub repo. Versioning of this figshare repository is associated with the GitHub repo's Release. So, check the Releases page for updates (the next version is to include the unified version of the codes in the first release with the tidyverse).The raw input data consists of two files (i.e. will_INF.txt and go_INF.txt). They represent the co-occurrence frequency of top-200 infinitival collocates for will and be going to respectively across the twenty decades of Corpus of Historical American English (from the 1810s to the 2000s).These two input files are used in the R code file 1-script-create-input-data-raw.r. The codes preprocess and combine the two files into a long format data frame consisting of the following columns: (i) decade, (ii) coll (for "collocate"), (iii) BE going to (for frequency of the collocates with be going to) and (iv) will (for frequency of the collocates with will); it is available in the input_data_raw.txt. Then, the script 2-script-create-motion-chart-input-data.R processes the input_data_raw.txt for normalising the co-occurrence frequency of the collocates per million words (the COHA size and normalising base frequency are available in coha_size.txt). The output from the second script is input_data_futurate.txt.Next, input_data_futurate.txt contains the relevant input data for generating (i) the static motion chart as an image plot in the publication (using the script 3-script-create-motion-chart-plot.R), and (ii) the dynamic motion chart (using the script 4-script-motion-chart-dynamic.R).The repository adopts the project-oriented workflow in RStudio; double-click on the Future Constructions.Rproj file to open an RStudio session whose working directory is associated with the contents of this repository.
d
R code that determines groundwater and surface water source fractions for...
catalog.data.gov
data.usgs.gov
+2more
Updated Aug 29, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2024). R code that determines groundwater and surface water source fractions for public-supply water service areas, counties, and 12-digit hydrologic units [Dataset]. https://catalog.data.gov/dataset/r-code-that-determines-groundwater-and-surface-water-source-fractions-for-public-supply-wa
Explore at:
Dataset updated
Aug 29, 2024
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Description
This child item describes R code used to determine water source fractions (groundwater (GW), surface water (SW), or spring (SP)) for public-supply water service areas, counties, and 12-digit hydrologic unit codes (HUC12) using information from a proprietary dataset from the U.S. Environmental Protection Agency. Water-use volumes per source were not available from public-supply systems so water source fractions were calculated by the number of withdrawal source types (GW/SW). For example, for a public supply system with three SW intakes and one GW well, the fractions would be 0.75 SW and 0.25 GW. This dataset is part of a larger data release using machine learning to predict public supply water use for 12-digit hydrologic units from 2000-2020. Output from this code was used to calculate groundwater and surface water volumes by HUC12 for public supply. This page includes the following files: FCL_Data_Water_Sources_Flagged_wHUC_DR.R - an R script used to determine water source fractions by public-supply water service areas, counties, and HUC12s WaterSource_readme.txt - a README text file describing the script County_SourceFrac.csv - a csv file with estimated water source fractions by county HUC12_SourceFrac.csv - a csv file with estimated water source fractions by HUC12 WSA_AGIDF_SourceFrac.csv - a csv file with estimated water source fractions by public-supply water service area
S
Datasets, Codes, and R Outputs for Analyzing Cross-Provincial...
scidb.cn
Updated Mar 19, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jing Yiming (2023). Datasets, Codes, and R Outputs for Analyzing Cross-Provincial Cultural-Psychological Differences across Mainland China [Dataset]. http://doi.org/10.57760/sciencedb.07672
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.57760/sciencedb.07672
Dataset updated
Mar 19, 2023
Dataset provided by
Science Data Bank
Authors
Jing Yiming
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
The files include SPSS datasets and R Markdown Notebooks (codes and outputs) for the paper titled "Understanding Regional Cultural-Psychological Variations and Their Sources: A Large-Scale Examination in China". Specifically, "Data_Main.zip" contains compressed SPSS datasets (scale-level scores) for the main analyses as reported in the main text. "Data_All.zip" contains compressed SPSS datasets (item-level scores) for all respondents who passed the quality check. And "Codes_Outputs.zip" contains compressed R Markdown documents for the main analyses.
Film Circulation dataset
zenodo.org
data.niaid.nih.gov
bin, csv, png
Updated Jul 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Skadi Loist; Skadi Loist; Evgenia (Zhenya) Samoilova; Evgenia (Zhenya) Samoilova (2024). Film Circulation dataset [Dataset]. http://doi.org/10.5281/zenodo.7887672
Explore at:
csv, png, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7887672
Dataset updated
Jul 12, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Skadi Loist; Skadi Loist; Evgenia (Zhenya) Samoilova; Evgenia (Zhenya) Samoilova
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Complete dataset of “Film Circulation on the International Film Festival Network and the Impact on Global Film Culture”

A peer-reviewed data paper for this dataset is in review to be published in NECSUS_European Journal of Media Studies - an open access journal aiming at enhancing data transparency and reusability, and will be available from https://necsus-ejms.org/ and https://mediarep.org

Please cite this when using the dataset.

Detailed description of the dataset:

1 Film Dataset: Festival Programs

The Film Dataset consists a data scheme image file, a codebook and two dataset tables in csv format.

The codebook (csv file “1_codebook_film-dataset_festival-program”) offers a detailed description of all variables within the Film Dataset. Along with the definition of variables it lists explanations for the units of measurement, data sources, coding and information on missing data.

The csv file “1_film-dataset_festival-program_long” comprises a dataset of all films and the festivals, festival sections, and the year of the festival edition that they were sampled from. The dataset is structured in the long format, i.e. the same film can appear in several rows when it appeared in more than one sample festival. However, films are identifiable via their unique ID.

The csv file “1_film-dataset_festival-program_wide” consists of the dataset listing only unique films (n=9,348). The dataset is in the wide format, i.e. each row corresponds to a unique film, identifiable via its unique ID. For easy analysis, and since the overlap is only six percent, in this dataset the variable sample festival (fest) corresponds to the first sample festival where the film appeared. For instance, if a film was first shown at Berlinale (in February) and then at Frameline (in June of the same year), the sample festival will list “Berlinale”. This file includes information on unique and IMDb IDs, the film title, production year, length, categorization in length, production countries, regional attribution, director names, genre attribution, the festival, festival section and festival edition the film was sampled from, and information whether there is festival run information available through the IMDb data.

2 Survey Dataset

The Survey Dataset consists of a data scheme image file, a codebook and two dataset tables in csv format.

The codebook “2_codebook_survey-dataset” includes coding information for both survey datasets. It lists the definition of the variables or survey questions (corresponding to Samoilova/Loist 2019), units of measurement, data source, variable type, range and coding, and information on missing data.

The csv file “2_survey-dataset_long-festivals_shared-consent” consists of a subset (n=161) of the original survey dataset (n=454), where respondents provided festival run data for films (n=206) and gave consent to share their data for research purposes. This dataset consists of the festival data in a long format, so that each row corresponds to the festival appearance of a film.

The csv file “2_survey-dataset_wide-no-festivals_shared-consent” consists of a subset (n=372) of the original dataset (n=454) of survey responses corresponding to sample films. It includes data only for those films for which respondents provided consent to share their data for research purposes. This dataset is shown in wide format of the survey data, i.e. information for each response corresponding to a film is listed in one row. This includes data on film IDs, film title, survey questions regarding completeness and availability of provided information, information on number of festival screenings, screening fees, budgets, marketing costs, market screenings, and distribution. As the file name suggests, no data on festival screenings is included in the wide format dataset.

3 IMDb & Scripts

The IMDb dataset consists of a data scheme image file, one codebook and eight datasets, all in csv format. It also includes the R scripts that we used for scraping and matching.

The codebook “3_codebook_imdb-dataset” includes information for all IMDb datasets. This includes ID information and their data source, coding and value ranges, and information on missing data.

The csv file “3_imdb-dataset_aka-titles_long” contains film title data in different languages scraped from IMDb in a long format, i.e. each row corresponds to a title in a given language.

The csv file “3_imdb-dataset_awards_long” contains film award data in a long format, i.e. each row corresponds to an award of a given film.

The csv file “3_imdb-dataset_companies_long” contains data on production and distribution companies of films. The dataset is in a long format, so that each row corresponds to a particular company of a particular film.

The csv file “3_imdb-dataset_crew_long” contains data on names and roles of crew members in a long format, i.e. each row corresponds to each crew member. The file also contains binary gender assigned to directors based on their first names using the GenderizeR application.

The csv file “3_imdb-dataset_festival-runs_long” contains festival run data scraped from IMDb in a long format, i.e. each row corresponds to the festival appearance of a given film. The dataset does not include each film screening, but the first screening of a film at a festival within a given year. The data includes festival runs up to 2019.

The csv file “3_imdb-dataset_general-info_wide” contains general information about films such as genre as defined by IMDb, languages in which a film was shown, ratings, and budget. The dataset is in wide format, so that each row corresponds to a unique film.

The csv file “3_imdb-dataset_release-info_long” contains data about non-festival release (e.g., theatrical, digital, tv, dvd/blueray). The dataset is in a long format, so that each row corresponds to a particular release of a particular film.

The csv file “3_imdb-dataset_websites_long” contains data on available websites (official websites, miscellaneous, photos, video clips). The dataset is in a long format, so that each row corresponds to a website of a particular film.

The dataset includes 8 text files containing the script for webscraping. They were written using the R-3.6.3 version for Windows.

The R script “r_1_unite_data” demonstrates the structure of the dataset, that we use in the following steps to identify, scrape, and match the film data.

The R script “r_2_scrape_matches” reads in the dataset with the film characteristics described in the “r_1_unite_data” and uses various R packages to create a search URL for each film from the core dataset on the IMDb website. The script attempts to match each film from the core dataset to IMDb records by first conducting an advanced search based on the movie title and year, and then potentially using an alternative title and a basic search if no matches are found in the advanced search. The script scrapes the title, release year, directors, running time, genre, and IMDb film URL from the first page of the suggested records from the IMDb website. The script then defines a loop that matches (including matching scores) each film in the core dataset with suggested films on the IMDb search page. Matching was done using data on directors, production year (+/- one year), and title, a fuzzy matching approach with two methods: “cosine” and “osa.” where the cosine similarity is used to match titles with a high degree of similarity, and the OSA algorithm is used to match titles that may have typos or minor variations.

The script “r_3_matching” creates a dataset with the matches for a manual check. Each pair of films (original film from the core dataset and the suggested match from the IMDb website was categorized in the following five categories: a) 100% match: perfect match on title, year, and director; b) likely good match; c) maybe match; d) unlikely match; and e) no match). The script also checks for possible doubles in the dataset and identifies them for a manual check.

The script “r_4_scraping_functions” creates a function for scraping the data from the identified matches (based on the scripts described above and manually checked). These functions are used for scraping the data in the next script.

The script “r_5a_extracting_info_sample” uses the function defined in the “r_4_scraping_functions”, in order to scrape the IMDb data for the identified matches. This script does that for the first 100 films, to check, if everything works. Scraping for the entire dataset took a few hours. Therefore, a test with a subsample of 100 films is advisable.

The script “r_5b_extracting_info_all” extracts the data for the entire dataset of the identified matches.

The script “r_5c_extracting_info_skipped” checks the films with missing data (where data was not scraped) and tried to extract data one more time to make sure that the errors were not caused by disruptions in the internet connection or other technical issues.

The script “r_check_logs” is used for troubleshooting and tracking the progress of all of the R scripts used. It gives information on the amount of missing values and errors.

4 Festival Library Dataset

The Festival Library Dataset consists of a data scheme image file, one codebook and one dataset, all in csv format.

The codebook (csv file “4_codebook_festival-library_dataset”) offers a detailed description of all variables within the Library Dataset. It lists the definition of variables, such as location and festival name, and festival categories,
Development of a Tool to Determine the Variability of Consensus Mass Spectra...
data.nist.gov
catalog.data.gov
Updated Mar 23, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Benjamin Place (2021). Development of a Tool to Determine the Variability of Consensus Mass Spectra Supporting Data [Dataset]. http://doi.org/10.18434/mds2-2299
Explore at:
Unique identifier
https://doi.org/10.18434/mds2-2299, https://identifiers.org/ark:/88434/mds2-2299
Dataset updated
Mar 23, 2021
Dataset provided by
National Institute of Standards and Technologyhttp://www.nist.gov/
Authors
Benjamin Place
License
https://www.nist.gov/open/licensehttps://www.nist.gov/open/license
Description
Supporting datasets and algorithms (R-based) for the manuscript entitled "Development of a Tool to Determine the Variability of Consensus Mass Spectra", including an R Markdown script to reproduce the manuscript's figures.
Explore data formats and ingestion methods
kaggle.com
Updated Feb 12, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gabriel Preda (2021). Explore data formats and ingestion methods [Dataset]. https://www.kaggle.com/datasets/gpreda/iris-dataset/discussion?sort=undefined
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 12, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Gabriel Preda
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Why this Dataset

This dataset brings to you Iris Dataset in several data formats (see more details in the next sections).

You can use it to test the ingestion of data in all these formats using Python or R libraries. We also prepared Python Jupyter Notebook and R Markdown report that input all these formats:

Test Data Formats in Python

Test Data Formats in R

Iris Dataset

Iris Dataset was created by R. A. Fisher and donated by Michael Marshall.

Repository on UCI site: https://archive.ics.uci.edu/ml/datasets/iris

Data Source: https://archive.ics.uci.edu/ml/machine-learning-databases/iris/

The file downloaded is iris.data and is formatted as a comma delimited file.

This small data collection was created to help you test your skills with ingesting various data formats.

Content

This file was processed to convert the data in the following formats: * csv - comma separated values format * tsv - tab separated values format * parquet - parquet format
* feather - feather format * parquet.gzip - compressed parquet format * h5 - hdf5 format * pickle - Python binary object file - pickle format * xslx - Excel format
* npy - Numpy (Python library) binary format * npz - Numpy (Python library) binary compressed format * rds - Rds (R specific data format) binary format

Acknowledgements

I would like to acknowledge the work of the creator of the dataset - R. A. Fisher and of the donor - Michael Marshall.

Inspiration

Use these data formats to test your skills in ingesting data in various formats.
h
autotrain-data-test
huggingface.co
Updated Apr 21, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Eddie (2024). autotrain-data-test [Dataset]. https://huggingface.co/datasets/j23349/autotrain-data-test
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 21, 2024
Authors
Eddie
Description
AutoTrain Dataset for project: test

Dataset Description

This dataset has been automatically processed by AutoTrain for project test.

Languages

The BCP-47 code for the dataset's language is en.

Dataset Structure Data Instances

A sample from this dataset looks as follows: [ { "feat_id": "13829542", "text": "Kasia: When are u coming back?\r Matt: Back where?\r Kasia: Oh come on\r Kasia: you know what i… See the full description on the dataset page: https://huggingface.co/datasets/j23349/autotrain-data-test.
[Superseded] Intellectual Property Government Open Data 2019
researchdata.edu.au
data.gov.au
Updated Jun 6, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
IP Australia (2019). [Superseded] Intellectual Property Government Open Data 2019 [Dataset]. https://researchdata.edu.au/superseded-intellectual-property-data-2019/2994670
Explore at:
Dataset updated
Jun 6, 2019
Dataset provided by
Data.govhttps://data.gov/
Authors
IP Australia
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
What is IPGOD?\r

The Intellectual Property Government Open Data (IPGOD) includes over 100 years of registry data on all intellectual property (IP) rights administered by IP Australia. It also has derived information about the applicants who filed these IP rights, to allow for research and analysis at the regional, business and individual level. This is the 2019 release of IPGOD.\r \r \r

How do I use IPGOD?\r

IPGOD is large, with millions of data points across up to 40 tables, making them too large to open with Microsoft Excel. Furthermore, analysis often requires information from separate tables which would need specialised software for merging. We recommend that advanced users interact with the IPGOD data using the right tools with enough memory and compute power. This includes a wide range of programming and statistical software such as Tableau, Power BI, Stata, SAS, R, Python, and Scalar.\r \r \r

IP Data Platform\r

IP Australia is also providing free trials to a cloud-based analytics platform with the capabilities to enable working with large intellectual property datasets, such as the IPGOD, through the web browser, without any installation of software. IP Data Platform\r \r

References\r

\r The following pages can help you gain the understanding of the intellectual property administration and processes in Australia to help your analysis on the dataset.\r \r * Patents\r * Trade Marks\r * Designs\r * Plant Breeder’s Rights\r \r \r

Updates\r

\r

Tables and columns\r

\r Due to the changes in our systems, some tables have been affected.\r \r * We have added IPGOD 225 and IPGOD 325 to the dataset!\r * The IPGOD 206 table is not available this year.\r * Many tables have been re-built, and as a result may have different columns or different possible values. Please check the data dictionary for each table before use.\r \r

Data quality improvements\r

\r Data quality has been improved across all tables.\r \r * Null values are simply empty rather than '31/12/9999'.\r * All date columns are now in ISO format 'yyyy-mm-dd'.\r * All indicator columns have been converted to Boolean data type (True/False) rather than Yes/No, Y/N, or 1/0.\r * All tables are encoded in UTF-8.\r * All tables use the backslash \ as the escape character.\r * The applicant name cleaning and matching algorithms have been updated. We believe that this year's method improves the accuracy of the matches. Please note that the "ipa_id" generated in IPGOD 2019 will not match with those in previous releases of IPGOD.
e
Check actions of CTIA, their focus and issued fines
data.europa.eu
data.gov.cz
rdf trig, unknown
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Univerzita Karlova, Check actions of CTIA, their focus and issued fines [Dataset]. https://data.europa.eu/data/datasets/https-lkod-mff-cuni-cz-zdroj-datove-sady-stirdata-c-oi-kontroly-zame-r-eni-sankce
Explore at:
unknown, rdf trigAvailable download formats
Dataset authored and provided by
Univerzita Karlova
License
https://data.gov.cz/zdroj/datové-sady/00216208/88a8c777bba57747498f6671020dedd2/distribuce/b7e73a5c2f2635b405e5be40dd17d362/podmínky-užitíhttps://data.gov.cz/zdroj/datové-sady/00216208/88a8c777bba57747498f6671020dedd2/distribuce/b7e73a5c2f2635b405e5be40dd17d362/podmínky-užití
https://data.gov.cz/zdroj/datové-sady/00216208/88a8c777bba57747498f6671020dedd2/distribuce/974dc2a3d80ee61893c30d710738b059/podmínky-užitíhttps://data.gov.cz/zdroj/datové-sady/00216208/88a8c777bba57747498f6671020dedd2/distribuce/974dc2a3d80ee61893c30d710738b059/podmínky-užití
Description
Check actions of the Czech Trade Inspection Authority, their focus and issued fines. Transformed to RDF from the datasets https://data.gov.cz/zdroj/datové-sady/00020869/98803, https://data.gov.cz/zdroj/datové-sady/00020869/99266 and https://data.gov.cz/zdroj/datové-sady/00020869/98719
c
R code that determines buying and selling of water by public-supply water...
s.cnmilf.com
data.usgs.gov
+1more
Updated Aug 29, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2024). R code that determines buying and selling of water by public-supply water service areas [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/r-code-that-determines-buying-and-selling-of-water-by-public-supply-water-service-areas
Explore at:
Dataset updated
Aug 29, 2024
Dataset provided by
U.S. Geological Survey
Description
This child item describes R code used to determine whether public-supply water systems buy water, sell water, both buy and sell water, or are neutral (meaning the system has only local water supplies) using water source information from a proprietary dataset from the U.S. Environmental Protection Agency. This information was needed to better understand public-supply water use and where water buying and selling were likely to occur. Buying or selling of water may result in per capita rates that are not representative of the population within the water service area. This dataset is part of a larger data release using machine learning to predict public supply water use for 12-digit hydrologic units from 2000-2020. Output from this code was used as an input feature variable in the public supply water use machine learning model. This page includes the following files: ID_WSA_04062022_Buyers_Sellers_DR.R - an R script used to determine whether a public-supply water service area buys water, sells water, or is neutral BuySell_readme.txt - a README text file describing the script
Replication dataset and calculations for PIIE Briefing 16-3, Reality Check...
piie.com
Updated Mar 13, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Julien Acalin; Olivier Blanchard; Monica de Bolle; José De Gregorio; Caroline Freund; Joseph E. Gagnon; Nicholas R. Lardy; Adam S. Posen; David J. Stockton; Nicolas Véron (2016). Replication dataset and calculations for PIIE Briefing 16-3, Reality Check for the Global Economy, by Julien Acalin, Olivier Blanchard, Monica de Bolle, José De Gregorio, Caroline Freund, Joseph E. Gagnon, Nicholas R. Lardy, Adam S. Posen, David J. Stockt [Dataset]. https://www.piie.com/publications/piie-briefings/reality-check-global-economy
Explore at:
Dataset updated
Mar 13, 2016
Dataset provided by
Peterson Institute for International Economicshttp://www.piie.com/
Authors
Julien Acalin; Olivier Blanchard; Monica de Bolle; José De Gregorio; Caroline Freund; Joseph E. Gagnon; Nicholas R. Lardy; Adam S. Posen; David J. Stockton; Nicolas Véron
Description
This data package includes the underlying data and files to replicate the calculations, charts, and tables presented in Reality Check for the Global Economy, PIIE Briefing 16-3.

If you use the data, please cite as: Acalin, Julien, Olivier Blanchard, Monica de Bolle, José De Gregorio, Caroline Freund, Joseph E. Gagnon, Nicholas R. Lardy, Adam S. Posen, David J. Stockton, and Nicolas Véron. (2016). Reality Check for the Global Economy. PIIE Briefing 16-3. Peterson Institute for International Economics.
Data from: XBT and CTD pairs dataset Version 1
researchdata.edu.au
data.csiro.au
+1more
datadownload
Updated Oct 16, 2014
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mark Rosenberg; Steve Rintoul; Esmee Van Wijk; Gustavo Goni; Kimio Hanawa; Shoichi Kizu; Tim Boyer; Rebecca Cowley (2014). XBT and CTD pairs dataset Version 1 [Dataset]. http://doi.org/10.4225/08/52AE99A4663B1
Explore at:
datadownloadAvailable download formats
Unique identifier
https://doi.org/10.4225/08/52AE99A4663B1
Dataset updated
Oct 16, 2014
Dataset provided by
CSIROhttp://www.csiro.au/
Authors
Mark Rosenberg; Steve Rintoul; Esmee Van Wijk; Gustavo Goni; Kimio Hanawa; Shoichi Kizu; Tim Boyer; Rebecca Cowley
License
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
Time period covered
Jan 1, 1967 - Dec 31, 2011
Area covered

Description
The XBT/CTD pairs dataset (Version 1) is the dataset used to calculate the historical XBT fall rate and temperature corrections presented in Cowley, R., Wijffels, S., Cheng, L., Boyer, T., and Kizu, S. (2013). Biases in Expendable Bathythermograph Data: A New View Based on Historical Side-by-Side Comparisons. Journal of Atmospheric and Oceanic Technology, 30, 1195–1225, doi:10.1175/JTECH-D-12-00127.1. http://journals.ametsoc.org/doi/abs/10.1175/JTECH-D-12-00127.1

4,115 pairs from 114 datasets were used to derive the fall rate and temperature corrections. Each dataset contains the scientifically quality controlled version and (where available) the originator's data. The XBT/CTD pairs are identified in the document 'XBT_CTDpairs_metadata_V1.csv'. Note that future versions of the XBT/CTD pairs database may supersede this version. Please check more recent versions for updates to individual datasets. Lineage: Data is sourced from the World Ocean Database, NOAA, CSIRO Marine and Atmospheric Research, Bundesamt für Seeschifffahrt und Hydrographie (BSH), Hamburg, Germany, Australian Antarctic Division. Original and raw data files are included where available. Quality controlled datasets follow the procedure of Bailey, R., Gronell, A., Phillips, H., Tanner, E., and Meyers, G. (1994). Quality control cookbook for XBT data, Version 1.1. CSIRO Marine Laboratories Reports, 221. Quality controlled data is in the 'MQNC' format used at CSIRO Marine and Atmospheric Research. The MQNC format is described in the document 'XBT_CTDpairs_descriptionV1.pdf'. Note that future versions of the XBT/CTD pairs database may supersede this version. Please check more recent versions for updates to individual datasets.
ASIC - Company Dataset
researchdata.edu.au
cloud.csiss.gmu.edu
+2more
Updated Sep 3, 2014
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Australian Securities and Investments Commission (ASIC) (2014). ASIC - Company Dataset [Dataset]. https://researchdata.edu.au/asic-company-dataset/2975914
Explore at:
Dataset updated
Sep 3, 2014
Dataset provided by
Data.govhttps://data.gov/
Authors
Australian Securities and Investments Commission (ASIC)
License
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
Description
Update March 2025 ###\r

\r From 11 March 2025, the dataset will be updated to include 1 new field, Date of Deregistration, (see help file for details). \r \r

Update August 2018 - frequency change to Company dataset ###\r

\r From 7 August 2018, the Company dataset will be updated weekly every Tuesday. As a result, the information might not be accurate at the time you check the Company dataset.\r ASIC-Connect updates information in real time, therefore, please consider accessing information on that platform if you need up to date information.\r \r ***\r \r

Dataset summary###\r

ASIC is Australia’s corporate, markets and financial services regulator. ASIC contributes to Australia’s economic reputation and wellbeing by ensuring that Australia’s financial markets are fair and transparent, supported by confident and informed investors and consumers.\r \r Australian companies are required to keep their details up to date on ASIC's Company Register. Information contained in the register is made available to the public to search via ASIC's website.\r \r Select data from the ASIC's Company Register will be uploaded each week to www.data.gov.au. The data made available will be a snapshot of the register at a point in time. Legislation prescribes the type of information ASIC is allowed to disclose to the public.\r \r The information included in the downloadable dataset is:\r \r * Company Name\r * Australian Company Number (ACN)\r * Type \r * Class\r * Sub Class \r * Status\r * Date of Registration\r * Date of Deregistration (Available from 11 March 2025)\r * Previous State of Registration (where applicable)\r * State Registration Number (where applicable) \r * Modified since last report – flag to indicate if data has been modified since last report\r * Current Name Indicator\r * Australian Business Number (ABN) \r * Current Name\r * Current Name Start Date\r \r Additional information about companies can be found via ASIC's website. Accessing some information may attract a fee.\r \r More information about searching ASIC's registers.\r
m
HUN AWRA-R simulation nodes v01
demo.dev.magda.io
researchdata.edu.au
+2more
zip
Updated Dec 4, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bioregional Assessment Program (2022). HUN AWRA-R simulation nodes v01 [Dataset]. https://demo.dev.magda.io/dataset/ds-dga-d9a4fd10-e099-48cb-b7ee-07d4000bb829
Explore at:
zipAvailable download formats
Dataset updated
Dec 4, 2022
Dataset provided by
Bioregional Assessment Program
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Abstract The dataset was derived by the Bioregional Assessment Programme from multiple datasets. The source dataset is identified in the Lineage field in this metadata statement. The processes undertaken to produce this derived dataset are described in the History field in this metadata statement. The dataset consists of an excel spreadsheet and shapefile representing the locations of simulation nodes used in the AWRA-R model. Some of the nodes correspond to gauging station locations or dam …Show full descriptionAbstract The dataset was derived by the Bioregional Assessment Programme from multiple datasets. The source dataset is identified in the Lineage field in this metadata statement. The processes undertaken to produce this derived dataset are described in the History field in this metadata statement. The dataset consists of an excel spreadsheet and shapefile representing the locations of simulation nodes used in the AWRA-R model. Some of the nodes correspond to gauging station locations or dam locations whereas other locations represent river confluences or catchment outlets which have no gauging. These are marked as "Dummy". Purpose Locations are used as pour points in oder to define reach areas for river system modelling. Dataset History Subset of data for the Hunter that was extracted from the Bureau of Meteorology's hydstra system and includes all gauges where data has been received from the lead water agency of each jurisdiction. Simulation nodes were added in locations in which the model will provide simulated streamflow. There are 3 files that have been extracted from the Hydstra database to aid in identifying sites in each bioregion and the type of data collected from each on. These data were used to determine the simulation node locations where model outputs were generated. The 3 files contained within the source dataset used for this determination are: Site - lists all sites available in Hydstra from data providers. The data provider is listed in the #Station as _xxx. For example, sites in NSW are _77, QLD are _66. Some sites do not have locational information and will not be able to be plotted. Period - the period table lists all the variables that are recorded at each site and the period of record. Variable - the variable table shows variable codes and names which can be linked to the period table. Relevant location information and other data were extracted to construct the spreadsheet and shapefile within this dataset. Dataset Citation Bioregional Assessment Programme (XXXX) HUN AWRA-R simulation nodes v01. Bioregional Assessment Derived Dataset. Viewed 13 March 2019, http://data.bioregionalassessments.gov.au/dataset/fda20928-d486-49d2-b362-e860c1918b06. Dataset Ancestors Derived From National Surface Water sites Hydstra
m
Data from: Datasets for lot sizing and scheduling problems in the...
data.mendeley.com
narcis.nl
Updated Jan 19, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Juan Piñeros (2021). Datasets for lot sizing and scheduling problems in the fruit-based beverage production process [Dataset]. http://doi.org/10.17632/j2x3gbskfw.1
Explore at:
Unique identifier
https://doi.org/10.17632/j2x3gbskfw.1
Dataset updated
Jan 19, 2021
Authors
Juan Piñeros
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The datasets presented here were partially used in “Formulation and MIP-heuristics for the lot sizing and scheduling problem with temporal cleanings” (Toscano, A., Ferreira, D. , Morabito, R. , Computers & Chemical Engineering) [1], in “A decomposition heuristic to solve the two-stage lot sizing and scheduling problem with temporal cleaning” (Toscano, A., Ferreira, D. , Morabito, R. , Flexible Services and Manufacturing Journal) [2], and in “A heuristic approach to optimize the production scheduling of fruit-based beverages” (Toscano et al., Gestão & Produção, 2020) [3]. In fruit-based production processes, there are two production stages: preparation tanks and production lines. This production process has some process-specific characteristics, such as temporal cleanings and synchrony between the two production stages, which make optimized production planning and scheduling even more difficult. In this sense, some papers in the literature have proposed different methods to solve this problem. To the best of our knowledge, there are no standard datasets used by researchers in the literature in order to verify the accuracy and performance of proposed methods or to be a benchmark for other researchers considering this problem. The authors have been using small data sets that do not satisfactorily represent different scenarios of production. Since the demand in the beverage sector is seasonal, a wide range of scenarios enables us to evaluate the effectiveness of the proposed methods in the scientific literature in solving real scenarios of the problem. The datasets presented here include data based on real data collected from five beverage companies. We presented four datasets that are specifically constructed assuming a scenario of restricted capacity and balanced costs. These dataset is supplementary data for the submitted paper to Data in Brief [4]. [1] Toscano, A., Ferreira, D., Morabito, R., Formulation and MIP-heuristics for the lot sizing and scheduling problem with temporal cleanings, Computers & Chemical Engineering. 142 (2020) 107038. Doi: 10.1016/j.compchemeng.2020.107038. [2] Toscano, A., Ferreira, D., Morabito, R., A decomposition heuristic to solve the two-stage lot sizing and scheduling problem with temporal cleaning, Flexible Services and Manufacturing Journal. 31 (2019) 142-173. Doi: 10.1007/s10696-017-9303-9. [3] Toscano, A., Ferreira, D., Morabito, R., Trassi, M. V. C., A heuristic approach to optimize the production scheduling of fruit-based beverages. Gestão & Produção, 27(4), e4869, 2020. https://doi.org/10.1590/0104-530X4869-20. [4] Piñeros, J., Toscano, A., Ferreira, D., Morabito, R., Datasets for lot sizing and scheduling problems in the fruit-based beverage production process. Data in Brief (2021).
c
GLM model data sets used to evaluate changes in the hydrodynamics of Anvil...
s.cnmilf.com
data.usgs.gov
+4more
Updated Jul 6, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2024). GLM model data sets used to evaluate changes in the hydrodynamics of Anvil Lake, Wisconsin: U.S. Geological Survey Data Release [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/glm-model-data-sets-used-to-evaluate-changes-in-the-hydrodynamics-of-anvil-lake-wisconsin-
Explore at:
Dataset updated
Jul 6, 2024
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Area covered
Wisconsin, Anvil Lake
Description
Interannual differences in the water quality of Anvil Lake, WI, were examined to determine how water level and climate affect the hydrodynamics and trophic state of shallow lakes, and their importance compared to anthropogenic changes in the watershed. To determine how changes in water level may affect these processes, the General Lake Model (GLM) was used to simulate how the lake’s thermal structure should change in response to changes in water level using R. This dataset includes the data inputs to the GLM model and the direct outputs from the model. Model Calibration (GLM_CalibrationZ); Simulation of with Deep Lake and Cold Weather (GLM_Deep_Cold_SimulationZ); Simulation of with Deep Lake and Hot Weather (GLM_Deep_Hot_SimulationZ); Simulation of with Shallow and Hot Weather (GLM_Shallow_Hot_SimulationZ); Simulation of with Shallow Lake and Cold Weather (GLM_Shallow_Cold_SimulationZ).

Case Study: Cyclist

kaggle.com

Updated Jul 27, 2021

Facebook

Twitter

Click to copy link

Link copied

Cite

PatrickRCampbell (2021). Case Study: Cyclist [Dataset]. https://www.kaggle.com/patrickrcampbell/case-study-cyclist/discussion

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Jul 27, 2021

Dataset provided by

Kagglehttp://kaggle.com/

Authors

PatrickRCampbell

Description

Phase 1: ASK

Key Objectives:

1. Business Task * Cyclist is looking to increase their earnings, and wants to know if creating a social media campaign can influence "Casual" users to become "Annual" members.

2. Key Stakeholders: * The main stakeholder from Cyclist is Lily Moreno, whom is the Director of Marketing and responsible for the development of campaigns and initiatives to promote their bike-share program. The other teams involved with this project will be Marketing & Analytics, and the Executive Team.

3. Business Task: * Comparing the two kinds of users and defining how they use the platform, what variables they have in common, what variables are different, and how can they get Casual users to become Annual members

Phase 2: PREPARE:

Key Objectives:

1. Determine Data Credibility * Cyclist provided data from years 2013-2021 (through March 2021), all of which is first-hand data collected by the company.

2. Sort & Filter Data: * The stakeholders want to know how the current users are using their service, so I am focusing on using the data from 2020-2021 since this is the most relevant period of time to answer the business task.

#Installing packages
install.packages("tidyverse", repos = "http://cran.us.r-project.org")
install.packages("readr", repos = "http://cran.us.r-project.org")
install.packages("janitor", repos = "http://cran.us.r-project.org")
install.packages("geosphere", repos = "http://cran.us.r-project.org")
install.packages("gridExtra", repos = "http://cran.us.r-project.org")

library(tidyverse)
library(readr)
library(janitor)
library(geosphere)
library(gridExtra)

#Importing data & verifying the information within the dataset
all_tripdata_clean <- read.csv("/Data Projects/cyclist/cyclist_data_cleaned.csv")

glimpse(all_tripdata_clean)

summary(all_tripdata_clean)

Phase 3: PROCESS

Key Objectives:

1. Cleaning Data & Preparing for Analysis: * Once the data has been placed into one dataset, and checked for errors, we began cleaning the data. * Eliminating data that correlates to the company servicing the bikes, and any ride with a traveled distance of zero. * New columns will be added to assist in the analysis, and to provide accurate assessments of whom is using the bikes.

#Eliminating any data that represents the company performing maintenance, and trips without any measureable distance
all_tripdata_clean <- all_tripdata_clean[!(all_tripdata_clean$start_station_name == "HQ QR" | all_tripdata_clean$ride_length<0),] 

#Creating columns for the individual date components (days_of_week should be run last)
all_tripdata_clean$day_of_week <- format(as.Date(all_tripdata_clean$date), "%A")
all_tripdata_clean$date <- as.Date(all_tripdata_clean$started_at)
all_tripdata_clean$day <- format(as.Date(all_tripdata_clean$date), "%d")
all_tripdata_clean$month <- format(as.Date(all_tripdata_clean$date), "%m")
all_tripdata_clean$year <- format(as.Date(all_tripdata_clean$date), "%Y")

** Now I will begin calculating the length of rides being taken, distance traveled, and the mean amount of time & distance.**

#Calculating the ride length in miles & minutes
all_tripdata_clean$ride_length <- difftime(all_tripdata_clean$ended_at,all_tripdata_clean$started_at,units = "mins")

all_tripdata_clean$ride_distance <- distGeo(matrix(c(all_tripdata_clean$start_lng, all_tripdata_clean$start_lat), ncol = 2), matrix(c(all_tripdata_clean$end_lng, all_tripdata_clean$end_lat), ncol = 2))
all_tripdata_clean$ride_distance = all_tripdata_clean$ride_distance/1609.34 #converting to miles

#Calculating the mean time and distance based on the user groups
userType_means <- all_tripdata_clean %>% group_by(member_casual) %>% summarise(mean_time = mean(ride_length))


userType_means <- all_tripdata_clean %>% 
 group_by(member_casual) %>% 
 summarise(mean_time = mean(ride_length),mean_distance = mean(ride_distance))

Adding in calculations that will differentiate between bike types and which type of user is using each specific bike type.

#Calculations

with_bike_type <- all_tripdata_clean %>% filter(rideable_type=="classic_bike" | rideable_type=="electric_bike")

with_bike_type %>%
 mutate(weekday = wday(started_at, label = TRUE)) %>% 
 group_by(member_casual,rideable_type,weekday) %>%
 summarise(totals=n(), .groups="drop") %>%
 
with_bike_type %>%
 group_by(member_casual,rideable_type) %>%
 summarise(totals=n(), .groups="drop") %>%

 #Calculating the ride differential
 
 all_tripdata_clean %>% 
 mutate(weekday = wkday(started_at, label = TRUE)) %>% 
 group_by(member_casual, weekday) %>% 
 summarise(number_of_rides = n()
      ,average_duration = mean(ride_length),.groups = 'drop') %>% 
 arrange(me...

d
Current Population Survey (CPS)
search.dataone.org
dataverse.harvard.edu
Updated Nov 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Damico, Anthony (2023). Current Population Survey (CPS) [Dataset]. http://doi.org/10.7910/DVN/AK4FDD
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/AK4FDD
Dataset updated
Nov 21, 2023
Dataset provided by
Harvard Dataverse
Authors
Damico, Anthony
Description
analyze the current population survey (cps) annual social and economic supplement (asec) with r the annual march cps-asec has been supplying the statistics for the census bureau's report on income, poverty, and health insurance coverage since 1948. wow. the us census bureau and the bureau of labor statistics ( bls) tag-team on this one. until the american community survey (acs) hit the scene in the early aughts (2000s), the current population survey had the largest sample size of all the annual general demographic data sets outside of the decennial census - about two hundred thousand respondents. this provides enough sample to conduct state- and a few large metro area-level analyses. your sample size will vanish if you start investigating subgroups b y state - consider pooling multiple years. county-level is a no-no. despite the american community survey's larger size, the cps-asec contains many more variables related to employment, sources of income, and insurance - and can be trended back to harry truman's presidency. aside from questions specifically asked about an annual experience (like income), many of the questions in this march data set should be t reated as point-in-time statistics. cps-asec generalizes to the united states non-institutional, non-active duty military population. the national bureau of economic research (nber) provides sas, spss, and stata importation scripts to create a rectangular file (rectangular data means only person-level records; household- and family-level information gets attached to each person). to import these files into r, the parse.SAScii function uses nber's sas code to determine how to import the fixed-width file, then RSQLite to put everything into a schnazzy database. you can try reading through the nber march 2012 sas importation code yourself, but it's a bit of a proc freak show. this new github repository contains three scripts: 2005-2012 asec - download all microdata.R down load the fixed-width file containing household, family, and person records import by separating this file into three tables, then merge 'em together at the person-level download the fixed-width file containing the person-level replicate weights merge the rectangular person-level file with the replicate weights, then store it in a sql database create a new variable - one - in the data table 2012 asec - analysis examples.R connect to the sql database created by the 'download all microdata' progr am create the complex sample survey object, using the replicate weights perform a boatload of analysis examples replicate census estimates - 2011.R connect to the sql database created by the 'download all microdata' program create the complex sample survey object, using the replicate weights match the sas output shown in the png file below 2011 asec replicate weight sas output.png statistic and standard error generated from the replicate-weighted example sas script contained in this census-provided person replicate weights usage instructions document. click here to view these three scripts for more detail about the current population survey - annual social and economic supplement (cps-asec), visit: the census bureau's current population survey page the bureau of labor statistics' current population survey page the current population survey's wikipedia article notes: interviews are conducted in march about experiences during the previous year. the file labeled 2012 includes information (income, work experience, health insurance) pertaining to 2011. when you use the current populat ion survey to talk about america, subract a year from the data file name. as of the 2010 file (the interview focusing on america during 2009), the cps-asec contains exciting new medical out-of-pocket spending variables most useful for supplemental (medical spending-adjusted) poverty research. confidential to sas, spss, stata, sudaan users: why are you still rubbing two sticks together after we've invented the butane lighter? time to transition to r. :D
d
Namoi AWRA-R model implementation (post groundwater input)
data.gov.au
cloud.csiss.gmu.edu
+1more
Updated Aug 9, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bioregional Assessment Program (2023). Namoi AWRA-R model implementation (post groundwater input) [Dataset]. https://data.gov.au/data/dataset/8681bd56-1806-40a8-892e-4da13cda86b8
Explore at:
Dataset updated
Aug 9, 2023
Dataset authored and provided by
Bioregional Assessment Program
Area covered
Namoi River
Description
Abstract

The dataset was derived by the Bioregional Assessment Programme from multiple source datasets. The source datasets are identified in the Lineage field in this metadata statement. The processes undertaken to produce this derived dataset are described in the History field in this metadata statement.

AWRA-R model implementation.

The metadata within the dataset contains the workflow, processes, input and output data and instructions to implement the Namoi AWRA-R model for model calibration or simulation. In Namoi the AWRA-R simulation was done twice, firstly without the baseflow input from the groundwater modelling and then second set of runs were carried out with the input from the groundwater modelling.

Each sub-folder in the associated data has a readme file indicating folder contents and providing general instructions about the workflow performed.

Detailed documentation of the AWRA-R model, is provided in: https://publications.csiro.au/rpr/download?pid=csiro:EP154523&dsid=DS2

Documentation about the implementation of AWRA-R in the Namoi bioregion is provided in BA NAM 2.6.1.3 and 2.6.1.4 products.

'..\AWRAR_Metadata_WGW\Namoi_Model_Sequence.pptx' shows the AWRA-L/R modelling sequence.

Purpose

BA Surface water modelling for Namoi bioregion

Dataset History

The directories within contain the input data and outputs of the Namoi AWRA-R model for model calibration and simulation. The folders of calibration is used as an example,simulation uses mirror files of these data, albeit with longer time-series depending on the simualtion period.

Detailed documentation of the AWRA-R model, is provided in: https://publications.csiro.au/rpr/download?pid=csiro:EP154523&dsid=DS2

Documentation about the implementation of AWRA-R in the Hunter bioregion is provided in BA NAM 2.6.1.3 and 2.6.1.4 products.

Additional data needed to generate some of the inputs needed to implement AWRA-R are detailed in the corresponding metadata statement as stated below.

Here is the parent folder:

'..\AWRAR_Metadata_WGW..'

Input data needed:

Gauge/node topological information in '...\model calibration\NAM5.3.1_low_calib\gis\sites\AWRARv5.00_reaches.csv'.

Look up table for soil thickness in '...\model calibration\NAM5.3.1_low_calib\gis\ASRIS_soil_properties\NAM_AWRAR_ASRIS_soil_thickness_v5.00.csv'. (check metadata statement)

Look up tables of AWRA-LG groundwater parameters in '...\model calibration\NAM5.3.1_low_calib\gis\AWRA-LG_gw_parameters\'.

Look up table of AWRA-LG catchment grid cell contribution in '...model calibration\NAM5.3.1_low_calib\gis\catchment-boundary\AWRA-R_catchment_x_AWRA-L_weight.csv'. (check metadata statement)

Look up tables of link lengths for main river, tributaries and distributaries within a reach in \model calibration\NAM5.3.1_low_calib\gis\rivers\'. (check metadata statement)

Time series data of AWRA-LG outputs: evaporation, rainfall, runoff and depth to groundwater.

Gridded data of AWRA-LG groundwater parameters, refer to explanation in '...'\model calibration\NAM5.3.1_low_calib\rawdata\AWRA_LG_output\gw_parameters\README.txt'.

Time series of observed or simulated reservoir level, volume and surface area for reservoirs used in the simulation: Keepit Dam, Split Rock and Chaffey Creek Dam.

located in '...\model calibration\NAM5.3.1_low_calib\rawdata\reservoirs\'.

Gauge station cross sections in '...\model calibration\NAM5.3.1_low_calib\rawdata\Site_Station_Sections\'. (check metadata statement)

Daily Streamflow and level time-series in'...\model calibration\NAM5.3.1_low_calib\rawdata\streamflow_and_level_all_processed\'.

Irrigation input, configuration and parameter files in '...\model calibration\NAM5.3.1_low_calib\inputs\NAM\irrigation\'.

These come from the separate calibration of the AWRA-R irrigation module in:

'...\irrigation calibration simulation\', refer to explanation in readme.txt file therein.

For Dam simulation script, read the following readme.txt files

' ..\AWRAR_Metadata_WGW\dam model calibration simulation\Chaffey\readme.txt'

'... \AWRAR_Metadata_WGW\dam model calibration simulation\Split_Rock_and_Keepit\readme.txt'

Relevant ouputs include:

AWRA-R time series of stores and fluxes in river reaches ('...\AWRAR_Metadata_WGW\model calibration\NAM5.3.1_low_calib\outputs\jointcalibration\v00\NAM\simulations\')

including simulated streamflow in files denoted XXXXXX_full_period_states_nonrouting.csv where XXXXXX denotes gauge or node ID.

AWRA-R time series of stores and fluxes for irrigation/mining in the same directory as above in files XXXXXX_irrigation_states.csv

AWRA-R calibration validation goodness of fit metrics ('...\AWRAR_Metadata_WGW\model calibration\NAM5.3.1_low_calib\outputs\jointcalibration\v00\NAM\postprocessing\')

in files calval_results_XXXXXX_v5.00.csv

Dataset Citation

Bioregional Assessment Programme (2017) Namoi AWRA-R model implementation (post groundwater input). Bioregional Assessment Derived Dataset. Viewed 12 March 2019, http://data.bioregionalassessments.gov.au/dataset/8681bd56-1806-40a8-892e-4da13cda86b8.

Dataset Ancestors

Derived From Historical Mining Footprints DTIRIS NAM 20150914

Derived From GEODATA 9 second DEM and D8: Digital Elevation Model Version 3 and Flow Direction Grid 2008

Derived From Namoi Environmental Impact Statements - Mine footprints

Derived From Namoi Surface Water Mine Footprints - digitised

Derived From River Styles Spatial Layer for New South Wales

Derived From National Surface Water sites Hydstra

Derived From Namoi AWRA-L model

Derived From Namoi Hydstra surface water time series v1 extracted 140814

Derived From Namoi AWRA-R (restricted input data implementation)

Derived From Namoi Existing Mine Development Surface Water Footprints

Facebook

Twitter

Click to copy link

Link copied

Cite

U.S. Geological Survey (2024). R code used to estimate public supply consumptive water use [Dataset]. https://catalog.data.gov/dataset/r-code-used-to-estimate-public-supply-consumptive-water-use

R code used to estimate public supply consumptive water use

Explore at:

Dataset updated

Aug 29, 2024

Dataset provided by

U.S. Geological Survey

Description

This child item describes R code used to determine public supply consumptive use estimates. Consumptive use was estimated by scaling an assumed fraction of deliveries used for outdoor irrigation by spatially explicit estimates of evaporative demand using estimated domestic and commercial, industrial, and institutional deliveries from the public supply delivery machine learning model child item. This method scales public supply water service area outdoor water use by the relationship between service area gross reference evapotranspiration provided by GridMET and annual continental U.S. (CONUS) growing season maximum evapotranspiration. This relationship to climate at the CONUS scale could result in over- or under-estimation of consumptive use at public supply service areas where local variations differ from national variations in climate. This method also assumes that 50% of deliveries for total domestic and commercial, industrial, and institutional deliveries is used for outdoor purposes. This dataset is part of a larger data release using machine learning to predict public supply water use for 12-digit hydrologic units from 2000-2020. This page includes the following file: PS_ConsumptiveUse.zip - a zip file containing input datasets, scripts, and output datasets

Clear search

Close search

Google apps

Main menu

R code used to estimate public supply consumptive water use

R codes and dataset for Visualisation of Diachronic Constructional Change...

R code that determines groundwater and surface water source fractions for...

Datasets, Codes, and R Outputs for Analyzing Cross-Provincial...

Film Circulation dataset

Development of a Tool to Determine the Variability of Consensus Mass Spectra...

Explore data formats and ingestion methods

Why this Dataset

Iris Dataset

Content

Acknowledgements

Inspiration

autotrain-data-test

[Superseded] Intellectual Property Government Open Data 2019

What is IPGOD?\r

How do I use IPGOD?\r

IP Data Platform\r

References\r

Updates\r

Tables and columns\r

Data quality improvements\r

Check actions of CTIA, their focus and issued fines

R code that determines buying and selling of water by public-supply water...

Replication dataset and calculations for PIIE Briefing 16-3, Reality Check...

Data from: XBT and CTD pairs dataset Version 1

ASIC - Company Dataset

Update March 2025 ###\r

Update August 2018 - frequency change to Company dataset ###\r

Dataset summary###\r

HUN AWRA-R simulation nodes v01

Data from: Datasets for lot sizing and scheduling problems in the...

GLM model data sets used to evaluate changes in the hydrodynamics of Anvil...

Case Study: Cyclist

Phase 1: ASK

Key Objectives:

Phase 2: PREPARE:

Key Objectives:

Phase 3: PROCESS

Key Objectives:

Current Population Survey (CPS)

Namoi AWRA-R model implementation (post groundwater input)

Abstract

Purpose

Dataset History

Dataset Citation

Dataset Ancestors

R code used to estimate public supply consumptive water useSee More Versions

R code used to estimate public supply consumptive water use