85 datasets found
  1. T

    Integrated Library System (ILS) Data Dictionary

    • cos-data.seattle.gov
    • data.seattle.gov
    • +1more
    application/rdfxml +5
    Updated May 10, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Seattle (2017). Integrated Library System (ILS) Data Dictionary [Dataset]. https://cos-data.seattle.gov/Community-and-Culture/Integrated-Library-System-ILS-Data-Dictionary/pbt3-ytbc
    Explore at:
    tsv, csv, application/rssxml, application/rdfxml, json, xmlAvailable download formats
    Dataset updated
    May 10, 2017
    Dataset authored and provided by
    City of Seattle
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Description

    Lookup table of Horizon item and borrower codes. The source of this data are the code definition tables in Horizon, such as bstat, btype, collection, and itype.

    This dataset is useful for understanding the codes used in some of Seattle Public Library's other open datasets. These codes (namely "ItemType" and "ItemCollection") are systematically used in the cataloging of items within Integrated Library System (ILS), Horizon (Sirsidynix).

  2. Datasets for figures and tables

    • datasets.ai
    • catalog.data.gov
    • +1more
    57
    Updated Aug 6, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Environmental Protection Agency (2024). Datasets for figures and tables [Dataset]. https://datasets.ai/datasets/datasets-for-figures-and-tables
    Explore at:
    57Available download formats
    Dataset updated
    Aug 6, 2024
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Authors
    U.S. Environmental Protection Agency
    Description

    Software

    Model simulations were conducted using WRF version 3.8.1 (available at https://github.com/NCAR/WRFV3) and CMAQ version 5.2.1 (available at https://github.com/USEPA/CMAQ). The meteorological and concentration fields created using these models are too large to archive on ScienceHub, approximately 1 TB, and are archived on EPA’s high performance computing archival system (ASM) at /asm/MOD3APP/pcc/02.NOAH.v.CLM.v.PX/.

    Figures

    Figures 1 – 6 and Figure 8: Created using the NCAR Command Language (NCL) scripts (https://www.ncl.ucar.edu/get_started.shtml). NCLD code can be downloaded from the NCAR website (https://www.ncl.ucar.edu/Download/) at no cost. The data used for these figures are archived on EPA’s ASM system and are available upon request.

    Figures 7, 8b-c, 8e-f, 8h-i, and 9 were created using the AMET utility developed by U.S. EPA/ORD. AMET can be freely downloaded and used at https://github.com/USEPA/AMET. The modeled data paired in space and time provided in this archive can be used to recreate these figures.

    The data contained in the compressed zip files are organized in comma delimited files with descriptive headers or space delimited files that match tabular data in the manuscript. The data dictionary provides additional information about the files and their contents.

    This dataset is associated with the following publication: Campbell, P., J. Bash, and T. Spero. Updates to the Noah Land Surface Model in WRF‐CMAQ to Improve Simulated Meteorology, Air Quality, and Deposition. Journal of Advances in Modeling Earth Systems. John Wiley & Sons, Inc., Hoboken, NJ, USA, 11(1): 231-256, (2019).

  3. Open Data Portal Catalogue

    • open.canada.ca
    • datasets.ai
    • +1more
    csv, json, jsonl, png +2
    Updated Jun 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Treasury Board of Canada Secretariat (2025). Open Data Portal Catalogue [Dataset]. https://open.canada.ca/data/en/dataset/c4c5c7f1-bfa6-4ff6-b4a0-c164cb2060f7
    Explore at:
    csv, sqlite, json, png, jsonl, xlsxAvailable download formats
    Dataset updated
    Jun 14, 2025
    Dataset provided by
    Treasury Board of Canada Secretariathttp://www.tbs-sct.gc.ca/
    Treasury Board of Canadahttps://www.canada.ca/en/treasury-board-secretariat/corporate/about-treasury-board.html
    License

    Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
    License information was derived automatically

    Description

    The open data portal catalogue is a downloadable dataset containing some key metadata for the general datasets available on the Government of Canada's Open Data portal. Resource 1 is generated using the ckanapi tool (external link) Resources 2 - 8 are generated using the Flatterer (external link) utility. ###Description of resources: 1. Dataset is a JSON Lines (external link) file where the metadata of each Dataset/Open Information Record is one line of JSON. The file is compressed with GZip. The file is heavily nested and recommended for users familiar with working with nested JSON. 2. Catalogue is a XLSX workbook where the nested metadata of each Dataset/Open Information Record is flattened into worksheets for each type of metadata. 3. datasets metadata contains metadata at the dataset level. This is also referred to as the package in some CKAN documentation. This is the main table/worksheet in the SQLite database and XLSX output. 4. Resources Metadata contains the metadata for the resources contained within each dataset. 5. resource views metadata contains the metadata for the views applied to each resource, if a resource has a view configured. 6. datastore fields metadata contains the DataStore information for CSV datasets that have been loaded into the DataStore. This information is displayed in the Data Dictionary for DataStore enabled CSVs. 7. Data Package Fields contains a description of the fields available in each of the tables within the Catalogue, as well as the count of the number of records each table contains. 8. data package entity relation diagram Displays the title and format for column, in each table in the Data Package in the form of a ERD Diagram. The Data Package resource offers a text based version. 9. SQLite Database is a .db database, similar in structure to Catalogue. This can be queried with database or analytical software tools for doing analysis.

  4. d

    APD Data Dictionary

    • catalog.data.gov
    • datahub.austintexas.gov
    • +1more
    Updated Apr 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.austintexas.gov (2025). APD Data Dictionary [Dataset]. https://catalog.data.gov/dataset/apd-data-dictionary
    Explore at:
    Dataset updated
    Apr 25, 2025
    Dataset provided by
    data.austintexas.gov
    Description

    A table of the values and definitions of fields used in Austin Police Department datasets. City of Austin Open Data Terms of Use - https://data.austintexas.gov/stories/s/ranj-cccq

  5. [Superseded] Intellectual Property Government Open Data 2019

    • researchdata.edu.au
    • data.gov.au
    Updated Jun 6, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    IP Australia (2019). [Superseded] Intellectual Property Government Open Data 2019 [Dataset]. https://researchdata.edu.au/superseded-intellectual-property-data-2019/2994670
    Explore at:
    Dataset updated
    Jun 6, 2019
    Dataset provided by
    Data.govhttps://data.gov/
    Authors
    IP Australia
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    What is IPGOD?\r

    The Intellectual Property Government Open Data (IPGOD) includes over 100 years of registry data on all intellectual property (IP) rights administered by IP Australia. It also has derived information about the applicants who filed these IP rights, to allow for research and analysis at the regional, business and individual level. This is the 2019 release of IPGOD.\r \r \r

    How do I use IPGOD?\r

    IPGOD is large, with millions of data points across up to 40 tables, making them too large to open with Microsoft Excel. Furthermore, analysis often requires information from separate tables which would need specialised software for merging. We recommend that advanced users interact with the IPGOD data using the right tools with enough memory and compute power. This includes a wide range of programming and statistical software such as Tableau, Power BI, Stata, SAS, R, Python, and Scalar.\r \r \r

    IP Data Platform\r

    IP Australia is also providing free trials to a cloud-based analytics platform with the capabilities to enable working with large intellectual property datasets, such as the IPGOD, through the web browser, without any installation of software. IP Data Platform\r \r

    References\r

    \r The following pages can help you gain the understanding of the intellectual property administration and processes in Australia to help your analysis on the dataset.\r \r * Patents\r * Trade Marks\r * Designs\r * Plant Breeder’s Rights\r \r \r

    Updates\r

    \r

    Tables and columns\r

    \r Due to the changes in our systems, some tables have been affected.\r \r * We have added IPGOD 225 and IPGOD 325 to the dataset!\r * The IPGOD 206 table is not available this year.\r * Many tables have been re-built, and as a result may have different columns or different possible values. Please check the data dictionary for each table before use.\r \r

    Data quality improvements\r

    \r Data quality has been improved across all tables.\r \r * Null values are simply empty rather than '31/12/9999'.\r * All date columns are now in ISO format 'yyyy-mm-dd'.\r * All indicator columns have been converted to Boolean data type (True/False) rather than Yes/No, Y/N, or 1/0.\r * All tables are encoded in UTF-8.\r * All tables use the backslash \ as the escape character.\r * The applicant name cleaning and matching algorithms have been updated. We believe that this year's method improves the accuracy of the matches. Please note that the "ipa_id" generated in IPGOD 2019 will not match with those in previous releases of IPGOD.

  6. d

    Open Data Dictionary Template Individual

    • opendata.dc.gov
    • catalog.data.gov
    Updated Jan 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Washington, DC (2023). Open Data Dictionary Template Individual [Dataset]. https://opendata.dc.gov/documents/cb6a686b1e344eeb8136d0103c942346
    Explore at:
    Dataset updated
    Jan 5, 2023
    Dataset authored and provided by
    City of Washington, DC
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Description

    This template covers section 2.5 Resource Fields: Entity and Attribute Information of the Data Discovery Form cited in the Open Data DC Handbook (2022). It completes documentation elements that are required for publication. Each field column (attribute) in the dataset needs a description clarifying the contents of the column. Data originators are encouraged to enter the code values (domains) of the column to help end-users translate the contents of the column where needed, especially when lookup tables do not exist.

  7. Film Circulation dataset

    • zenodo.org
    • data.niaid.nih.gov
    bin, csv, png
    Updated Jul 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Skadi Loist; Skadi Loist; Evgenia (Zhenya) Samoilova; Evgenia (Zhenya) Samoilova (2024). Film Circulation dataset [Dataset]. http://doi.org/10.5281/zenodo.7887672
    Explore at:
    csv, png, binAvailable download formats
    Dataset updated
    Jul 12, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Skadi Loist; Skadi Loist; Evgenia (Zhenya) Samoilova; Evgenia (Zhenya) Samoilova
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Complete dataset of “Film Circulation on the International Film Festival Network and the Impact on Global Film Culture”

    A peer-reviewed data paper for this dataset is in review to be published in NECSUS_European Journal of Media Studies - an open access journal aiming at enhancing data transparency and reusability, and will be available from https://necsus-ejms.org/ and https://mediarep.org

    Please cite this when using the dataset.


    Detailed description of the dataset:

    1 Film Dataset: Festival Programs

    The Film Dataset consists a data scheme image file, a codebook and two dataset tables in csv format.

    The codebook (csv file “1_codebook_film-dataset_festival-program”) offers a detailed description of all variables within the Film Dataset. Along with the definition of variables it lists explanations for the units of measurement, data sources, coding and information on missing data.

    The csv file “1_film-dataset_festival-program_long” comprises a dataset of all films and the festivals, festival sections, and the year of the festival edition that they were sampled from. The dataset is structured in the long format, i.e. the same film can appear in several rows when it appeared in more than one sample festival. However, films are identifiable via their unique ID.

    The csv file “1_film-dataset_festival-program_wide” consists of the dataset listing only unique films (n=9,348). The dataset is in the wide format, i.e. each row corresponds to a unique film, identifiable via its unique ID. For easy analysis, and since the overlap is only six percent, in this dataset the variable sample festival (fest) corresponds to the first sample festival where the film appeared. For instance, if a film was first shown at Berlinale (in February) and then at Frameline (in June of the same year), the sample festival will list “Berlinale”. This file includes information on unique and IMDb IDs, the film title, production year, length, categorization in length, production countries, regional attribution, director names, genre attribution, the festival, festival section and festival edition the film was sampled from, and information whether there is festival run information available through the IMDb data.


    2 Survey Dataset

    The Survey Dataset consists of a data scheme image file, a codebook and two dataset tables in csv format.

    The codebook “2_codebook_survey-dataset” includes coding information for both survey datasets. It lists the definition of the variables or survey questions (corresponding to Samoilova/Loist 2019), units of measurement, data source, variable type, range and coding, and information on missing data.

    The csv file “2_survey-dataset_long-festivals_shared-consent” consists of a subset (n=161) of the original survey dataset (n=454), where respondents provided festival run data for films (n=206) and gave consent to share their data for research purposes. This dataset consists of the festival data in a long format, so that each row corresponds to the festival appearance of a film.

    The csv file “2_survey-dataset_wide-no-festivals_shared-consent” consists of a subset (n=372) of the original dataset (n=454) of survey responses corresponding to sample films. It includes data only for those films for which respondents provided consent to share their data for research purposes. This dataset is shown in wide format of the survey data, i.e. information for each response corresponding to a film is listed in one row. This includes data on film IDs, film title, survey questions regarding completeness and availability of provided information, information on number of festival screenings, screening fees, budgets, marketing costs, market screenings, and distribution. As the file name suggests, no data on festival screenings is included in the wide format dataset.


    3 IMDb & Scripts

    The IMDb dataset consists of a data scheme image file, one codebook and eight datasets, all in csv format. It also includes the R scripts that we used for scraping and matching.

    The codebook “3_codebook_imdb-dataset” includes information for all IMDb datasets. This includes ID information and their data source, coding and value ranges, and information on missing data.

    The csv file “3_imdb-dataset_aka-titles_long” contains film title data in different languages scraped from IMDb in a long format, i.e. each row corresponds to a title in a given language.

    The csv file “3_imdb-dataset_awards_long” contains film award data in a long format, i.e. each row corresponds to an award of a given film.

    The csv file “3_imdb-dataset_companies_long” contains data on production and distribution companies of films. The dataset is in a long format, so that each row corresponds to a particular company of a particular film.

    The csv file “3_imdb-dataset_crew_long” contains data on names and roles of crew members in a long format, i.e. each row corresponds to each crew member. The file also contains binary gender assigned to directors based on their first names using the GenderizeR application.

    The csv file “3_imdb-dataset_festival-runs_long” contains festival run data scraped from IMDb in a long format, i.e. each row corresponds to the festival appearance of a given film. The dataset does not include each film screening, but the first screening of a film at a festival within a given year. The data includes festival runs up to 2019.

    The csv file “3_imdb-dataset_general-info_wide” contains general information about films such as genre as defined by IMDb, languages in which a film was shown, ratings, and budget. The dataset is in wide format, so that each row corresponds to a unique film.

    The csv file “3_imdb-dataset_release-info_long” contains data about non-festival release (e.g., theatrical, digital, tv, dvd/blueray). The dataset is in a long format, so that each row corresponds to a particular release of a particular film.

    The csv file “3_imdb-dataset_websites_long” contains data on available websites (official websites, miscellaneous, photos, video clips). The dataset is in a long format, so that each row corresponds to a website of a particular film.

    The dataset includes 8 text files containing the script for webscraping. They were written using the R-3.6.3 version for Windows.

    The R script “r_1_unite_data” demonstrates the structure of the dataset, that we use in the following steps to identify, scrape, and match the film data.

    The R script “r_2_scrape_matches” reads in the dataset with the film characteristics described in the “r_1_unite_data” and uses various R packages to create a search URL for each film from the core dataset on the IMDb website. The script attempts to match each film from the core dataset to IMDb records by first conducting an advanced search based on the movie title and year, and then potentially using an alternative title and a basic search if no matches are found in the advanced search. The script scrapes the title, release year, directors, running time, genre, and IMDb film URL from the first page of the suggested records from the IMDb website. The script then defines a loop that matches (including matching scores) each film in the core dataset with suggested films on the IMDb search page. Matching was done using data on directors, production year (+/- one year), and title, a fuzzy matching approach with two methods: “cosine” and “osa.” where the cosine similarity is used to match titles with a high degree of similarity, and the OSA algorithm is used to match titles that may have typos or minor variations.

    The script “r_3_matching” creates a dataset with the matches for a manual check. Each pair of films (original film from the core dataset and the suggested match from the IMDb website was categorized in the following five categories: a) 100% match: perfect match on title, year, and director; b) likely good match; c) maybe match; d) unlikely match; and e) no match). The script also checks for possible doubles in the dataset and identifies them for a manual check.

    The script “r_4_scraping_functions” creates a function for scraping the data from the identified matches (based on the scripts described above and manually checked). These functions are used for scraping the data in the next script.

    The script “r_5a_extracting_info_sample” uses the function defined in the “r_4_scraping_functions”, in order to scrape the IMDb data for the identified matches. This script does that for the first 100 films, to check, if everything works. Scraping for the entire dataset took a few hours. Therefore, a test with a subsample of 100 films is advisable.

    The script “r_5b_extracting_info_all” extracts the data for the entire dataset of the identified matches.

    The script “r_5c_extracting_info_skipped” checks the films with missing data (where data was not scraped) and tried to extract data one more time to make sure that the errors were not caused by disruptions in the internet connection or other technical issues.

    The script “r_check_logs” is used for troubleshooting and tracking the progress of all of the R scripts used. It gives information on the amount of missing values and errors.


    4 Festival Library Dataset

    The Festival Library Dataset consists of a data scheme image file, one codebook and one dataset, all in csv format.

    The codebook (csv file “4_codebook_festival-library_dataset”) offers a detailed description of all variables within the Library Dataset. It lists the definition of variables, such as location and festival name, and festival categories,

  8. P

    Census 2020 Table P1 12011 Blocks

    • data.pompanobeachfl.gov
    • broward-county-demographics-bcgis.hub.arcgis.com
    • +2more
    Updated Feb 28, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    External Datasets (2023). Census 2020 Table P1 12011 Blocks [Dataset]. https://data.pompanobeachfl.gov/dataset/census-2020-table-p1-12011-blocks
    Explore at:
    zip, kml, geojson, html, csv, arcgis geoservices rest apiAvailable download formats
    Dataset updated
    Feb 28, 2023
    Dataset provided by
    cjennings_BCGIS
    Authors
    External Datasets
    Description

    2020 Census P.L. 94-171 is the first detailed data release from the 2020 Decennial Census of Population and Housing. The web layer is based on an extract for Table P1 – Race at the block level geography of Broward County, Florida. The data extract was then joined to the 2020 Census TIGER/Line Shapefiles.

    For details on field names, table hierarchy, and table contents refer to TABLE (MATRIX) SECTION in Chapter 6. Data Dictionary, https://www2.census.gov/programs-surveys/decennial/2020/technical-documentation/complete-tech-docs/summary-file/2020Census_PL94_171Redistricting_StatesTechDoc_English.pdf" STYLE="text-decoration:underline;">2020 Census State Public Law 94-171 Summary File Technical Documentation.

  9. C

    Medical Service Study Area Data Dictionary

    • data.chhs.ca.gov
    Updated Sep 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of Health Care Access and Information (2024). Medical Service Study Area Data Dictionary [Dataset]. https://data.chhs.ca.gov/dataset/medical-service-study-area-data-dictionary
    Explore at:
    csv, geojson, html, zip, kml, arcgis geoservices rest apiAvailable download formats
    Dataset updated
    Sep 5, 2024
    Dataset provided by
    CA Department of Health Care Access and Information
    Authors
    Department of Health Care Access and Information
    Description
    Field NameData TypeDescription
    StatefpNumberUS Census Bureau unique identifier of the state
    CountyfpNumberUS Census Bureau unique identifier of the county
    CountynmTextCounty name
    TractceNumberUS Census Bureau unique identifier of the census tract
    GeoidNumberUS Census Bureau unique identifier of the state + county + census tract
    AlandNumberUS Census Bureau defined land area of the census tract
    AwaterNumberUS Census Bureau defined water area of the census tract
    AsqmiNumberArea calculated in square miles from the Aland
    MSSAidTextID of the Medical Service Study Area (MSSA) the census tract belongs to
    MSSAnmTextName of the Medical Service Study Area (MSSA) the census tract belongs to
    DefinitionTextType of MSSA, possible values are urban, rural and frontier.
    TotalPovPopNumberUS Census Bureau total population for whom poverty status is determined of the census tract, taken from the 2020 ACS 5 YR S1701
  10. d

    Integrated Tax System Data Dictionary

    • opendata.dc.gov
    • catalog.data.gov
    • +1more
    Updated Feb 26, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Washington, DC (2018). Integrated Tax System Data Dictionary [Dataset]. https://opendata.dc.gov/datasets/integrated-tax-system-data-dictionary/api
    Explore at:
    Dataset updated
    Feb 26, 2018
    Dataset authored and provided by
    City of Washington, DC
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Description

    Table defining the fields for the attribute table of the owner points feature. It reflects the new table structure OTR adopted in Tax Year 2005 reflecting the systems update to the public release file.

  11. P

    Census 2020 Table P2 12011 Blocks

    • data.pompanobeachfl.gov
    Updated Feb 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    External Datasets (2023). Census 2020 Table P2 12011 Blocks [Dataset]. https://data.pompanobeachfl.gov/dataset/census-2020-table-p2-12011-blocks
    Explore at:
    csv, html, geojson, zip, kml, arcgis geoservices rest apiAvailable download formats
    Dataset updated
    Feb 28, 2023
    Dataset provided by
    cjennings_BCGIS
    Authors
    External Datasets
    Description

    2020 Census P.L. 94-171 is the first detailed data release from the 2020 Decennial Census of Population and Housing. The web layer is based on an extract for Table P2 – Hispanic or Latino, and Not Hispanic or Latino by Race at the block level geography of Broward County, Florida. The data extract was then joined to the 2020 Census TIGER/Line Shapefiles.

    For details on field names, table hierarchy, and table contents refer to TABLE (MATRIX) SECTION in Chapter 6. Data Dictionary, https://www2.census.gov/programs-surveys/decennial/2020/technical-documentation/complete-tech-docs/summary-file/2020Census_PL94_171Redistricting_StatesTechDoc_English.pdf" STYLE="text-decoration:underline;">2020 Census State Public Law 94-171 Summary File Technical Documentation.

  12. Restaurant Menu Items

    • kaggle.com
    Updated Apr 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pranali Bose (2022). Restaurant Menu Items [Dataset]. https://www.kaggle.com/datasets/pranalibose/restaurant
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 28, 2022
    Dataset provided by
    Kaggle
    Authors
    Pranali Bose
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    About

    The dataset has menu items price for around 100 restaurants.

    Task

    1. Perform EDA and share insights from the data.
    2. Identify stores or items which are priced too high or too low with respect to other menu items or competitors.

    Data Dictionary

    ColumnDescription
    RestaurantName of the Restaurant
    SectionFood Item Section
    ItemFood Item Name
    DescriptionDescription of the Food Item
    PriceCost of the Food Item
  13. u

    Data from: Pesticide Data Program (PDP)

    • agdatacommons.nal.usda.gov
    txt
    Updated Nov 30, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Department of Agriculture (USDA), Agricultural Marketing Service (AMS) (2023). Pesticide Data Program (PDP) [Dataset]. http://doi.org/10.15482/USDA.ADC/1520764
    Explore at:
    txtAvailable download formats
    Dataset updated
    Nov 30, 2023
    Dataset provided by
    Ag Data Commons
    Authors
    U.S. Department of Agriculture (USDA), Agricultural Marketing Service (AMS)
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Description

    The Pesticide Data Program (PDP) is a national pesticide residue database program. Through cooperation with State agriculture departments and other Federal agencies, PDP manages the collection, analysis, data entry, and reporting of pesticide residues on agricultural commodities in the U.S. food supply, with an emphasis on those commodities highly consumed by infants and children. This dataset provides information on where each tested sample was collected, where the product originated from, what type of product it was, and what residues were found on the product, for calendar years 1992 through 2020. The data can measure residues of individual compounds and classes of compounds, as well as provide information about the geographic distribution of the origin of samples, from growers, packers and distributors. The dataset also includes information on where the samples were taken, what laboratory was used to test them, and all testing procedures (by sample, so can be linked to the compound that is identified). The dataset also contains a reference variable for each compound that denotes the limit of detection for a pesticide/commodity pair (LOD variable). The metadata also includes EPA tolerance levels or action levels for each pesticide/commodity pair. The dataset will be updated on a continual basis, with a new resource data file added annually after the PDP calendar-year survey data is released. Resources in this dataset:Resource Title: CSV Data Dictionary for PDP. File Name: PDP_DataDictionary.csvResource Description: Machine-readable Comma Separated Values (CSV) format data dictionary for PDP Database Zip files. Defines variables for the sample identity and analytical results data tables/files. The ## characters in the Table and Text Data File name refer to the 2-digit year for the PDP survey, like 97 for 1997 or 01 for 2001. For details on table linking, see PDF. Resource Software Recommended: Microsoft Excel,url: https://www.microsoft.com/en-us/microsoft-365/excel Resource Title: Data dictionary for Pesticide Data Program. File Name: PDP DataDictionary.pdfResource Description: Data dictionary for PDP Database Zip files.Resource Software Recommended: Adobe Acrobat,url: https://www.adobe.com Resource Title: 2019 PDP Database Zip File. File Name: 2019PDPDatabase.zipResource Title: 2018 PDP Database Zip File. File Name: 2018PDPDatabase.zipResource Title: 2017 PDP Database Zip File. File Name: 2017PDPDatabase.zipResource Title: 2016 PDP Database Zip File. File Name: 2016PDPDatabase.zipResource Title: 2015 PDP Database Zip File. File Name: 2015PDPDatabase.zipResource Title: 2014 PDP Database Zip File. File Name: 2014PDPDatabase.zipResource Title: 2013 PDP Database Zip File. File Name: 2013PDPDatabase.zipResource Title: 2012 PDP Database Zip File. File Name: 2012PDPDatabase.zipResource Title: 2011 PDP Database Zip File. File Name: 2011PDPDatabase.zipResource Title: 2010 PDP Database Zip File. File Name: 2010PDPDatabase.zipResource Title: 2009 PDP Database Zip File. File Name: 2009PDPDatabase.zipResource Title: 2008 PDP Database Zip File. File Name: 2008PDPDatabase.zipResource Title: 2007 PDP Database Zip File. File Name: 2007PDPDatabase.zipResource Title: 2005 PDP Database Zip File. File Name: 2005PDPDatabase.zipResource Title: 2004 PDP Database Zip File. File Name: 2004PDPDatabase.zipResource Title: 2003 PDP Database Zip File. File Name: 2003PDPDatabase.zipResource Title: 2002 PDP Database Zip File. File Name: 2002PDPDatabase.zipResource Title: 2001 PDP Database Zip File. File Name: 2001PDPDatabase.zipResource Title: 2000 PDP Database Zip File. File Name: 2000PDPDatabase.zipResource Title: 1999 PDP Database Zip File. File Name: 1999PDPDatabase.zipResource Title: 1998 PDP Database Zip File. File Name: 1998PDPDatabase.zipResource Title: 1997 PDP Database Zip File. File Name: 1997PDPDatabase.zipResource Title: 1996 PDP Database Zip File. File Name: 1996PDPDatabase.zipResource Title: 1995 PDP Database Zip File. File Name: 1995PDPDatabase.zipResource Title: 1994 PDP Database Zip File. File Name: 1994PDPDatabase.zipResource Title: 1993 PDP Database Zip File. File Name: 1993PDPDatabase.zipResource Title: 1992 PDP Database Zip File. File Name: 1992PDPDatabase.zipResource Title: 2006 PDP Database Zip File. File Name: 2006PDPDatabase.zipResource Title: 2020 PDP Database Zip File. File Name: 2020PDPDatabase.zipResource Description: Data and supporting files for PDP 2020 surveyResource Software Recommended: Microsoft Access,url: https://products.office.com/en-us/access

  14. Predict Women's Basketball Since 1997

    • kaggle.com
    Updated Oct 27, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aman Chauhan (2022). Predict Women's Basketball Since 1997 [Dataset]. https://www.kaggle.com/datasets/whenamancodes/predict-womens-basketball-since-1997
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 27, 2022
    Dataset provided by
    Kaggle
    Authors
    Aman Chauhan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Women's National Basketball Association (WNBA) is a professional basketball league in the United States. It is composed of twelve teams. The league was founded on April 22, 1996, as the women's counterpart to the National Basketball Association (NBA), and league play started in 1997. The regular season is played from May to September, with the All Star game being played midway through the season in July (except in Olympic years) and the WNBA Finals at the end of September until the beginning of October. Find More on Wiki

    About Files:

    • wnba_elo.csv contains game-by-game Elo ratings and forecasts since 1997.
    • wnba_elo_latest.csv contains game-by-game Elo ratings and forecasts for only the latest season.

    Data Dictionary

    ColumnDefinition
    seasonYear of season
    dateDate of game
    playoffWhether game was in playoffs
    neutralWhether game was on a neutral site
    statuspost if the game already happened; pre if it hasn't happened yet; live if it is being played at the time of data export
    home_teamHome team name
    away_teamAway team name
    home_team_abbrHome team abbreviation. Multiple team names can fall under the same team_abbr because of name changes or moves. Interactive is grouped by team_abbr.
    away_team_abbrAway team abbreviation. Multiple team names can fall under the same team_abbr because of name changes or moves. Interactive is grouped by team_abbr.
    home_team_pregame_ratingHome team's Elo rating before the game
    away_team_pregame_ratingAway team's Elo rating before the game
    home_team_winprobHome team's probability of winning according to team ratings
    away_team_winprobAway team's probability of winning according to team ratings
    home_team_scoreHome team's score (will be blank for pre and live games)
    away_team_scoreAway team's score (will be blank for pre and live games)
    home_team_postgame_ratingHome team's rating after the game (will be blank for pre and live games)
    away_team_postgame_ratingAway team's rating after the game (will be blank for pre and live games)
    commissioners_cup_finalWhether this game was the WNBA Commissioner's Cup Championship Game. All Championship Games shift teams' Elo ratings but are excluded from Monte Carlo simulations, as they do not count towards teams' regular season records.
  15. Park, Beach, Open Space, or Coastline Access

    • data.chhs.ca.gov
    • healthdata.gov
    • +3more
    csv, html, pdf, xlsx +1
    Updated Apr 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    California Department of Public Health (2025). Park, Beach, Open Space, or Coastline Access [Dataset]. https://data.chhs.ca.gov/dataset/park-beach-open-space-or-coastline-access
    Explore at:
    xlsx, zip, pdf, csv(129337734), htmlAvailable download formats
    Dataset updated
    Apr 21, 2025
    Dataset authored and provided by
    California Department of Public Healthhttps://www.cdph.ca.gov/
    Description

    This table contains data on access to parks measured as the percent of population within ½ a mile of a parks, beach, open space or coastline for California, its regions, counties, county subdivisions, cities, towns, and census tracts. More information on the data table and a data dictionary can be found in the Data and Resources section. As communities become increasingly more urban, parks and the protection of green and open spaces within cities increase in importance. Parks and natural areas buffer pollutants and contribute to the quality of life by providing communities with social and psychological benefits such as leisure, play, sports, and contact with nature. Parks are critical to human health by providing spaces for health and wellness activities. The access to parks table is part of a series of indicators in the Healthy Communities Data and Indicators Project (HCI) of the Office of Health Equity. The goal of HCI is to enhance public health by providing data, a standardized set of statistical measures, and tools that a broad array of sectors can use for planning healthy communities and evaluating the impact of plans, projects, policy, and environmental changes on community health. The creation of healthy social, economic, and physical environments that promote healthy behaviors and healthy outcomes requires coordination and collaboration across multiple sectors, including transportation, housing, education, agriculture and others. Statistical metrics, or indicators, are needed to help local, regional, and state public health and partner agencies assess community environments and plan for healthy communities that optimize public health. The format of the access to parks table is based on the standardized data format for all HCI indicators. As a result, this data table contains certain variables used in the HCI project (e.g., indicator ID, and indicator definition). Some of these variables may contain the same value for all observations.

  16. A

    ‘Sample Super Store’ analyzed by Analyst-2

    • analyst-2.ai
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com), ‘Sample Super Store’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-sample-super-store-2a4f/latest
    Explore at:
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Sample Super Store’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/ibrahimelsayed182/sample-super-store on 13 February 2022.

    --- Dataset description provided by original source is as follows ---

    Context

    super Store in USA , the data contain about 10000 rows

    Data Dictionary

    AttributesDefinitionexample
    Ship ModeSecond Class
    SegmentSegment CategoryConsumer
    CountryUnited State
    CityLos Angeles
    StateCalifornia
    Postal Code90032
    RegionWest
    CategoryCategories of productTechnology
    Sub-CategoryPhones
    Salesnumber of sales114.9
    Quantity3
    Discount0.45
    Profit14.1694

    Acknowledgements

    All thanks to The Sparks Foundation For making this data set

    Inspiration

    Get the data and try to take insights. Good luck ❤️

    --- Original source retains full ownership of the source dataset ---

  17. A

    ‘Titanic dataset’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Feb 13, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Titanic dataset’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-titanic-dataset-7d6b/0f1d826e/?iid=009-906&v=presentation
    Explore at:
    Dataset updated
    Feb 13, 2022
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Titanic dataset’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/ibrahimelsayed182/titanic-dataset on 13 February 2022.

    --- Dataset description provided by original source is as follows ---

    Overview

    This is Titanic dataset

    Data Dictionary

    AttributesDefinitionKey
    sexSex/Gendermale/female
    ageAge
    sibspsiblings of the passenger0/1 /2 ...
    parchparents / children aboard the Titanic0/1/2 ...
    farePassenger fare
    embarkedPort of EmbarkationC : Cherbourg, Q : Queenstown, S : Southampton
    classTicket classFirst / Second / Third
    whocategories to passengersmale, female, child
    alonehe was alone in ship or no0/1
    survived0/1

    --- Original source retains full ownership of the source dataset ---

  18. d

    Secondary Network (SPEN_015) Data Quality Checks - Dataset - Datopian CKAN...

    • demo.dev.datopian.com
    Updated May 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Secondary Network (SPEN_015) Data Quality Checks - Dataset - Datopian CKAN instance [Dataset]. https://demo.dev.datopian.com/dataset/sp-energy-networks--spen_data_quality_secondary_substation
    Explore at:
    Dataset updated
    May 27, 2025
    Description

    This data table provides the detailed data quality assessment scores for the Secondary Network dataset. The quality assessment was carried out on 31st March. At SPEN, we are dedicated to sharing high-quality data with our stakeholders and being transparent about its' quality. This is why we openly share the results of our data quality assessments. We collaborate closely with Data Owners to address any identified issues and enhance our overall data quality; to demonstrate our progress we conduct annual assessments of our data quality in line with the dataset refresh rate. To learn more about our approach to how we assess data quality, visit Data Quality - SP Energy Networks. We welcome feedback and questions from our stakeholders regarding this process. Our Open Data Team is available to answer any enquiries or receive feedback on the assessments. You can contact them via our Open Data mailbox at opendata@spenergynetworks.co.uk.The first phase of our comprehensive data quality assessment measures the quality of our datasets across three dimensions. Please refer to the data table schema for the definitions of these dimensions. We are now in the process of expanding our quality assessments to include additional dimensions to provide a more comprehensive evaluation and will update the data tables with the results when available.

  19. o

    Long Term Development Statement (LTDS) Table 3a Observed Peak Demand

    • ukpowernetworks.opendatasoft.com
    Updated May 30, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Long Term Development Statement (LTDS) Table 3a Observed Peak Demand [Dataset]. https://ukpowernetworks.opendatasoft.com/explore/dataset/ltds-table-3a-load-data-observed/
    Explore at:
    Dataset updated
    May 30, 2025
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    IntroductionThe Long Term Development Statements (LTDS) report on a 0-5 year period, describing a forecast of load on the network and envisioned network developments. The LTDS is published at the end of May and November each year. This is Table 3a from our current LTDS report (published 30 May 2025), showing the observed substation peak demands with no correction for demand served by generation. More information and full reports are available from the landing page below: Long Term Development Statement and Network Development Plan Landing Page

    Methodological Approach

    Site Functional Locations (FLOCs) are used to associate the Substation to Key characteristics of active Grid and Primary sites — UK Power Networks 
    ID field added to identify row number for reference purposes
    

    Quality Control Statement Quality Control Measures include:

    Verification steps to match features only with confirmed functional locations.
    Manual review and correction of data inconsistencies.
    Use of additional verification steps to ensure accuracy in the methodology.
    

    Assurance Statement The Open Data Team and Network Insights Team worked together to ensure data accuracy and consistency. Other

    Download dataset information: Metadata (JSON) Definitions of key terms related to this dataset can be found in the Open Data Portal Glossary: https://ukpowernetworks.opendatasoft.com/pages/glossary/

  20. S

    NASICON-type solid electrolyte materials named entity recognition dataset

    • scidb.cn
    Updated Apr 27, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Liu Yue; Liu Dahui; Yang Zhengwei; Shi Siqi (2023). NASICON-type solid electrolyte materials named entity recognition dataset [Dataset]. http://doi.org/10.57760/sciencedb.j00213.00001
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 27, 2023
    Dataset provided by
    Science Data Bank
    Authors
    Liu Yue; Liu Dahui; Yang Zhengwei; Shi Siqi
    Description

    1.Framework overview. This paper proposed a pipeline to construct high-quality datasets for text mining in materials science. Firstly, we utilize the traceable automatic acquisition scheme of literature to ensure the traceability of textual data. Then, a data processing method driven by downstream tasks is performed to generate high-quality pre-annotated corpora conditioned on the characteristics of materials texts. On this basis, we define a general annotation scheme derived from materials science tetrahedron to complete high-quality annotation. Finally, a conditional data augmentation model incorporating materials domain knowledge (cDA-DK) is constructed to augment the data quantity.2.Dataset information. The experimental datasets used in this paper include: the Matscholar dataset publicly published by Weston et al. (DOI: 10.1021/acs.jcim.9b00470), and the NASICON entity recognition dataset constructed by ourselves. Herein, we mainly introduce the details of NASICON entity recognition dataset.2.1 Data collection and preprocessing. Firstly, 55 materials science literature related to NASICON system are collected through Crystallographic Information File (CIF), which contains a wealth of structure-activity relationship information. Note that materials science literature is mostly stored as portable document format (PDF), with content arranged in columns and mixed with tables, images, and formulas, which significantly compromises the readability of the text sequence. To tackle this issue, we employ the text parser PDFMiner (a Python toolkit) to standardize, segment, and parse the original documents, thereby converting PDF literature into plain text. In this process, the entire textual information of literature, encompassing title, author, abstract, keywords, institution, publisher, and publication year, is retained and stored as a unified TXT document. Subsequently, we apply rules based on Python regular expressions to remove redundant information, such as garbled characters and line breaks caused by figures, tables, and formulas. This results in a cleaner text corpus, enhancing its readability and enabling more efficient data analysis. Note that special symbols may also appear as garbled characters, but we refrain from directly deleting them, as they may contain valuable information such as chemical units. Therefore, we converted all such symbols to a special token

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
City of Seattle (2017). Integrated Library System (ILS) Data Dictionary [Dataset]. https://cos-data.seattle.gov/Community-and-Culture/Integrated-Library-System-ILS-Data-Dictionary/pbt3-ytbc

Integrated Library System (ILS) Data Dictionary

Explore at:
tsv, csv, application/rssxml, application/rdfxml, json, xmlAvailable download formats
Dataset updated
May 10, 2017
Dataset authored and provided by
City of Seattle
License

U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically

Description

Lookup table of Horizon item and borrower codes. The source of this data are the code definition tables in Horizon, such as bstat, btype, collection, and itype.

This dataset is useful for understanding the codes used in some of Seattle Public Library's other open datasets. These codes (namely "ItemType" and "ItemCollection") are systematically used in the cataloging of items within Integrated Library System (ILS), Horizon (Sirsidynix).

Search
Clear search
Close search
Google apps
Main menu