100+ datasets found
  1. w

    Data tables for customs declarants and declaration volumes for international...

    • gov.uk
    Updated May 31, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    HM Revenue & Customs (2024). Data tables for customs declarants and declaration volumes for international trade in 2023 [Dataset]. https://www.gov.uk/government/statistical-data-sets/data-tables-for-customs-declarants-and-declaration-volumes-for-international-trade-in-2023
    Explore at:
    Dataset updated
    May 31, 2024
    Dataset provided by
    GOV.UK
    Authors
    HM Revenue & Customs
    Description

    The following data tables contain the number of customs declarants and declarations for international trade in goods in 2023, with breakdowns by direction of movement, partner countries, calendar month declarant representation, location of entry/exit and declarations type category.

    https://assets.publishing.service.gov.uk/media/665d86887b792ffff71a8629/Customs_declarants_and_declarations_volumes_for_international_trade_in_goods_in_2023_dataset.ods">Customs declarants and declaration volumes for international trade in 2023 dataset

    ODS, 31.9 KB

    This file is in an OpenDocument format

  2. CPP Dataset

    • kaggle.com
    zip
    Updated Apr 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mujtaba Ahmed (2025). CPP Dataset [Dataset]. https://www.kaggle.com/datasets/rajamujtabaahmed/cpp-dataset
    Explore at:
    zip(82876 bytes)Available download formats
    Dataset updated
    Apr 21, 2025
    Authors
    Mujtaba Ahmed
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    This dataset contains 10,000 unique C++ programming prompts along with their corresponding code responses, designed specifically for training and evaluating natural language generation models such as Transformers. ** Each row in the CSV contains:**

    id: A unique identifier for each record.

    prompt: A C++ programming instruction or task, phrased in natural language.

    response: The corresponding C++ source code fulfilling the prompt.

    The prompts include a wide range of programming concepts, such as:

    Basic arithmetic operations

    Loops and conditionals

    Class and object creation

    Recursion and algorithm design

    Template functions and data structures

    This dataset is ideal for:

    Fine-tuning code generation models (e.g., GPT-style models)

    Creating educational tools or auto-code assistants

    Exploring zero-shot/few-shot learning in code generation

    Following Code can Be used to complete all #TODO Programs in the Dataset:

    import pandas as pd from transformers import AutoModelForCausalLM, AutoTokenizer import torch from tqdm import tqdm

    Load your dataset

    df = pd.read_csv("/Path/CPP_Dataset_MujtabaAhmed.csv")

    Load the model and tokenizer (CodeGen 350M - specialized for programming)

    model_name = "Salesforce/codegen-350M-mono" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name).cuda() # Use .cpu() if no GPU

    Function to complete C++ code with TODO

    def complete_code(prompt): input_text = prompt.strip() + " " inputs = tokenizer(input_text, return_tensors="pt").to(model.device) output = model.generate( **inputs, max_length=512, num_return_sequences=1, temperature=0.7, do_sample=True, top_p=0.95, pad_token_id=tokenizer.eos_token_id ) decoded = tokenizer.decode(output[0], skip_special_tokens=True) return decoded.replace(prompt.strip(), "").strip()

    Iterate and fill TODOs

    completed_responses = []

    for i, row in tqdm(df.iterrows(), total=len(df), desc="Processing"): prompt, response = row["prompt"], row["response"] if "TODO" in response: generated = complete_code(prompt + " " + response.split("TODO")[0]) response_filled = response.replace("TODO", generated) else: response_filled = response completed_responses.append(response_filled)

    Update DataFrame and save

    df["response"] = completed_responses df.to_csv("CPP_Dataset_Completed.csv", index=False) print("✅ Completed CSV saved as 'CPP_Dataset_Completed.csv'")

  3. d

    Replication Data for: Revisiting 'The Rise and Decline' in a Population of...

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TeBlunthuis, Nathan; Aaron Shaw; Benjamin Mako Hill (2023). Replication Data for: Revisiting 'The Rise and Decline' in a Population of Peer Production Projects [Dataset]. http://doi.org/10.7910/DVN/SG3LP1
    Explore at:
    Dataset updated
    Nov 22, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    TeBlunthuis, Nathan; Aaron Shaw; Benjamin Mako Hill
    Description

    This archive contains code and data for reproducing the analysis for “Replication Data for Revisiting ‘The Rise and Decline’ in a Population of Peer Production Projects”. Depending on what you hope to do with the data you probabbly do not want to download all of the files. Depending on your computation resources you may not be able to run all stages of the analysis. The code for all stages of the analysis, including typesetting the manuscript and running the analysis, is in code.tar. If you only want to run the final analysis or to play with datasets used in the analysis of the paper, you want intermediate_data.7z or the uncompressed tab and csv files. The data files are created in a four-stage process. The first stage uses the program “wikiq” to parse mediawiki xml dumps and create tsv files that have edit data for each wiki. The second stage generates all.edits.RDS file which combines these tsvs into a dataset of edits from all the wikis. This file is expensive to generate and at 1.5GB is pretty big. The third stage builds smaller intermediate files that contain the analytical variables from these tsv files. The fourth stage uses the intermediate files to generate smaller RDS files that contain the results. Finally, knitr and latex typeset the manuscript. A stage will only run if the outputs from the previous stages do not exist. So if the intermediate files exist they will not be regenerated. Only the final analysis will run. The exception is that stage 4, fitting models and generating plots, always runs. If you only want to replicate from the second stage onward, you want wikiq_tsvs.7z. If you want to replicate everything, you want wikia_mediawiki_xml_dumps.7z.001 wikia_mediawiki_xml_dumps.7z.002, and wikia_mediawiki_xml_dumps.7z.003. These instructions work backwards from building the manuscript using knitr, loading the datasets, running the analysis, to building the intermediate datasets. Building the manuscript using knitr This requires working latex, latexmk, and knitr installations. Depending on your operating system you might install these packages in different ways. On Debian Linux you can run apt install r-cran-knitr latexmk texlive-latex-extra. Alternatively, you can upload the necessary files to a project on Overleaf.com. Download code.tar. This has everything you need to typeset the manuscript. Unpack the tar archive. On a unix system this can be done by running tar xf code.tar. Navigate to code/paper_source. Install R dependencies. In R. run install.packages(c("data.table","scales","ggplot2","lubridate","texreg")) On a unix system you should be able to run make to build the manuscript generalizable_wiki.pdf. Otherwise you should try uploading all of the files (including the tables, figure, and knitr folders) to a new project on Overleaf.com. Loading intermediate datasets The intermediate datasets are found in the intermediate_data.7z archive. They can be extracted on a unix system using the command 7z x intermediate_data.7z. The files are 95MB uncompressed. These are RDS (R data set) files and can be loaded in R using the readRDS. For example newcomer.ds <- readRDS("newcomers.RDS"). If you wish to work with these datasets using a tool other than R, you might prefer to work with the .tab files. Running the analysis Fitting the models may not work on machines with less than 32GB of RAM. If you have trouble, you may find the functions in lib-01-sample-datasets.R useful to create stratified samples of data for fitting models. See line 89 of 02_model_newcomer_survival.R for an example. Download code.tar and intermediate_data.7z to your working folder and extract both archives. On a unix system this can be done with the command tar xf code.tar && 7z x intermediate_data.7z. Install R dependencies. install.packages(c("data.table","ggplot2","urltools","texreg","optimx","lme4","bootstrap","scales","effects","lubridate","devtools","roxygen2")). On a unix system you can simply run regen.all.sh to fit the models, build the plots and create the RDS files. Generating datasets Building the intermediate files The intermediate files are generated from all.edits.RDS. This process requires about 20GB of memory. Download all.edits.RDS, userroles_data.7z,selected.wikis.csv, and code.tar. Unpack code.tar and userroles_data.7z. On a unix system this can be done using tar xf code.tar && 7z x userroles_data.7z. Install R dependencies. In R run install.packages(c("data.table","ggplot2","urltools","texreg","optimx","lme4","bootstrap","scales","effects","lubridate","devtools","roxygen2")). Run 01_build_datasets.R. Building all.edits.RDS The intermediate RDS files used in the analysis are created from all.edits.RDS. To replicate building all.edits.RDS, you only need to run 01_build_datasets.R when the int... Visit https://dataone.org/datasets/sha256%3Acfa4980c107154267d8eb6dc0753ed0fde655a73a062c0c2f5af33f237da3437 for complete metadata about this dataset.

  4. d

    Part C Black Lung Claim Adjudications at the District Director Level During...

    • catalog.data.gov
    • s.cnmilf.com
    • +1more
    Updated Dec 30, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office of Workers’ Compensation Programs (2024). Part C Black Lung Claim Adjudications at the District Director Level During Fiscal Year [Dataset]. https://catalog.data.gov/dataset/part-c-black-lung-claim-adjudications-at-the-district-director-level-during-fiscal-year-9b2c2
    Explore at:
    Dataset updated
    Dec 30, 2024
    Dataset provided by
    Office of Workers' Compensation Programshttps://www.dol.gov/agencies/owcp
    Description

    This dataset represents the information collected during the claims management process. It contains information about the number of claims decisions and approvals at the District Director level.

  5. s

    Annual Budget 2013 Table C FCC - Dataset - data.smartdublin.ie

    • data.smartdublin.ie
    Updated Jun 28, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2021). Annual Budget 2013 Table C FCC - Dataset - data.smartdublin.ie [Dataset]. https://data.smartdublin.ie/dataset/annual-budget-2013-table-c-fcc2
    Explore at:
    Dataset updated
    Jun 28, 2021
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains data from the Council’s Annual Budget. The budget is comprised of Tables A to F and Appendix 1. Each table is represented by a separate data file.Table C is the Calculation of the Annual Rate on Valuation for the Financial Year for Balbriggan Town Council. It contains –Estimate of ‘Money Demanded’Adopted ‘Money Demanded’Estimated ‘Irrecoverable rates and cost of collection’Adopted ‘Irrecoverable rates and cost of collection’Total Sum to be Raised is the sum of ‘Money Demanded’ and ‘Irrecoverable rates and cost of collection’‘Annual Rate on Valuation to meet Total Sum to be Raised’This dataset is used to create Table C in the published Annual Budget document, which can be found at www.fingal.ieThe data is best understood by comparing it to Table C.Data fields for Table C are as follows –Doc : Table ReferenceHeading : Indicates sections in the Table - Table C is comprised of one section, therefore Heading value for all records = 1Ref : Town ReferenceDesc : Town DescriptionMD_Est : Money Demanded EstimatedMD_Adopt : Money Demanded AdoptedIR_Est : Irrecoverable rates and cost of collection EstimatedIR_Adopt : Irrecoverable rates and cost of collection AdoptedNEV : Annual Rate on Valuation to meet Total Sum to be Raised

  6. R

    Data from: C Project Dataset

    • universe.roboflow.com
    zip
    Updated Sep 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    seoyh rb (2022). C Project Dataset [Dataset]. https://universe.roboflow.com/seoyh-rb-gdiwi/c-project/dataset/2
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 29, 2022
    Dataset authored and provided by
    seoyh rb
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Tooth Bounding Boxes
    Description

    C Project

    ## Overview
    
    C Project is a dataset for object detection tasks - it contains Tooth annotations for 713 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  7. h

    fineweb-c

    • huggingface.co
    Updated Jan 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Is Better Together (2025). fineweb-c [Dataset]. https://huggingface.co/datasets/data-is-better-together/fineweb-c
    Explore at:
    Dataset updated
    Jan 14, 2025
    Dataset authored and provided by
    Data Is Better Together
    Description

    FineWeb-C: Educational content in many languages, labelled by the community

    Multilingual data is better together!

    Note: We are not actively working on this project anymore. You can continue to contribute annotations and we'll occasionally refresh the exported data.

      What is this?
    

    FineWeb-C is a collaborative, community-driven project that expands upon the FineWeb2 dataset. The goal is to create high-quality educational content annotations across hundreds of… See the full description on the dataset page: https://huggingface.co/datasets/data-is-better-together/fineweb-c.

  8. R

    Annotator C Dataset

    • universe.roboflow.com
    zip
    Updated Feb 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ANNOTATORSREADY (2025). Annotator C Dataset [Dataset]. https://universe.roboflow.com/annotatorsready/annotator-c-sqp4k
    Explore at:
    zipAvailable download formats
    Dataset updated
    Feb 4, 2025
    Dataset authored and provided by
    ANNOTATORSREADY
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Cars Houses Trees ETHm Bounding Boxes
    Description

    Annotator C

    ## Overview
    
    Annotator C is a dataset for object detection tasks - it contains Cars Houses Trees ETHm annotations for 1,080 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  9. Data from: Ecosystem-Level Determinants of Sustained Activity in Open-Source...

    • zenodo.org
    application/gzip, bin +2
    Updated Aug 2, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marat Valiev; Marat Valiev; Bogdan Vasilescu; James Herbsleb; Bogdan Vasilescu; James Herbsleb (2024). Ecosystem-Level Determinants of Sustained Activity in Open-Source Projects: A Case Study of the PyPI Ecosystem [Dataset]. http://doi.org/10.5281/zenodo.1419788
    Explore at:
    bin, application/gzip, zip, text/x-pythonAvailable download formats
    Dataset updated
    Aug 2, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Marat Valiev; Marat Valiev; Bogdan Vasilescu; James Herbsleb; Bogdan Vasilescu; James Herbsleb
    License

    https://www.gnu.org/licenses/old-licenses/gpl-2.0-standalone.htmlhttps://www.gnu.org/licenses/old-licenses/gpl-2.0-standalone.html

    Description
    Replication pack, FSE2018 submission #164:
    ------------------------------------------
    
    **Working title:** Ecosystem-Level Factors Affecting the Survival of Open-Source Projects: 
    A Case Study of the PyPI Ecosystem
    
    **Note:** link to data artifacts is already included in the paper. 
    Link to the code will be included in the Camera Ready version as well.
    
    
    Content description
    ===================
    
    - **ghd-0.1.0.zip** - the code archive. This code produces the dataset files 
     described below
    - **settings.py** - settings template for the code archive.
    - **dataset_minimal_Jan_2018.zip** - the minimally sufficient version of the dataset.
     This dataset only includes stats aggregated by the ecosystem (PyPI)
    - **dataset_full_Jan_2018.tgz** - full version of the dataset, including project-level
     statistics. It is ~34Gb unpacked. This dataset still doesn't include PyPI packages
     themselves, which take around 2TB.
    - **build_model.r, helpers.r** - R files to process the survival data 
      (`survival_data.csv` in **dataset_minimal_Jan_2018.zip**, 
      `common.cache/survival_data.pypi_2008_2017-12_6.csv` in 
      **dataset_full_Jan_2018.tgz**)
    - **Interview protocol.pdf** - approximate protocol used for semistructured interviews.
    - LICENSE - text of GPL v3, under which this dataset is published
    - INSTALL.md - replication guide (~2 pages)
    Replication guide
    =================
    
    Step 0 - prerequisites
    ----------------------
    
    - Unix-compatible OS (Linux or OS X)
    - Python interpreter (2.7 was used; Python 3 compatibility is highly likely)
    - R 3.4 or higher (3.4.4 was used, 3.2 is known to be incompatible)
    
    Depending on detalization level (see Step 2 for more details):
    - up to 2Tb of disk space (see Step 2 detalization levels)
    - at least 16Gb of RAM (64 preferable)
    - few hours to few month of processing time
    
    Step 1 - software
    ----------------
    
    - unpack **ghd-0.1.0.zip**, or clone from gitlab:
    
       git clone https://gitlab.com/user2589/ghd.git
       git checkout 0.1.0
     
     `cd` into the extracted folder. 
     All commands below assume it as a current directory.
      
    - copy `settings.py` into the extracted folder. Edit the file:
      * set `DATASET_PATH` to some newly created folder path
      * add at least one GitHub API token to `SCRAPER_GITHUB_API_TOKENS` 
    - install docker. For Ubuntu Linux, the command is 
      `sudo apt-get install docker-compose`
    - install libarchive and headers: `sudo apt-get install libarchive-dev`
    - (optional) to replicate on NPM, install yajl: `sudo apt-get install yajl-tools`
     Without this dependency, you might get an error on the next step, 
     but it's safe to ignore.
    - install Python libraries: `pip install --user -r requirements.txt` . 
    - disable all APIs except GitHub (Bitbucket and Gitlab support were
     not yet implemented when this study was in progress): edit
     `scraper/init.py`, comment out everything except GitHub support
     in `PROVIDERS`.
    
    Step 2 - obtaining the dataset
    -----------------------------
    
    The ultimate goal of this step is to get output of the Python function 
    `common.utils.survival_data()` and save it into a CSV file:
    
      # copy and paste into a Python console
      from common import utils
      survival_data = utils.survival_data('pypi', '2008', smoothing=6)
      survival_data.to_csv('survival_data.csv')
    
    Since full replication will take several months, here are some ways to speedup
    the process:
    
    ####Option 2.a, difficulty level: easiest
    
    Just use the precomputed data. Step 1 is not necessary under this scenario.
    
    - extract **dataset_minimal_Jan_2018.zip**
    - get `survival_data.csv`, go to the next step
    
    ####Option 2.b, difficulty level: easy
    
    Use precomputed longitudinal feature values to build the final table.
    The whole process will take 15..30 minutes.
    
    - create a folder `
  10. LinkedIn Data | C-Level Executives Worldwide | Verified Work Emails &...

    • datarade.ai
    Updated Jan 1, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Success.ai (2018). LinkedIn Data | C-Level Executives Worldwide | Verified Work Emails & Contact Details from 700M+ Dataset | Best Price Guarantee [Dataset]. https://datarade.ai/data-products/linkedin-data-c-level-executives-worldwide-verified-work-success-ai
    Explore at:
    .bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
    Dataset updated
    Jan 1, 2018
    Dataset provided by
    Area covered
    Latvia, Saint Pierre and Miquelon, Marshall Islands, Burundi, Bermuda, Palestine, Cambodia, Malta, Netherlands, United States Minor Outlying Islands
    Description

    Success.ai proudly offers our exclusive LinkedIn Data product, targeting C-level executives from around the globe. This premium dataset is meticulously curated to empower your business development, recruitment strategies, and market research efforts with direct access to top-tier professionals.

    Global Reach and Detailed Insights: Our LinkedIn Data encompasses profiles of C-level executives worldwide, offering detailed insights that include professional histories, current and past affiliations, as well as direct contact information such as verified work emails and phone numbers. This data spans across industries such as finance, technology, healthcare, manufacturing, and more, ensuring you have comprehensive coverage no matter your sector focus.

    Accuracy and Compliance: Accuracy is paramount in executive-level data. Each profile within our dataset undergoes rigorous verification processes, using advanced AI algorithms to ensure data accuracy and reliability. Our datasets are also compliant with global data privacy laws such as GDPR, CCPA, and others, providing you with data you can trust and use with confidence.

    Empower Your Business Strategies: Leverage our LinkedIn Data to enhance various business functions:

    Sales and Marketing: Directly reach decision-makers, reducing sales cycles and increasing conversion rates. Recruitment and Talent Acquisition: Identify and engage with potential candidates for executive roles within your organization. Market Research and Competitive Analysis: Gain insights into competitor leadership and strategic moves by analyzing executive backgrounds and professional networks. Robust Data Points Include:

    Full Names and Titles: Gain access to the full names and current positions of C-level executives. Professional Emails and Phone Numbers: Direct communication channels to ensure your messages reach the intended audience. Company Information: Understand the organizational context with details about the company size, industry, and role within the corporation. Professional History: Detailed career trajectories, highlighting roles, responsibilities, and achievements. Education and Certifications: Educational backgrounds and certifications that enrich the professional profiles of these executives. Flexible Delivery and Integration: Our LinkedIn Data is available in multiple formats, including CSV, Excel, and via API, allowing easy integration into your CRM systems or other sales platforms. We provide continuous updates to our datasets, ensuring you always have access to the most current information available.

    Competitive Pricing with Best Price Guarantee: Success.ai offers this valuable data at the most competitive rates in the industry, backed by our best price guarantee. We are committed to providing you with the highest quality data at prices that fit your budget, ensuring excellent return on investment.

    Sample Data and Custom Solutions: To demonstrate the quality and depth of our LinkedIn Data, we offer a sample dataset for initial evaluation. For specific needs, our team is skilled at creating customized datasets tailored to your exact business requirements.

    Client Success Stories: Our clients, from startups to Fortune 500 companies, have successfully leveraged our LinkedIn Data to drive growth and strategic initiatives. We provide case studies and testimonials that showcase the effectiveness of our data in real-world applications.

    Engage with Success.ai Today: Connect with us to explore how our LinkedIn Data can transform your strategic initiatives. Our data experts are ready to assist you in leveraging the full potential of this dataset to meet your business goals.

    Reach out to Success.ai to access the world of C-level executives and propel your business to new heights with strategic data insights that drive success.

  11. Coastal Change Analysis Program (C-CAP) Regional Land Cover Data and Change...

    • catalog.data.gov
    • datasets.ai
    • +3more
    Updated Apr 15, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NOAA Office for Coastal Management (Point of Contact, Custodian) (2025). Coastal Change Analysis Program (C-CAP) Regional Land Cover Data and Change Data [Dataset]. https://catalog.data.gov/dataset/coastal-change-analysis-program-c-cap-regional-land-cover-data-and-change-data2
    Explore at:
    Dataset updated
    Apr 15, 2025
    Dataset provided by
    National Oceanic and Atmospheric Administrationhttp://www.noaa.gov/
    Description

    The NOAA Coastal Change Analysis Program (C-CAP) produces national standardized land cover and change products for the coastal regions of the U.S. C-CAP products inventory coastal intertidal areas, wetlands, and adjacent uplands with the goal of monitoring changes in these habitats, on a one-to-five year repeat cycle. The timeframe for this metadata is reported as 1985 - 2010-Era, but the actual dates of the Landsat imagery used to create the land cover may have been acquired a few years before or after each era. These maps are developed utilizing Landsat Thematic Mapper imagery, and can be used to track changes in the landscape through time. This trend information gives important feedback to managers on the success or failure of management policies and programs and aid in developing a scientific understanding of the Earth system and its response to natural and human-induced changes. This understanding allows for the prediction of impacts due to these changes and the assessment of their cumulative effects, helping coastal resource managers make more informed regional decisions. NOAA C-CAP is a contributing member to the Multi-Resolution Land Characteristics consortium and C-CAP products are included as the coastal expression of land cover within the National Land Cover Database.

  12. R

    Datn C Dataset

    • universe.roboflow.com
    zip
    Updated Nov 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    hai (2025). Datn C Dataset [Dataset]. https://universe.roboflow.com/hai-fw7at/datn-c-vcn9o/model/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 18, 2025
    Dataset authored and provided by
    hai
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Cookie Bounding Boxes
    Description

    DATN C

    ## Overview
    
    DATN C is a dataset for object detection tasks - it contains Cookie annotations for 300 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  13. The Canada Trademarks Dataset

    • zenodo.org
    pdf, zip
    Updated Jul 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jeremy Sheff; Jeremy Sheff (2024). The Canada Trademarks Dataset [Dataset]. http://doi.org/10.5281/zenodo.4999655
    Explore at:
    zip, pdfAvailable download formats
    Dataset updated
    Jul 19, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Jeremy Sheff; Jeremy Sheff
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Canada Trademarks Dataset

    18 Journal of Empirical Legal Studies 908 (2021), prepublication draft available at https://papers.ssrn.com/abstract=3782655, published version available at https://onlinelibrary.wiley.com/share/author/CHG3HC6GTFMMRU8UJFRR?target=10.1111/jels.12303

    Dataset Selection and Arrangement (c) 2021 Jeremy Sheff

    Python and Stata Scripts (c) 2021 Jeremy Sheff

    Contains data licensed by Her Majesty the Queen in right of Canada, as represented by the Minister of Industry, the minister responsible for the administration of the Canadian Intellectual Property Office.

    This individual-application-level dataset includes records of all applications for registered trademarks in Canada since approximately 1980, and of many preserved applications and registrations dating back to the beginning of Canada’s trademark registry in 1865, totaling over 1.6 million application records. It includes comprehensive bibliographic and lifecycle data; trademark characteristics; goods and services claims; identification of applicants, attorneys, and other interested parties (including address data); detailed prosecution history event data; and data on application, registration, and use claims in countries other than Canada. The dataset has been constructed from public records made available by the Canadian Intellectual Property Office. Both the dataset and the code used to build and analyze it are presented for public use on open-access terms.

    Scripts are licensed for reuse subject to the Creative Commons Attribution License 4.0 (CC-BY-4.0), https://creativecommons.org/licenses/by/4.0/. Data files are licensed for reuse subject to the Creative Commons Attribution License 4.0 (CC-BY-4.0), https://creativecommons.org/licenses/by/4.0/, and also subject to additional conditions imposed by the Canadian Intellectual Property Office (CIPO) as described below.

    Terms of Use:

    As per the terms of use of CIPO's government data, all users are required to include the above-quoted attribution to CIPO in any reproductions of this dataset. They are further required to cease using any record within the datasets that has been modified by CIPO and for which CIPO has issued a notice on its website in accordance with its Terms and Conditions, and to use the datasets in compliance with applicable laws. These requirements are in addition to the terms of the CC-BY-4.0 license, which require attribution to the author (among other terms). For further information on CIPO’s terms and conditions, see https://www.ic.gc.ca/eic/site/cipointernet-internetopic.nsf/eng/wr01935.html. For further information on the CC-BY-4.0 license, see https://creativecommons.org/licenses/by/4.0/.

    The following attribution statement, if included by users of this dataset, is satisfactory to the author, but the author makes no representations as to whether it may be satisfactory to CIPO:

    The Canada Trademarks Dataset is (c) 2021 by Jeremy Sheff and licensed under a CC-BY-4.0 license, subject to additional terms imposed by the Canadian Intellectual Property Office. It contains data licensed by Her Majesty the Queen in right of Canada, as represented by the Minister of Industry, the minister responsible for the administration of the Canadian Intellectual Property Office. For further information, see https://creativecommons.org/licenses/by/4.0/ and https://www.ic.gc.ca/eic/site/cipointernet-internetopic.nsf/eng/wr01935.html.

    Details of Repository Contents:

    This repository includes a number of .zip archives which expand into folders containing either scripts for construction and analysis of the dataset or data files comprising the dataset itself. These folders are as follows:

    • /csv: contains the .csv versions of the data files
    • /do: contains Stata do-files used to convert the .csv files to .dta format and perform the statistical analyses set forth in the paper reporting this dataset
    • /dta: contains the .dta versions of the data files
    • /py: contains the python scripts used to download CIPO’s historical trademarks data via SFTP and generate the .csv data files

    If users wish to construct rather than download the datafiles, the first script that they should run is /py/sftp_secure.py. This script will prompt the user to enter their IP Horizons SFTP credentials; these can be obtained by registering with CIPO at https://ised-isde.survey-sondage.ca/f/s.aspx?s=59f3b3a4-2fb5-49a4-b064-645a5e3a752d&lang=EN&ds=SFTP. The script will also prompt the user to identify a target directory for the data downloads. Because the data archives are quite large, users are advised to create a target directory in advance and ensure they have at least 70GB of available storage on the media in which the directory is located.

    The sftp_secure.py script will generate a new subfolder in the user’s target directory called /XML_raw. Users should note the full path of this directory, which they will be prompted to provide when running the remaining python scripts. Each of the remaining scripts, the filenames of which begin with “iterparse”, corresponds to one of the data files in the dataset, as indicated in the script’s filename. After running one of these scripts, the user’s target directory should include a /csv subdirectory containing the data file corresponding to the script; after running all the iterparse scripts the user’s /csv directory should be identical to the /csv directory in this repository. Users are invited to modify these scripts as they see fit, subject to the terms of the licenses set forth above.

    With respect to the Stata do-files, only one of them is relevant to construction of the dataset itself. This is /do/CA_TM_csv_cleanup.do, which converts the .csv versions of the data files to .dta format, and uses Stata’s labeling functionality to reduce the size of the resulting files while preserving information. The other do-files generate the analyses and graphics presented in the paper describing the dataset (Jeremy N. Sheff, The Canada Trademarks Dataset, 18 J. Empirical Leg. Studies (forthcoming 2021)), available at https://papers.ssrn.com/abstract=3782655). These do-files are also licensed for reuse subject to the terms of the CC-BY-4.0 license, and users are invited to adapt the scripts to their needs.

    The python and Stata scripts included in this repository are separately maintained and updated on Github at https://github.com/jnsheff/CanadaTM.

    This repository also includes a copy of the current version of CIPO's data dictionary for its historical XML trademarks archive as of the date of construction of this dataset.

  14. CEOS Cal Val Test Site - Dome C, Antarctica - Instrumented Site - Dataset -...

    • data.nasa.gov
    Updated Mar 31, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). CEOS Cal Val Test Site - Dome C, Antarctica - Instrumented Site - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/ceos-cal-val-test-site-dome-c-antarctica-instrumented-site
    Explore at:
    Dataset updated
    Mar 31, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Area covered
    Antarctica, Dome C
    Description

    On the background of these requirements for sensor calibration, intercalibration and product validation, the subgroup on Calibration and Validation of the Committee on Earth Observing System (CEOS) formulated the following recommendation during the plenary session held in China at the end of 2004, with the goal of setting-up and operating an internet based system to provide sensor data, protocols and guidelines for these purposes: Background: Reference Datasets are required to support the understanding of climate change and quality assure operational services by Earth Observing satellites. The data from different sensors and the resulting synergistic data products require a high level of accuracy that can only be obtained through continuous traceable calibration and validation activities. Requirement: Initiate an activity to document a reference methodology to predict Top of Atmosphere (TOA) radiance for which currently flying and planned wide swath sensors can be intercompared, i.e. define a standard for traceability. Also create and maintain a fully accessible web page containing, on an instrument basis, links to all instrument characteristics needed for intercomparisons as specified above, ideally in a common format. In addition, create and maintain a database (e.g. SADE) of instrument data for specific vicarious calibration sites, including site characteristics, in a common format. Each agency is responsible for providing data for their instruments in this common format. Recommendation : The required activities described above should be supported for an implementation period of two years and a maintenance period over two subsequent years. The CEOS should encourage a member agency to accept the lead role in supporting this activity. CEOS should request all member agencies to support this activity by providing appropriate information and data in a timely manner. Instrumented Sites: Dome C, Antarctica is one of eight instrumented sites that are CEOS Reference Test Sites. The CEOS instrumented sites are provisionally being called LANDNET. These instrumented sites are primarily used for field campaigns to obtain radiometric gain, and these sites can serve as a focus for international efforts, facilitating traceability and inter-comparison to evaluate biases of in-flight and future instruments in a harmonized manner. In the longer-term it is anticipated that these sites will all be fully automated and provide surface and atmospheric measurements to the WWW in an autonomous manner reducing some of the cost of a manned campaign, at present three can operate in this manner.

  15. Student Performance & Behavior Dataset

    • kaggle.com
    zip
    Updated May 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mahmoud Elhemaly (2025). Student Performance & Behavior Dataset [Dataset]. https://www.kaggle.com/datasets/mahmoudelhemaly/students-grading-dataset
    Explore at:
    zip(1020509 bytes)Available download formats
    Dataset updated
    May 28, 2025
    Authors
    Mahmoud Elhemaly
    Description

    Student Performance & Behavior Dataset

    This dataset is real data of 5,000 records collected from a private learning provider. The dataset includes key attributes necessary for exploring patterns, correlations, and insights related to academic performance.

    Columns: 01. Student_ID: Unique identifier for each student. 02. First_Name: Student’s first name. 03. Last_Name: Student’s last name. 04. Email: Contact email (can be anonymized). 05. Gender: Male, Female, Other. 06. Age: The age of the student. 07. Department: Student's department (e.g., CS, Engineering, Business). 08. Attendance (%): Attendance percentage (0-100%). 09. Midterm_Score: Midterm exam score (out of 100). 10. Final_Score: Final exam score (out of 100). 11. Assignments_Avg: Average score of all assignments (out of 100). 12. Quizzes_Avg: Average quiz scores (out of 100). 13. Participation_Score: Score based on class participation (0-10). 14. Projects_Score: Project evaluation score (out of 100). 15. Total_Score: Weighted sum of all grades. 16. Grade: Letter grade (A, B, C, D, F). 17. Study_Hours_per_Week: Average study hours per week. 18. Extracurricular_Activities: Whether the student participates in extracurriculars (Yes/No). 19. Internet_Access_at_Home: Does the student have access to the internet at home? (Yes/No). 20. Parent_Education_Level: Highest education level of parents (None, High School, Bachelor's, Master's, PhD). 21. Family_Income_Level: Low, Medium, High. 22. Stress_Level (1-10): Self-reported stress level (1: Low, 10: High). 23. Sleep_Hours_per_Night: Average hours of sleep per night.

    The Attendance is not part of the Total_Score or has very minimal weight.

    Calculating the weighted sum: Total Score=a⋅Midterm+b⋅Final+c⋅Assignments+d⋅Quizzes+e⋅Participation+f⋅Projects

    ComponentWeight (%)
    Midterm15%
    Final25%
    Assignments Avg15%
    Quizzes Avg10%
    Participation5%
    Projects Score30%
    Total100%

    Dataset contains: - Missing values (nulls): in some records (e.g., Attendance, Assignments, or Parent Education Level). - Bias in some Datae (ex: grading e.g., students with high attendance get slightly better grades). - Imbalanced distributions (e.g., some departments having more students).

    Note: - The dataset is real, but I included some bias to create a greater challenge for my students. - Some Columns have been masked as the Data owner requested. "Students_Grading_Dataset_Biased.csv" contains the biased Dataset "Students Performance Dataset" Contains the masked dataset

  16. GRACEnet: GHG Emissions, C Sequestration and

    • kaggle.com
    zip
    Updated Jan 19, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). GRACEnet: GHG Emissions, C Sequestration and [Dataset]. https://www.kaggle.com/datasets/thedevastator/gracenet-ghg-emissions-c-sequestration-and-envir
    Explore at:
    zip(1943875 bytes)Available download formats
    Dataset updated
    Jan 19, 2023
    Authors
    The Devastator
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    GRACEnet: GHG Emissions, C Sequestration and Environmental Benefits

    Quantifying Climate Change Mitigation and Sustainable Agricultural Practices

    By US Open Data Portal, data.gov [source]

    About this dataset

    This Kaggle dataset showcases the groundbreaking research undertaken by the GRACEnet program, which is attempting to better understand and minimize greenhouse gas (GHG) emissions from agro-ecosystems in order to create a healthier world for all. Through multi-location field studies that utilize standardized protocols – combined with models, producers, and policy makers – GRACEnet seeks to: typify existing production practices, maximize C sequestration, minimize net GHG emissions, and meet sustainable production goals. This Kaggle dataset allows us to evaluate the impact of different management systems on factors such as carbon dioxide and nitrous oxide emissions, C sequestration levels, crop/forest yield levels – plus additional environmental effects like air quality etc. With this data we can start getting an idea of the ways that agricultural policies may be influencing our planet's ever-evolving climate dilemma

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    Step 1: Familiarize yourself with the columns in this dataset. In particular, pay attention to Spreadsheet tab description (brief description of each spreadsheet tab), Element or value display name (name of each element or value being measured), Description (detailed description), Data type (type of data being measured) Unit (unit of measurement for the data) Calculation (calculation used to determine a value or percentage) Format (format required for submitting values), Low Value and High Value (range for acceptable entries).

    Step 2: Familiarize yourself with any additional information related to calculations. Most calculations made use of accepted best estimates based on standard protocols defined by GRACEnet. Every calculation was described in detail and included post-processing steps such as quality assurance/quality control changes as well as measurement uncertainty assessment etc., as available sources permit relevant calculations were discussed collaboratively between all participating partners at every level where they felt necessary. All terms were rigorously reviewed before all partners agreed upon any decision(s). A range was established when several assumptions were needed or when there was a high possibility that samples might fall outside previously accepted ranges associated with standard protocol conditions set up at GRACEnet Headquarters laboratories resulting due to other external factors like soil type, climate etc,.

    Step 3: Determine what types of operations are allowed within each spreadsheet tab (.csv file). For example on some tabs operations like adding an entire row may be permitted but using formulas is not permitted since all non-standard manipulations often introduce errors into an analysis which is why users are encouraged only add new rows/columns provided it is seen fit for their specific analysis operations like fill blank cells by zeros or delete rows/columns made redundant after standard filtering process which have been removed earlier from different tabs should be avoided since these nonstandard changes create unverified extra noise which can bias your results later on during robustness testing processes related to self verification process thereby creating erroneous output results also such action also might result into additional FET values due API's specially crafted excel documents while selecting two ways combo box therefore

    Research Ideas

    • Analyzing and comparing the environmental benefits of different agricultural management practices, such as crop yields and carbon sequestration rates.
    • Developing an app or other mobile platform to help farmers find management practices that maximize carbon sequestration and minimize GHG emissions in their area, based on their specific soil condition and climate data.
    • Building an AI-driven model to predict net greenhouse gas emissions and C sequestration from potential weekly/monthly production plans across different regions in the world, based on optimal allocation of resources such as fertilizers, equipment, water etc

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the ...

  17. s

    Annual Budget 2010 Table C FCC

    • data.smartdublin.ie
    • datasalsa.com
    • +2more
    Updated Jun 28, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2021). Annual Budget 2010 Table C FCC [Dataset]. https://data.smartdublin.ie/dataset/annual-budget-2010-table-c-fcc2
    Explore at:
    Dataset updated
    Jun 28, 2021
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains data from the Council’s Annual Budget. The budget is comprised of Tables A to F and Appendix 1. Each table is represented by a separate data file.Table C is the Calculation of the Annual Rate on Valuation for the Financial Year for Balbriggan Town Council. It contains –Estimate of ‘Money Demanded’Adopted ‘Money Demanded’Estimated ‘Irrecoverable rates and cost of collection’Adopted ‘Irrecoverable rates and cost of collection’Total Sum to be Raised is the sum of ‘Money Demanded’ and ‘Irrecoverable rates and cost of collection’‘Annual Rate on Valuation to meet Total Sum to be Raised’This dataset is used to create Table C in the published Annual Budget document, which can be found at www.fingal.ie The data is best understood by comparing it to Table C.Data fields for Table C are as follows –Doc : Table ReferenceHeading : Indicates sections in the Table - Table C is comprised of one section, therefore Heading value for all records = 1Ref : Town ReferenceDesc : Town DescriptionMD_Est : Money Demanded EstimatedMD_Adopt : Money Demanded AdoptedIR_Est : Irrecoverable rates and cost of collection EstimatedIR_Adopt : Irrecoverable rates and cost of collection AdoptedNEV : Annual Rate on Valuation to meet Total Sum to be Raised

  18. s

    Fineweb-c

    • sprogteknologi.dk
    Updated Jan 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Privatperson (2025). Fineweb-c [Dataset]. https://sprogteknologi.dk/dataset/fineweb-c
    Explore at:
    http://publications.europa.eu/resource/authority/file-type/parquetAvailable download formats
    Dataset updated
    Jan 30, 2025
    Dataset authored and provided by
    Privatperson
    License

    Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
    License information was derived automatically

    Description

    FineWeb-C: Educational content in many languages, labelled by the community This is a link to the Danish part of the dataset.

    This is a collaborative, community-driven project that expands upon the FineWeb2 dataset. Our goal is to create high-quality educational content annotations across hundreds of languages.

    By enhancing web content with these annotations, we aim to improve the development of Large Language Models (LLMs) in all languages, making AI technology more accessible and effective globally.

    The annotations in this dataset will help train AI systems to automatically identify high-quality educational content in more languages and in turn help build better Large Language Models for all languages.

    What the community is doing: For a given language, look at a page of web content from the FineWeb2 dataset in Argilla. Rate how educational the content is. Flag problematic content i.e. content that is malformed or in the wrong language. Once a language reaches 1,000 annotations, the dataset will be included in this dataset! Alongside rating the educational quality of the content, different language communities are discussing other ways to improve the quality of data for their language in our Discord discussion channel.

    The use of this dataset is also subject to CommonCrawl's Terms of Use.

  19. d

    Louisville Metro KY - Annual Open Data Report 2022

    • catalog.data.gov
    • data.louisvilleky.gov
    • +3more
    Updated Jul 30, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Louisville/Jefferson County Information Consortium (2025). Louisville Metro KY - Annual Open Data Report 2022 [Dataset]. https://catalog.data.gov/dataset/louisville-metro-ky-annual-open-data-report-2022
    Explore at:
    Dataset updated
    Jul 30, 2025
    Dataset provided by
    Louisville/Jefferson County Information Consortium
    Area covered
    Louisville, Kentucky
    Description

    On August 25th, 2022, Metro Council Passed Open Data Ordinance; previously open data reports were published on Mayor Fischer's Executive Order, You can find here both the Open Data Ordinance, 2022 (PDF) and the Mayor's Open Data Executive Order, 2013 Open Data Annual ReportsPage 6 of the Open Data Ordinance, Within one year of the effective date of this Ordinance, and thereafter no later than September1 of each year, the Open Data Management Team shall submit to the Mayor and Metro Council an annual Open Data Report.The Open Data Management team (also known as the Data Governance Team is currently led by the city's Data Officer Andrew McKinney in the Office of Civic Innovation and Technology. Previously, it was led by the former Data Officer, Michael Schnuerle and prior to that by Director of IT.Open Data Ordinance O-243-22 TextLouisville Metro GovernmentLegislation TextFile #: O-243-22, Version: 3ORDINANCE NO._, SERIES 2022AN ORDINANCE CREATING A NEW CHAPTER OF THE LOUISVILLE/JEFFERSONCOUNTY METRO CODE OF ORDINANCES CREATING AN OPEN DATA POLICYAND REVIEW. (AMENDMENT BY SUBSTITUTION)(AS AMENDED).SPONSORED BY: COUNCIL MEMBERS ARTHUR, WINKLER, CHAMBERS ARMSTRONG,PIAGENTINI, DORSEY, AND PRESIDENT JAMESWHEREAS, Metro Government is the catalyst for creating a world-class city that provides itscitizens with safe and vibrant neighborhoods, great jobs, a strong system of education and innovationand a high quality of life;WHEREAS, it should be easy to do business with Metro Government. Online governmentinteractions mean more convenient services for citizens and businesses and online governmentinteractions improve the cost effectiveness and accuracy of government operations;WHEREAS, an open government also makes certain that every aspect of the builtenvironment also has reliable digital descriptions available to citizens and entrepreneurs for deepengagement mediated by smart devices;WHEREAS, every citizen has the right to prompt, efficient service from Metro Government;WHEREAS, the adoption of open standards improves transparency, access to publicinformation and improved coordination and efficiencies among Departments and partnerorganizations across the public, non-profit and private sectors;WHEREAS, by publishing structured standardized data in machine readable formats, MetroGovernment seeks to encourage the local technology community to develop software applicationsand tools to display, organize, analyze, and share public record data in new and innovative ways;WHEREAS, Metro Government’s ability to review data and datasets will facilitate a betterUnderstanding of the obstacles the city faces with regard to equity;WHEREAS, Metro Government’s understanding of inequities, through data and datasets, willassist in creating better policies to tackle inequities in the city;WHEREAS, through this Ordinance, Metro Government desires to maintain its continuousimprovement in open data and transparency that it initiated via Mayoral Executive Order No. 1,Series 2013;WHEREAS, Metro Government’s open data work has repeatedly been recognized asevidenced by its achieving What Works Cities Silver (2018), Gold (2019), and Platinum (2020)certifications. What Works Cities recognizes and celebrates local governments for their exceptionaluse of data to inform policy and funding decisions, improve services, create operational efficiencies,and engage residents. The Certification program assesses cities on their data-driven decisionmakingpractices, such as whether they are using data to set goals and track progress, allocatefunding, evaluate the effectiveness of programs, and achieve desired outcomes. These datainformedstrategies enable Certified Cities to be more resilient, respond in crisis situations, increaseeconomic mobility, protect public health, and increase resident satisfaction; andWHEREAS, in commitment to the spirit of Open Government, Metro Government will considerpublic information to be open by default and will proactively publish data and data containinginformation, consistent with the Kentucky Open Meetings and Open Records Act.NOW, THEREFORE, BE IT ORDAINED BY THE COUNCIL OF THELOUISVILLE/JEFFERSON COUNTY METRO GOVERNMENT AS FOLLOWS:SECTION I: A new chapter of the Louisville Metro Code of Ordinances (“LMCO”) mandatingan Open Data Policy and review process is hereby created as follows:§ XXX.01 DEFINITIONS. For the purpose of this Chapter, the following definitions shall apply unlessthe context clearly indicates or requires a different meaning.OPEN DATA. Any public record as defined by the Kentucky Open Records Act, which could bemade available online using Open Format data, as well as best practice Open Data structures andformats when possible, that is not Protected Information or Sensitive Information, with no legalrestrictions on use or reuse. Open Data is not information that is treated as exempt under KRS61.878 by Metro Government.OPEN DATA REPORT. The annual report of the Open Data Management Team, which shall (i)summarize and comment on the state of Open Data availability in Metro Government Departmentsfrom the previous year, including, but not limited to, the progress toward achieving the goals of MetroGovernment’s Open Data portal, an assessment of the current scope of compliance, a list of datasetscurrently available on the Open Data portal and a description and publication timeline for datasetsenvisioned to be published on the portal in the following year; and (ii) provide a plan for the next yearto improve online public access to Open Data and maintain data quality.OPEN DATA MANAGEMENT TEAM. A group consisting of representatives from each Departmentwithin Metro Government and chaired by the Data Officer who is responsible for coordinatingimplementation of an Open Data Policy and creating the Open Data Report.DATA COORDINATORS. The members of an Open Data Management Team facilitated by theData Officer and the Office of Civic Innovation and Technology.DEPARTMENT. Any Metro Government department, office, administrative unit, commission, board,advisory committee, or other division of Metro Government.DATA OFFICER. The staff person designated by the city to coordinate and implement the city’sopen data program and policy.DATA. The statistical, factual, quantitative or qualitative information that is maintained or created byor on behalf of Metro Government.DATASET. A named collection of related records, with the collection containing data organized orformatted in a specific or prescribed way.METADATA. Contextual information that makes the Open Data easier to understand and use.OPEN DATA PORTAL. The internet site established and maintained by or on behalf of MetroGovernment located at https://data.louisvilleky.gov/ or its successor website.OPEN FORMAT. Any widely accepted, nonproprietary, searchable, platform-independent, machinereadablemethod for formatting data which permits automated processes.PROTECTED INFORMATION. Any Dataset or portion thereof to which the Department may denyaccess pursuant to any law, rule or regulation.SENSITIVE INFORMATION. Any Data which, if published on the Open Data Portal, could raiseprivacy, confidentiality or security concerns or have the potential to jeopardize public health, safety orwelfare to an extent that is greater than the potential public benefit of publishing that data.§ XXX.02 OPEN DATA PORTAL(A) The Open Data Portal shall serve as the authoritative source for Open Data provided by MetroGovernment.(B) Any Open Data made accessible on Metro Government’s Open Data Portal shall use an OpenFormat.(C) In the event a successor website is used, the Data Officer shall notify the Metro Council andshall provide notice to the public on the main city website.§ XXX.03 OPEN DATA MANAGEMENT TEAM(A) The Data Officer of Metro Government will work with the head of each Department to identify aData Coordinator in each Department. The Open Data Management Team will work to establish arobust, nationally recognized, platform that addresses digital infrastructure and Open Data.(B) The Open Data Management Team will develop an Open Data Policy that will adopt prevailingOpen Format standards for Open Data and develop agreements with regional partners to publish andmaintain Open Data that is open and freely available while respecting exemptions allowed by theKentucky Open Records Act or other federal or state law.§ XXX.04 DEPARTMENT OPEN DATA CATALOGUE(A) Each Department shall retain ownership over the Datasets they submit to the Open DataPortal. The Departments shall also be responsible for all aspects of the quality, integrity and securityPortal. The Departments shall also be responsible for all aspects of the quality, integrity and securityof the Dataset contents, including updating its Data and associated Metadata.(B) Each Department shall be responsible for creating an Open Data catalogue which shall includecomprehensive inventories of information possessed and/or managed by the Department.(C) Each Department’s Open Data catalogue will classify information holdings as currently “public”or “not yet public;” Departments will work with the Office of Civic Innovation and Technology todevelop strategies and timelines for publishing Open Data containing information in a way that iscomplete, reliable and has a high level of detail.§ XXX.05 OPEN DATA REPORT AND POLICY REVIEW(A) Within one year of the effective date of this Ordinance, and thereafter no later than September1 of each year, the Open Data Management Team shall submit to the Mayor and Metro Council anannual Open Data Report.(B) Metro Council may request a specific Department to report on any data or dataset that may bebeneficial or pertinent in implementing policy and legislation.(C) In acknowledgment that technology changes rapidly, in the future, the Open Data Policy shouldshall be reviewed annually and considered for revisions or additions that will continue to positionMetro Government as a leader on issues of

  20. Q

    Interstate War Initiation and Termination (I-WIT) data set

    • data.qdr.syr.edu
    pdf, txt
    Updated Jan 27, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tanisha Fazal; Page Fortna; Tanisha Fazal; Page Fortna (2018). Interstate War Initiation and Termination (I-WIT) data set [Dataset]. http://doi.org/10.5064/F6JW8BSD
    Explore at:
    pdf(20823), pdf(30743), pdf(26663), pdf(28899), pdf(26915), pdf(22945), pdf(38196), pdf(20543), pdf(23298), pdf(22274), pdf(27937), pdf(31238), pdf(25515), pdf(21490), pdf(17162), pdf(30230), pdf(37051), pdf(19327), pdf(40735), pdf(36793), pdf(22905), pdf(52067), txt(5709), pdf(16656), pdf(23304), pdf(33318), pdf(23788), pdf(33029), pdf(19777), pdf(25561), pdf(27179), pdf(21593), pdf(27236), pdf(35499), pdf(31560), pdf(78157), pdf(14867), pdf(38456), pdf(26235), pdf(27218), pdf(21608), pdf(30580), pdf(36239), pdf(34064), pdf(37069), pdf(41691), pdf(22760), pdf(25855), pdf(205800), pdf(35868), pdf(17490), pdf(39803), pdf(18104), pdf(14144), pdf(31152), pdf(22618), pdf(29328), pdf(42574), pdf(27961), pdf(39315), pdf(22624), pdf(25356), pdf(18522), pdf(30614), pdf(54106), pdf(18430), pdf(16431), pdf(36124), pdf(29261), pdf(32211), pdf(28049), pdf(26778), pdf(17728), pdf(18920), pdf(27686), pdf(35681), pdf(33715), pdf(25790), pdf(30998), pdf(25920), pdf(17586), pdf(30323), pdf(35613), pdf(26445), pdf(29821), pdf(26032), pdf(42873), pdf(26647), pdf(22476), pdf(17934), pdf(41245), pdf(30815), pdf(37642), pdf(23754), pdf(33771), pdf(21492), pdf(33045), pdf(17763), pdf(35641), pdf(62086), pdf(34431), pdf(24910), pdf(21064), pdf(174541), pdf(26897), pdf(63566), pdf(37213), pdf(38948), pdf(31810), pdf(16640), pdf(31247), pdf(47882), pdf(16274), pdf(21243), pdf(26918), pdf(38934), pdf(48977), pdf(31957), pdf(25789), pdf(33588), pdf(26961), pdf(24169), pdf(36577), pdf(30572), pdf(17886), pdf(44210), pdf(21675), pdf(31866), pdf(21565), pdf(24852), pdf(40299), pdf(19557), pdf(32327), pdf(33361), pdf(26256), pdf(24729), pdf(18306), pdf(29396), pdf(22833), pdf(32456), pdf(73597), pdf(36733), pdf(19616), pdf(28512)Available download formats
    Dataset updated
    Jan 27, 2018
    Dataset provided by
    Qualitative Data Repository
    Authors
    Tanisha Fazal; Page Fortna; Tanisha Fazal; Page Fortna
    License

    https://qdr.syr.edu/policies/qdr-standard-access-conditionshttps://qdr.syr.edu/policies/qdr-standard-access-conditions

    Time period covered
    Jan 1, 1816 - Dec 31, 2007
    Area covered
    Global
    Description

    Project Summary:The Interstate War Initiation and Termination (I-WIT) data set was created to enable study of macro-historical change in war initiation and termination. I-WIT is based on the Correlates of War (COW) version 4 list of interstate wars, and contains most of the interstate wars in the COW list; those excluded were wars the researchers believe do not meet the COW criteria for interstate wars. For each war, research assistants (RAs) coded a host of variables relating to war initiation and termination, including whether each side issued a declaration of war, the political and military outcomes of the war (which are coded separately), and the nature of any agreement that concluded the war. One argument made in several publications based on these data (also part of a larger book project) is that the proliferation of codified international humanitarian law has created disincentives for states to admit that they are in a state of war. Declaring war or concluding a peace treaty would constitute an admission of being in a state of war. As international humanitarian law has proliferated and changed in character over the past 100 years or so, it has set the costs of compliance – and also the costs of finding a state to be out of compliance – very high. Thus, states avoid declaring war and concluding peace treaties to try to perpetrate a type of legal fiction – that they are not at war – to limit their liability for any violations of the laws of war. Data Abstract: The data cover the period from 1816 to 2007 and span the entire world. Dozens of graduate and undergraduate RAs working between 2004 and 2010 compiled existing data from secondary sources and, when available online, primary sources to code variables listed and described in the coding instrument. RAs were given a coding instrument with a description and rules for coding each variable. Typically, they consulted both secondary and primary sources, although off-site archival sources were not consulted. They filled in a spreadsheet for each war with variable values, and produced a narrative report (henceforward, “narrative”) of 5-10 pages that gave background information on the war and also justified their coding. Each war was assigned to at least two RAs to check for inter-coder reliability. If there was disagreement between the first two RAs, a third RA was brought in to code discrepant variables for that war. Where possible, a 2/3 rule was followed in resolving discrepancies. Remaining discrepancies are addressed in the “discrepancy narrative,” which lists the discrepancies and documents final coding decisions. Files Description: Some sources were scanned (e.g., declarations of war or peace treaties) but for the most part, RAs took notes on their assigned cases and produced their coding and narratives based on these notes. The coding instrument and the discrepancy narrative are included in the data documentation files, and all data files produced – including original codings that were discrepant with later codings – are included in the interest of allowing other researchers to make their own judgments as to the final coding decisions. A companion data set – C-WIT (Civil War Initiation and Termination) – is still under construction and thus not shared at this time.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
HM Revenue & Customs (2024). Data tables for customs declarants and declaration volumes for international trade in 2023 [Dataset]. https://www.gov.uk/government/statistical-data-sets/data-tables-for-customs-declarants-and-declaration-volumes-for-international-trade-in-2023

Data tables for customs declarants and declaration volumes for international trade in 2023

Explore at:
Dataset updated
May 31, 2024
Dataset provided by
GOV.UK
Authors
HM Revenue & Customs
Description

The following data tables contain the number of customs declarants and declarations for international trade in goods in 2023, with breakdowns by direction of movement, partner countries, calendar month declarant representation, location of entry/exit and declarations type category.

https://assets.publishing.service.gov.uk/media/665d86887b792ffff71a8629/Customs_declarants_and_declarations_volumes_for_international_trade_in_goods_in_2023_dataset.ods">Customs declarants and declaration volumes for international trade in 2023 dataset

ODS, 31.9 KB

This file is in an OpenDocument format

Search
Clear search
Close search
Google apps
Main menu