13 datasets found
  1. f

    Complete BHL Metadata Object Description Schema (MODS) XML Data Export

    • smithsonian.figshare.com
    bin
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jacqueline Dearborn; Mike Lichtenberg (2023). Complete BHL Metadata Object Description Schema (MODS) XML Data Export [Dataset]. http://doi.org/10.25573/data.21081526.v1
    Explore at:
    binAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    Smithsonian Libraries and Archives
    Authors
    Jacqueline Dearborn; Mike Lichtenberg
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The datasets containing metadata in MODS for the entire BHL collection (both hosted and externally linked content) can be downloaded from the following locations:

    bhlitem.mods.xml bhlitem.mods.xml.zip bhlpart.mods.xml bhlpart.mods.xml.zip bhltitle.mods.xml bhltitle.mods.xml.zip

    For contextual information and key definitions about this dataset see the Biodiversity Heritage Library Open Data Collection.

    Data Dictionary:https://www.loc.gov/standards/mods/v3/mods-3-8.xsd Release Date: First of the month Frequency: Monthly bureauCode: 452:11 Access Level: public Rights: http://rightsstatements.org/vocab/NoC-US/

  2. f

    BHL-hosted Metadata Object Description Schema (MODS) XML Data Export

    • smithsonian.figshare.com
    bin
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jacqueline Dearborn; Mike Lichtenberg (2023). BHL-hosted Metadata Object Description Schema (MODS) XML Data Export [Dataset]. http://doi.org/10.25573/data.21082192.v1
    Explore at:
    binAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Smithsonian Libraries and Archives
    Authors
    Jacqueline Dearborn; Mike Lichtenberg
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The datasets containing metadata in MODS for only items hosted by BHL can be downloaded from the following locations:

    bhlitem.mods.xml bhlitem.mods.xml.zip bhlpart.mods.xml bhlpart.mods.xml.zip bhltitle.mods.xml bhltitle.mods.xml.zip

    For contextual information and key definitions about this dataset see the Biodiversity Heritage Library Open Data Collection.

    Data Dictionary: https://www.loc.gov/standards/mods/v3/mods-3-8.xsd Release Date: First of the month Frequency: Monthly bureauCode: 452:11 Access Level: public Rights: http://rightsstatements.org/vocab/NoC-US/

  3. H

    Resources for Harvard Identity and Access Management Data Customers

    • dataverse.harvard.edu
    Updated Oct 2, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Harvard Identify and Access Management (2016). Resources for Harvard Identity and Access Management Data Customers [Dataset]. http://doi.org/10.7910/DVN/ILMCGS
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 2, 2016
    Dataset provided by
    Harvard Dataverse
    Authors
    Harvard Identify and Access Management
    License

    https://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/ILMCGShttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/ILMCGS

    Description

    Developers and others working with Harvard data customers can find help guides, definitions, and other key information on working with data import and export here. Still looking for what you need? Please contact iam@harvard.edu. You will be required to log in before downloading these files. For All Data Customers XML Schema Definition for Harvard People Data For Import Customers IdM Import Developers' Guide: File Names and Delivery IdM Import User's Guide: Log File Errors and Warnings IdM XML Email Import Provider's Guide IdM XML Student Directory Listing and Emergency Contact Data Provider's Guide For Export Customers Developers' Guide to Data Display and Applying Privacy IdM XML Export Data User's Guide IdM Export Developers' Guide: File Names and Privacy

  4. d

    Data from: TPB Public Register

    • data.gov.au
    • data.wu.ac.at
    xml, zip
    Updated May 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tax Practitioners Board (2023). TPB Public Register [Dataset]. https://data.gov.au/data/dataset/groups/tpb-register
    Explore at:
    xml(75246260), zip(6459125)Available download formats
    Dataset updated
    May 7, 2023
    Dataset authored and provided by
    Tax Practitioners Board
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    TPB Public Register

    To view XML content follow these steps:

    1. Download TPB Public Register XML data from the website via right click and 'save link as'
    2. Open Excel and create a new worksheet
    3. Click Developer > Import If you don't see the Developer tab, see Show the Developer tab.
    4. In the Import XML dialog box, locate and select the XML data file (.xml) you want to import, and click Import.
    5. In the Import Data dialog box, Select XML table in existing worksheet to import the contents of the XML data file into an XML table in your worksheet at the specified cell location.

    Click ‘OK’ for any prompts that are displayed

  5. t

    Extended Information Delivery Specifications (IDS) dataset for the digital...

    • researchdata.tuwien.at
    zip
    Updated Oct 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Patrick Loibl; Patrick Loibl; Simon Fischer; Simon Fischer; Harald Urban; Harald Urban; Christian Schranz; Christian Schranz (2025). Extended Information Delivery Specifications (IDS) dataset for the digital building permit process [Dataset]. http://doi.org/10.48436/yh1wr-8t679
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 14, 2025
    Dataset provided by
    TU Wien
    Authors
    Patrick Loibl; Patrick Loibl; Simon Fischer; Simon Fischer; Harald Urban; Harald Urban; Christian Schranz; Christian Schranz
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Description

    The published data is research data from the publication "Extending Information Delivery Specifications for digital building permit requirements" (https://doi.org/10.1016/j.dibe.2024.100560). This publication examines the potential for extending the Information Delivery Specification (IDS) schema to facilitate its integration into the building permit process.
    IDS is an open specification based on XML for defining and verifying information requirements for digital building models in the IFC format (an open format for BIM models).
    The present concepts for extending IDS to define information requirements for escape route analysis and code compliance checks for Austrian fire resistance regulations.

    This dataset contains the results of the mentioned publication. This includes the edited IDS schema (XML Schema Definition - XSD), the created IDS files, and BCF files (BIM Collaboration Format) used for the validation. The IDS files were used as input for a self-developed IDS software to check two IFC test models. The two test models are published in related datasets:
    Custom test model for escape route analysis in IFC format: https://doi.org/10.48436/hx8gz-zw339
    Real-world test model for escape route analysis in IFC format: https://doi.org/10.48436/fnmrh-crh59
    The checking results were saved in the BCF files. Therefore, the checking results can be visualised by applying the published BCF files to the corresponding IFC models.

    Technical details

    The dataset contains three folders, one for each of the three data types:

    • XSD-file: contains the adapted XML schema definition (XSD) for IDS. The basis for the adaption was version 1.0
    • IDS-files: contains IDS files for three different purposes:
      • Information-Requirements-EscapeRouteAnalysis.ids contains targeted information requirements for an IFC model to ensure the required data to perform an escape route analysis is possible.
      • Code-Compliance-Checking-OIB2-Tab1b contains code compliance checks for the fire resistance regulations defined in the Austrian OIB guideline 2 in Table 1b.
      • Code-Compliance-Checking-OIB2-Tab2a contains code compliance checks for the fire resistance regulations defined in the Austrian OIB guideline 2 in Table 2a.
    • BCF-files: contains all BCF files (version 2.1) created in this research. The folder is structured in two subfolders. Each subfolder contains 6 BCF files, which are differentiated by the used IDS file and test model:
      • Applicable-elements: contains BCF files listing the applicable elements for each IDS specification
      • Checking-reports: contains BCF files listing the failed elements for each IDS specification. These are classic checking reports.

    Further details

    • The IDS files correspond to the adapted schema. Therefore, commercial IDS software cannot process these files. Their purpose is to show how the schema extension affects the structure of IDS files.
    • The BCF files use a standard BCF version and can be imported into commercial BIM software. The combination with the cited test models allows for visualising the checking results.
  6. d

    Directory.gov.au export

    • data.gov.au
    • researchdata.edu.au
    • +1more
    xml
    Updated Jul 13, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of Finance (2017). Directory.gov.au export [Dataset]. https://data.gov.au/data/dataset/directory-gov-au-export
    Explore at:
    xmlAvailable download formats
    Dataset updated
    Jul 13, 2017
    Dataset authored and provided by
    Department of Finance
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The purpose of Directory (directory.gov.au) is to provide a guide to the structure, organisations and key people in the Australian Government.

    A new and improved Directory.gov.au system was released in mid-2017 which consolidated the former Directory, AusGovBoards and Australian Government Organisations Register, into one system. The consolidated system contains key information about Australian Government entities, contact details of key stakeholders within organisations, and a listing of government board appointments.

    This XML dataset is a full extract of the Directory content, updated daily.

  7. Current and last 100 days of planning applications - Dataset - data.gov.uk

    • ckan.publishing.service.gov.uk
    Updated Feb 24, 2016
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ckan.publishing.service.gov.uk (2016). Current and last 100 days of planning applications - Dataset - data.gov.uk [Dataset]. https://ckan.publishing.service.gov.uk/dataset/current-and-last-100-days-of-planning-applications
    Explore at:
    Dataset updated
    Feb 24, 2016
    Dataset provided by
    CKANhttps://ckan.org/
    Description

    Current (undetermined) and most recent 100 days of determined applications at Surrey County Council, Surrey, UK. N.B. Surrey County Council does not oversee all planning applications for the county. SCC oversees applications of a specific nature, for example, related to waste, minerals, highways and schools. Data can be visualised and accessed for whole of Surrey County (all planning authorities), including API access to data, here: http://digitalservices.surreyi.gov.uk/ Data updated daily. Data formatted according to ODUG/LGA national schema, see: http://schemas.opendata.esd.org.uk/PlanningApplications. Also available as JSON, XML and CSV. To save as CSV use CSV download link and 'save page as' .CSV for correct comma separated formatting. This will provide you with a neatly formatted CSV file which can be opened in a spreadsheet package such as Excel or Libre Office Calc. PSMA End User licence: http://www.ordnancesurvey.co.uk/business-and-government/public-sector/mapping-agreements/end-user-licence.html

  8. P

    DEPD -- The Differentially Expressed Protein Database

    • opendata.pku.edu.cn
    Updated Nov 20, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Peking University Open Research Data Platform (2015). DEPD -- The Differentially Expressed Protein Database [Dataset]. http://doi.org/10.18170/DVN/UYUEUE
    Explore at:
    Dataset updated
    Nov 20, 2015
    Dataset provided by
    Peking University Open Research Data Platform
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Access to Data The Differentially Expressed Protein Database (DEPD) is a publicly available, web-based database. It was designed to store the output of comparative proteomics and provides a query and analysis platform for further data mining. Currently, the DEPD contains information about more than 3,000 DEPs, manually extracted from published literature, mostly from studies of serious human diseases including lung cancer, breast cancer and liver cancer. Towards establishing a data exchange standard for comparative proteomics, DEPD provide a new XML schema named CPXS 0.1 (comparative proteomics XML Schema). Additionally, a user-friendly web interface has been set up with tools for querying, visualization and analysis results of published comparative proteomics studies. All of the DEPD data can be downloaded freely from the web site (http://protchem.hunnu.edu.cn/depd/).

  9. Zenodo metadata JSON records as of 2019-09-16

    • data.europa.eu
    unknown
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zenodo, Zenodo metadata JSON records as of 2019-09-16 [Dataset]. https://data.europa.eu/data/datasets/oai-zenodo-org-3531504?locale=no
    Explore at:
    unknown(1786)Available download formats
    Dataset authored and provided by
    Zenodohttp://zenodo.org/
    Description

    This preliminary dataset contains the application/vnd.zenodo.v1+json JSON records of Zenodo deposits as retrieved on 2019-09-16. Files zenodo-records-json-2019-09-16.tar.xz Zenodo JSON records XZ-compressed tar archive of individual JSON records as retrieved from Zenodo. Filenames reflects record, e.g. 1310621.json was retrieved from https://zenodo.org/api/records/1310621 using content-negotiation for application/vnd.zenodo.v1+json zenodo-records-json-2019-09-16-filtered.jsonseq.xz Concatinated Zenodo JSON records XZ-compressed RFC7464 JSON Sequence stream, readable by jq. Concatination of Zenodo JSON records. Order not significant. zenodo-records.sh Retrieve Zenodo JSON records A retrospectively created Bash shell script that shows the commands used to retrieve JSON files and concationate to jsonseq. ro-crate-metadata.jsonld RO-Crate 0.2 structured metadata ro-crate-preview.html Browser rendering of RO-Crate structured metadata README.md This dataset description License This dataset is provided under the license Apache License, version 2.0: Copyright 2019 The University of Manchester Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. CC0 for Zenodo metadata The Zenodo metadata in zenodo-records-json-2019-09-16.tar.xz is reused under the terms of https://creativecommons.org/publicdomain/zero/1.0/ Reproducibility To retrieve the Zenodo JSON it was deemed necessary to use the undocumented parts of Zenodo API. From the Zenodo source code it was identified that the REST template https://zenodo.org/api/records/{pid_value} could be used with pid_value as the numeric part from the OAI-PMH identifier, e.g. for oai:zenodo.org:1310621 the Zenodo JSON can be retrieved at https://zenodo.org/api/records/1310621. The JSON API supports content negotiation, the content-types supported as of 2019-09-20 include: application/vnd.zenodo.v1+json giving the Zenodo record in Zenodo's internal JSON schema (v1) application/ld+json giving JSON-LD Linked Data using the http://schema.org/ vocabulary application/x-datacite-v41+xml giving DataCite v4 XML application/marcxml+xml giving MARC 21 XML Using these (currently) undocumented parts of the Zenodo API thus avoids the need for HTML scraping while also giving individual complete records that are suitable to redistribute as records in a filtered dataset. This preliminary exploration will be adapted into the reproducible CWL workflow, for now included as a Bash script zenodo-records.sh Execution time was about 3 days from a server at the University of Manchester network on a single 1 GBps network link. The script does: Retrieve each of the first 3.5 million Zenodo records as Zenodo JSON by iterating over possible numeric IDs (the maximum ID 3450000 was estimated from "Recent uploads") Filter list to exclude records that are not found, moved or deleted. The presence of the key conceptrecid is used as marker. Use jq to ensure the JSON is on a single line Join the JSON files using the ASCII Record Separator (RS, 0x1e) to make a application/json-seq JSON text sequence stream Save the JSON stream as a single compressed file using xz

  10. c

    BOREAS Forest Cover Data Layers Over the SSA-MSA in Raster Format

    • s.cnmilf.com
    • search.dataone.org
    • +7more
    Updated Sep 19, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ORNL_DAAC (2025). BOREAS Forest Cover Data Layers Over the SSA-MSA in Raster Format [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/boreas-forest-cover-data-layers-over-the-ssa-msa-in-raster-format-65ce0
    Explore at:
    Dataset updated
    Sep 19, 2025
    Dataset provided by
    ORNL_DAAC
    Description

    This data set was prepared by BORIS staff by processing the original vector data into raster files. The original data were received as ARC/INFO coverages or as export files from SERM. The data include information on forest parameters for the BOREAS SSA MSA. The data are stored in binary, image format files.

  11. Z

    Jula lexicographic data collected during January 2019

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jul 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Donaldson, Coleman (2024). Jula lexicographic data collected during January 2019 [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_2566070
    Explore at:
    Dataset updated
    Jul 25, 2024
    Dataset provided by
    University of Hamburg
    Authors
    Donaldson, Coleman
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Rough lexicographic data for 62 Jula lexemes collected during January 2019 in western Burkina Faso by Coleman Donaldson in the following formats:

    1) a .lift file (of the LIFT XML schema) exported from LexiquePro.

    2) a .txt file in Toolbox format

    3) a .pdf file of a formatted export from LexiquePro

    4) a .rtf file of an export from LexiquePro.

    5) a .pdf file of a formatted export from Toolbox's MDF tool.

    6) a .rtf file of an export from Toolbox's MDF tool.

    I follow the de facto official phonemic orthography synthesizing the various national standards that linguists use while also marking tone. Grave diacritics mark low tones and acute diacritics mark high tones. An unmarked vowel carries the same tone as the last marked vowel before it. Lexemes without any diacritics means I am unsure of the tone.

  12. SEC Form 4 Filings

    • kaggle.com
    Updated Sep 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SecFilingApi (2025). SEC Form 4 Filings [Dataset]. https://www.kaggle.com/datasets/secfilingapi/sec-form-4-filings
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 30, 2025
    Dataset provided by
    Kaggle
    Authors
    SecFilingApi
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Historical SEC dataset containing all insider transactions (Form 4 filings). The data is public, and sourced from the SEC's EDGAR database out of their XML filings. Lightly processed for easier consumption. All Form 4 filings from Jan/20 to Jun/25

    Why this exists

    Form 4s are noisy to work with: amended filings, multiple insiders per transaction, and inconsistent tables. This dataset provides clean, normalized insider-transaction data with a stable schema so you can backtest signals and monitor insider activity without scraping.

    What you get:

    • Company (name, ticker if known), CIK
    • Insider(s) with role (Director/Officer/10% Owner, etc.)
    • Trade details: transaction date, type, number of shares, price, total consideration
    • Holdings before/after the transaction

    Update cadence: monthly (moving to daily as we scale). Source: U.S. SEC EDGAR Form 4 filings.

    We're building a real-time API for new filings with clean JSON endpoints and low latency. If interested, sign up to our waiting list: 👉 https://secfilingapi.com/?utm_source=kaggle&utm_medium=dataset&utm_campaign=form4

    Ideas to try with the data:

    • Event study: top-decile insider buys vs sector ETF over 30/60/90 days
    • Cluster insiders by role + transaction size; test persistence
    • Filter for CEO buys after 20% drawdowns (momentum/contrarian mix)
    • Find unusual clusters (multiple insiders buying within 10 days)

    Limitations & notes:

    • There are a few missing filings in the historical data (< 1000 total, all very old filings, before the SEC launched the XML format).
    • Forms 4B (amendments to form 4 - rare - are not included)
    • Mapping from CIK→ticker can lag for recent IPOs/SPACs.
    • Always verify edge cases against the EDGAR link when publishing results.

    Feedback & requests welcome in the Discussion tab.

    DISCLAIMER: It is possible that inaccuracies or other errors were introduced into the data sets during the process of extracting the data and compiling the data sets. The data set is intended to assist the public in analyzing data contained in Commission filings; however, they are not a substitute for such filings. Investors should review the full Commission filings before making any investment decision.

  13. b

    Liaisons maritimes gérées par la région Bretagne

    • data.bretagne.bzh
    • data.geocatalogue.fr
    • +3more
    csv, excel, geojson +1
    Updated May 17, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2018). Liaisons maritimes gérées par la région Bretagne [Dataset]. https://data.bretagne.bzh/explore/dataset/liaisons-maritimes-gerees-par-la-region-bretagne/
    Explore at:
    csv, geojson, json, excelAvailable download formats
    Dataset updated
    May 17, 2018
    License

    Licence Ouverte / Open Licence 1.0https://www.etalab.gouv.fr/wp-content/uploads/2014/05/Open_Licence.pdf
    License information was derived automatically

    Area covered
    Bretagne
    Description

    Liaisons maritimes reliant les îles et le continent gérées par la région Bretagne. Sont concernés les îles de Bréhat, Ouessant, Molène, Sein, Groix, Belle île, Houat, Hoedic et Arz.

  14. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Jacqueline Dearborn; Mike Lichtenberg (2023). Complete BHL Metadata Object Description Schema (MODS) XML Data Export [Dataset]. http://doi.org/10.25573/data.21081526.v1

Complete BHL Metadata Object Description Schema (MODS) XML Data Export

Related Article
Explore at:
binAvailable download formats
Dataset updated
Jun 1, 2023
Dataset provided by
Smithsonian Libraries and Archives
Authors
Jacqueline Dearborn; Mike Lichtenberg
License

CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically

Description

The datasets containing metadata in MODS for the entire BHL collection (both hosted and externally linked content) can be downloaded from the following locations:

bhlitem.mods.xml bhlitem.mods.xml.zip bhlpart.mods.xml bhlpart.mods.xml.zip bhltitle.mods.xml bhltitle.mods.xml.zip

For contextual information and key definitions about this dataset see the Biodiversity Heritage Library Open Data Collection.

Data Dictionary:https://www.loc.gov/standards/mods/v3/mods-3-8.xsd Release Date: First of the month Frequency: Monthly bureauCode: 452:11 Access Level: public Rights: http://rightsstatements.org/vocab/NoC-US/

Search
Clear search
Close search
Google apps
Main menu