13 datasets found

f
Complete BHL Metadata Object Description Schema (MODS) XML Data Export
smithsonian.figshare.com
bin
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jacqueline Dearborn; Mike Lichtenberg (2023). Complete BHL Metadata Object Description Schema (MODS) XML Data Export [Dataset]. http://doi.org/10.25573/data.21081526.v1
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.25573/data.21081526.v1
Dataset updated
Jun 1, 2023
Dataset provided by
Smithsonian Libraries and Archives
Authors
Jacqueline Dearborn; Mike Lichtenberg
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
The datasets containing metadata in MODS for the entire BHL collection (both hosted and externally linked content) can be downloaded from the following locations:

bhlitem.mods.xml bhlitem.mods.xml.zip bhlpart.mods.xml bhlpart.mods.xml.zip bhltitle.mods.xml bhltitle.mods.xml.zip

For contextual information and key definitions about this dataset see the Biodiversity Heritage Library Open Data Collection.

Data Dictionary:https://www.loc.gov/standards/mods/v3/mods-3-8.xsd Release Date: First of the month Frequency: Monthly bureauCode: 452:11 Access Level: public Rights: http://rightsstatements.org/vocab/NoC-US/
f
BHL-hosted Metadata Object Description Schema (MODS) XML Data Export
smithsonian.figshare.com
bin
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jacqueline Dearborn; Mike Lichtenberg (2023). BHL-hosted Metadata Object Description Schema (MODS) XML Data Export [Dataset]. http://doi.org/10.25573/data.21082192.v1
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.25573/data.21082192.v1
Dataset updated
May 30, 2023
Dataset provided by
Smithsonian Libraries and Archives
Authors
Jacqueline Dearborn; Mike Lichtenberg
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
The datasets containing metadata in MODS for only items hosted by BHL can be downloaded from the following locations:

bhlitem.mods.xml bhlitem.mods.xml.zip bhlpart.mods.xml bhlpart.mods.xml.zip bhltitle.mods.xml bhltitle.mods.xml.zip

For contextual information and key definitions about this dataset see the Biodiversity Heritage Library Open Data Collection.

Data Dictionary: https://www.loc.gov/standards/mods/v3/mods-3-8.xsd Release Date: First of the month Frequency: Monthly bureauCode: 452:11 Access Level: public Rights: http://rightsstatements.org/vocab/NoC-US/
H
Resources for Harvard Identity and Access Management Data Customers
dataverse.harvard.edu
Updated Oct 2, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Harvard Identify and Access Management (2016). Resources for Harvard Identity and Access Management Data Customers [Dataset]. http://doi.org/10.7910/DVN/ILMCGS
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/ILMCGS
Dataset updated
Oct 2, 2016
Dataset provided by
Harvard Dataverse
Authors
Harvard Identify and Access Management
License
https://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/ILMCGShttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/ILMCGS
Description
Developers and others working with Harvard data customers can find help guides, definitions, and other key information on working with data import and export here. Still looking for what you need? Please contact iam@harvard.edu. You will be required to log in before downloading these files. For All Data Customers XML Schema Definition for Harvard People Data For Import Customers IdM Import Developers' Guide: File Names and Delivery IdM Import User's Guide: Log File Errors and Warnings IdM XML Email Import Provider's Guide IdM XML Student Directory Listing and Emergency Contact Data Provider's Guide For Export Customers Developers' Guide to Data Display and Applying Privacy IdM XML Export Data User's Guide IdM Export Developers' Guide: File Names and Privacy
d
Data from: TPB Public Register
data.gov.au
data.wu.ac.at
xml, zip
Updated May 7, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tax Practitioners Board (2023). TPB Public Register [Dataset]. https://data.gov.au/data/dataset/groups/tpb-register
Explore at:
xml(75246260), zip(6459125)Available download formats
Dataset updated
May 7, 2023
Dataset authored and provided by
Tax Practitioners Board
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
TPB Public Register

To view XML content follow these steps:

Download TPB Public Register XML data from the website via right click and 'save link as'

Open Excel and create a new worksheet

Click Developer > Import If you don't see the Developer tab, see Show the Developer tab.

In the Import XML dialog box, locate and select the XML data file (.xml) you want to import, and click Import.

In the Import Data dialog box, Select XML table in existing worksheet to import the contents of the XML data file into an XML table in your worksheet at the specified cell location.

Click ‘OK’ for any prompts that are displayed
t
Extended Information Delivery Specifications (IDS) dataset for the digital...
researchdata.tuwien.at
zip
Updated Oct 14, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Patrick Loibl; Patrick Loibl; Simon Fischer; Simon Fischer; Harald Urban; Harald Urban; Christian Schranz; Christian Schranz (2025). Extended Information Delivery Specifications (IDS) dataset for the digital building permit process [Dataset]. http://doi.org/10.48436/yh1wr-8t679
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.48436/yh1wr-8t679
Dataset updated
Oct 14, 2025
Dataset provided by
TU Wien
Authors
Patrick Loibl; Patrick Loibl; Simon Fischer; Simon Fischer; Harald Urban; Harald Urban; Christian Schranz; Christian Schranz
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Description

The published data is research data from the publication "Extending Information Delivery Specifications for digital building permit requirements" (https://doi.org/10.1016/j.dibe.2024.100560). This publication examines the potential for extending the Information Delivery Specification (IDS) schema to facilitate its integration into the building permit process.
IDS is an open specification based on XML for defining and verifying information requirements for digital building models in the IFC format (an open format for BIM models).
The present concepts for extending IDS to define information requirements for escape route analysis and code compliance checks for Austrian fire resistance regulations.

This dataset contains the results of the mentioned publication. This includes the edited IDS schema (XML Schema Definition - XSD), the created IDS files, and BCF files (BIM Collaboration Format) used for the validation. The IDS files were used as input for a self-developed IDS software to check two IFC test models. The two test models are published in related datasets:
Custom test model for escape route analysis in IFC format: https://doi.org/10.48436/hx8gz-zw339
Real-world test model for escape route analysis in IFC format: https://doi.org/10.48436/fnmrh-crh59
The checking results were saved in the BCF files. Therefore, the checking results can be visualised by applying the published BCF files to the corresponding IFC models.

Technical details

The dataset contains three folders, one for each of the three data types:

XSD-file: contains the adapted XML schema definition (XSD) for IDS. The basis for the adaption was version 1.0

IDS-files: contains IDS files for three different purposes:

Information-Requirements-EscapeRouteAnalysis.ids contains targeted information requirements for an IFC model to ensure the required data to perform an escape route analysis is possible.

Code-Compliance-Checking-OIB2-Tab1b contains code compliance checks for the fire resistance regulations defined in the Austrian OIB guideline 2 in Table 1b.

Code-Compliance-Checking-OIB2-Tab2a contains code compliance checks for the fire resistance regulations defined in the Austrian OIB guideline 2 in Table 2a.

BCF-files: contains all BCF files (version 2.1) created in this research. The folder is structured in two subfolders. Each subfolder contains 6 BCF files, which are differentiated by the used IDS file and test model:

Applicable-elements: contains BCF files listing the applicable elements for each IDS specification

Checking-reports: contains BCF files listing the failed elements for each IDS specification. These are classic checking reports.

Further details

The IDS files correspond to the adapted schema. Therefore, commercial IDS software cannot process these files. Their purpose is to show how the schema extension affects the structure of IDS files.

The BCF files use a standard BCF version and can be imported into commercial BIM software. The combination with the cited test models allows for visualising the checking results.
d
Directory.gov.au export
data.gov.au
researchdata.edu.au
+1more
xml
Updated Jul 13, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department of Finance (2017). Directory.gov.au export [Dataset]. https://data.gov.au/data/dataset/directory-gov-au-export
Explore at:
xmlAvailable download formats
Dataset updated
Jul 13, 2017
Dataset authored and provided by
Department of Finance
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The purpose of Directory (directory.gov.au) is to provide a guide to the structure, organisations and key people in the Australian Government.

A new and improved Directory.gov.au system was released in mid-2017 which consolidated the former Directory, AusGovBoards and Australian Government Organisations Register, into one system. The consolidated system contains key information about Australian Government entities, contact details of key stakeholders within organisations, and a listing of government board appointments.

This XML dataset is a full extract of the Directory content, updated daily.
Current and last 100 days of planning applications - Dataset - data.gov.uk
ckan.publishing.service.gov.uk
Updated Feb 24, 2016
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ckan.publishing.service.gov.uk (2016). Current and last 100 days of planning applications - Dataset - data.gov.uk [Dataset]. https://ckan.publishing.service.gov.uk/dataset/current-and-last-100-days-of-planning-applications
Explore at:
Dataset updated
Feb 24, 2016
Dataset provided by
CKANhttps://ckan.org/
Description
Current (undetermined) and most recent 100 days of determined applications at Surrey County Council, Surrey, UK. N.B. Surrey County Council does not oversee all planning applications for the county. SCC oversees applications of a specific nature, for example, related to waste, minerals, highways and schools. Data can be visualised and accessed for whole of Surrey County (all planning authorities), including API access to data, here: http://digitalservices.surreyi.gov.uk/ Data updated daily. Data formatted according to ODUG/LGA national schema, see: http://schemas.opendata.esd.org.uk/PlanningApplications. Also available as JSON, XML and CSV. To save as CSV use CSV download link and 'save page as' .CSV for correct comma separated formatting. This will provide you with a neatly formatted CSV file which can be opened in a spreadsheet package such as Excel or Libre Office Calc. PSMA End User licence: http://www.ordnancesurvey.co.uk/business-and-government/public-sector/mapping-agreements/end-user-licence.html
P
DEPD -- The Differentially Expressed Protein Database
opendata.pku.edu.cn
Updated Nov 20, 2015
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Peking University Open Research Data Platform (2015). DEPD -- The Differentially Expressed Protein Database [Dataset]. http://doi.org/10.18170/DVN/UYUEUE
Explore at:
Unique identifier
https://doi.org/10.18170/DVN/UYUEUE
Dataset updated
Nov 20, 2015
Dataset provided by
Peking University Open Research Data Platform
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Access to Data The Differentially Expressed Protein Database (DEPD) is a publicly available, web-based database. It was designed to store the output of comparative proteomics and provides a query and analysis platform for further data mining. Currently, the DEPD contains information about more than 3,000 DEPs, manually extracted from published literature, mostly from studies of serious human diseases including lung cancer, breast cancer and liver cancer. Towards establishing a data exchange standard for comparative proteomics, DEPD provide a new XML schema named CPXS 0.1 (comparative proteomics XML Schema). Additionally, a user-friendly web interface has been set up with tools for querying, visualization and analysis results of published comparative proteomics studies. All of the DEPD data can be downloaded freely from the web site (http://protchem.hunnu.edu.cn/depd/).
Zenodo metadata JSON records as of 2019-09-16
data.europa.eu
unknown
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zenodo, Zenodo metadata JSON records as of 2019-09-16 [Dataset]. https://data.europa.eu/data/datasets/oai-zenodo-org-3531504?locale=no
Explore at:
unknown(1786)Available download formats
Dataset authored and provided by
Zenodohttp://zenodo.org/
Description
This preliminary dataset contains the application/vnd.zenodo.v1+json JSON records of Zenodo deposits as retrieved on 2019-09-16. Files zenodo-records-json-2019-09-16.tar.xz Zenodo JSON records XZ-compressed tar archive of individual JSON records as retrieved from Zenodo. Filenames reflects record, e.g. 1310621.json was retrieved from https://zenodo.org/api/records/1310621 using content-negotiation for application/vnd.zenodo.v1+json zenodo-records-json-2019-09-16-filtered.jsonseq.xz Concatinated Zenodo JSON records XZ-compressed RFC7464 JSON Sequence stream, readable by jq. Concatination of Zenodo JSON records. Order not significant. zenodo-records.sh Retrieve Zenodo JSON records A retrospectively created Bash shell script that shows the commands used to retrieve JSON files and concationate to jsonseq. ro-crate-metadata.jsonld RO-Crate 0.2 structured metadata ro-crate-preview.html Browser rendering of RO-Crate structured metadata README.md This dataset description License This dataset is provided under the license Apache License, version 2.0: Copyright 2019 The University of Manchester Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. CC0 for Zenodo metadata The Zenodo metadata in zenodo-records-json-2019-09-16.tar.xz is reused under the terms of https://creativecommons.org/publicdomain/zero/1.0/ Reproducibility To retrieve the Zenodo JSON it was deemed necessary to use the undocumented parts of Zenodo API. From the Zenodo source code it was identified that the REST template https://zenodo.org/api/records/{pid_value} could be used with pid_value as the numeric part from the OAI-PMH identifier, e.g. for oai:zenodo.org:1310621 the Zenodo JSON can be retrieved at https://zenodo.org/api/records/1310621. The JSON API supports content negotiation, the content-types supported as of 2019-09-20 include: application/vnd.zenodo.v1+json giving the Zenodo record in Zenodo's internal JSON schema (v1) application/ld+json giving JSON-LD Linked Data using the http://schema.org/ vocabulary application/x-datacite-v41+xml giving DataCite v4 XML application/marcxml+xml giving MARC 21 XML Using these (currently) undocumented parts of the Zenodo API thus avoids the need for HTML scraping while also giving individual complete records that are suitable to redistribute as records in a filtered dataset. This preliminary exploration will be adapted into the reproducible CWL workflow, for now included as a Bash script zenodo-records.sh Execution time was about 3 days from a server at the University of Manchester network on a single 1 GBps network link. The script does: Retrieve each of the first 3.5 million Zenodo records as Zenodo JSON by iterating over possible numeric IDs (the maximum ID 3450000 was estimated from "Recent uploads") Filter list to exclude records that are not found, moved or deleted. The presence of the key conceptrecid is used as marker. Use jq to ensure the JSON is on a single line Join the JSON files using the ASCII Record Separator (RS, 0x1e) to make a application/json-seq JSON text sequence stream Save the JSON stream as a single compressed file using xz
c
BOREAS Forest Cover Data Layers Over the SSA-MSA in Raster Format
s.cnmilf.com
search.dataone.org
+7more
Updated Sep 19, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ORNL_DAAC (2025). BOREAS Forest Cover Data Layers Over the SSA-MSA in Raster Format [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/boreas-forest-cover-data-layers-over-the-ssa-msa-in-raster-format-65ce0
Explore at:
Dataset updated
Sep 19, 2025
Dataset provided by
ORNL_DAAC
Description
This data set was prepared by BORIS staff by processing the original vector data into raster files. The original data were received as ARC/INFO coverages or as export files from SERM. The data include information on forest parameters for the BOREAS SSA MSA. The data are stored in binary, image format files.
Z
Jula lexicographic data collected during January 2019
data.niaid.nih.gov
zenodo.org
Updated Jul 25, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Donaldson, Coleman (2024). Jula lexicographic data collected during January 2019 [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_2566070
Explore at:
Dataset updated
Jul 25, 2024
Dataset provided by
University of Hamburg
Authors
Donaldson, Coleman
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Rough lexicographic data for 62 Jula lexemes collected during January 2019 in western Burkina Faso by Coleman Donaldson in the following formats:

1) a .lift file (of the LIFT XML schema) exported from LexiquePro.

2) a .txt file in Toolbox format

3) a .pdf file of a formatted export from LexiquePro

4) a .rtf file of an export from LexiquePro.

5) a .pdf file of a formatted export from Toolbox's MDF tool.

6) a .rtf file of an export from Toolbox's MDF tool.

I follow the de facto official phonemic orthography synthesizing the various national standards that linguists use while also marking tone. Grave diacritics mark low tones and acute diacritics mark high tones. An unmarked vowel carries the same tone as the last marked vowel before it. Lexemes without any diacritics means I am unsure of the tone.
SEC Form 4 Filings
kaggle.com
Updated Sep 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
SecFilingApi (2025). SEC Form 4 Filings [Dataset]. https://www.kaggle.com/datasets/secfilingapi/sec-form-4-filings
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 30, 2025
Dataset provided by
Kaggle
Authors
SecFilingApi
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Historical SEC dataset containing all insider transactions (Form 4 filings). The data is public, and sourced from the SEC's EDGAR database out of their XML filings. Lightly processed for easier consumption. All Form 4 filings from Jan/20 to Jun/25

Why this exists

Form 4s are noisy to work with: amended filings, multiple insiders per transaction, and inconsistent tables. This dataset provides clean, normalized insider-transaction data with a stable schema so you can backtest signals and monitor insider activity without scraping.

What you get:

Company (name, ticker if known), CIK

Insider(s) with role (Director/Officer/10% Owner, etc.)

Trade details: transaction date, type, number of shares, price, total consideration

Holdings before/after the transaction

Update cadence: monthly (moving to daily as we scale). Source: U.S. SEC EDGAR Form 4 filings.

We're building a real-time API for new filings with clean JSON endpoints and low latency. If interested, sign up to our waiting list: 👉 https://secfilingapi.com/?utm_source=kaggle&utm_medium=dataset&utm_campaign=form4

Ideas to try with the data:

Event study: top-decile insider buys vs sector ETF over 30/60/90 days

Cluster insiders by role + transaction size; test persistence

Filter for CEO buys after 20% drawdowns (momentum/contrarian mix)

Find unusual clusters (multiple insiders buying within 10 days)

Limitations & notes:

There are a few missing filings in the historical data (< 1000 total, all very old filings, before the SEC launched the XML format).

Forms 4B (amendments to form 4 - rare - are not included)

Mapping from CIK→ticker can lag for recent IPOs/SPACs.

Always verify edge cases against the EDGAR link when publishing results.

Feedback & requests welcome in the Discussion tab.

DISCLAIMER: It is possible that inaccuracies or other errors were introduced into the data sets during the process of extracting the data and compiling the data sets. The data set is intended to assist the public in analyzing data contained in Commission filings; however, they are not a substitute for such filings. Investors should review the full Commission filings before making any investment decision.
b
Liaisons maritimes gérées par la région Bretagne
data.bretagne.bzh
data.geocatalogue.fr
+3more
csv, excel, geojson +1
Updated May 17, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2018). Liaisons maritimes gérées par la région Bretagne [Dataset]. https://data.bretagne.bzh/explore/dataset/liaisons-maritimes-gerees-par-la-region-bretagne/
Explore at:
csv, geojson, json, excelAvailable download formats
Dataset updated
May 17, 2018
License
Licence Ouverte / Open Licence 1.0https://www.etalab.gouv.fr/wp-content/uploads/2014/05/Open_Licence.pdf
License information was derived automatically
Area covered
Bretagne
Description
Liaisons maritimes reliant les îles et le continent gérées par la région Bretagne. Sont concernés les îles de Bréhat, Ouessant, Molène, Sein, Groix, Belle île, Houat, Hoedic et Arz.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Jacqueline Dearborn; Mike Lichtenberg (2023). Complete BHL Metadata Object Description Schema (MODS) XML Data Export [Dataset]. http://doi.org/10.25573/data.21081526.v1

Complete BHL Metadata Object Description Schema (MODS) XML Data Export

Explore at:

binAvailable download formats

Unique identifier

https://doi.org/10.25573/data.21081526.v1

Dataset updated

Jun 1, 2023

Dataset provided by

Smithsonian Libraries and Archives

Authors

Jacqueline Dearborn; Mike Lichtenberg

License

CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically

Description

The datasets containing metadata in MODS for the entire BHL collection (both hosted and externally linked content) can be downloaded from the following locations:

bhlitem.mods.xml bhlitem.mods.xml.zip bhlpart.mods.xml bhlpart.mods.xml.zip bhltitle.mods.xml bhltitle.mods.xml.zip

For contextual information and key definitions about this dataset see the Biodiversity Heritage Library Open Data Collection.

Data Dictionary:https://www.loc.gov/standards/mods/v3/mods-3-8.xsd Release Date: First of the month Frequency: Monthly bureauCode: 452:11 Access Level: public Rights: http://rightsstatements.org/vocab/NoC-US/

Clear search

Close search

Google apps

Main menu

Complete BHL Metadata Object Description Schema (MODS) XML Data Export

BHL-hosted Metadata Object Description Schema (MODS) XML Data Export

Resources for Harvard Identity and Access Management Data Customers

Data from: TPB Public Register

Extended Information Delivery Specifications (IDS) dataset for the digital...

Description

Technical details

Further details

Directory.gov.au export

Current and last 100 days of planning applications - Dataset - data.gov.uk

DEPD -- The Differentially Expressed Protein Database

Zenodo metadata JSON records as of 2019-09-16

BOREAS Forest Cover Data Layers Over the SSA-MSA in Raster Format

Jula lexicographic data collected during January 2019

SEC Form 4 Filings

Why this exists

What you get:

Ideas to try with the data:

Limitations & notes:

Liaisons maritimes gérées par la région Bretagne

Complete BHL Metadata Object Description Schema (MODS) XML Data Export