4 datasets found

H
Harvard Art Museums API
dataverse.harvard.edu
datasetcatalog.nlm.nih.gov
Updated Dec 11, 2015
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jeff Steward (2015). Harvard Art Museums API [Dataset]. http://doi.org/10.7910/DVN/AGTG4E
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/AGTG4E
Dataset updated
Dec 11, 2015
Dataset provided by
Harvard Dataverse
Authors
Jeff Steward
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
The Harvard Art Museums API is a REST-style service designed for developers who wish to explore and integrate the museums’ collections in their projects. The API provides direct access to JSON formatted data that describes many aspects of the museums. Details at http://www.harvardartmuseums.org/collections/api and https://github.com/harvardartmuseums/api-docs.
d
Harvard Faculty Finder
search.dataone.org
dataverse.harvard.edu
Updated Nov 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Waldo, Jim (2023). Harvard Faculty Finder [Dataset]. http://doi.org/10.7910/DVN/PLMNRW
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/PLMNRW
Dataset updated
Nov 21, 2023
Dataset provided by
Harvard Dataverse
Authors
Waldo, Jim
Description
The Harvard Faculty Finder creates creates an institution-wide view of the breadth and depth of Harvard faculty and scholarship, and it helps students, faculty, administrators, and the general public locate Harvard faculty according to research and teaching expertise. More information about the HFF website and the data it contains can be found on the Harvard University Faculty Development & Diversity website. HFF is a Semantic Web application, which means its content can be read and understood by other computer programs. This enables the data associated with a person, such as titles, contact information, and publications to be shared with other institutions and appear on other websites. Below are the technical details for building a computer program that can export data from HFF. The data is available through an API. No authentication is required. Documentation can be found at http://api.facultyfinder.harvard.edu, or you can see a snapshot of the documentation as the data for this entry. The API entry points are described in the documentation
d
Reddit May 2019 Submissions
search.dataone.org
dataverse.harvard.edu
Updated Nov 22, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Baumgartner, Jason (2023). Reddit May 2019 Submissions [Dataset]. http://doi.org/10.7910/DVN/JVI8CT
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/JVI8CT
Dataset updated
Nov 22, 2023
Dataset provided by
Harvard Dataverse
Authors
Baumgartner, Jason
Description
Dataset Metrics Total size of data uncompressed: 59,515,177,346 bytes Number of objects (submissions): 19,456,493 Reddit API Documentation: https://www.reddit.com/dev/api/ Overview This dataset contains all available submissions from Reddit during the month of May, 2019 (using UTC time boundaries). The data has been split to accommodate the file upload limitations for dataverse. Each file is a collection of json objects (ndjson). Each file was then compressed using zstandard compression (https://facebook.github.io/zstd). The files should be ordered by the id of the submission (represented by the id field). The time that each object was ingested is recorded in the retrieved_on field (in epoch seconds). Methodology Monthly Reddit ingests are usually started around a week into a new month for the previous month (but could be delayed). This gives submission scores, gildings and num_comments time to "settle" close to their eventual score before Reddit archives the posts (usually done after six months from the post's creation). All submissions are ingested via Reddit's API (using the /api/info endpoint). This is a "best effort" attempt to get all available data at the time of ingest. Due to the nature of Reddit, subreddits can go from private to public at any time, so it's possible more submissions could be found by rescanning missing ids. The author of this dataset highly encourages any researchers to do a sanity check on the data and to rescan for missing ids to ensure all available data has been gathered. If you need assistance, you can contact me directly. All efforts were made to capture as much data as possible. Generally, > 95% of all ids are captured. Missing data could be the result of Reddit API errors, submissions that were private during the ingest but then became public and subreddits that were quarantined and were not added to the whitelist before ingesting the data. When collecting the data, two scans are done. The first scan of ids using the /api/info endpoint collects all available data. After the first scan, a second scan is done requesting only missing ids from the first scan. This helps to keep the data as complete and comprehensive as possible. Contact If you have any questions about the data or require more details on the methodology, you are welcome to contact the author.
H
Harvard Catalyst Profiles
dataverse.harvard.edu
search.dataone.org
Updated Oct 2, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Griffin Weber (2016). Harvard Catalyst Profiles [Dataset]. http://doi.org/10.7910/DVN/SOZSJA
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/SOZSJA
Dataset updated
Oct 2, 2016
Dataset provided by
Harvard Dataverse
Authors
Griffin Weber
License
https://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/SOZSJAhttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/SOZSJA
Description
Harvard Catalyst Profiles is a Semantic Web application, which means its content can be read and understood by other computer programs. This enables the data in profiles, such as addresses and publications, to be shared with other institutions and appear on other websites. If you click the "Export RDF" link on the left sidebar of a profile page, you can see what computer programs see when visiting a profile. The section below describes the technical details for building a computer program that can export data from Harvard Catalyst Profiles. There are four types of application programming interfaces (APIs) in Harvard Catalyst Profiles. RDF crawl. Because Harvard Catalyst Profiles is a Semantic Web application, every profile has both an HTML page and a corresponding RDF document, which contains the data for that page in RDF/XML format. Web crawlers can follow the links embedded within the RDF/XML to access additional content. SPARQL endpoint. SPARQL is a programming language that enables arbitrary queries against RDF data. This provides the most flexibility in accessing data; however, the downsides are the complexity in coding SPARQL queries and performance. In general, the XML Search API (see below) is better to use than SPARQL. However, if you require access to the SPARQL endpoint, please contact Griffin Weber. XML Search API. This is a web service that provides support for the most common types of queries. It is designed to be easier to use and to offer better performance than SPARQL, but at the expense of fewer options. It enables full-text search across all entity types, faceting, pagination, and sorting options. The request message to the web service is in XML format, but the output is in RDF/XML format. The URL of the XML Search API is https://connects.catalyst.harvard.edu/API/Profiles/Public/Search. Old XML based web services. This provides backwards compatibility for institutions that built applications using the older version of Harvard Catalyst Profiles. These web services do not take advantage of many of the new features of Harvard Catalyst Profiles. Users are encouraged to switch to one of the new APIs. The URL of the old XML web service is https://connects.catalyst.harvard.edu/ProfilesAPI. For more information about the APIs, please see the documentation and example files.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Jeff Steward (2015). Harvard Art Museums API [Dataset]. http://doi.org/10.7910/DVN/AGTG4E

Harvard Art Museums API

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Unique identifier

https://doi.org/10.7910/DVN/AGTG4E

Dataset updated

Dec 11, 2015

Dataset provided by

Harvard Dataverse

Authors

Jeff Steward

License

CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically

Description

The Harvard Art Museums API is a REST-style service designed for developers who wish to explore and integrate the museums’ collections in their projects. The API provides direct access to JSON formatted data that describes many aspects of the museums. Details at http://www.harvardartmuseums.org/collections/api and https://github.com/harvardartmuseums/api-docs.

Clear search

Close search

Google apps

Main menu

Harvard Art Museums API

Harvard Faculty Finder

Reddit May 2019 Submissions

Harvard Catalyst Profiles

Harvard Art Museums APISee More Versions

Harvard Art Museums API