4 datasets found
  1. H

    Harvard Art Museums API

    • dataverse.harvard.edu
    • datasetcatalog.nlm.nih.gov
    Updated Dec 11, 2015
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jeff Steward (2015). Harvard Art Museums API [Dataset]. http://doi.org/10.7910/DVN/AGTG4E
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 11, 2015
    Dataset provided by
    Harvard Dataverse
    Authors
    Jeff Steward
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The Harvard Art Museums API is a REST-style service designed for developers who wish to explore and integrate the museums’ collections in their projects. The API provides direct access to JSON formatted data that describes many aspects of the museums. Details at http://www.harvardartmuseums.org/collections/api and https://github.com/harvardartmuseums/api-docs.

  2. d

    Harvard Faculty Finder

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Waldo, Jim (2023). Harvard Faculty Finder [Dataset]. http://doi.org/10.7910/DVN/PLMNRW
    Explore at:
    Dataset updated
    Nov 21, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Waldo, Jim
    Description

    The Harvard Faculty Finder creates creates an institution-wide view of the breadth and depth of Harvard faculty and scholarship, and it helps students, faculty, administrators, and the general public locate Harvard faculty according to research and teaching expertise. More information about the HFF website and the data it contains can be found on the Harvard University Faculty Development & Diversity website. HFF is a Semantic Web application, which means its content can be read and understood by other computer programs. This enables the data associated with a person, such as titles, contact information, and publications to be shared with other institutions and appear on other websites. Below are the technical details for building a computer program that can export data from HFF. The data is available through an API. No authentication is required. Documentation can be found at http://api.facultyfinder.harvard.edu, or you can see a snapshot of the documentation as the data for this entry. The API entry points are described in the documentation

  3. d

    Reddit May 2019 Submissions

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Baumgartner, Jason (2023). Reddit May 2019 Submissions [Dataset]. http://doi.org/10.7910/DVN/JVI8CT
    Explore at:
    Dataset updated
    Nov 22, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Baumgartner, Jason
    Description

    Dataset Metrics Total size of data uncompressed: 59,515,177,346 bytes Number of objects (submissions): 19,456,493 Reddit API Documentation: https://www.reddit.com/dev/api/ Overview This dataset contains all available submissions from Reddit during the month of May, 2019 (using UTC time boundaries). The data has been split to accommodate the file upload limitations for dataverse. Each file is a collection of json objects (ndjson). Each file was then compressed using zstandard compression (https://facebook.github.io/zstd). The files should be ordered by the id of the submission (represented by the id field). The time that each object was ingested is recorded in the retrieved_on field (in epoch seconds). Methodology Monthly Reddit ingests are usually started around a week into a new month for the previous month (but could be delayed). This gives submission scores, gildings and num_comments time to "settle" close to their eventual score before Reddit archives the posts (usually done after six months from the post's creation). All submissions are ingested via Reddit's API (using the /api/info endpoint). This is a "best effort" attempt to get all available data at the time of ingest. Due to the nature of Reddit, subreddits can go from private to public at any time, so it's possible more submissions could be found by rescanning missing ids. The author of this dataset highly encourages any researchers to do a sanity check on the data and to rescan for missing ids to ensure all available data has been gathered. If you need assistance, you can contact me directly. All efforts were made to capture as much data as possible. Generally, > 95% of all ids are captured. Missing data could be the result of Reddit API errors, submissions that were private during the ingest but then became public and subreddits that were quarantined and were not added to the whitelist before ingesting the data. When collecting the data, two scans are done. The first scan of ids using the /api/info endpoint collects all available data. After the first scan, a second scan is done requesting only missing ids from the first scan. This helps to keep the data as complete and comprehensive as possible. Contact If you have any questions about the data or require more details on the methodology, you are welcome to contact the author.

  4. H

    Harvard Catalyst Profiles

    • dataverse.harvard.edu
    • search.dataone.org
    Updated Oct 2, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Griffin Weber (2016). Harvard Catalyst Profiles [Dataset]. http://doi.org/10.7910/DVN/SOZSJA
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 2, 2016
    Dataset provided by
    Harvard Dataverse
    Authors
    Griffin Weber
    License

    https://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/SOZSJAhttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/SOZSJA

    Description

    Harvard Catalyst Profiles is a Semantic Web application, which means its content can be read and understood by other computer programs. This enables the data in profiles, such as addresses and publications, to be shared with other institutions and appear on other websites. If you click the "Export RDF" link on the left sidebar of a profile page, you can see what computer programs see when visiting a profile. The section below describes the technical details for building a computer program that can export data from Harvard Catalyst Profiles. There are four types of application programming interfaces (APIs) in Harvard Catalyst Profiles. RDF crawl. Because Harvard Catalyst Profiles is a Semantic Web application, every profile has both an HTML page and a corresponding RDF document, which contains the data for that page in RDF/XML format. Web crawlers can follow the links embedded within the RDF/XML to access additional content. SPARQL endpoint. SPARQL is a programming language that enables arbitrary queries against RDF data. This provides the most flexibility in accessing data; however, the downsides are the complexity in coding SPARQL queries and performance. In general, the XML Search API (see below) is better to use than SPARQL. However, if you require access to the SPARQL endpoint, please contact Griffin Weber. XML Search API. This is a web service that provides support for the most common types of queries. It is designed to be easier to use and to offer better performance than SPARQL, but at the expense of fewer options. It enables full-text search across all entity types, faceting, pagination, and sorting options. The request message to the web service is in XML format, but the output is in RDF/XML format. The URL of the XML Search API is https://connects.catalyst.harvard.edu/API/Profiles/Public/Search. Old XML based web services. This provides backwards compatibility for institutions that built applications using the older version of Harvard Catalyst Profiles. These web services do not take advantage of many of the new features of Harvard Catalyst Profiles. Users are encouraged to switch to one of the new APIs. The URL of the old XML web service is https://connects.catalyst.harvard.edu/ProfilesAPI. For more information about the APIs, please see the documentation and example files.

  5. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Jeff Steward (2015). Harvard Art Museums API [Dataset]. http://doi.org/10.7910/DVN/AGTG4E

Harvard Art Museums API

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 11, 2015
Dataset provided by
Harvard Dataverse
Authors
Jeff Steward
License

CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically

Description

The Harvard Art Museums API is a REST-style service designed for developers who wish to explore and integrate the museums’ collections in their projects. The API provides direct access to JSON formatted data that describes many aspects of the museums. Details at http://www.harvardartmuseums.org/collections/api and https://github.com/harvardartmuseums/api-docs.

Search
Clear search
Close search
Google apps
Main menu