This dataset includes the contents of the Répertoire du Catalogue collective de France for its part “Bibliothèques”. Description of the Directory of Libraries More than 5,000 French libraries are registered in the Directory of the CCFr, including: * All municipal libraries in cities with more than 10.000 inhabitants, * All French university libraries (updated with data from the Directory of Sudoc Resource Centres), * All libraries reporting collections in the CCFr. But the Directory also presents libraries and documentation centres of all types and sizes when they request them. Each registration includes both practical and scientific information (address, timetables, access conditions, web address of the catalogue and/or library, services offered, description of collections,...) for users and professionals. For more information On the Directory of Libraries: The RCR number Each library is identified by its RCR number (9 characters), assigned by the Bibliographic Agency for Higher Education (ABES) and composed as follows: * the first two numbers correspond to the department number, * the following three numbers correspond to the INSEE code of the municipality, * the following two numbers correspond to the type of library, * the last two form a sequential number. The RCR number is widely used within the CCFr, for the localisation of the collections or documents described in the Heritage Base, in the General Catalogue of Manuscripts (CGM) or in the Repertoire of French Literary Manuscripts of the 20th Century (Palma).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Study and population characteristics.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This package contains all output data and documentation associated with a meta-analysis study to understand semantic search issues and mechanisms. when applied to search for data, papers or articles. This analysis was conducted on the entire contents of 9 digital libraries - ACM Digital Library, arXiV, Engineering Village, IEEE Xplore, SBC OpenLib, Springer Link, Scopus, Wiley Online Library and Web of Science, covering hundreds of thousands of documents.
Comprehensive dataset of 30 Public libraries in Santa Cruz de Tenerife, Spain as of June, 2025. Includes verified contact information (email, phone), geocoded addresses, customer ratings, reviews, business categories, and operational details. Perfect for market research, lead generation, competitive analysis, and business intelligence. Download a complimentary sample to evaluate data quality and completeness.
This dataset contains a an extract of K10plus library union catalog with its subject indexing data:
K10plus is a union catalog of German libraries, run by library service centers BSZ and VZG since 2019. The catalog contains bibliographic data of the majority of academic libraries in Germany. The core data of K10plus is made available as OpenData via APIs and in form of database dumps. More information can be found here:
The data is provided in its raw internal format called PICA+ to not loose information during conversion. In particular the data is given in PICA Normalized Format with one record per line. Each record consists of a list of fields and each field consists of a list of subfields.
The data can best be processed with command line tools pica-rs or picadata. A detailled description of PICA format and its processing is given in the German textbook Einführung in die Verarbeitung von PICA-Daten.
For visual inspection PICA Normalized Format is best converted into PICA Plain Format (pica-rs command pica print
). The following example record contains seven fields:
003@ $0010003231
044K $9106080474$VTsv1$7gnd/4077343-7$3209204761$aSekte
044N $aReligionsgemeinschaft
045E $a12
045F $a291
045Q/01 $9181570408$VTkv$a11.97$jNeue religiöse Bewegungen$jSekten
045R $91270641751$VTkv$7rvk/11410:$3200641751$aBG 9600$jAllgemeines$kTheologie und Religionswissenschaften$kFundamentaltheologie$kKirche und Kirchen$kFreikirchen und Sekten
Each K10plus record is uniquely identified by its record identifier PPN, given in field 003@
subfield $0
. The PPN can be used:
The data is limited to records having a least one holding by a library participating in K10plus. Records are provided with “offline expansion” (some subfield have been added automatically to facilitate re-use of the data) and limited to the following fields:
003@
with internal record identifier “PPN” in subfield $0
041A
keywords044.
all subject indexing fields starting with 044
045.
all subject indexing fields starting with 045
144Z
local library keywords145S
local library classification145Z
local library classificationDocumentation of the fields can be found at https://format.k10plus.de/k10plushelp.pl?cmd=pplist&katalog=Standard#titel
The current dump contains 41.786.820 record with subject indexing out of 71.429.482 K10plus records in total.
For reference, the dump has been created from a full dump of K10plus with this chain of commands:
pica filter 003@? --reduce 003@.0,044.,045.,144Z,145S,145Z kxp-catalog-full_2022-06-30_*.dat | grep -Pv '^003@..0[0-9X]+.$' > kxp-subjects_2021-06-30.dat
The file was then split into chunks of 5.000.000 records each:
split -l 5000000 --numeric-suffixes=01 kxp-subjects_2021-06-30.dat kxp-subjects_2021-06-30_
Processing examples
Extract CSV file of PPN and RVK-Notation:
pica filter '045R?' kxp-subjects_2021-06-30.dat | pica select '003@$0,045Ra'
Get a list of PPN of records having RVK but not BK:
pica filter '045R? & !045Q/01' kxp-subjects_2021-06-30.dat | pica select '003@$0'
License
This dataset contains normalized subject indexing data of K10plus library union catalog. It includes links between bibliographic records in K10plus and concepts (subjects or classes) from controlled vocabularies:
kxp-subjects_2022-06-30.tsv.gz
: TSV formatkxp-subjects_2022-06-30.nt.gz
: RDF format (in form of NTriples)vocabularies.json
: information about vocabulariesK10plus
K10plus is a union catalog of German libraries, run by library service centers BSZ and VZG since 2019. The catalog contains bibliographic data of the majority of academic libraries in Germany. Bibliographic records in K10plus are uniquely identified by a PPN identifier.
Several APIs exist to retrieve more data for a record via its PPN, e.g. link into K10plus OPAC:
https://opac.k10plus.de/PPNSET?PPN={PPN}
Retrieve full record in MARC/XML format:
https://unapi.k10plus.de/?format=marcxml&id=opac-de-627:ppn:{PPN}
Get formatted citation for display:
https://ws.gbv.de/suggest/csl2?citationstyle=ieee&language=en&database=opac-de-627&query=pica.ppn=${PPN}
APIs to look up more data from a notation or identifier of a vocabulary can be found in https://bartoc.org/. For instance BK class 58.55
can be retrieved via DANTE API:
https://api.dante.gbv.de/data?uri=http%3A%2F%2Furi.gbv.de%2Fterminology%2Fbk%2F58.55
See vocabularies.json
for mapping of vocabulary symbol to BARTOC URI and additional information.
Statistics
The TSV dataset is 24,009,936 records and 82,937,252 links to concepts.
Number of concepts per vocabulary:
asb 5340
stw 104118
nlm 129289
ssd 153242
kab 159543
sfb 432141
sdnb 4593798
lcc 5232208
ddc 9248794
rvk 10172838
bk 13321229
gnd 39384712
Number of RDF Triples: 82,937,252
TSV
The .tsv
file contains three tab-separated columns:
An example:
0010000011 bk 58.55
0010000011 gnd 4036582-7
Record 0010000011
is indexed with class 58.55
from Basic Classification and with authority record 4036582-7
from Integrated authority file.
RDF
The NTriples file contains the same information as given in TSV file but identifiers are mapped to URIs. An example:
<http://uri.gbv.de/document/opac-de-627:ppn:0010000011> <http://purl.org/dc/terms/subject> <http://d-nb.info/gnd/4036582-7> .
<http://uri.gbv.de/document/opac-de-627:ppn:0010000011> <http://purl.org/dc/terms/subject> <http://uri.gbv.de/terminology/bk/58.55> .
Changelog
License and provenance
All data is public domain but references are welcome. See https://coli-conc.gbv.de/ for related projects and documentation.
The data has been derived from a larger datase of all subject indexing data, published at https://doi.org/10.5281/zenodo.6817455.
This dataset has been created with public scripts from git repository https://github.com/gbv/k10plus-subjects. Comments and feature requests are welcome!
Comprehensive dataset of 5 Children's libraries in State of Rio de Janeiro, Brazil as of June, 2025. Includes verified contact information (email, phone), geocoded addresses, customer ratings, reviews, business categories, and operational details. Perfect for market research, lead generation, competitive analysis, and business intelligence. Download a complimentary sample to evaluate data quality and completeness.
data.bnf.fr gathers data from the different databases of the Bibliothèque nationale de France, so as to create Web pages about Works, Authors, Subjects and Places together with a RDF view on the extracted data.
There are links to id.loc.gov for languages and nationalities, to dewey.info for subjects, and to DCMI type for types.
The authors and subjects are matched to Agrovoc,Geonames, DBpedia, Wikipedia, VIAF, Stitch, Dewey, ISNI, the library of Congress, and the German national library.
We use SKOS, FOAF, DC and RDA vocabularies, in a FRBR model.
This dataset is a spectroscopic library structured as a document database. It contains Raman and ATR-FTIR spectra of weathered and biofouled polymers.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The University of Minnesota Insect Collection’s mission is to explore, describe, and preserve representative specimens of Earth’s remarkable diversity of insects and to make these specimens available to the global community for research and education. Contributions to the collection began in 1879 with specimens of insects and spiders from the North Shore of Lake Superior. During the last 140 years, the collection’s holdings have grown from a regional collection of 3,000 specimens to a major national and international resource of more than 4.06 million specimens. The collection is one of the largest university-affiliated insect collections in North America. Enhancing the collection’s status are 5 resident systematists, computerized inventory management and specimen databases, and the large and historically important affiliated University of Minnesota Natural Resources Library. Current and past National Science Foundation grants have made significant progress in digitizing the collection’s specimen holdings. Research projects associated with the collection have broad taxonomic and geographic scope. Faculty and graduate student research focuses on both aquatic and terrestrial insect groups and includes taxonomic, phylogenetic, and applied questions. The collection is the mainstay of graduate training is systematic entomology at the University of Minnesota.
This dataset consists of extracting bibliographic records from the Base patrimoine du catalogue collective de France (CCFr). It was produced on the occasion of the study day “Naissance et essor du roman anglophone en France” on 11 April 2019. It is dedicated to the study of English-speaking collections and collections, which may be of interest to research and conserved by CCFr’s partner municipal libraries. ### Dataset content This game consists of 16 files of bibliographic records corresponding to a selection linked to an author, a provenance or more broadly to the English works present in a particular collection, that is, collected initially by a collector. The dataset of the Dijon library, dedicated to rare editions of or parodiant Laurence Sterne, is accompanied by the photographic file of the copies. The content of each of the files is given more precisely in the “Dataset Presentation” document. ### Production context This game has been extracted from the heritage base of the collective catalogue of France (CCFr), whose purpose is to report works preserved by municipal libraries in old, local, thematic and/or bequeathed collections by individuals. The complete database is searchable online on the CCFr via a specific search interface. The dataset does not take into account the 1000 record limit set for the results lists. ### Dataset format The dataset is delivered in the form of: — A raw unimarc export containing all the information and codes of the source data (including fund codes and libraries). The mrk format, which can open with MarcEdit software or word processor, was preferred over the ISO2709 exchange format. — A simplified export under Excel spreadsheet, containing the main information useful for the identification of the edition and the location of the copy. ### API and related datasets This dataset is an extraction made from the data of the Heritage Base also made available on this site. It is related to the Library Description Sheets (CCFr Directory) and Fund Description Sheets (CCFr Directory), which provide a detailed description of the institutions and holdings referred to in the bibliographic data. ### To learn more on the Heritage Base: https://www.bnf.fr/fr/la-base-patrimoine on the UNIMARC format: https://www.transition-bibliographique.fr/systemes-et-donnees/manuel-unimarc-format-bibliographique/ on CPR Nos: http://documentation.abes.fr/sudoc/manuels/administration/gestion_bibliotheques/index.html#IlnRcr on MarcEdit: https://marcedit.reeset.net/
Not seeing a result you expected?
Learn how you can add new datasets to our index.
This dataset includes the contents of the Répertoire du Catalogue collective de France for its part “Bibliothèques”. Description of the Directory of Libraries More than 5,000 French libraries are registered in the Directory of the CCFr, including: * All municipal libraries in cities with more than 10.000 inhabitants, * All French university libraries (updated with data from the Directory of Sudoc Resource Centres), * All libraries reporting collections in the CCFr. But the Directory also presents libraries and documentation centres of all types and sizes when they request them. Each registration includes both practical and scientific information (address, timetables, access conditions, web address of the catalogue and/or library, services offered, description of collections,...) for users and professionals. For more information On the Directory of Libraries: The RCR number Each library is identified by its RCR number (9 characters), assigned by the Bibliographic Agency for Higher Education (ABES) and composed as follows: * the first two numbers correspond to the department number, * the following three numbers correspond to the INSEE code of the municipality, * the following two numbers correspond to the type of library, * the last two form a sequential number. The RCR number is widely used within the CCFr, for the localisation of the collections or documents described in the Heritage Base, in the General Catalogue of Manuscripts (CGM) or in the Repertoire of French Literary Manuscripts of the 20th Century (Palma).