Published February 21, 2013 | Version v1
Conference paper Open

Use of Solr and Xapian in the Invenio document repository software

Authors/Creators

  • 1. ROR icon European Organization for Nuclear Research
  • 1. ROR icon European Organization for Nuclear Research

Description

Invenio is a free comprehensive web-based document repository and digital library software suite originally developed at CERN. It can serve a variety of use cases from an institutional repository or digital library to a web journal. In order to fully use full-text documents for efficient search and ranking, Solr was integrated into Invenio through a generic bridge. Solr indexes extracted full-texts and most relevant metadata. Consequently, Invenio takes advantage of Solr's efficient search and word similarity ranking capabilities. In this paper, we first give an overview of Invenio, its capabilities and features. We then present our open source Solr integration as well as scalability challenges that arose for an Invenio-based multi-million record repository: the CERN Document Server. We also compare our Solr adapter to an alternative Xapian adapter using the same generic bridge. Both integrations are distributed with the Invenio package and ready to be used by the institutions using or adopting Invenio.

Files

arXiv:1310.0250.pdf

Files (1.4 MB)

Name Size Download all
md5:1dc86d21f1df921182562c8cdb27841b
519.3 kB Preview Download
md5:ca6ad26170d62db18ac0697ad6763f46
846.6 kB Preview Download

Additional details

Identifiers

CDS
1634050
CDS Report Number
CERN-IT-2013-006
Aleph number
000735952CER

Related works

Is variant form of
Other: 1256679 (Inspire)
Other: arXiv:1310.0250 (arXiv)
References
Event: 1634051 (CDS)

CERN

Department
IT

Conference

Title
charlottetown20130708

Linked records