Open Archives Initiative

The Open Archives Initiative (OAI) is an organization to develop and apply technical interoperability standards for archives to share catalog information (metadata).[1] It attempts to build a "low-barrier interoperability framework" for archives (institutional repositories) containing digital content (digital libraries). It allows people (service providers) to harvest metadata (from data providers). This metadata is used to provide "value-added services", often by combining different data sets.

OAI has been involved xa in developing a technological framework and interoperability standards for enhancing access to eprint archives, which make scholarly communications like academic journals available, associated with the open access publishing movement. The relevant technology and standards are applicable beyond scholarly publishing.

The OAI technical infrastructure, specified in the Protocol for Metadata Harvesting (OAI-PMH) version 2.0, defines a mechanism for data providers to expose their metadata. This protocol mandates that individual archives map their metadata to the Dublin Core, a common metadata set for this purpose. OAI standards allow a common way to provide content, and part of those standards is that the content has metadata that describes the items in Dublin Core format. Object Reuse and Exchange (OAI-ORE) defines standards for the description and exchange of aggregations of web resources.

Funding for the initiative comes from the Andrew W. Mellon Foundation, Coalition for Networked Information (CNI), Digital Library Federation (DLF), National Science Foundation (NSF), the Alfred P. Sloan Foundation, and other organizations.[1]

See also


  1. ^ a b Open Archives Initiative -> About OAI

External links


AIM25 is a non-profit making collaborative archive project; a single point of networked access to collection level descriptions of the archives of over one hundred higher education institutions, learned societies and specialist archives within the M25 Greater London area of the United Kingdom. It holds over 7500 collection level descriptions on subjects including social sciences, politics, social and economic history, women's history and military history. Each description on AIM25 provides a link to ARCHON which gives contact details of the repository holding that archive.

AIM25 follows ISAD(G) and is interoperable with Encoded Archival Description, Open Archives Initiative and Dublin Core. AIM25 is based at King's College London and is freely available to all. Partner institutions update the records for their holdings and collection level descriptions are indexed at King's College London using personal, corporate, place names and subject thesauri.

AIM25 is freely available and forms part of the UK national network of archives.

The relaunched interface has Web 2.0 features including a tag cloud, RSS feeds and space for the upload of images.

The project was initially funded by the Research Support Libraries Programme

BASE (search engine)

BASE (Bielefeld Academic Search Engine) is a multi-disciplinary search engine to scholarly internet resources, created by Bielefeld University Library in Bielefeld, Germany. It is based on free and open-source software such as Apache Solr and VuFind. It harvests OAI metadata from institutional repositories and other academic digital libraries that implement the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH), and then normalizes and indexes the data for searching. In addition to OAI metadata, the library indexes selected web sites and local data collections, all of which can be searched via a single search interface.

Users can search bibliographic metadata including abstracts, if available. However, BASE does not currently offer full text search. It contrasts with commercial search engines in multiple ways, including in the types and kinds of resources it searches and the information it offers about the results it finds. Results can be narrowed down using drill down menus (faceted search). Bibliographic data is provided in several formats, and the results may be sorted by multiple fields, such as by author or year of publication.

Paying customers include EBSCO Information Services who integrated BASE into their EBSCO Discovery Service (EDS). Non-commercial services can integrate BASE search for free using an API. BASE becomes an increasingly important component of open access initiatives concerned with enhancing the visibility of their digital archive collections.On 6 October 2016, BASE surpassed the 100 million documents threshold having indexed 100,183,705 documents from 4,695 content sources.


CiteSeerx (originally called CiteSeer) is a public search engine and digital library for scientific and academic papers, primarily in the fields of computer and information science. CiteSeer holds a United States patent # 6289342, titled "Autonomous citation indexing and literature browsing using citation context," granted on September 11, 2001. Stephen R. Lawrence, C. Lee Giles, Kurt D. Bollacker are the inventors of this patent assigned to NEC Laboratories America, Inc. This patent was filed on May 20, 1998, which has its roots (Priority) to January 5, 1998. A continuation patent was also granted to the same inventors and also assigned to NEC Labs on this invention i.e. US Patent # 6738780 granted on May 18, 2004 and was filed on May 16, 2001. CiteSeer is considered as a predecessor of academic search tools such as Google Scholar and Microsoft Academic Search. CiteSeer-like engines and archives usually only harvest documents from publicly available websites and do not crawl publisher websites. For this reason, authors whose documents are freely available are more likely to be represented in the index.

CiteSeer's goal is to improve the dissemination and access of academic and scientific literature. As a non-profit service that can be freely used by anyone, it has been considered as part of the open access movement that is attempting to change academic and scientific publishing to allow greater access to scientific literature. CiteSeer freely provided Open Archives Initiative metadata of all indexed documents and links indexed documents when possible to other sources of metadata such as DBLP and the ACM Portal. To promote open data, CiteSeerx shares its data for non-commercial purposes under a Creative Commons license.The name can be construed to have at least two explanations. As a pun, a 'sightseer' is a tourist who looks at the sights, so a 'cite seer' would be a researcher who looks at cited papers. Another is a 'seer' is a prophet and a 'cite seer' is a prophet of citations. CiteSeer changed its name to ResearchIndex at one point and then changed it back.


CogPrints is an electronic archive in which authors can self-archive papers in any area of cognitive science, including psychology, neuroscience, and linguistics, and many areas of computer science (e.g., artificial intelligence, robotics, vision, learning, speech, neural networks), philosophy (e.g., mind, language, knowledge, science, logic), biology (e.g., ethology, behavioral ecology, sociobiology, behaviour genetics, evolutionary theory), medicine (e.g., psychiatry, neurology, human genetics, imaging), anthropology (e.g., primatology, cognitive ethnology, archeology, paleontology), as well as any other portions of the physical, social and mathematical sciences that are pertinent to the study of cognition.

CogPrints is moderated by Stevan Harnad. The archive was launched in 1997 and now contains over 2000 freely downloadable articles.

Some cite CogPrints, along with the physics archive arXiv as evidence that the author self-archiving model of Open Access can work—although under the influence of the Open Archives Initiative and its OAI-PMH, the emphasis in self-archiving has since moved away from such central repositories in the direction of distributed self-archiving in Institutional Repositories.

CogPrints was first made OAI-compliant, and then the software was converted into the EPrints software at the University of Southampton by Rob Tansley who then went on to design DSpace. EPrints is now maintained by Christopher Gutteridge at Southampton.

Dove Medical Press

Dove Medical Press is an academic publisher of open access peer-reviewed scientific and medical journals, with offices in Manchester, London (United Kingdom), Princeton, New Jersey (United States), and Auckland (New Zealand).In September 2017, Dove Medical Press was acquired by the Taylor and Francis Group (Informa PLC).As an open access publisher, Dove charges a publication fee to authors or their institutions or funders. This charge allows Dove to recover its editorial and production costs and to create a pool of funds that can be used to provide fee waivers for authors from lesser developed countries. Articles published are available via an interface following the Open Archives Initiative Protocol for Metadata Harvesting, a set of uniform standards promulgated by the Open Archives Initiative allowing metadata on archive holdings.Dove is a member of the Association of Learned and Professional Society Publishers, the Committee on Publication Ethics, and the Open Archives Initiative. As of September 2016, it publishes over 100 journals.


EPrints is a free and open-source software package for building open access repositories that are compliant with the Open Archives Initiative Protocol for Metadata Harvesting. It shares many of the features commonly seen in document management systems, but is primarily used for institutional repositories and scientific journals. EPrints has been developed at the University of Southampton School of Electronics and Computer Science and released under a GPL license.The EPrints software is not to be confused with "Eprints" (or "e-prints"), which are preprints (before peer review) and postprints (after peer review), of research journal articles (eprints = preprints + postprints).

Herbert Van de Sompel

Herbert Van de Sompel is a Belgian librarian, computer scientist, and musician, most known for his role in the development of the Open Archives Initiative (OAI) and standards such as OpenURL, Object Reuse and Exchange, and the OAI Protocol for Metadata Harvesting.

Hyper Articles en Ligne

Hyper Articles en Ligne, generally shortened to HAL, is an open archive where authors can deposit scholarly documents from all academic fields. It has a good position in the international web repository ranking.HAL is run by the Centre pour la communication scientifique directe, a French computing centre, which is part of the French National Centre for Scientific Research, CNRS. Other French institutions, such as INRIA, have joined the system. While it is primarily directed towards French academics, participation is not restricted to them.

Documents in HAL are uploaded either by one of the authors with the consent of the others or by an authorized person on their behalf. Since 2017 it's also possible to use, a tool for easy and semi-automated deposit.HAL is a tool for direct scientific communication between academics. A text posted to HAL is normally comparable to that of a paper that an investigator might submit for publication in a peer-reviewed scientific journal or conference proceedings. A document deposited in HAL will not be subjected to any detailed scientific evaluation, but simply a rapid overview, to ensure that it falls within the category defined above.

An uploaded document does not need to have been published or even to be intended for publication: It may be posted to HAL as long as its scientific content justifies it. But should the article be published, contributors are invited to indicate the relevant bibliographic information and the digital object identifier (DOI).

HAL aims to ensure the long term preservation of the deposited documents that are stored there permanently and will receive a stable web address. Thus, like any publication in a traditional scientific journal, it can be cited in other work.

The free online access to these documents provided by HAL is intended to promote the best possible dissemination of research work; the intellectual property remains that of the authors.

For physics, mathematics and other natural science topics, HAL has an automated depositing agreement with arXiv. A similar agreement exists with PubMed Central for biomedical topics.

Over 120 institutions have their own entrance to HAL, called portals. HAL hosts institutional repositories (for universities, research organisms and units) as well as subject repositories ; one example is the Arts and Humanities eprint repository, hprints.

As an open access repository, HAL complies with the Open Archives Initiative (OAI-PMH) as well as with the European OpenAIRE project.


OAIster is an online combined bibliographic catalogue of open access material aggregated using OAI-PMH.It began at the University of Michigan in 2002 funded by a grant from the Andrew W. Mellon Foundation and with the purpose of establishing a retrieval service for publicly available digital library resources provided by the research library community. During its tenure at the University of Michigan, OAIster grew to become one of the largest aggregations of records pointing to open access collections in the world.

In 2009, OCLC formed a partnership with the University of Michigan to provide continued access to open access collections aggregated in OAIster. Since OCLC began managing OAIster, it has grown to include over 30 million records contributed by over 1,500 organizations. OCLC is evolving OAIster to a model of self-service contribution for all open access digital repositories to ensure the long-term sustainability of this rich collection of open access materials.

OAIster data is harvested from Open Archives Initiative (OAI)-compliant digital libraries, institutional repositories, and online journals using the self-service WorldCat Digital Collection Gateway.

OPUS (software)

OPUS is an open source software package under the GNU General Public License used for creating Open Access repositories that are compliant with the Open Archives Initiative Protocol for Metadata Harvesting. It provides tools for creating collections of digital resources, as well as for their storage and dissemination. It is usually used at universities, libraries and research institutes as a platform for institutional repositories.

Object Reuse and Exchange

The Open Archives Initiative Object Reuse and Exchange (OAI-ORE) defines standards for the description and exchange of aggregations of web resources. The OAI-ORE specification implements the ORE Model which introduces the resource map (ReM) that makes it possible to associate an identity with aggregations of resources and make assertions about their structure and semantics.

These aggregations (sometimes called compound digital objects or compound information objects) may combine distributed resources together, and with multiple media types including text, images, data, and video. The goal of OAI-ORE is to expose the rich content in aggregations to applications that support authoring, deposit, exchange, visualization, reuse, and preservation.

The Andrew W. Mellon Foundation funded two years of work on the OAI-ORE project in 2006-2008. Version 1.0 of the specification was released on 17 October 2008.

Open-access repository

An open-access repository or open archive is a digital platform that holds research output and provides free, immediate and permanent access to research results for anyone to use, download and distribute. To facilitate open access such repositories must be interoperable according to the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). Search engines harvest the content of open access repositories, constructing a database of worldwide, free of charge available research.As opposed to a simple institutional repository or disciplinary repository, open-access repositories provide free access to research for users outside the institutional community and are one of the recommended ways to achieve the open access vision described in the Budapest Open Access Initiative definition of open access. This is sometimes referred to as the self-archiving or "green" route to open access.

Open Archival Information System

An Open Archival Information System (or OAIS) is an archive, consisting of an organization of people and systems, that has accepted the responsibility to preserve information and make it available for a Designated Community.

The term OAIS also refers, by extension, to the ISO OAIS Reference Model for an OAIS. This reference model is defined by recommendation CCSDS 650.0-B-2 of the Consultative Committee for Space Data Systems; this text is identical to ISO 14721:2012. The CCSDS's purview is space agencies, but the OAIS model it developed has proved useful to a wide variety of other organizations and institutions with digital archiving needs.

The information being maintained has been deemed to need "long term preservation", even if the OAIS itself is not permanent. "Long term" is long enough to be concerned with the impacts of changing technologies, including support for new media and data formats, or with a changing user community. "Long term" may extend indefinitely. In this reference model there is a particular focus on digital information, both as the primary forms of information held and as supporting information for both digitally and physically archived materials. Therefore, the model accommodates information that is inherently non-digital (e.g., a physical sample), but the modeling and preservation of such information is not addressed in detail. As strictly a conceptual framework, the OAIS model does not require the use of any particular computing platform, system environment, system design paradigm, system development methodology, database management system, database design paradigm, data definition language, command language, system interface, user interface, technology, or media for an archive to be compliant. Its aim is to set the standard for the activities that are involved in preserving a digital archive rather than the method for carrying out those activities.

The acronym OAIS should not be confused with OAI, which is the Open Archives Initiative.

Open Archives Initiative Protocol for Metadata Harvesting

The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) is a protocol developed for harvesting metadata descriptions of records in an archive so that services can be built using metadata from many archives. An implementation of OAI-PMH must support representing metadata in Dublin Core, but may also support additional representations.The protocol is usually just referred to as the OAI Protocol.

OAI-PMH uses XML over HTTP. Version 2.0 of the protocol was released in 2002; the document was last updated in 2015. It has a Creative Commons license BY-SA.

Open Research Online

Open Research Online (ORO) is a repository of research publications run by The Open University (OU).It uses the GNU ePrints software and its repositories use the Open Archives Initiative Protocol for Metadata Harvesting.It is an open access repository and it accepts books, journal articles, patents, conference articles, and theses.As of 21 September 2008, 496 of its publications are from the Mathematics and Computing Department of the OU, while over two thousand are from the Science department.


In academic publishing, a postprint is a digital draft of a research journal article after it has been peer reviewed. A digital draft before peer review is called a preprint. Jointly, postprints and preprints are called eprints.Expressed in the CrossRef terminology, any draft starting from the author's original version but prior to the accepted version is a preprint, whereas any draft from the accepted version onward, including the version of record or definitive work, is a postprint.

Since the advent of the Open Archives Initiative, preprints and postprints have been deposited in institutional repositories, which are interoperable because they are compliant with the Open Archives Initiative Protocol for Metadata Harvesting.

Eprints are at the heart of the open access initiative to make research freely accessible online. Eprints were first deposited or self-archived in arbitrary websites and then harvested by virtual archives such as CiteSeer (and, more recently, Google Scholar), or they were deposited in central disciplinary archives such as Arxiv or PubMed Central.

Research Object

In computing, a Research Object is a method for the identification, aggregation and exchange of scholarly information on the Web. The primary goal of the research object approach is to provide a mechanism to associate together related resources about a scientific investigation so that they can be shared together using a single identifier. As such, research objects are an advanced form of Enhanced publication.Current implementations build upon existing Web technologies and methods including Linked Data, HTTP, Uniform Resource Identifiers (URIs), the Open Archives Initiative Object Reuse and Exchange (OAI-ORE) and the Open Annotation model, as well as existing approaches for identification and knowledge representation in the scientific domain including Digital Object Identifiers for documents, ORCID identifiers for people, and the Investigation, Study, and Assay (ISA) data model.


ScientificCommons is a project of the University of St. Gallen Institute for Media and Communications Management. The major aim of the project is to develop the world’s largest archive of scientific knowledge with fulltexts freely accessible to the public.

ScientificCommons includes a search engine for publications and author profiles. It also allows the user to turn searches into customized RSS feeds of new publications. ScientificCommons also provides a fulltext caching service for researchers.

Since the beginning of 2013, ScientificCommons has been inaccessible. All visitors are forwared to an administration login for server virtualization management software Proxmox VE and the site is no longer issuing a valid TLS certificate.

Projects +

This page is based on a Wikipedia article written by authors (here).
Text is available under the CC BY-SA 3.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.