DjVu (/ˌdeɪʒɑːˈvuː/ DAY-zhah-VOO, like French "déjà vu") is a computer file format designed primarily to store scanned documents, especially those containing a combination of text, line drawings, indexed color images, and photographs. It uses technologies such as image layer separation of text and background/images, progressive loading, arithmetic coding, and lossy compression for bitonal (monochrome) images. This allows high-quality, readable images to be stored in a minimum of space, so that they can be made available on the web.
DjVu has been promoted as providing smaller files than PDF for most scanned documents. The DjVu developers report that color magazine pages compress to 40–70 kB, black-and-white technical papers compress to 15–40 kB, and ancient manuscripts compress to around 100 kB; a satisfactory JPEG image typically requires 500 kB. Like PDF, DjVu can contain an OCR text layer, making it easy to perform copy and paste and text search operations.
Free creators, manipulators, converters, browser plug-ins, and desktop viewers are available. DjVu is supported by a number of multi-format document viewers and e-book reader software on Linux (Okular, Evince) and Windows (SumatraPDF).
|Internet media type|
|Developed by||AT&T Labs – Research|
|Type of format||Image file formats|
|Open format?||GNU GPLv2 for DjVu Reference Library and DjVuLibre-3.5;|
License grants under the GNU GPL for several patents that cover aspects of the library
Due to its declared higher compression ratio (and thus smaller file size) and the ease of converting large volumes of text into DjVu format, and because it is an open file format, it has been considered superior to PDF. Independent technologist Brewster Kahle in a 2004 talk on IT Conversations discussed the benefits of allowing easier access to DjVu files.
The DjVu library distributed as part of the open-source package DjVuLibre has become the reference implementation for the DjVu format. DjVuLibre has been maintained and updated by the original developers of DjVu since 2002.
The DjVu file format specification has gone through a number of revisions, the most recent being from 2005.
|Support status||Version||Release date||Notes|
|Unsupported||1–19||1996–1999||Developmental versions by AT&T labs preceding the sale of the format to LizardTech.|
|Unsupported||Version 20||April 1999||DjVu version 3. DjVu changed from a single-page format to a multipage format.|
|Older, still supported||Version 21||September 1999||Indirect storage format replaced. The searchable text layer was added.|
|Older, still supported||Version 22||April 2001||Page orientation, color JB2|
|Unsupported||Version 23||July 2002||CID chunk|
|Unsupported||Version 24||February 2003||LTAnno chunk|
|Older, still supported||Version 25||May 2003||NAVM chunk. Support for DjVu bookmarks (outlines) was added. Changes made by Versions 23 and 24 were made obsolete.|
|Current||Version 26||April 2005||Text/line annotations|
The primary usage of the DjVu format has been the electronic distribution of documents with a quality comparable to that of printed documents. As that niche is also the primary usage for PDF, it was inevitable that the two formats would become competitors. It should however be observed that the two formats approach the problem of delivering high resolution documents in very different ways: PDF primarily encodes graphics and text as vectorised data, whereas DjVu primarily encodes them as pixmap images. This means PDF places the burden of rendering the document on the reader, whereas DjVu places that burden on the creator.
During a number of years, significantly overlapping with the period when DjVu was being developed, there were no PDF viewers for free operating systems — a particular stumbling block was the rendering of vectorised fonts, which are essential for combining small file size with high resolution in PDF. Since displaying DjVu was a simpler problem for which free software was available, there were suggestions that the free software movement should employ DjVu instead of PDF for distributing documentation; rendering for creating DjVu is in principle not much different from rendering for a device-specific printer driver, and DjVu can as a last resort be generated from scans of paper media. However when FreeType 2.0 in 2000 began provide rendering of all major vectorised font formats, that specific advantage of DjVu began to erode.
In the 2000s, with the growth of the world wide web and before widespread adoption of broadband, DjVu was often adopted by digital libraries as their format of choice, thanks to its integration with software like Greenstone and the Internet Archive, browser plugins which allowed advanced online browsing, smaller file size for comparable quality of book scans and other image-heavy documents and support for embedding and searching full text from OCR. Some features such as the thumbnail previews were later integrated in the Internet Archive's BookReader and DjVu browsing was deprecated in its favour as around 2015 some major browsers stopped supporting Java applets and DjVu plugins with them.
The DjVu file format is based on the Interchange File Format and is composed of hierarchically organized chunks. The IFF structure is preceded by a 4-byte
AT&T magic number. Following is a single
FORM chunk with a secondary identifier of either
DJVM for a single-page or a multi-page document, respectively.
|Chunk identifier||Contained by||Description|
|FORM:DJVU||FORM:DJVM||Describes a single page. Can either be at the root of a document and be a single-page document or referred to from a |
|FORM:DJVM||N/A||Describes a multi-page document. Is the document's root chunk.|
|FORM:DJVI||FORM:DJVM||Contains data shared by multiple pages.|
|INFO||FORM:DJVU||Must be the first chunk. Describes the page width, height, format version, resolution, gamma, and rotation.|
|DIRM||FORM:DJVM||Must be the first chunk. References other |
|NAVM||FORM:DJVM||If present, must immediately follow the |
DjVu divides a single image into many different images, then compresses them separately. To create a DjVu file, the initial image is first separated into three images: a background image, a foreground image, and a mask image. The background and foreground images are typically lower-resolution color images (e.g., 100 dpi); the mask image is a high-resolution bilevel image (e.g., 300 dpi) and is typically where the text is stored. The background and foreground images are then compressed using a wavelet-based compression algorithm named IW44. The mask image is compressed using a method called JB2 (similar to JBIG2). The JB2 encoding method identifies nearly identical shapes on the page, such as multiple occurrences of a particular character in a given font, style, and size. It compresses the bitmap of each unique shape separately, and then encodes the locations where each shape appears on the page. Thus, instead of compressing a letter "e" in a given font multiple times, it compresses the letter "e" once (as a compressed bit image) and then records every place on the page it occurs.
Optionally, these shapes may be mapped to UTF-8 codes (either by hand or potentially by a text recognition system) and stored in the DjVu file. If this mapping exists, it is possible to select and copy text.
Since JBIG2 was based on JB2, both compression methods have the same problems when performing lossy compression. Numbers may be substituted with similarly looking numbers (such as replacing 6 with 8) if the text was scanned at a low resolution prior to lossy compression.
DjVu is an open file format with patents. The file format specification is published, as well as source code for the reference library. The original authors distribute an open-source implementation named "DjVuLibre" under the GNU General Public License. The rights to the commercial development of the encoding software have been transferred to different companies over the years, including AT&T Corporation, LizardTech, Celartem and Cuminas.
Despite its advantages, DjVu is not widely supported by scanning and viewing software. While viewers can be downloaded, opening DjVu files is not implemented in most operating systems by default.
In 2002, the DjVu file format was chosen by the Internet Archive as a format in which its Million Book Project provides scanned public-domain books online (along with TIFF and PDF). In February 2016, the IA announced that DjVu would no longer be used for new uploads.
ABBYY FineReader is an optical character recognition (OCR) application developed by ABBYY.
The program allows the conversion of image documents (photos, scans, PDF files) into editable electronic formats. In particular, Microsoft Word, Microsoft Excel, Microsoft PowerPoint, Rich Text Format, HTML, PDF/A, searchable PDF, CSV and txt (plain text) files. Starting with version 11 files can be saved in the DjVu format. Version 14 supports recognition of text in 192 languages and has a built-in spell check for 48 of them.
There are more than 20 million users of ABBYY FineReader worldwide.
Based on FineReader optical character recognition, ABBYY licenses the technology to several companies such as Fujitsu, Panasonic, Xerox, Samsung and others.
Version 12 of the software has received an "Excellent" rating by PC Magazine.Antarctic Floristic Kingdom
The Antarctic Floristic Kingdom, also the Holantarctic Kingdom, is a floristic kingdom. It includes most areas of the world south of 40°S latitude. It was first identified by botanist Ronald Good, and later by Armen Takhtajan. The Antarctic Floristic Kingdom is a classification in phytogeography, different from the Antarctic ecozone classification in biogeography, and from Antarctic flora genera/species classifications in botany.Chief Justice of the Leeward Islands
The Chief Justice of the Leeward Islands headed the Supreme Court of the Leeward Islands.
The British Leeward Islands was a British colony existing between 1833 and 1960, and consisted of Antigua, Barbuda, the British Virgin Islands, Montserrat, Saint Kitts, Nevis, Anguilla and Dominica (to 1940). Prior to 1871, when the Supreme Court was established, the individual islands had their own courts.
In 1939 the Windward and Leeward Islands Supreme Court and the Windward and Leeward Islands Court of Appeal were established, which was replaced in 1967 by the Eastern Caribbean Supreme Court which provides both functions.Comparison of e-book formats
The following is a comparison of e-book formats used to create and publish e-books.
The EPUB format is the most widely supported vendor-independent XML-based (as opposed to PDF) e-book format; that is, it is supported by the largest number of e-Readers, including Amazon Kindle Fire (but not standard Kindle). See table below for details.Dean of Gloucester
The Dean of Gloucester is the head (primus inter pares: first among equals) and chair of the chapter of canons, the ruling body of Gloucester Cathedral. The dean and chapter are based at the Cathedral Church of St Peter and the Holy and Indivisible Trinity in Gloucester. The cathedral is the mother church of the Diocese of Gloucester and seat of the Bishop of Gloucester. The current dean is Stephen Lake.Dean of Hereford
The Dean of Hereford is the head (primus inter pares – first among equals) and chair of the chapter of canons, the ruling body of Hereford Cathedral. The dean and chapter are based at the Cathedral Church of Blessed Virgin Mary and St Ethelbert in Hereford. The cathedral is the mother church of the Diocese of Hereford and seat of the Bishop of Hereford. The current dean is Michael Tavinor.Evince
Evince () is a document viewer for PDF, PostScript, DjVu, TIFF, XPS and DVI formats. It was designed for the GNOME desktop environment.The developers of Evince intended to replace the multiple GNOME document viewers with a single and simple application. The Evince motto sums up the project aim: "Simply a Document Viewer".GNOME releases have included Evince since GNOME 2.12 (September 2005). Evince code consists mainly of C, with a small part (the code that interfaces with Poppler) written in C++. A large number of Linux distributions – including Ubuntu, Fedora and Linux Mint – include Evince as the default document-viewer.
Evince is free and open-source software subject to the requirements of the GNU General Public License version 2 or later.
The Evince FAQ highlights the meaning of the word "Evince" as "to show or express something clearly".Geographical Dictionary of the Kingdom of Poland
The Geographical Dictionary of the Kingdom of Poland and other Slavic Countries (Polish: Słownik geograficzny Królestwa Polskiego i innych krajów słowiańskich) is a monumental Polish gazetteer, published 1880–1902 in Warsaw by Filip Sulimierski, Bronisław Chlebowski, Władysław Walewski and others.List of Deans of St Asaph
This is a list of the deans of St Asaph Cathedral, Wales.
-1357 Llywelyn ap Madog
1357–1376 William Spridlington
1403 Richard Courtenay (afterwards Dean of Wells, 1410)
1463-1492 John Tapton
1511-1542 Fouke Salisbury
1543-1556 Richard Puskyn
1556-c.1558 John Gruffith
c.1559 Maurice Blayne, alias Gruffith
1559 John Lloyd
1560-1587 Hugh Evans
1587-1634 Thomas Banks
1634-before 1654 Andrew Morris
1660-1663 David Lloyd
1663 Humphrey Lloyd
1674-1689 Nicholas Stratford
1689-1696 George Bright
1696-1706 Daniel Price
1706-1731 William Stanley
1731-1751 William Powell
1751-1774 William Herring
1774-1826 William Shipley
1826-1854 Charles Luxmoore
1886-1889 Armitage James
1889-1892 John Owen
1892-1899 Watkin Williams
1899–1910 Shadrach Pryce
1910–1927 Llewelyn Wynne Jones
1927–1938 John Du Buisson
1938–1957 Spencer Ellis
1957–1971 Harold Charles
1971–1992 Raymond Renowden
1993–2001 Kerry Goulstone
2001-2011 Chris Potter
2011-Present Nigel WilliamsLéon Bottou
Léon Bottou (born 1965) is a researcher best known for his work in machine learning and data compression. His work presents stochastic gradient descent as a fundamental learning algorithm. He is also one of the main creators of the DjVu image compression technology (together with Yann LeCun and Patrick Haffner), and the maintainer of DjVuLibre, the open source implementation of DjVu. He is the original developer of the Lush programming language.Pantropical
A pantropical ("all tropics") distribution is one which covers tropical regions of both hemispheres. Examples include the plant genera Acacia and Bacopa.Neotropical is a zoogeographic term that covers a large part of the Americas, roughly from Mexico and the Caribbean southwards (including cold regions in southernmost South America).
Palaeotropical refers to geographical occurrence. For a distribution to be palaeotropical a taxon must occur in tropical regions in the Old World.
According to Takhtajan (1978), the following families have a pantropical distribution:
Annonaceae, Hernandiaceae, Lauraceae, Piperaceae, Urticaceae, Dilleniaceae, Tetrameristaceae, Passifloraceae, Bombacaceae, Euphorbiaceae, Rhizophoraceae, Myrtaceae, Anacardiaceae, Sapindaceae, Malpighiaceae, Proteaceae, Bignoniaceae, Orchidaceae and Arecaceae.Peter Tait (physicist)
Peter Guthrie Tait FRSE (28 April 1831 – 4 July 1901) was a Scottish mathematical physicist and early pioneer in thermodynamics. He is best known for the mathematical physics textbook Treatise on Natural Philosophy, which he co-wrote with Kelvin, and his early investigations into knot theory,
His work on knot theory contributed to the eventual formation of topology as a mathematical discipline. His name is known in graph theory mainly for Tait's conjecture.Prophecy in the Seventh-day Adventist Church
Seventh-day Adventists believe that Ellen G. White, one of the church's co-founders, was a prophetess, understood today as an expression of the New Testament spiritual gift of prophecy.Seventh-day Adventist believe that White had the spiritual gift of prophecy, but that her writings are inferior to the Bible, which has ultimate authority. According to the 28 Fundamentals the core set of theological beliefs held by the Seventh-day Adventist Church, states that Adventists accept the Bible as their only creed and can be read online on the website of the Seventh-day Adventist Church.The 18 of the 28 Fundamentals states the Adventists viewpoint on the Gift of Prophecy:
"One of the gifts of the Holy Spirit is prophecy. This gift is an identifying mark of the remnant church and was manifested in the ministry of Ellen. G. White . As the Lord's messenger, her writings are a continuing and authoritative source of truth which provide for the church comfort, guidance, instruction, and correction. They also make clear that the Bible is the standard by which all teaching and experience must be tested. (Joel 2:28, 29; Acts 2:14-21; Heb. 1:1-3; Rev. 12:17; 19:10.)."According to one church document, "her expositions on any given Bible passage offer an inspired guide to the meaning of texts without exhausting their meaning or preempting the task of exegesis". In other words, White's writings are considered an inspired commentary on Scripture, although Scripture remains ultimately authoritative.
Adventist believe she had the spiritual gift of prophecy as outlined in Revelation 19:10. Her restorationist writings endeavor to showcase the hand of God in Christian history. This cosmic conflict, referred to as the "Great Controversy theme", is foundational to the development of Seventh-day Adventist theology.Questions on Doctrine
Seventh-day Adventists Answer Questions on Doctrine (generally known by the shortened title Questions on Doctrine, abbreviated QOD) is a book published by the Seventh-day Adventist Church in 1957 to help explain Adventism to conservative Protestants and Evangelicals. The book generated greater acceptance of the Adventist church within the evangelical community, where it had previously been widely regarded as a cult. However, it also proved to be one of the most controversial publications in Adventist history and the release of the book brought prolonged alienation and separation within Adventism and evangelicalism.
Although no authors are listed on the title of the book (credit is given to "a representative group" of Adventist "leaders, Bible teachers and editors"), the primary contributors to the book were Le Roy Edwin Froom, Walter E. Read, and Roy Allan Anderson (sometimes referred to as "FREDA").
In Adventist culture, the phrase Questions on Doctrine has come to encompass not only the book itself but also the history leading up to its publication and the prolonged theological controversy which it sparked. This article covers all of these facets of the book's history and legacy.STDU Explorer
STDU Explorer is a file manager for previewing and managing PDF, DjVu, Comic Book Archive (CBR or CBZ), XPS and image file formats such as BMP, GIF, JPEG, PNG, PSD and WMF. It works under Microsoft Windows, and is free for non-commercial use.STDU Viewer
STDU Viewer is computer software, a compact viewer for many computer file formats: Portable Document Format (PDF), World Wide Fund for Nature (WWF), DjVu, comic book archive (CBR or CBZ), FB2, ePUB, XML Paper Specification (XPS), Text Compression for Reader (TCR), Mobipocket (MOBI), AZW, multi-page TIFF, text file (TXT), PalmDoc (PDB), Windows Metafile (EMF), Windows Metafile (WMF), bitmap (BMP), Graphics Interchange Format (GIF), JPEG-JPG, Portable Network Graphics (PNG), Photoshop Document (PSD), PiCture eXchang (PCX-DCX). It works under Microsoft Windows, and is free for non-commercial use.STDU viewer is developed in the programming language C++.Sumatra PDF
Sumatra PDF is a free and open-source document viewer that supports many document formats including: Portable Document Format (PDF), Microsoft Compiled HTML Help (CHM), DjVu, EPUB, FictionBook (FB2), MOBI, PRC, Open XML Paper Specification (OpenXPS, OXPS, XPS), and Comic Book Archive file (CB7, CBR, CBT, CBZ). If Ghostscript is installed, it supports PostScript files. It is developed exclusively for Microsoft Windows, but it can run under Linux using Wine.Timarit.is
Timarit.is (also known as Tímarit.is, Tidarrit.fo and Aviisitoqqat.gl) is an open access digital library run by the National and University Library of Iceland which hosts digital editions of newspapers and magazines published in Iceland, Faroe Islands and Greenland as well as publications in their languages elsewhere, such as Canada which had a large influx of Icelanders in the late 19th and early 20th centuries. The project was initially sponsored by the West Nordic Council and launched its web interface under the title VESTNORD in 2002. The web interface has since undergone two major revisions, in 2003 and 2008. With the last revision a decision was made to gradually convert images from the DjVu image format to the more common PDF. Hence, part of the collection can be viewed with the DjVu plugin and part with a PDF reader.
The digital collection covers material from the 17th century to the early 21st century and offers users the ability to collect bookmarks on their free account for ease of use as well as do a text search on the majority of the collection. As of February 2009 there were more than 2,6 million images in the archive of which 2 million had been OCRed.
Initially the aim was to limit access to newspapers published before 1930 to avoid questions of copyright but shortly afterwards the project made an agreement with Morgunblaðið to scan and publish all of their issues to the year 2000. This agreement was followed with others involving both current and defunct newspapers published in the 20th century. Newspapers published after 2000 are usually sent to the library in digital format. The general rule, depending on agreements with each publisher, is to make these available 2–3 years after their initial publication.Zathura (document viewer)
Zathura is a free, plugin-based document viewer. Plugins are available for PDF (via poppler or MuPDF), PostScript, DjVu, and EPUB. It was written to be lightweight and controlled with vi-like keybindings. Zathura's customizability makes it well-liked by many Linux users.Zathura has a mature, well-established codebase and a large development team. It has official packages available in Arch Linux,Debian,Fedora,Gentoo,OpenBSD,OpenSUSE,Source Mage GNU/Linux,Ubuntu,
and an unofficial macOS package provided by MacPorts.Zathura was named after the 2005 film Zathura: A Space Adventure.