Project Gutenberg (PG) is a volunteer effort to digitize and archive cultural works, to "encourage the creation and distribution of eBooks". It was founded in 1971 by American writer Michael S. Hart and is the oldest digital library. Most of the items in its collection are the full texts of public domain books. The project tries to make these as free as possible, in long-lasting, open formats that can be used on almost any computer. As of 23 June 2018, Project Gutenberg reached 57,000 items in its collection of free eBooks.
The releases are available in plain text but, wherever possible, other formats are included, such as HTML, PDF, EPUB, MOBI, and Plucker. Most releases are in the English language, but many non-English works are also available. There are multiple affiliated projects that are providing additional content, including regional and language-specific works. Project Gutenberg is also closely affiliated with Distributed Proofreaders, an Internet-based community for proofreading scanned texts.
|Established||December 1, 1971|
(first document posted)
|Size||Over 57,000 documents|
|Website||Project Gutenberg Home Page |
Gutenberg Mobile Site
Project Gutenberg was started by Michael Hart in 1971 with the digitization of the United States Declaration of Independence. Hart, a student at the University of Illinois, obtained access to a Xerox Sigma V mainframe computer in the university's Materials Research Lab. Through friendly operators, he received an account with a virtually unlimited amount of computer time; its value at that time has since been variously estimated at $100,000 or $100,000,000. Hart has said he wanted to "give back" this gift by doing something that could be considered to be of great value. His initial goal was to make the 10,000 most consulted books available to the public at little or no charge, and to do so by the end of the 20th century.
This particular computer was one of the 15 nodes on ARPANET, the computer network that would become the Internet. Hart believed that computers would one day be accessible to the general public and decided to make works of literature available in electronic form for free. He used a copy of the United States Declaration of Independence in his backpack, and this became the first Project Gutenberg e-text. He named the project after Johannes Gutenberg, the fifteenth century German printer who propelled the movable type printing press revolution.
By the mid-1990s, Hart was running Project Gutenberg from Illinois Benedictine College. More volunteers had joined the effort. All of the text was entered manually until 1989 when image scanners and optical character recognition software improved and became more widely available, which made book scanning more feasible. Hart later came to an arrangement with Carnegie Mellon University, which agreed to administer Project Gutenberg's finances. As the volume of e-texts increased, volunteers began to take over the project's day-to-day operations that Hart had run.
Starting in 2004, an improved online catalog made Project Gutenberg content easier to browse, access and hyperlink. Project Gutenberg is now hosted by ibiblio at the University of North Carolina at Chapel Hill.
Italian volunteer Pietro Di Miceli developed and administered the first Project Gutenberg website and started the development of the Project online Catalog. In his ten years in this role (1994–2004), the Project web pages won a number of awards, often being featured in "best of the Web" listings, and contributing to the project's popularity.
Hart died on 6 September 2011 at his home in Urbana, Illinois at the age of 64.
In 2000, a non-profit corporation, the Project Gutenberg Literary Archive Foundation, Inc. was chartered in Mississippi, United States to handle the project's legal needs. Donations to it are tax-deductible. Long-time Project Gutenberg volunteer Gregory Newby became the foundation's first CEO.
Also in 2000, Charles Franks founded Distributed Proofreaders (DP), which allowed the proofreading of scanned texts to be distributed among many volunteers over the Internet. This effort greatly increased the number and variety of texts being added to Project Gutenberg, as well as making it easier for new volunteers to start contributing. DP became officially affiliated with Project Gutenberg in 2002. As of 2018, the 36,000+ DP-contributed books comprised almost two-thirds of the nearly 57,000 books in Project Gutenberg.
In August 2003, Project Gutenberg created a CD containing approximately 600 of the "best" e-books from the collection. The CD is available for download as an ISO image. When users are unable to download the CD, they can request to have a copy sent to them, free of charge.
In December 2003, a DVD was created containing nearly 10,000 items. At the time, this represented almost the entire collection. In early 2004, the DVD also became available by mail.
In July 2007, a new edition of the DVD was released containing over 17,000 books, and in April 2010, a dual-layer DVD was released, containing nearly 30,000 items.
The majority of the DVDs, and all of the CDs mailed by the project, were recorded on recordable media by volunteers. However, the new dual layer DVDs were manufactured, as it proved more economical than having volunteers burn them. As of October 2010, the project has mailed approximately 40,000 discs. As of 2017, the delivery of free CDs has been discontinued, though the ISO image is still available for download.
As of August 2015, Project Gutenberg claimed over 57,000 items in its collection, with an average of over 50 new e-books being added each week. These are primarily works of literature from the Western cultural tradition. In addition to literature such as novels, poetry, short stories and drama, Project Gutenberg also has cookbooks, reference works and issues of periodicals. The Project Gutenberg collection also has a few non-text items such as audio files and music-notation files.
Most releases are in English, but there are also significant numbers in many other languages. As of April 2016, the non-English languages most represented are: French, German, Finnish, Dutch, Italian, and Portuguese.
Whenever possible, Gutenberg releases are available in plain text, mainly using US-ASCII character encoding but frequently extended to ISO-8859-1 (needed to represent accented characters in French and Scharfes s in German, for example). Besides being copyright-free, the requirement for a Latin (character set) text version of the release has been a criterion of Michael Hart's since the founding of Project Gutenberg, as he believes this is the format most likely to be readable in the extended future. Out of necessity, this criterion has had to be extended further for the sizable collection of texts in East Asian languages such as Chinese and Japanese now in the collection, where UTF-8 is used instead.
Other formats may be released as well when submitted by volunteers. The most common non-ASCII format is HTML, which allows markup and illustrations to be included. Some project members and users have requested more advanced formats, believing them to be much easier to read. But some formats that are not easily editable, such as PDF, are generally not considered to fit in with the goals of Project Gutenberg. Also Project Gutenberg has two options for master formats that can be submitted (from which all other files are generated): customized versions of the Text Encoding Initiative standard (since 2005) and reStructuredText (since 2011).
Michael Hart said in 2004, "The mission of Project Gutenberg is simple: 'To encourage the creation and distribution of ebooks'". His goal was, "to provide as many e-books in as many formats as possible for the entire world to read in as many languages as possible". Likewise, a project slogan is to "break down the bars of ignorance and illiteracy", because its volunteers aim to continue spreading public literacy and appreciation for the literary heritage just as public libraries began to do in the late 19th century.
Project Gutenberg is intentionally decentralized. For example, there is no selection policy dictating what texts to add. Instead, individual volunteers work on what they are interested in, or have available. The Project Gutenberg collection is intended to preserve items for the long term, so they cannot be lost by any one localized accident. In an effort to ensure this, the entire collection is backed-up regularly and mirrored on servers in many different locations.
Project Gutenberg is careful to verify the status of its ebooks according to United States copyright law. Material is added to the Project Gutenberg archive only after it has received a copyright clearance, and records of these clearances are saved for future reference. Project Gutenberg does not claim new copyright on titles it publishes. Instead, it encourages their free reproduction and distribution.
Most books in the Project Gutenberg collection are distributed as public domain under United States copyright law. There are also a few copyrighted texts, like of science fiction author Cory Doctorow, that Project Gutenberg distributes with permission. These are subject to further restrictions as specified by the copyright holder, although they generally tend to be licensed under Creative Commons.
"Project Gutenberg" is a trademark of the organization, and the mark cannot be used in commercial or modified redistributions of public domain texts from the project. There is no legal impediment to the reselling of works in the public domain if all references to Project Gutenberg are removed, but Gutenberg contributors have questioned the appropriateness of directly and commercially reusing content that has been formatted by volunteers. There have been instances of books being stripped of attribution to the project and sold for profit in the Kindle Store and other booksellers, one being the 1906 book Fox Trapping.
With the U.S. annual copyright term set to expire in 2019, items published in 1923 will be added to the public domain effective January 1, 2019.
As of 28 February 2018, Project Gutenberg is no longer accessible within Germany to comply with a court order from S. Fischer Verlag regarding the works of Heinrich Mann, Thomas Mann and Alfred Döblin. Although they were public domain in the United States, the court recognized the infringement of copyrights still active in Germany, and asserted that the Project Gutenberg website was under German jurisdiction because it hosts content in the German language.
The text files use the format of plain text encoded in UTF-8 and wrapped at 65–70 characters, with paragraphs separated by a double line break. In recent decades, the resulting relatively bland appearance and the lack of a markup possibility have often been perceived as a drawback of this format. Project Gutenberg attempts to address this by making many texts available in HTML, ePub, and PDF versions as well, but faithful to the mission of offering data that is easy to handle with computer code, plain ASCII text remains the most important format, and the ePub version still contains extra line breaks between paragraphs.
In December 1994, Project Gutenberg was criticized by the Text Encoding Initiative for failing to include apparatus (documentation) of the decisions unavoidable in preparing a text, or in some cases, documenting which of several (conflicting) versions of a text has been the one digitized.
The selection of works (and editions) available has been determined by popularity, ease of scanning, being out of copyright, and other factors; this would be difficult to avoid in any crowd-sourced project.
In March 2004, a new initiative was begun by Michael Hart and John S. Guagliardo to provide low-cost intellectual properties. The initial name for this project was Project Gutenberg 2 (PG II), which created controversy among PG volunteers because of the re-use of the project's trademarked name for a commercial venture.
All affiliated projects are independent organizations that share the same ideals and have been given permission to use the Project Gutenberg trademark. They often have a particular national or linguistic focus.
You can view or edit ASCII text using just about every text editor or viewer in the world. [...] Unicode is steadily gaining ground, with at least some support in every major operating system, but we're nowhere near the point where everyone can just open a text based on Unicode and read and edit it.
A Short Biographical Dictionary of English Literature is a collection of biographies of writers by John William Cousin (1849–1910), published in 1910. Most of the entries consist of only one paragraph but some entries, like William Shakespeare's, are quite lengthy.The book was the 5,000th e-book provided by the Distributed Proofreaders project to Project Gutenberg, where it was released on August 21, 2004.Allan Quatermain
Allan Quatermain is the protagonist of H. Rider Haggard's 1885 novel King Solomon's Mines and its sequels. Allan Quatermain was also the title of a book in this sequence. An English big game hunter and adventurer, in film and television he has been portrayed by Richard Chamberlain, Sean Connery, Cedric Hardwicke, Patrick Swayze and Stewart Granger among others.Ann Veronica
Ann Veronica is a New Woman novel by H. G. Wells published in 1909.
Ann Veronica describes the rebellion of Ann Veronica Stanley, "a young lady of nearly two-and-twenty", against her middle-class father's stern patriarchal rule. The novel dramatizes the contemporary problem of the New Woman. It is set in Victorian era London and environs, except for an Alpine excursion. Ann Veronica offers vignettes of the Women's suffrage movement in Great Britain and features a chapter inspired by the 1908 attempt of suffragettes to storm Parliament.Cape Hawke
Cape Hawke (32°12′S 152°34′E) is a coastal headland in Australia on the New South Wales coast, just south of Forster/Tuncurry and within the Booti Booti National Park.
The cape was named by Captain Cook when he passed it on his Endeavour voyage on 12 May 1770, honoring Edward Hawke who was First Lord of the Admiralty.Distributed Proofreaders Canada
Distributed Proofreaders Canada (DP Canada) is a volunteer organization that converts books into digital format and releases them as public domain books in formats readable by electronic devices. It was launched in December 2007 and as of 2018 has published about 4,200 books. Books that are released are stored on a book archive called Faded Page. While its focus is on Canadian publications and preserving Canadiana, it also includes books from other countries as well. It is modelled after Distributed Proofreaders, and performs the same function as similar projects in other parts of the world such as Project Gutenberg in the United States and Project Gutenberg Australia.Duck
Duck is the common name for a large number of species in the waterfowl family Anatidae which also includes swans and geese. Ducks are divided among several subfamilies in the family Anatidae; they do not represent a monophyletic group (the group of all descendants of a single common ancestral species) but a form taxon, since swans and geese are not considered ducks. Ducks are mostly aquatic birds, mostly smaller than the swans and geese, and may be found in both fresh water and sea water.
Ducks are sometimes confused with several types of unrelated water birds with similar forms, such as loons or divers, grebes, gallinules, and coots.Elements of art
A work of art can be analyzed by considering a variety of aspects of it individually. These aspects are often called the elements of art. A commonly used list of the main elements include form, shape, line, color, value, space and texture.Encyclopædia Britannica, Eleventh Edition
The Encyclopædia Britannica, Eleventh Edition (1910–11) is a 29-volume reference work, an edition of the Encyclopædia Britannica. It was developed during the encyclopaedia's transition from a British to an American publication. Some of its articles were written by the best-known scholars of the time. This edition of the encyclopedia, containing 40,000 entries, is now in the public domain, and many of its articles have been used as a basis for articles in Wikipedia. However, the outdated nature of some of its content makes its use as a source for modern scholarship problematic. Some articles have special value and interest to modern scholars as cultural artifacts of the 19th and early 20th centuries.His Last Bow
His Last Bow: Some Reminiscences of Sherlock Holmes is a collection of previously published Sherlock Holmes stories by Arthur Conan Doyle, including the titular short story, "His Last Bow. The War Service of Sherlock Holmes" (1917). The collection's first US edition adjusts the anthology's subtitle to Some Later Reminiscences of Sherlock Holmes. All editions contain a brief preface, by "John H. Watson, M.D.", that assures readers that as of the date of publication (1917), Holmes is long retired from his profession of detective but is still alive and well, albeit suffering from a touch of rheumatism.Hyrrokkin
In Norse mythology, Hyrrokkin ("Fire-Smoked", possibly referring to a dark, shrivelled appearance) is a giantess. She appears to be depicted on one of the surviving stones from the Hunnestad Monument near Marsvinsholm, Sweden called DR 284.James Wood (encyclopaedist)
The Reverend James Wood (12 October 1820 – 17 March 1901) was a Scottish editor and Free Church minister. He was born in Leith and studied at the University of Edinburgh, living most of his life in Edinburgh. His admiration for Thomas Carlyle and John Ruskin may have contributed to his failure to secure a ministry. Instead he earned a living as a writer. He translated Auguste Barth's Religions of India and edited Nuttall's Standard Dictionary, The Nuttall Encyclopaedia, Warne's Dictionary of Quotations (later titled Nuttall's Dictionary of Quotations), Bagster & Sons' Helps to the Bible, and a Carlyle School Reader. In 1881 he published anonymously The Strait Gate, and Other Discourses, with a Lecture on Thomas Carlyle, by a Scotch Preacher. He is described by P. J. E. Wilson as " that most conscientious of pedants".Project Gutenberg Australia
Project Gutenberg Australia, abbreviated as PGA, is an Internet site which was founded in 2001 by Colin Choat. It is a sister site of Project Gutenberg, though there is no formal relationship between the two organizations. The site hosts free ebooks or e-texts which are in the public domain in Australia. Volunteers have prepared and submitted the ebooks.
To complement the extensive amount of original source material available in the form of ebooks, a great deal of information about the history and the exploration of Australia is provided, together with a "Library of Australiana", a list of ebooks available about Australia or written by Australians.
Because of differences between Australian and United States (where Project Gutenberg is based) copyright law, Project Gutenberg Australia contains many works not available in Project Gutenberg, including works by Margaret Mitchell, George Orwell, Ayn Rand, H. P. Lovecraft, Edgar Wallace, S. S. Van Dine and Dylan Thomas.
With the introduction of the U.S.-Australia Free Trade Agreement, works of authors who died after 31 December 1954 will now not enter the public domain in Australia until at least 1 January 2026. However, all such works which were already public domain under Australian law as of the end of 2004 remain in the public domain, and thus continue to be hosted at Project Gutenberg of Australia.Project Gutenberg Canada
Project Gutenberg Canada, also known as Project Gutenburg of Canada, is a Canadian digital library founded July 1, 2007. Their website allows Canadian residents to create e-texts and download books that are otherwise not in the public domain in other countries.
It is not formally affiliated with the original Project Gutenberg, though both share the common objective of making public domain books available for free to the general public as e-books. Project Gutenburg Canada primarily focuses on works by Canadian authors or about Canada, as well as works in Canadian French.Distributed Proofreaders Canada began contributing ebooks to Project Gutenberg Canada when launched on December 1, 2007.Rat-catcher
A rat-catcher is a person who practices rat-catching as a professional form of pest control.
Keeping the rat population under control was practiced in Europe to prevent the spread of diseases, most notoriously the Black Plague, and to prevent damage to food supplies.
In modern developed countries, such a professional is otherwise known as a pest control operative or pest technician.Sidney Lee
Sir Sidney Lee (5 December 1859 – 3 March 1926) was an English biographer, writer and critic.Slave Narrative Collection
Slave Narratives: A Folk History of Slavery in the United States (often referred to as the WPA Slave Narrative Collection) was a massive compilation of histories by former slaves undertaken by the Federal Writers' Project of the Works Progress Administration from 1936 to 1938. It was the simultaneous effort of state-level branches of FWP in seventeen states, working largely separately from each other. The collections, as works of the US federal government, are in the public domain. The collection has been digitized and is available online. In addition, excerpts have been published by various publishers as printed books or on the Internet. The total collection contains more than 10,000 typed pages, representing more than 2000 interviews. The Library of Congress also has a digitized collection of recordings that were sometimes made during these interviews.Tarzan the Invincible
Tarzan the Invincible is a novel by American writer Edgar Rice Burroughs, the fourteenth in his series of books about the title character Tarzan. The novel was originally serialized in the magazine Blue Book from October, 1930 through April, 1931 as Tarzan, Guard of the Jungle.The Head of Kay's
The Head of Kay's is a novel by English author P.G. Wodehouse.William Smith (lexicographer)
Sir William Smith (20 May 1813 – 7 October 1893) was an English lexicographer. He also made advances in the teaching of Greek and Latin in schools.