Tag (metadata)

In information systems, a tag is a keyword or term assigned to a piece of information (such as an Internet bookmark, digital image, database record, or computer file). This kind of metadata helps describe an item and allows it to be found again by browsing or searching.[1] Tags are generally chosen informally and personally by the item's creator or by its viewer, depending on the system, although they may also be chosen from a controlled vocabulary.[2]:68

Tagging was popularized by websites associated with Web 2.0 and is an important feature of many Web 2.0 services.[2][3] It is now also part of other database systems, desktop applications, and operating systems.[4]

Web 2.0 Map
A tag cloud with terms related to Web 2.0


People use tags to aid classification, mark ownership, note boundaries, and indicate online identity. Tags may take the form of words, images, or other identifying marks. An analogous example of tags in the physical world is museum object tagging. People were using textual keywords to classify information and objects long before computers. Computer based search algorithms made the use of such keywords a rapid way of exploring records.

Tagging gained popularity due to the growth of social bookmarking, image sharing, and social networking websites.[2] These sites allow users to create and manage labels (or "tags") that categorize content using simple keywords. Websites that include tags often display collections of tags as tag clouds,[5] as do some desktop applications.[6] On websites that aggregate the tags of all users, an individual user's tags can be useful both to them and to the larger community of the website's users.

Tagging systems have sometimes been classified into two kinds: top-down and bottom-up.[3]:142[4]:24 Top-down taxonomies are created by an authorized group of designers (sometimes in the form of a controlled vocabulary), whereas bottom-up taxonomies (called folksonomies) are created by all users.[3]:142 This definition of "top down" and "bottom up" should not be confused with the distinction between a single hierarchical tree structure (in which there is one correct way to classify each item) versus multiple non-hierarchical sets (in which there are multiple ways to classify an item); the structure of both top-down and bottom-up taxonomies may be either hierarchical, non-hierarchical, or a combination of both.[3]:142–143 Some researchers and applications have experimented with combining hierarchical and non-hierarchical tagging to aid in information retrieval.[7][8][9] Others are combining top-down and bottom-up tagging,[10] including in some large library catalogs (OPACs) such as WorldCat.[11][12]:74[13][14]

When tags or other taxonomies have further properties (or semantics) such as relationships and attributes, they constitute an ontology.[3]:56–62

Metadata tags as described in this article should not be confused with the use of the word "tag" in some software to refer to an automatically generated cross-reference; examples of the latter are tags tables in Emacs[15] and smart tags in Microsoft Office.[16]


The use of keywords as part of an identification and classification system long predates computers. Paper data storage devices, notably edge-notched cards, that permitted classification and sorting by multiple criteria were already in use prior to the twentieth century, and faceted classification has been used by libraries since the 1930s.

In the late 1970s and early 1980s, the Unix text editor Emacs offered a companion software program called Tags that could automatically build a table of cross-references called a tags table that Emacs could use to jump between a function call and that function's definition.[17] This use of the word "tag" did not refer to metadata tags, but was an early use of the word "tag" in software to refer to a word index.

Online databases and early websites deployed keyword tags as a way for publishers to help users find content. In the early days of the World Wide Web, the keywords meta element was used by web designers to tell web search engines what the web page was about, but these keywords were only visible in a web page's source code and were not modifiable by users.

A Description of the Equator and Some Otherlands, collaborative hypercinema portal Upload page
"A Description of the Equator and Some ØtherLands", collaborative hypercinema portal, produced by documenta X, 1997. User upload page associating user contributed media with the term Tag.

In 1997, the collaborative portal "A Description of the Equator and Some ØtherLands" produced by documenta X, Germany, used the folksonomic term Tag for its co-authors and guest authors on its Upload page.[18] In "The Equator" the term Tag for user-input was described as an abstract literal or keyword to aid the user. However, users defined singular Tags, and did not share Tags at that point.

In 2003, the social bookmarking website Delicious provided a way for its users to add "tags" to their bookmarks (as a way to help find them later);[2]:162 Delicious also provided browseable aggregated views of the bookmarks of all users featuring a particular tag.[19] Within a couple of years, the photo sharing website Flickr allowed its users to add their own text tags to each of their pictures, constructing flexible and easy metadata that made the pictures highly searchable.[20] The success of Flickr and the influence of Delicious popularized the concept,[21] and other social software websites—such as YouTube, Technorati, and Last.fm—also implemented tagging.[22] In 2005, the Atom web syndication standard provided a "category" element for inserting subject categories into web feeds, and in 2007 Tim Bray proposed a "tag" URN.[23]


Within a blog

Many blog systems (and other web content management systems) allow authors to add free-form tags to a post, along with (or instead of) placing the post into a predetermined category.[5] For example, a post may display that it has been tagged with baseball and tickets. Each of those tags is usually a web link leading to an index page listing all of the posts associated with that tag. The blog may have a sidebar listing all the tags in use on that blog, with each tag leading to an index page. To reclassify a post, an author edits its list of tags. All connections between posts are automatically tracked and updated by the blog software; there is no need to relocate the page within a complex hierarchy of categories.

Within application software

Some desktop applications and web applications feature their own tagging systems, such as email tagging in Gmail and Mozilla Thunderbird,[12]:73 bookmark tagging in Firefox,[24] audio tagging in iTunes or Winamp, and photo tagging in various applications.[25] Some of these applications display collections of tags as tag clouds.[6]

Assigned to computer files

There are various systems for applying tags to the files in a computer's file system. In Apple's macOS, the operating system has allowed users to assign multiple arbitrary tags as extended file attributes to any file or folder ever since OS X 10.9 was released in 2013,[26] and before that time the open-source OpenMeta standard provided similar tagging functionality in macOS.[27] Several semantic file systems that implement tags are available for the Linux kernel, including Tagsistant.[28] Microsoft Windows allows users to set tags only on Microsoft Office documents and some kinds of picture files.[29]

Cross-platform file tagging standards include Extensible Metadata Platform (XMP), an ISO standard for embedding metadata into popular image, video and document file formats, such as JPEG and PDF, without breaking their readability by applications that do not support XMP.[30] XMP largely supersedes the earlier IPTC Information Interchange Model. Exif is a standard that specifies the image and audio file formats used by digital cameras, including some metadata tags.[31] TagSpaces is an open-source cross-platform application for tagging files; it inserts tags into the filename.[32]

For an event

An official tag is a keyword adopted by events and conferences for participants to use in their web publications, such as blog entries, photos of the event, and presentation slides.[33] Search engines can then index them to make relevant materials related to the event searchable in a uniform way. In this case, the tag is part of a controlled vocabulary.

In research

A researcher may work with a large collection of items (e.g. press quotes, a bibliography, images) in digital form. If he/she wishes to associate each with a small number of themes (e.g. to chapters of a book, or to sub-themes of the overall subject), then a group of tags for these themes can be attached to each of the items in the larger collection.[34] In this way, freeform classification allows the author to manage what would otherwise be unwieldy amounts of information.[35]

Special types

Triple tags

A triple tag or machine tag uses a special syntax to define extra semantic information about the tag, making it easier or more meaningful for interpretation by a computer program.[36] Triple tags comprise three parts: a namespace, a predicate, and a value. For example, geo:long=50.123456 is a tag for the geographical longitude coordinate whose value is 50.123456. This triple structure is similar to the Resource Description Framework model for information.

The triple tag format was first devised for geolicious in November 2004,[37] to map Delicious bookmarks, and gained wider acceptance after its adoption by Mappr and GeoBloggers to map Flickr photos.[38] In January 2007, Aaron Straup Cope at Flickr introduced the term machine tag as an alternative name for the triple tag, adding some questions and answers on purpose, syntax, and use.[39]

Specialized metadata for geographical identification is known as geotagging; machine tags are also used for other purposes, such as identifying photos taken at a specific event or naming species using binomial nomenclature.[40]


A hashtag is a kind of metadata tag marked by the prefix #, sometimes known as a "hash" symbol. This form of tagging is used on microblogging and social networking services such as Twitter, Facebook, Google+, VK and Instagram.

Knowledge tags

A knowledge tag is a type of meta-information that describes or defines some aspect of a piece of information (such as a document, digital image, database table, or web page).[41] Knowledge tags are more than traditional non-hierarchical keywords or terms; they are a type of metadata that captures knowledge in the form of descriptions, categorizations, classifications, semantics, comments, notes, annotations, hyperdata, hyperlinks, or references that are collected in tag profiles (a kind of ontology).[41] These tag profiles reference an information resource that resides in a distributed, and often heterogeneous, storage repository.[41]

Knowledge tags are part of a knowledge management discipline that leverages Enterprise 2.0 methodologies for users to capture insights, expertise, attributes, dependencies, or relationships associated with a data resource.[3]:251[42] Different kinds of knowledge can be captured in knowledge tags, including factual knowledge (that found in books and data), conceptual knowledge (found in perspectives and concepts), expectational knowledge (needed to make judgments and hypothesis), and methodological knowledge (derived from reasoning and strategies).[42] These forms of knowledge often exist outside the data itself and are derived from personal experience, insight, or expertise. Knowledge tags are considered an expansion of the information itself that adds additional value, context, and meaning to the information. Knowledge tags are valuable for preserving organizational intelligence that is often lost due to turnover, for sharing knowledge stored in the minds of individuals that is typically isolated and unharnessed by the organization, and for connecting knowledge that is often lost or disconnected from an information resource.[43]

Advantages and disadvantages

In a typical tagging system, there is no explicit information about the meaning or semantics of each tag, and a user can apply new tags to an item as easily as applying older tags.[2] Hierarchical classification systems can be slow to change, and are rooted in the culture and era that created them; in contrast, the flexibility of tagging allows users to classify their collections of items in the ways that they find useful, but the personalized variety of terms can present challenges when searching and browsing.

When users can freely choose tags (creating a folksonomy, as opposed to selecting terms from a controlled vocabulary), the resulting metadata can include homonyms (the same tags used with different meanings) and synonyms (multiple tags for the same concept), which may lead to inappropriate connections between items and inefficient searches for information about a subject.[44] For example, the tag "orange" may refer to the fruit or the color, and items related to a version of the Linux kernel may be tagged "Linux", "kernel", "Penguin", "software", or a variety of other terms. Users can also choose tags that are different inflections of words (such as singular and plural),[45] which can contribute to navigation difficulties if the system does not include stemming of tags when searching or browsing. Larger-scale folksonomies address some of the problems of tagging, in that users of tagging systems tend to notice the current use of "tag terms" within these systems, and thus use existing tags in order to easily form connections to related items. In this way, folksonomies may collectively develop a partial set of tagging conventions.

Complex system dynamics

Despite the apparent lack of control, research has shown that a simple form of shared vocabulary emerges in social bookmarking systems. Collaborative tagging exhibits a form of complex systems dynamics (or self-organizing dynamics).[46] Thus, even if no central controlled vocabulary constrains the actions of individual users, the distribution of tags converges over time to stable power law distributions.[46] Once such stable distributions form, simple folksonomic vocabularies can be extracted by examining the correlations that form between different tags. In addition, research has suggested that it is easier for machine learning algorithms to learn tag semantics when users tag "verbosely"—when they annotate resources with a wealth of freely associated, descriptive keywords.[47]


Tagging systems open to the public are also open to tag spam, in which people apply an excessive number of tags or unrelated tags to an item (such as a YouTube video) in order to attract viewers. This abuse can be mitigated using human or statistical identification of spam items.[48] The number of tags allowed may also be limited to reduce spam.


Some tagging systems provide a single text box to enter tags, so to be able to tokenize the string, a separator must be used. Two popular separators are the space character and the comma. To enable the use of separators in the tags, a system may allow for higher-level separators (such as quotation marks) or escape characters. Systems can avoid the use of separators by allowing only one tag to be added to each input widget at a time, although this makes adding multiple tags more time-consuming.

A syntax for use within HTML is to use the rel-tag microformat which uses the rel attribute with value "tag" (i.e., rel="tag") to indicate that the linked-to page acts as a tag for the current context.[49]

See also


  1. ^ Some users, however, see tags not as metadata but as "just more content": Berendt, Bettina; Hanser, Christoph (2007). "Tags are not metadata, but 'just more content'—to some people" (PDF). Proceedings of the International Conference on Weblogs and Social Media (ICWSM), Boulder, Colorado, USA, March 26–28, 2007. Menlo Park, CA: International Joint Conferences on Artificial Intelligence. OCLC 799635928.
  2. ^ a b c d e Smith, Gene (2008). Tagging: people-powered metadata for the social web. Berkeley: New Riders Press. ISBN 9780321529176. OCLC 154806677.
  3. ^ a b c d e f Breslin, John G.; Passant, Alexandre; Decker, Stefan (2009). The social semantic web. Heidelberg; New York: Springer-Verlag. doi:10.1007/978-3-642-01172-6. ISBN 9783642011719. OCLC 506401195.
  4. ^ a b Jones, Rodney H.; Hafner, Christoph A. (2012). "Networks and organization". Understanding digital literacies: a practical introduction. Milton Park, Abingdon, Oxon; New York: Routledge. pp. 23–28. ISBN 9780415673167. OCLC 711041611.
  5. ^ a b For example, Blogger and WordPress can display tag clouds.
  6. ^ a b For example: Leap is a macOS application that features a clickable tag cloud of macOS tags: Hampton-Smith, Sam (12 April 2013). "The pro designer's guide to photo organization". creativebloq.com. Archived from the original on 16 April 2013. Retrieved 10 March 2017. As with all the other options here, meta data can be added to individual files to help improve their find-ability, and uniquely the tag cloud field within Leap's interface allows you to quickly drill down to individually labelled files without fuss. TaggTool is a Windows application that permits tagging files and displaying a tag cloud: Henry, Alan (28 April 2010). "TaggTool: organize your files by keyword". pcmag.com. PC Magazine. Archived from the original on 11 July 2015. Retrieved 10 March 2017.
  7. ^ Heymann, Paul; Garcia-Molina, Hector (2006). Collaborative creation of communal hierarchical taxonomies in social tagging systems (Technical report). Stanford University. Summarized in: Heymann, Paul (2006). "Tag hierarchies". infolab.stanford.edu. Archived from the original on 25 June 2016. Retrieved 10 March 2017.
  8. ^ Quintarelli, Emanuele; Resmini, Andrea; Rosati, Luca (June 2007). "Information architecture: Facetag: integrating bottom-up and top-down classification in a social tagging system". Bulletin of the American Society for Information Science and Technology. 33 (5): 10–15. doi:10.1002/bult.2007.1720330506.
  9. ^ Wu, Harris; Zubair, Mohammad; Maly, Kurt (2007). "Collaborative classification of growing collections with evolving facets". Proceedings of the eighteenth conference on hypertext and hypermedia, Manchester, UK, September 10–12, 2007. HT '07. New York: Association for Computing Machinery. pp. 167–170. CiteSeerX doi:10.1145/1286240.1286289. ISBN 9781595938206.
  10. ^ Carcillo, Franco; Rosati, Luca (2007). "Tags for citizens: integrating top-down and bottom-up classification in the Turin municipality website". In Schuler, Douglas (ed.). Online communities and social computing: second international conference, OCSC 2007, held as part of HCI International 2007, Beijing, China, July 22–27, 2007: proceedings. 4564. Berlin; New York: Springer-Verlag. pp. 256–264. doi:10.1007/978-3-540-73257-0_29. ISBN 9783540732563. OCLC 184906067.
  11. ^ Wilson, Katie (2007). "OPAC 2.0: next generation online library catalogues ride the Web 2.0 wave!". Online Currents. 21 (10): 406–413.
  12. ^ a b Yee, Raymond (2008). "Understanding tagging and folksonomies". Pro Web 2.0 mashups: remixing data and Web services. The expert's voice in Web development. Berkeley: Apress. pp. 61–75. doi:10.1007/978-1-4302-0286-8_3. ISBN 9781590598580. OCLC 148910044.
  13. ^ Willey, Eric (2011). "A cautious partnership: the growing acceptance of folksonomy as a complement to indexing digital images and catalogs". Library Student Journal. Retrieved 10 March 2017.
  14. ^ Gerolimos, Michalis (January 2013). "Tagging for libraries: a review of the effectiveness of tagging systems for library catalogs". Journal of Library Metadata. 13 (1): 36–58. doi:10.1080/19386389.2013.778730.
  15. ^ Raman, T. V. (1997). Auditory user interfaces: toward the speaking computer. Boston: Kluwer Academic Publishers. p. 107. doi:10.1007/978-1-4615-6225-2. ISBN 978-0792399841. OCLC 37109286. Calling a function defined in one compilation unit from within another is analogous to cross references in large hypertext documents. By using tags tables, the Emacs environment enables the user to turn program source code into powerful hypertext documents.
  16. ^ Wempen, Faithe (2010). Teach yourself visually Microsoft Access 2010. Teach yourself visually. Indianapolis: John Wiley & Sons. p. 69. ISBN 9780470577653. OCLC 495271168. You can turn on smart tags for a field to make it easier to cross-reference data between the Access database and Microsoft Outlook (or another personal information and e-mail program) and the Web.
  17. ^ Meyrowitz, Norman; Dam, Andries (September 1982). "Interactive Editing Systems: Part II". ACM Computing Surveys (CSUR). 14 (3): 353–415 (366–367). doi:10.1145/356887.356890. EMACS is an M.I.T. display editor designed to be 'extensible, customizable, and self-documenting' [...] Another interesting facility for program editing is the TAGS package. The separate program TAGS builds a TAGS table containing the file name and position in that file in which each application program function is defined. This table is loaded into EMACS; specifying the command Meta, function name causes EMACS to select the appropriate file and go to the proper function definition within that file.
  18. ^ "A Description of the Equator and Some ØtherLands". aporee.org. Archived from the original on 18 August 2001. Retrieved 10 March 2017.
  19. ^ See, for example: Screenshot of tags on del.icio.us in 2004 and Screenshot of a tag page on del.icio.us, also in 2004, both published by Joshua Schachter on July 9, 2007.
  20. ^ Garrett, Jesse James (4 August 2005). "An Interview with Flickr's Eric Costello". Tags were not in the initial version of Flickr. Stewart Butterfield wanted to add them. He liked the way they worked on del.icio.us, the social bookmarking application. We added very simple tagging functionality, so you could tag your photos, and then look at all your photos with a particular tag, or any one person's photos with a particular tag. Soon thereafter, users started telling us that what was really interesting about tagging was not just how you've tagged your photos, but how the whole Flickr community has been tagging photos. So we started seeing a lot of requests from users to be able to see a global view of the tagscape.
  21. ^ Mathes, Adam (December 2004). "Folksonomies: cooperative classification and communication through shared metadata". adammathes.com. Archived from the original on 9 March 2017. Retrieved 10 March 2017.
  22. ^ Gupta, Manish; Li, Rui; Yin, Zhijun; Han, Jiawei (2011). "An overview of social tagging and applications". In Aggarwal, Charu C. (ed.). Social network data analytics. New York: Springer-Verlag. pp. 447–497. doi:10.1007/978-1-4419-8462-3_16. ISBN 9781441984616. OCLC 709712928.
  23. ^ Bray, Tim (1 February 2007). "A Uniform Resource Name (URN) namespace for tag metadata". tbray.org. Archived from the original on 5 November 2016. Retrieved 10 March 2017.
  24. ^ "Firefox tip: find bookmarks faster with tags". blog.mozilla.org. Mozilla Foundation. Archived from the original on 12 October 2016. Retrieved 10 March 2017.
  25. ^ Hinton, Mark Justice; Obermeier, Barbara; Sahlin, Doug (2010). "Tagging photos". Editing digital photos for dummies. Hoboken, NJ: John Wiley & Sons. ISBN 9780470591451. OCLC 606841528.
  26. ^ Siracusa, John (22 October 2013). "OS X 10.9 Mavericks: The Ars Technica Review: Tags". arstechnica.com. Ars Technica. Archived from the original on 9 January 2017. Retrieved 10 March 2013.
  27. ^ Cherp, Aleh (17 March 2011). "Tagging". macademic.org. Academic workflows on a Mac. Archived from the original on 30 April 2016. Retrieved 10 March 2017.
  28. ^ "Extended attributes and tag file systems". lesbonscomptes.com. 2 July 2015. Archived from the original on 11 August 2016. Retrieved 10 March 2017.
  29. ^ Schultz, Greg (23 March 2011). "Tag your files for easier searches in Windows 7". techrepublic.com. TechRepublic. Archived from the original on 29 August 2016. Retrieved 10 March 2017.
  30. ^ Gasiorowski-Denis, Elizabeth (22 March 2012). "Adobe Extensible Metadata Platform (XMP) becomes an ISO standard". iso.org. International Organization for Standardization. Archived from the original on 10 March 2017. Retrieved 10 March 2017.
  31. ^ Płoszajski, Grzegorz (2017). "Metadata in long-term digital preservation". In Traczyk, Tomasz; Ogryczak, Włodzimierz; Pałka, Piotr; Śliwiński, Tomasz (eds.). Digital preservation: putting it to work. Studies in computational intelligence. 700. New York: Springer-Verlag. pp. 15–61. doi:10.1007/978-3-319-51801-5_2. ISBN 9783319518008. OCLC 969844731.
  32. ^ Devcic, Ivana Isadora (9 October 2015). "Tag, you're it! How to manage files on Linux with TagSpaces". makeuseof.com. MakeUseOf. Archived from the original on 28 December 2016. Retrieved 10 March 2017.
  33. ^ Finch, Curt (26 May 2011). "Hashtag techniques for businesses". inc.com. Inc. Magazine. Retrieved 10 March 2017.
  34. ^ Parry, David (11 March 2007). "Tagging files—or how to keep research organized". academhack.outsidethetext.com. Archived from the original on 2 August 2016. Retrieved 10 March 2017.
  35. ^ Smith, Richard (December 2010). "Strategies for coping with information overload". The BMJ. 341: c7126. doi:10.1136/bmj.c7126. PMID 21159764.
  36. ^ Bainbridge, Scott; Page, Geoff; Jaroensutasinee, Mullica; Jaroensutasinee, Krisanadej (September 2011). "Towards a services based architecture for real time marine observing data". OCEANS '11 MTS/IEEE Kona, Waikoloa, Hawaii, USA, 19–22 22 September 2011. Piscataway, NJ: IEEE. pp. 740–745. ISBN 9781457714276. OCLC 777270556.
  37. ^ Maron, Mikel (5 November 2004). "geo.lici.us: geotagging hosted services". brainoff.com. Archived from the original on 28 April 2007. Retrieved 10 March 2017.
  38. ^ Catt, Dan (11 January 2006). "Advanced Tagging and TripleTags". Archived from the original on 18 October 2007. Retrieved 10 March 2017.
  39. ^ Straup Cope, Aaron (24 January 2007). "Machine tags". flickr.com. Archived from the original on 20 April 2016. Retrieved 10 March 2017.
  40. ^ "The Encyclopedia of Life Flickr group rules". flickr.com. Encyclopedia of Life. Archived from the original on 10 February 2017. Retrieved 10 March 2017. Includes the required use of a taxonomy machine tag.
  41. ^ a b c Panda, Mrutyunjaya; El-Bendary, Nashwa; Salama, Mostafa A.; Hassanien, Aboul Ella; Abraham, Ajith (2012). "Computational social networks: tools, perspectives, and challenges" (PDF). In Abraham, Ajith; Hassanien, Aboul-Ella (eds.). Computational social networks: tools, perspectives, and applications. New York: Springer-Verlag. pp. 3–23 [14–15]. doi:10.1007/978-1-4471-4048-1_1. ISBN 9781447140474. OCLC 798568503.
  42. ^ a b Wiig, Karl M. (March 1997). "Knowledge management: an introduction and perspective". Journal of Knowledge Management. 1 (1): 6–14. doi:10.1108/13673279710800682.
  43. ^ Alavi, Maryam; Leidner, Dorothy E. (February 1999). "Knowledge management systems: issues, challenges, and benefits". Communications of the AIS. 1 (2es): 1.
  44. ^ Golder, Scott A.; Huberman, Bernardo A. (April 2006). "Usage patterns of collaborative tagging systems". Journal of Information Science. 32 (2): 198–208. doi:10.1177/0165551506062337.
  45. ^ Devens, Keith (24 December 2004). "Singular vs. plural tags in a tag-based categorization system (such as del.icio.us)". keithdevens.com. Archived from the original on 10 May 2012. Retrieved 10 March 2017.
  46. ^ a b Halpin, Harry; Robu, Valentin; Shepherd, Hana (2007). "The complex dynamics of collaborative tagging" (PDF). Proceedings of the 16th international conference on World Wide Web, Banff, Alberta, Canada, May 08–12, 2007. WWW '07. New York: Association for Computing Machinery. pp. 211–220. CiteSeerX doi:10.1145/1242572.1242602. ISBN 9781595936547. OCLC 173331796.
  47. ^ Körner, Christian; Benz, Dominik; Hotho, Andreas; Strohmaier, Markus; Stumme, Gerd (2010). "Stop thinking, start tagging: tag semantics emerge from collaborative verbosity" (PDF). Proceedings of the 19th International Conference on World Wide Web, Raleigh, North Carolina, USA, April 26–30, 2010. WWW '10. New York: Association for Computing Machinery. pp. 521–530. doi:10.1145/1772690.1772744. ISBN 9781605587998. OCLC 671101543.
  48. ^ Heymann, Paul. "Tag spam". stanford.edu. Stanford University. Retrieved 10 March 2017.
  49. ^ "Microformats wiki: rel='tag'". microformats.org. 10 January 2005. Retrieved 10 March 2017.

ApexKB (formerly Jumper 2.0), is an open source script for collaborative search and knowledge management powered by a shared enterprise bookmarking engine that is a fork of KnowledgebasePublisher. It was publicly announced on 29 September 2008. A stable version of Jumper (version was publicly released under the GNU General Public License and made available on Sourceforge on 26 March 2009 as a free software download.Jumper is Enterprise 2.0 software that empowers users to compile and share collaborative bookmarks by crowdsourcing their knowledge, experience and insights using knowledge tags. Users tag, link, and rate structured data and unstructured data sources, including relational databases, flat file databases, medical imaging, content management systems, and any network file system. It is an interactive, user-submitted recommendation engine which uses peer and social-networking principles to reference any information located in distributed storage devices, either inside or outside the firewall, and capture the collective knowledge about that content, media, or data.

Automatic indexing

Automatic indexing is the computerized process of scanning large volumes of documents against a controlled vocabulary, taxonomy, thesaurus or ontology and using those controlled terms to quickly and effectively index large electronic document depositories. As the number of documents exponentially increases with the proliferation of the Internet, automatic indexing will become essential to maintaining the ability to find relevant information in a sea of irrelevant information. Automatic indexing is the process of analyzing an item to extract the information to be permanently kept in an index.

The automated process can encounter problems and these are primarily caused by two factors: 1) the complexity of the language; and, 2) the lack intuitiveness and the difficulty in extrapolating concepts out of statements on the part of the computing technology. These are primarily linguistic challenges and specific problems involve semantic and syntactic aspects of language.

Edge-notched card

Edge-notched cards or edge-punched cards are an obsolete technology used to store a small amount of binary or logical data on paper index cards, encoded via the presence or absence of notches in the edges of the cards. The notches allowed efficient sorting and selecting of specific cards matching multiple desired criteria, from a larger number of cards in a paper-based database of information. In the mid-20th century they were also known by commercial names such as Cope-Chat cards, E-Z Sort cards, and McBee Keysort cards.

Enterprise bookmarking

Enterprise bookmarking is a method for Enterprise 2.0 users to tag, organize, store, and search bookmarks of both web pages on the Internet and data resources stored in a distributed database or fileserver. This is done collectively and collaboratively in a process by which users add tag (metadata) and knowledge tags.In early versions of the software, these tags are applied as non-hierarchical keywords, or terms assigned by a user to a web page, and are collected in tag clouds.

Examples of this software are Connectbeam and Dogear. New versions of the software such as Jumper 2.0 and Knowledge Plaza expand tag metadata in the form of knowledge tags that provide additional information about the data and are applied to structured and semi-structured data and are collected in tag profiles.

Faceted classification

A faceted classification is a classification scheme used in organizing knowledge into a systematic order. A faceted classification uses semantic categories, either general or subject-specific, that are combined to create the full classification entry. Many library classification systems use a combination of a fixed, enumerative taxonomy of concepts with subordinate facets that further refine the topic.

Index term

An index term, subject term, subject heading, or descriptor, in information retrieval, is a term that captures the essence of the topic of a document. Index terms make up a controlled vocabulary for use in bibliographic records. They are an integral part of bibliographic control, which is the function by which libraries collect, organize and disseminate documents. They are used as keywords to retrieve documents in an information system, for instance, a catalog or a search engine. A popular form of keywords on the web are tags which are directly visible and can be assigned by non-experts. Index terms can consist of a word, phrase, or alphanumerical term. They are created by analyzing the document either manually with subject indexing or automatically with automatic indexing or more sophisticated methods of keyword extraction. Index terms can either come from a controlled vocabulary or be freely assigned.

Keywords are stored in a search index. Common words like articles (a, an, the) and conjunctions (and, or, but) are not treated as keywords because it's inefficient. Almost every English-language site on the Internet has the article "the", and so it makes no sense to search for it. The most popular search engine, Google removed stop words such as "the" and "a" from its indexes for several years, but then re-introduced them, making certain types of precise search possible again.

The term "descriptor" was coined by Calvin Mooers in 1948. It is in particular used about a preferred term from a thesaurus.

The Simple Knowledge Organization System language (SKOS) provides a way to express index terms with Resource Description Framework for use in the context of Semantic Web.

MIKE2.0 methodology

The Method for an Integrated Knowledge Environment (MIKE2.0) is an open source delivery methodology for enterprise information management consultants. MIKE2.0 was released in December, 2006, by BearingPoint, a management and technology consulting company, under the Creative Commons Attribution License. The project is now run by the MIKE2.0 Governance Association, a non-profit organisation based in Switzerland, with BearingPoint and Deloitte as the founding members. In March 2013 a book Information Development Using MIKE2.0 was published promoting it.

Machine-readable document

A machine-readable document is a document whose content can be readily processed by computers. Such documents are distinguished from machine-readable data by virtue of having sufficient structure to provide the necessary context to support the business processes for which they are created.

Media meshing

Media meshing is the process of using one of the media, such as a blog or a website, to enhance the experience of another medium, such as a newspaper article or a fictional television program. "Meshing" may describe activities and motivations of an information receiver which are completely independent of the intentions of the source of that information. "Meshing" may also be used to describe the activities of an information provider, who may intentionally encourage consumer engagement by using multiple media or channels of information exchange related to a product or organization, as in an integrated advertising campaign. Strictly, in both cases, "media meshing" ultimately describes the behaviour of the person receiving the information.

When the meshing of media sources is encouraged by the source itself, as is often the case with commercial products and special interest promotions, media meshing may be better described as integrated media. This describes the activity desired by commercial entities when they encourage web traffic through non-web media such as billboards or newspaper articles in a comprehensive advertising campaign.However, from the perspective of the information consumer, Lerma describes that meshing represents a way to experience media which is fundamentally different from the way people interacted with mass media in the past. When the participants themselves choose the channels, pieces, sources, and specific media that they use to enhance or disassemble the first piece of media they encounter, especially when these are perceived to be mostly independent sources, the result is an enriched self-discovery experience with a level of engagement not possible from even the most well designed guided experiences. Meshing goes beyond simply mutually inclusive public relations across media channels, pieces, and product placement. Broadband technology and diversity of online services have increased the accessibility and prevalence of simultaneous media experiences to the point where some would argue there is a new, more demanding and more inquisitive, incarnation of information consumers. The richness of experience in meshing becomes even more evident when information receivers puts their new views and interpretations back into the media by posting or answering questions on a discussion forum or blog, publishing a formal in-depth response, or creating an independent fan site. In cases like these, the persons may be called information prosumers, as they are both consuming information from various sources and producing a web of related information for other "meshers" to consume (or prosumers to mesh).

It is arguable, however, that this is the traditional approach of historians and scientists to information sources, and that what is new is the speed and ease with which cross-referencing and elaboration can be achieved with the media delivery and display technology that is used, and the amount of time invested in multi-source research by the non-specialized population. It is also arguable that multi-form/multi-source information gathering only "seems" new when contrasted to the recent era of single-voice and non-interactive commercial television and radio information broadcasting. It is also important to note that if the sources are not chosen carefully to be as orthogonal as possible, media meshing will result in a false reassurance of the "facts" presented by the single-voice of the original articles, but it would still technically be media meshing (i.e., confirming a CNN television story about an event in London by meshing the CNN.com web-site, BBC radio, and The Guardian newspaper will probably confirm the original facts, but a different set of facts may be forthcoming from an email to a friend who lives there or from reading a blog from a beat-cop in the area.)


Musepack or MPC is an open source lossy audio codec, specifically optimized for transparent compression of stereo audio at bitrates of 160–180 (manual set allows bitrates up to 320) kbit/s. It was formerly known as MPEGplus, MPEG+ or MP+.

Development of MPC was initiated in 1997 by Andree Buschmann and later assumed by Frank Klemm, and as of 2004 is maintained by the Musepack Development Team (MDT) with assistance from Buschmann and Klemm. Encoders and decoders are available for Microsoft Windows, Linux and Mac OS X, and plugins for several third-party media players available from the Musepack website, licensed under the GNU Lesser General Public License (LGPL) or BSD licenses, and an extensive list of programs supporting the format.

Personal knowledge base

A personal knowledge base (PKB) is an electronic tool used to express, capture, and later retrieve the personal knowledge of an individual. It differs from a traditional database in that it contains subjective material particular to the owner, that others may not agree with nor care about. Importantly, a PKB consists primarily of knowledge, rather than information; in other words, it is not a collection of documents or other sources an individual has encountered, but rather an expression of the distilled knowledge the owner has extracted from those sources.The term personal knowledge base was mentioned as early as the 1980s, but the term came to prominence when it was described at length in publications by computer scientist Stephen Davies and colleagues, who compared PKBs on a number of different dimensions, the most important of which is the data model that each PKB uses to organize knowledge.Davies and colleagues examined three aspects of the data models of PKBs:

their structural framework, which prescribes rules about how knowledge elements can be structured and interrelated (as a tree, graph, tree plus graph, spatially, categorically, or as n-ary links);

their knowledge elements, or basic building blocks of information that a user creates and works with, and the level of granularity of those knowledge elements (such as word/concept, phrase/proposition, free text notes, links to information sources, or composite); and

their schema, which involves the level of formal semantics introduced into the data model (such as a type system and related schemas, keywords, attribute–value pairs, etc.).Davies and colleagues also differentiated PKBs according to their architecture: file-based, database-based, or client–server systems (including Internet-based systems accessed through desktop computers and/or handheld mobile devices).Non-electronic personal knowledge bases have probably existed in some form for centuries: Da Vinci's notebooks are a famous example. More commonly, files of index cards (in German, Zettelkasten), edge-notched cards and annotated private libraries, have served this function in the pre-electronic age. Undoubtedly the most famous early formulation of an electronic PKB was Vannevar Bush's description of the "memex" in 1945. In a 1962 technical report, human–computer interaction pioneer Douglas Engelbart (who would later become famous for his 1968 "Mother of All Demos" that demonstrated almost all the fundamental elements of modern personal computing) described his use of edge-notched cards to partially model Bush's memex.

Philip Pocock

Philip Pocock is a Canadian artist, photographer and researcher. He was born in Ottawa, Ontario, in 1954. Since the early 1990s, his work has been collaborative, situational, time-, code-, net-based and participatory.

In photography, in the 1980s, Philip Pocock produced two bodies of photographic works: lyrical documentary explorations in New York and Berlin; as well as alchemical Cibachrome photographs. In 1980, "The Obvious Illusion: Murals from the Lower East Side", a monograph of his color photographs, was published by George Braziller to accompany public exhibitions of his Cibachrome photographs at the Cooper Union, in New York, in 1980, and the Art Gallery of Ontario, in Toronto, in 1981.In New York City, in 1988, collaborating with the painter, John Zinsser, Philip Pocock co-founded, co-published, co-edited, and designed on an Apple Macintosh Plus and Laserwriter, the early low-cost, interview-based, desktop-published Journal of Contemporary Art, announced in the New York Times 1988-01-22.Relocating to Europe, in 1990, Philip Pocock continued collaborative practice, painting and drawing with German artist Walter Dahn, song lyrics from American popular music sources, from the Blues to Indie, under the label Music Security Administration, in Cologne, from 1993 to 1995, before entering telecommunication space with FAX performance, database cinema and cybernetic installation from 1993 onward.In 1993, with Swiss photographer Felix Stephan Huber, Philip Pocock extended collaborative practice with digital cameras, laptops and a Fax modem, co-producing for the Venice Biannual's Electronic Café, a digital performance and facsimile book, Black Sea Diary.In 1995, Huber and Pocock created an art weblog, mixing regularly posted live journal, sound and video entries with emails from their users' forum on the web. Travel-art-art-as-information, a cyber-roadmovie, Arctic Circle investigates contemporary loneliness, taking the duo by van from Vancouver, British Columbia, over thousands of kilometers, to walk along the Arctic Circle in Canada's northern wilderness, simultaneously searching for any sign of life on the other side, the cyber-side, of their laptop screens. Driving, acting, uploading, what began as 1970s conceptual performance mutated into 1990s pulp melodrama when two hitchhikers, Nora and Nicolas, hopped on board, all becoming fictional characters playing in a digital documentary of their own making. Arctic Circle was produced for the traveling exhibition Fotographie nach der Fotographie, in 1995-97, included in documenta X, in 1997.Philip Pocock was invited by documenta X's director, Catherine David, in 1996, to produce an Internet cinema piece for the event in 1997. He presented the work in the context of the documenta X - 100 Days 100 Guests event. For [ A Description of the Equator and Some ØtherLands] Philip Pocock assembled longtime collaborator, Felix Stephan Huber, Udo Noll, and Florian Wenz, to produce an early online, user-generated, database-driven hypercinematic work, which introduced the term Tag (metadata), taking Pocock and Wenz first to Uganda, then Pocock and Noll to the Java Sea to traverse the Earth's equator, and with thousands of users pursue the potential one of corresponding identities in cyberspace. A Description of the Equator and Some ØtherLands was coded with open source software: php1.0 msql on a Redhat linux operating system. Philip Pocock did not visit the site of his collaboration at the documenta X in Kassel, until he participated as a guest speaker in its 100 Days 100 Guests programme, 1997-08-23.

In 1999, with another group of collaborators, notably the Italian architecture collective, Gruppo A12, net programmer Daniel Burckhardt, Brazilian artist Roberto Cabot, Thing.net founder Wolfgang Staehle, as well as the Equator group, Philip Pocock produced H|u|m|b|o|t for the ZKM Center for Art and Media's net_condition exhibition in Karlsruhe, Germany, initiated with support from the Goethe-Institut, Caracas, Venezuela. H|u|m|b|o|t is a movie-mapping, an atlas plotted to ubiquitous screens, transmitted from a database of text and video, mapped as a single screen-world, with the help of an intelligent, self-organizing mapping algorithm from the Finnish mathematician, Teuvo Kohonen. H|u|m|b|o|ts text source was Alexander von Humboldt's scientific travelogue, Personal Narrative of a Journey to the Equinoctial Regions of the New Continent 1799–1804, each paragraph of which was specifically identified according to its Global Positioning GPS meta-data, as well as annotated with emotion, keyword and location markers, using H|u|m|b|o|ts XML editor. This meta-data translates into a topography of Humboldt's historical narrative, tagged, visually and semantically connecting clusters of text to one shared screen (FLATBOOK), collated with contemporary videos from Venezuela and Cuba by H|u|m|b|o† authors (FLATMOVIE). Together, an atlas is composed through which users travel, each logged as possible itineraries for future users. H|u|m|b|o|t was installed in Hans Ulrich Obrist's Voilà: Le monde dans la tête exhibition at the Musée d-art moderne, Paris, 2000.In 2002, pre-YouTube, UNMOVIE, a future cinema, codes tagged, user-generated, flash video on-the-fly, the UNMOVIE Stream, mashed up from words generated by synthespian dialogue from the UNMOVIE Stage. Synthespians ([Chatterbots]) were coded from: the entire oeuvre of Bob Dylan, Beyond Good and Evil by Nietzsche, Sculpting in Time by Tarkovsky, The Philosophy of Andy Warhol by Drella, anecdotes by the 13th-century Zen master Dogen, male-female cybersex chat from Geisha, and visitors to the Stage, You_01 - 06. As the synthespians match words, some are sent to the database to cull user video to play on the 'Stage'. With info architect, Axel Heide, sculptor, Gregor Stehle, and designers, Onesandzeros, Philip Pocock produced UNMOVIE for the traveling Future Cinema: The Cinematic Imaginary After Film exhibition at ZKM Karlsruhe. UNMOVIE opened in November 2001, and has been writing itself and playing 24-7 ever since. UNMOVIE has been installed at the Kiasma, Helsinki, Finland, and the NTT InterCommunication Center (ICC) gallery, Tokyo, Japan.

In 2006, Philip Pocock created SpacePlace: Art in the Age of Orbitization with Peter Weibel, ZKM, Axel Heide and Onesandzeros. As well as being an on-line, web2.0, Mashup (digital) and repository for Outer Space-related art and culture, the SpacePlace database generated a multimedia platform, SpacePlace mobile, as well as a dual-screen, free public access Bluetooth installation for specific locations, such as ZKMax, Munich, Germany, where urban guest were greeted by a cellphone message and projected video wobbling to the sound of outer space, opening June 7, 2006, in support of the [United Nations Office for Outer Space Affairs] conference in Vienna to check and balance peaceful and cultural utilization of near Earth orbit. and beyond.

Philip Pocock produced and directed in collaboration with several art and design school students and grads, the ZKM Island YOUniverse in Second Life, with cyber-robotic avatars, avatar-sensing cinema structures, participatory and converging with a Moblog and mobile media sculpture presented at the YOU_ser: The Century of the Consumer exhibition curated by Peter Weibel at ZKM, Karlsruhe. User-generated images are emailed to Second Life mashup cinemas. ZKM Island in Second Life presents vitrine architecture in a globally warmed wasteland, each supermodern structure's components simultaneously screen, wall and window, sensitive to avatar movement and orientation. A Boxing Ring where avatars can get in the ring with six German cultural theorists and philosophy cyber-robotic avatars and punch it out while waxing philosophy, just for fun.Commissioned by the Seville Biannual (BIACS), Spain, in 2008, Philip Pocock collaboratively produced with Alex Wenger, Linus Stolz, Julian Finn, Daniel Burckhardt and other students Aland: Scopic Regimes of Uncertainty, three telescopic, participatory, multi-screen sculptures that converse incessantly and convivially. Alan∂, short for Al-Andalus, a rare moment of cultural conviviality on the Iberian Peninsula between the 8th and 15th centuries, begins with an artificially intelligent, incessant dialogue between Federico García Lorca, raised in Christian Andalusia (his 20th-century poetry), Moses Maimonides (his 11th-century book, A Guide for the Perplexed), and Muhammad Ibn Tufail (his 11th-century novel, Alive, Son of Awake) Jewish and Muslim Al-Andaluz contemporaries, driving database searches for images of Andalusia in the contemporary blogosphere, compiling them into rhythmic and subtitled video clips, which are surveilled by telescopes, the details captured, retrieving similar images from Andalusian cyberspace. In short, it is scopic media that are surveilled, and pictures looking at pictures, for pictures to display over sculpted arrays of recycled and DIY screens. Web-cams sculpturally integrated as well mix portraits of installation guests with a mashed up overabundance of Andalusia's scopic regime.

Revision tag

A revision tag is a textual label that can be associated with a specific revision of a project maintained by a version control system. This allows the user to define a meaningful name to be given to a particular state of a project that is under version control. This label can then be used in place of the revision identifier for commands supported by the version control system.

For example, in software development, a tag may be used to identify a specific release of the software such as "version 1.2".

Schema crosswalk

A schema crosswalk is a table that shows equivalent elements (or "fields") in more than one database schema. It maps the elements in one schema to the equivalent elements in another schema.

Crosswalk tables are often employed within or in parallel to enterprise systems, especially when multiple systems are interfaced or when the system includes legacy system data. In the context of Interfaces, they function as a sort of internal Extract, Transform, Load (ETL) mechanism.

For example, this is a metadata crosswalk from MARC standards to Dublin Core:

Crosswalks show people where to put the data from one scheme into a different scheme. They are often used by libraries, archives, museums, and other cultural institutions to translate data to or from MARC standards, Dublin Core, Text Encoding Initiative (TEI), and other metadata schemes. For example, say an archive has a MARC record in their catalog describing a manuscript. If the archive makes a digital copy of that manuscript and wants to display it on the web along with the information from the catalog, it will have to translate the data from the MARC catalog record into a different format such as Metadata Object Description Schema that is viewable in a webpage. Because MARC has different fields than MODS, decisions must be made about where to put the data into MODS. This type of "translating" from one format to another is often called "metadata mapping" or "field mapping," and is related to "data mapping", and "semantic mapping".

Crosswalks also have several technical capabilities. They help databases using different metadata schemes to share information. They help metadata harvesters create union catalogs. They enable search engines to search multiple databases simultaneously with a single query.


SciCrunch is a collaboratively edited knowledge base about scientific resources, a community portal for researchers and a content management system for data and databases. It is intended to provide a common source of data to the research community and the data about Research Resource Identifiers (RRIDs), which can be used in scientific publications. In some respect, it is for science and scholarly publishing, what Wikidata is for Wikimedia Foundation projects. Hosted by the University of California, San Diego, SciCrunch was also designed to help communities of researchers create their own portals to provide access to resources, databases and tools of relevance to their research areas


The steve.museum project was a collaborative effort to improve public access to and engagement with US art museum collections. It explored the possibilities of user-generated descriptions of works of art, also known as folksonomy. Project staff in 2011 comprised a group of volunteers, mostly from art museums, including the Guggenheim Museum, the Cleveland Museum of Art, the Metropolitan Museum of Art and the San Francisco Museum of Modern Art, as well as Archives & Museum Informatics.In a folksonomy users tag content for the purposes of later retrieval. It allows the public to introduce new search-terms, in the form of tags, to the formal library catalog that art and cataloging professionals themselves might not have included. It also allows curators and other museum professionals to see what the public sees in works of art. These terms will enrich the catalog and increase the likelihood that searchers of all levels will find what they are looking for. In the end, it is hoped that museum collections will be fully searchable by keywords rather than just by name or artist. Early results from the project found that a number of tags were applied often, while others were applied just once per work of art.The project received a $1 million grant from the US Institute of Museum and Library Services, from which the Indianapolis Museum of Art is working to apply folksonomy to its collection, and is one of a number of related projects currently working to make art more accessible and to find its role in the digital age.

WannaCry ransomware attack

The WannaCry ransomware attack was a May 2017 worldwide cyberattack by the WannaCry ransomware cryptoworm, which targeted computers running the Microsoft Windows operating system by encrypting data and demanding ransom payments in the Bitcoin cryptocurrency. It propagated through EternalBlue, an exploit developed by the US National Security Agency (NSA) for older Windows systems that was released by The Shadow Brokers a few months prior to the attack. While Microsoft had released patches previously to close the exploit, much of WannaCry's spread was from organizations that had not applied these, or were using older Windows systems that were past their end-of-life. WannaCry also took advantage of installing backdoors onto infected systems.

The attack was stopped within a few days of its discovery due to emergency patches released by Microsoft, and the discovery of a kill switch that prevented infected computers from spreading WannaCry further. The attack was estimated to have affected more than 200,000 computers across 150 countries, with total damages ranging from hundreds of millions to billions of dollars. Security experts believed from preliminary evaluation of the worm that the attack originated from North Korea or agencies working for the country.

In December 2017, the United States, United Kingdom and Australia formally asserted that North Korea was behind the attack.A new variant of WannaCry ransomware forced Taiwan Semiconductor Manufacturing Company (TSMC) to temporarily shut down several of its chip-fabrication factories in August 2018. The virus spread to 10,000 machines in TSMC's most advanced facilities.

Website correlation

Website correlation, or website matching, is a process used to identify websites that are similar or related. Websites are inherently easy to duplicate. This led to proliferation of identical websites or very similar websites for purposes ranging from translation to Internet marketing (especially affiliate marketing) to Internet crime Locating similar websites is inherently problematic because they may be in different languages, on different servers, in different countries (different top-level domains).


This page is based on a Wikipedia article written by authors (here).
Text is available under the CC BY-SA 3.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.