Knowledge Graph

The Knowledge Graph is a knowledge base used by Google and its services to enhance its search engine's results with information gathered from a variety of sources. The information is presented to users in an infobox next to the search results. Knowledge Graph infoboxes were added to Google's search engine in May 2012, starting in the United States, with international expansion by the end of the year. The Knowledge Graph was powered in part by Freebase.[1] The information covered by the Knowledge Graph grew significantly after launch, tripling its size within seven months (covering 570 million entities and 18 billion facts[2]) and answering "roughly one-third" of the 100 billion monthly searches Google processed in May 2016. The Knowledge Graph has been criticized for providing answers without source attribution or citation.

Information from the Knowledge Graph is presented as a box, which Google has referred to as the "knowledge panel", to the right (top on mobile) of search results.[3] According to Google, this information is retrieved from many sources, including the CIA World Factbook, Wikidata, and Wikipedia.[4][5] In October 2016, Google announced that the Knowledge Graph held over 70 billion facts.[6] There is no official documentation on the technology used for the Knowledge Graph implementation.[7]

Information from the Knowledge Graph is used to answer direct spoken questions in Google Assistant[8][9] and Google Home voice queries.[10]

Google Knowledge Panel
Knowledge Graph data about Thomas Jefferson displayed on Google Search, as of January 2015

History

Google announced Knowledge Graph on May 16, 2012, as a way to significantly enhance the value of information returned by Google searches.[4] Initially only available in English, the Knowledge Graph was expanded in December 2012 to Spanish, French, German, Portuguese, Japanese, Russian, and Italian.[11] Support for Bengali was added in March, 2017.[12]

In August 2014, New Scientist reported that Google had launched Knowledge Vault, a new initiative to succeed the capabilities of the Knowledge Graph. Contrary to a database, which deals with numbers, the Knowledge Vault was meant to deal with facts, automatically gathering and merging information from across the Internet into a knowledge base capable of answering direct questions, such as "Where was Madonna born". It was reported that its main ability over the Knowledge Graph was to gather information automatically rather than relying on crowdsourced facts compiled by humans; by the time of the 2014 report, it had collected over 1.6 billion facts, 271 million of which were considered "confident facts", a term for information deemed more than 90% true.[13] However, after publication, Google reached out to Search Engine Land to explain that Knowledge Vault was a research paper, not an active Google service, and in its report, Search Engine Land referenced indications by the company that "numerous models" were being experimented with to examine the possibility of automatically gathering meaning from text.[14]

Criticism

Lack of source attribution

By May 2016, knowledge boxes were appearing for "roughly one-third" of the estimated 100 billion monthly searches the company processed. Dario Taraborelli, head of research at the Wikimedia Foundation, told The Washington Post that Google's omission of sources in its knowledge boxes "undermines people’s ability to verify information and, ultimately, to develop well-informed opinions". The publication also reported that the boxes are "frequently unattributed", such as a knowledge box on the age of actress Betty White, which is "as unsourced and absolute as if handed down by God".[15]

Declining Wikipedia article readerships

According to The Register, the implementation of direct answers in Google search results has caused significant readership declines for the online encyclopedia Wikipedia, from which the Knowledge Graph obtains some of its information.[16] The Daily Dot noted that "Wikipedia still has no real competitor as far as actual content is concerned. All that's up for grabs are traffic stats. And as a nonprofit, traffic numbers don't equate into revenue in the same way they do for a commercial media site". After the article's publication, a spokesperson for the Wikimedia Foundation, which owns Wikipedia, reached out to state that it "welcomes" the Knowledge Graph functionality, that it was "looking into" the traffic drops, and that "We've also not noticed a significant drop in search engine referrals. We also have a continuing dialog with staff from Google working on the Knowledge Panel".[17]

See also

References

  1. ^ Singhal, Amit (May 16, 2012). "Introducing the Knowledge Graph: Things, Not Strings". Google Official Blog. Retrieved September 6, 2014.
  2. ^ Newton, Casey (December 4, 2012). "Google's Knowledge Graph tripled in size in seven months". CNET. CBS Interactive. Retrieved December 10, 2017.
  3. ^ "Your business information in the Knowledge Panel". Google My Business Help. Google. Retrieved December 10, 2017.
  4. ^ a b Singhal, Amit (May 16, 2012). "Introducing the Knowledge Graph: things, not strings". Official Google Blog. Google. Retrieved December 10, 2017.
  5. ^ Schwartz, Barry (December 17, 2014). "Google's Freebase To Close After Migrating To Wikidata: Knowledge Graph Impact?". Search Engine Roundtable. Retrieved December 10, 2017.
  6. ^ Vincent, James (October 4, 2016). "Apple boasts about sales; Google boasts about how good its AI is". The Verge. Vox Media. Retrieved December 10, 2017.
  7. ^ Ehrlinger, Lisa; Wöß, Wolfram (2016). "Towards a Definition of Knowledge Graphs" (PDF).
  8. ^ Lynley, Matthew (May 18, 2016). "Google unveils Google Assistant, a virtual assistant that's a big upgrade to Google Now". TechCrunch. Oath Inc. Retrieved December 10, 2017.
  9. ^ Kovach, Steve (October 4, 2016). "Google is going to win the next major battle in computing". Business Insider. Axel Springer SE. Retrieved December 10, 2017.
  10. ^ Bohn, Dieter (May 18, 2016). "Google Home: a speaker to finally take on the Amazon Echo". The Verge. Vox Media. Retrieved December 10, 2017.
  11. ^ Newton, Casey (December 14, 2012). "How Google is taking the Knowledge Graph global". CNET. CBS Interactive. Retrieved December 10, 2017.
  12. ^ "Making it easier to Search in Bengali". Official Google India Blog. Retrieved 2018-01-26.
  13. ^ Hodson, Hal (August 20, 2014). "Google's fact-checking bots build vast knowledge bank". New Scientist. Retrieved December 10, 2017.
  14. ^ Sterling, Greg (August 25, 2014). "Google "Knowledge Vault" To Power Future Of Search". Search Engine Land. Retrieved December 10, 2017.
  15. ^ Dewey, Caitlin (May 11, 2016). "You probably haven't even noticed Google's sketchy quest to control the world's knowledge". The Washington Post. Retrieved December 10, 2017.
  16. ^ Orlowski, Andrew (January 13, 2014). "Google stabs Wikipedia in the front". The Register. Retrieved December 10, 2017.
  17. ^ Kloc, Joe (January 8, 2014). "Is Google accidentally killing Wikipedia?". The Daily Dot. Retrieved December 10, 2017.
Abox

In Computer Science, an ABox is an "assertion component"—a fact associated with a conceptual model or ontologies within a knowledge base.

The terms ABox and TBox are used to describe two different types of statements in knowledge bases. TBox statements describe a domain of interest by defining classes and properties as a domain vocabulary. ABox are TBox-compliant statements that use the vocabulary.

TBox statements are sometimes associated with object-oriented classes and ABox statements associated with instances of those classes.

Together ABox and TBox statements make up a knowledge base or a Knowledge Graph.

Android TV

Android TV is a version of the Android operating system designed for digital media players. As a replacement for Google TV, it features a user interface designed around content discovery and voice search, surfacing content aggregated from various media apps and services, and integration with other recent Google technologies such as Assistant, Cast, and Knowledge Graph.

The platform was first unveiled in June 2014, with its Nexus Player launch device unveiled that October. The platform has also been adopted as smart TV middleware by companies such as Sony and Sharp, while Android TV products have also been adopted as set-top boxes by a number of IPTV television providers.

DARPA Agent Markup Language

The DARPA Agent Markup Language (DAML) was the name of a US funding program at the US Defense Advanced Research Projects Agency (DARPA) started in 1999 by then-Program Manager James Hendler, and later run by Murray Burke, Mark Greaves and Michael Pagels. The program focused on the creation of machine-readable representations for the Web. One of the Investigators working on the program was Tim Berners-Lee and to a great degree through his influence, working with the program managers, the effort worked to create technologies and demonstrations for what is now called the Semantic Web and this in turn led to the growth of Knowledge Graph technology.

A primary outcome of the DAML program was the DAML language, an agent markup language based on RDF. This language was then followed by an extension entitled DAML+OIL which included researchers outside of the DARPA program in the design. The 2002 submission of the DAML+OIL language to the World Wide Web Consortium (W3C) captures the work done by DAML contractors and the EU/U.S. ad hoc Joint Committee on Markup Languages. This submission was the starting point for the language (later called OWL) to be developed by W3C's web ontology working group, WebOnt.

DAML+OIL was a syntax, layered on RDF and XML, that could be used to describe sets of facts making up an ontology.

DAML+OIL had its roots in three main languages - DAML, as described above, OIL (Ontology Inference Layer) and SHOE, an earlier US research project.

A major innovation of the languages was to use RDF and XML for a basis, and to use RDF namespaces to organize and assist with the integration of arbitrarily many different and incompatible ontologies.

Articulation ontologies can link these competing ontologies through codification of analogous subsets in a neutral point of view, as is done in the Wikipedia.

Current ontology research derived in part from DAML is leading toward the expression of ontologies and rules for reasoning and action.

Much of the work in DAML has now been incorporated into RDF Schema, the OWL and their successor languages and technologies including schema.org

Diffbot

Diffbot is a developer of machine learning and computer vision algorithms and public APIs for extracting data from web pages / web scraping. The company was founded in 2008 at Stanford University and was the first company funded by StartX (then Stanford Student Enterprises), Stanford's on-campus venture capital fund.The company has gained interest from its application of computer vision technology to web pages, wherein it visually parses a web page for important elements and returns them in a structured format. In 2015 Diffbot announced it was working on its version of an automated "Knowledge Graph" by crawling the web and using its automatic web page extraction to build a large database of structured web data.The company's products allow software developers to analyze web home pages and article pages, and extract the "important information" while ignoring elements deemed not core to the primary content.In August 2012 the company released its Page Classifier API, which automatically categorizes web pages into specific "page types". As part of this, Diffbot analyzed 750,000 web pages shared on the social media service Twitter and revealed that photos, followed by articles and videos, are the predominant web media shared on the social network.The company raised $2 million in funding in May 2012 from investors including Andy Bechtolsheim and Sky Dayton.Diffbot's customers include Adobe, AOL, Cisco, DuckDuckGo, eBay, Instapaper, Microsoft, Onswipe and Springpad.

Freebase

Freebase was a large collaborative knowledge base consisting of data composed mainly by its community members. It was an online collection of structured data harvested from many sources, including individual, user-submitted wiki contributions. Freebase aimed to create a global resource that allowed people (and machines) to access common information more effectively. It was developed by the American software company Metaweb and ran publicly beginning in March 2007. Metaweb was acquired by Google in a private sale announced 16 July 2010. Google's Knowledge Graph was powered in part by Freebase.Freebase data was available for commercial and non-commercial use under a Creative Commons Attribution License, and an open API, RDF endpoint, and a database dump was provided for programmers.

On 16 December 2014, Knowledge Graph announced that it would shut down Freebase over the succeeding six months and help with the move of the data from Freebase to Wikidata.On 16 December 2015, Google officially announced the Knowledge Graph API, which is meant to be a replacement to the Freebase API. Freebase.com was officially shut down on 2 May 2016.On 8 of September 2018 Google has published at github.com sources of graphd server, which is a Freebase backend.

Google Hummingbird

Hummingbird is the codename given to a significant algorithm change in Google Search in 2013. Its name was derived from the speed and accuracy of the hummingbird. The change was announced on September 26, 2013, having already been in use for a month. "Hummingbird" places greater emphasis on natural language queries, considering context and meaning over individual keywords. It also looks deeper at content on individual pages of a website, with improved ability to lead users directly to the most appropriate page rather than just a website's homepage.

The upgrade marked the most significant change to Google search in years, with more "human" search interactions and a much heavier focus on conversation and meaning. Thus, web developers and writers were encouraged to optimize their sites with natural writing rather than forced keywords, and make effective use of technical web development for on-site navigation.

Google Now

Google Now was a feature of Google Search of the Google app for Android and iOS. Google Now proactively delivered information to users to predict (based on search habits and other factors) information they may need in the form of informational cards. Google Now branding is no longer used, but the functionality continues in the Google app and its feed.Google first included Google Now in Android 4.1 ("Jelly Bean"), which launched on July 9, 2012, and the Galaxy Nexus smartphone was first to support it. The service became available for iOS on April 29, 2013, without most of its features. In 2014, Google added Now cards to the notification center in Chrome OS and in the Chrome browser. Later, however they removed the notification center entirely from Chrome. Popular Science named Google Now the "Innovation of the Year" for 2012.Since 2015, Google gradually phased out reference to "Google Now" in the Google app, largely removing remaining use of "Now" in October 2016, including replacing Now cards with Feed. At Google I/O 2016, Google showcased its new intelligent personal assistant Google Assistant, in some ways an evolution of Google Now. Unlike Google Now, however, Assistant can engage in two-way dialogue with the user.

Google Search

Google Search, also referred to as Google Web Search or simply Google, is a web search engine developed by Google LLC. It is the most used search engine on the World Wide Web across all platforms, with 92.74% market share as of October 2018, handling more than 3.5 billion searches each day.The order of search results returned by Google is based, in part, on a priority rank system called "PageRank". Google Search also provides many different options for customized search, using symbols to include, exclude, specify or require certain search behavior, and offers specialized interactive experiences, such as flight status and package tracking, weather forecasts, currency, unit and time conversions, word definitions, and more.

The main purpose of Google Search is to hunt for text in publicly accessible documents offered by web servers, as opposed to other data, such as images or data contained in databases. It was originally developed by Larry Page and Sergey Brin in 1997. In June 2011, Google introduced "Google Voice Search" to search for spoken, rather than typed, words. In May 2012, Google introduced a Knowledge Graph semantic search feature in the U.S.

Analysis of the frequency of search terms may indicate economic, social and health trends. Data about the frequency of use of search terms on Google can be openly inquired via Google Trends and have been shown to correlate with flu outbreaks and unemployment levels, and provide the information faster than traditional reporting methods and surveys. As of mid-2016, Google's search engine has begun to rely on deep neural networks.Competitors of Google include Baidu and Soso.com in China; Naver.com and Daum.net in South Korea; Yandex in Russia; Seznam.cz in the Czech Republic; Yahoo in Japan, Taiwan and the US, as well as Bing and DuckDuckGo. Some smaller search engines offer facilities not available with Google, e.g. not storing any private or tracking information.

Within the US, as of July 2018, Microsoft Sites handled 24.2 percent of all search queries in the United States. During the same period of time, Oath (formerly known as Yahoo) had a search market share of 11.5 percent. Market leader Google generated 63.2 percent of all core search queries in the United States.

Google TV

Google TV is a discontinued smart TV platform from Google co-developed by Intel, Sony, and Logitech that was launched in October 2010 with official devices initially made by Sony and Logitech. Google TV integrates the Android operating system and the Google Chrome web browser to create an interactive television overlay on top of existing online video sites to add a 10-foot user interface, for a smart TV experience.

Google TV's first generation devices were all based on x86 architecture processors and were created and commercialized by Sony and Logitech. The second generation of devices are all based on ARM architecture processors and with additional partners including LG, Samsung, Vizio and Hisense. In 2013, more second generation Google TV-supported devices were announced by new partners, including Hisense, Netgear, TCL, and Asus, some of which including 3D video support.

Google TV was succeeded in June 2014 by Android TV, a newer platform which shares closer ties with the Android platform and has a revamped user experience integrating with Knowledge Graph, and providing casting support from mobile devices. While a "small subset" of Google TV devices will be upgraded to the Android TV platform, the majority will not. In June 2014, the Google TV SDK was no longer available, ending any future software development for existing devices and effectively deprecating the platform.

International Semantic Web Conference

The International Semantic Web Conference (ISWC) is a series of academic conferences and the premier international forum, for the Semantic Web, Linked Data and Knowledge Graph Community. Here, scientists, industry specialists, and practitioners meet to discuss the future of practical, scalable, user-friendly, and game changing solutions. Its proceedings are published in the Lecture Notes in Computer Science by Springer-Verlag.

Knowledge engine

Knowledge engine or Knowledge Engine may refer to:

Wolfram Alpha, a computational knowledge engine or answer engine developed by Wolfram Research

Knowledge Engine (Wikimedia Foundation), a search engine project by the Wikimedia Foundation

Knowledge graph, the concept in information science

Knowledge Graph, a knowledge base used by Google to enhance its search engine's search results with semantic search information gathered from a wide variety of sources

LawMoose

LawMoose, launched in September 2000, is believed to have been the first U.S. regional legal search engine operating its own independent web crawler.

Initially LawMoose provided a searchable index drawn from Minnesota law and government sites. Later, it added a similar capability for Wisconsin law sites and select general legal reference starting point sites.

LawMoose has since evolved into a hybrid bi-level public and subscription legal knowledge environment, featuring a thesaurus-based topical map of legal and governmental web resources (which spans the U.S. and globe and adds non-legal resources in a subscriber edition), a list of the largest one hundred Minnesota law firms, ranked by number of Minnesota lawyers, the Minnesota Legal Periodical Index, listing and topically categorizing more than 39,000 thousand articles published in Minnesota legal publications from 1984 to the present (in the public edition), and a densely interconnected, constantly evolving legal words, phrases, concepts and resources knowledge graph (in a subscriber edition).

LawMoose's legal words, phrases, concepts and resources knowledge graph consists of more than 337,000 legal, governmental, business, insurance, and popular terms, interconnected through more than 1,425,000 semantic relationships. Interconnections are based on a relationships vocabulary of 300 relationship types. This multi-dimensional intellectual network functions as a navigable intellectual model of law and law practice.

This semantic, intellectual network-based approach to organizing legal knowledge, legal diagnostic and problem solving concepts, law practice processes, and conceptually locating legal and law practice resources is a significant departure from traditional hierarchical, case law-specific legal taxonomies, such as the taxonomy utilized by the West American Digest System and from collections of searchable primary law.

The Minnesota Legal Periodical Index has been continuously maintained by the Minnesota State Law Library since 1984. Since 2002, it has appeared on LawMoose through a collaboration with LawMoose publisher, Pritchard Law Webs, Minneapolis, Minnesota.

Ontology (information science)

In computer science and information science, an ontology encompasses a representation, formal naming, and definition of the categories, properties, and relations between the concepts, data, and entities that substantiate one, many, or all domains.

Every field creates ontologies to limit complexity and organize information into data and knowledge. As new ontologies are made, their use hopefully improves problem solving within that domain. Translating research papers within every field is a problem made easier when experts from different countries maintain a controlled vocabulary of jargon between each of their languages.Since Google started an initiative called Knowledge Graph, a substantial amount of research has gone on using the phrase knowledge graph as a generalized term. Although there is no clear definition for the term knowledge graph, it is sometimes used as synonym for ontology. One common interpretation is that a knowledge graph represents a collection of interlinked descriptions of entities – real-world objects, events, situations or abstract concepts. Unlike ontologies, knowledge graphs, such as Google's Knowledge Graph, often contain large volumes of factual information with less formal semantics. In some contexts, the term knowledge graph is used to refer to any knowledge base that is represented as a graph.

Ontotext

Ontotext is a Bulgarian software company headquartered in Sofia. It is the semantic technology branch of Sirma Group. Its main domain of activity is the development of software based on the Semantic Web languages and standards, in particular RDF, OWL and SPARQL. Ontotext is best known for the Ontotext GraphDB semantic graph database engine. Another major business line is the development of enterprise knowledge management and analytics systems that involve big knowledge graphs. Those systems are developed on top of the Ontotext Platform that builds on top of GraphDB capabilities for text mining using big knowledge graphs.

Together with BBC, Ontotext developed one of the early large-scale industrial semantic applications, Dynamic Semantic Publishing, starting in 2010.Ontotext content management systems deliver semantic tagging, classification, recommendation, search and discovery services. Typically they involve semantic data integration that results in a big knowledge graph, which combines proprietary master data with open data and commercially available datasets. These big knowledge graphs are used to provide context about the corresponding domain and semantic profiles of the key concepts and entities in it.

Search engine results page

Search Engine Results Pages (SERP) are the pages displayed by search engines in response to a query by a searcher. The main component of the SERP is the listing of results that are returned by the search engine in response to a keyword query, although the pages may also contain other results such as advertisements.The results are of two general types, organic search (i.e., retrieved by the search engine's algorithm) and sponsored search (i.e., advertisements). The results are normally ranked by relevance to the query. Each result displayed on the SERP normally includes a title, a link that points to the actual page on the Web, and a short description showing where the keywords have matched content within the page for organic results. For sponsored results, the advertiser chooses what to display.

Due to the huge number of items that are available or related to the query, there usually are several pages in response to a single search query as the search engine or the user's preferences restrict viewing to a subset of results per page. Each succeeding page will tend to have lower ranking or lower relevancy results. Just like the world of traditional print media and its advertising, this enables competitive pricing for page real estate, but is complicated by the dynamics of consumer expectations and intent— unlike static print media where the content and the advertising on every page is the same all of the time for all viewers, despite such hard copy being localized to some degree, usually geographic, like state, metro-area, city, or neighborhoods, search engine results can vary based on individual factors such as browsing habits.

Semantic network

A semantic network, or frame network is a knowledge base that represents semantic relations between concepts in a network. This is often used as a form of knowledge representation. It is a directed or undirected graph consisting of vertices, which represent concepts, and edges, which represent semantic relations between concepts, mapping or connecting semantic fields.

Typical standardized semantic networks are expressed as semantic triples.

Semantic networks are used in natural language processing applications such as semantic parsing and word-sense disambiguation.

Timeline of Google Search

Google Search, offered by Google, is the most widely used search engine on the World Wide Web as of 2014, with over three billion searches a day. This page covers key events in the history of Google's search service.

For a history of Google the company, including all of Google's products, acquisitions, and corporate changes, see the history of Google page.

Wikipediocracy

Wikipediocracy is a website for discussion and criticism of Wikipedia. Its members have brought information about Wikipedia's controversies to the attention of the media. The site was founded in March 2012 by users of Wikipedia Review, another site critical of Wikipedia.The site is "known for digging up dirt on Wikipedia's top brass", wrote reporter Kevin Morris in the Daily Dot. Novelist Amanda Filipacchi wrote in The Wall Street Journal that the site "intelligently discusses and entertainingly lambastes Wikipedia’s problematic practices".

Yummly

Yummly is a mobile app and website that provides recipe recommendations personalized to the individual's tastes, semantic recipe search, a digital recipe box, shopping list and one-hour grocery delivery. The Yummly app is available for iOS, Android and web browsers. The Yummly app was named "Best of 2014" in Apple's App Store.Yummly uses patent-pending technology, and a hand-curated knowledge graph to offer a semantic web search engine for food, cooking and recipes. Yummly allows users to search by ingredient, diet, allergy, nutrition, price, cuisine, time, taste, meal courses and sources; and ‘learns’ about users based on their likes and dislikes. Yummly uses this information to categorize food for search and make recommendations.Yummly is located in Redwood City, California, previously at 165 University Avenue — the former home of other successful internet companies.In 2014, Yummly had 15 million active users in the US and has launched international websites in the UK, Germany and The Netherlands.In May 2017, the company was acquired by appliance maker Whirlpool Corporation. The site will continue to operate as a subsidiary, keeping its current head office.

Computable knowledge
Topics and
concepts
Proposals and
implementations
In fiction
Overview
Advertising
Communication
Software
Platforms
Hardware
Development
tools
Publishing
Search
(timeline)
Events
People
Other
Related

This page is based on a Wikipedia article written by authors (here).
Text is available under the CC BY-SA 3.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.