GRDDL

GRDDL (pronounced "griddle") is a markup format for Gleaning Resource Descriptions from Dialects of Languages. It is a W3C Recommendation, and enables users to obtain RDF triples out of XML documents, including XHTML. The GRDDL specification shows examples using XSLT, however it was intended to be abstract enough to allow for other implementations as well. It became a Recommendation on September 11, 2007.[1]

Mechanism

XHTML and transformations

A document specifies associated transformations, using one of a number of ways.

For instance, an XHTML document may contain the following markup:

<head profile="http://www.w3.org/2003/g/data-view
		http://dublincore.org/documents/dcq-html/
		http://gmpg.org/xfn/11">

<link rel="transformation" href="grokXFN.xsl" />

Document consumers are informed that there are GRDDL transformations available in this page, by including the following in the profile attribute of the head element:

http://www.w3.org/2003/g/data-view

The available transformations are revealed through one or more link elements:

<link rel="transformation" href="grokXFN.xsl" />

This code is valid for XHTML 1.x only. The profile attribute has been dropped in HTML5, including its XML serialisation.

Microformats and profile transformations

If an XHTML page contains Microformats, there is usually a specific profile.

For instance, a document with hcard information should have:

<head profile="http://www.w3.org/2003/g/data-view http://www.w3.org/2006/03/hcard">

When fetched http://www.w3.org/2006/03/hcard has:

<head profile="http://www.w3.org/2003/g/data-view">

and

<p>Use of this profile licenses RDF data extracted by
   <a rel="profileTransformation" href="../vcard/hcard2rdf.xsl">hcard2rdf.xsl</a>
    from <a href="http://www.w3.org/2006/vcard/ns">the 2006 vCard/RDF work</a>.
</p>

The GRDDL aware agent can then use that profileTransformation to extract all hcard data from pages that reference that link.

XML and transformations

In a similar fashion to XHTML, GRDDL transformations can be attached to XML documents.

XML namespace transformations

Just like a profileTransformation, an XML namespace can have a transformation associated with it.

This allows entire XML dialects (for instance, KML or Atom) to provide meaningful RDF.

An XML document simply points to a namespace

<foo xmlns="http://example.com/1.0/{{dead link|date=October 2017 |bot=InternetArchiveBot |fix-attempted=yes }}">
   <!-- document content here -->
</foo>

and when fetched, http://example.com/1.0/ points to a namespaceTransformation.

This also allows very large amounts of the existing XML data in the wild to become RDF/XML with minimal effort from the namespace author.

Output

Once a document has been transformed, there is an RDF representation of that data.

This output is generally put into a database and queried via SPARQL.

Implementations

GRDDL consumers (also known as GRDDL aware agents)

See also

  • Microformats – a simplified approach to semantically annotate data in websites
  • RDFa – a W3C Recommendation for annotating websites with RDF data
  • eRDF – an alternative to RDFa

References

  1. ^ W3C press release announcing that GRDDL reached Recommendation status.
  • Kerner, Sean Michael (2006-10-26). "W3C Looks to GRDDL For Semantic Web Sense'". internetnews.com.

External links

Acct (protocol)

The acct URI scheme is a proposed internet standard published by the Internet Engineering Task Force, defined by RFC 7565. The purpose of the scheme is to identify, rather than interact, with user accounts hosted by a service provider. This scheme differs from the DNS name which specifies the service provider.The acct URI was intended to be the single URI scheme that would return information about a person (or possibly a thing) that holds an account at a given domain.

Brian Suda

Brian Suda (born 29 May 1979, St. Louis, Missouri) is an American informatician living in Reykjavík, Iceland.

Suda received a bachelor's degree in computer science from St. Louis University in 2001 and a master's degree in informatics from the University of Edinburgh in 2003. Much of his adult life has been spent abroad, first in Scotland and then in Iceland, where in 2008 he was one of three founders of Skólapúlsinn, a company that helps Icelandic schools measure the engagement, academic ability, and well-being of students.Suda was an invited expert in the W3C's GRDDL working group in 2008, co-author of the hCard microformat specification, and in 2010 wrote a book, A Practical Guide to Designing with Data, published by Five Simple Steps. He has written for many online and print publications including A List Apart, Linux Format, Viðskiptablaðið, and SitePoint.

Embedded RDF

Embedded RDF (eRDF) is a syntax for writing HTML in such a way that the information in the HTML document can be extracted (with an eRDF parser or XSLT style sheet) into Resource Description Framework (RDF). This can be of great use for searching within data.

It was invented by Ian Davis in 2005, and partly inspired by microformats, a simplified approach to semantically annotate data in websites. This specification is obsolete, superseded by RDFa, Microdata, and JSON-LD.

Internationalized Resource Identifier

The Internationalized Resource Identifier (IRI) – is an internet protocol standard which extends the ASCII characters subset of the Uniform Resource Identifier (URI) protocol. It was defined by the Internet Engineering Task Force (IETF) in 2005 as a new internet standard to extend the existing URI scheme. The primary standard is defined by the RFC 3987. While URIs are limited to a subset of the ASCII character set, IRIs may contain characters from the Universal Character Set (Unicode/ISO 10646), including Chinese or Japanese kanji, Korean, Cyrillic characters, and so forth.

Microformat

A microformat (sometimes abbreviated μF) is a World Wide Web-based approach to semantic markup which uses HTML/XHTML tags supported for other purposes to convey additional metadata and other attributes in web pages and other contexts that support (X)HTML, such as RSS. This approach allows software to process information intended for end-users (such as contact information, geographic coordinates, calendar events, and similar information) automatically.

Although the content of web pages has been capable of some "automated processing" since the inception of the web, such processing is difficult because the markup tags used to display information on the web do not describe what the information means. Microformats can bridge this gap by attaching semantics, and thereby obviate other, more complicated, methods of automated processing, such as natural language processing or screen scraping. The use, adoption and processing of microformats enables data items to be indexed, searched for, saved or cross-referenced, so that information can be reused or combined.As of 2013 microformats allow the encoding and extraction of event details, contact information, social relationships and similar information.

Open data

Open data is the idea that some data should be freely available to everyone to use and republish as they wish, without restrictions from copyright, patents or other mechanisms of control. The goals of the open-source data movement are similar to those of other "open(-source)" movements such as open-source software, hardware, open content, open education, open educational resources, open government, open knowledge, open access, open science, and the open web. Paradoxically, the growth of the open data movement is paralleled by a rise in intellectual property rights. The philosophy behind open data has been long established (for example in the Mertonian tradition of science), but the term "open data" itself is recent, gaining popularity with the rise of the Internet and World Wide Web and, especially, with the launch of open-data government initiatives such as Data.gov, Data.gov.uk and Data.gov.in.

Open data, can also be linked data; when it is, it is linked open data. One of the most important forms of open data is open government data (OGD), which is a form of open data created by ruling government institutions. Open government data's importance is borne from it being a part of citizens' everyday lives, down to the most routine/mundane tasks that are seemingly far removed from government.

Open science data

Open science data is a type of open data focused on publishing observations and results of scientific activities available for anyone to analyze and reuse. A major purpose of the drive for open data is to allow the verification of scientific claims, by allowing others to look at the reproducibility of results, and to allow data from many sources to be integrated to give new knowledge. While the idea of open science data has been actively promoted since the 1950s, the rise of the Internet has significantly lowered the cost and time required to publish or obtain data.

RDFa

RDFa (or Resource Description Framework in Attributes) is a W3C Recommendation that adds a set of attribute-level extensions to HTML, XHTML and various XML-based document types for embedding rich metadata within Web documents. The RDF data-model mapping enables its use for embedding RDF subject-predicate-object expressions within XHTML documents. It also enables the extraction of RDF model triples by compliant user agents.

The RDFa community runs a wiki website to host tools, examples, and tutorials.

Resource Description Framework

The Resource Description Framework (RDF) is a family of World Wide Web Consortium (W3C) specifications originally designed as a metadata data model. It has come to be used as a general method for conceptual description or modeling of information that is implemented in web resources, using a variety of syntax notations and data serialization formats. It is also used in knowledge management applications.

RDF was adopted as a W3C recommendation in 1999. The RDF 1.0 specification was published in 2004, the RDF 1.1 specification in 2014.

Semantic Web

The Semantic Web is an extension of the World Wide Web through standards by the World Wide Web Consortium (W3C). The standards promote common data formats and exchange protocols on the Web, most fundamentally the Resource Description Framework (RDF). According to the W3C, "The Semantic Web provides a common framework that allows data to be shared and reused across application, enterprise, and community boundaries". The Semantic Web is therefore regarded as an integrator across different content, information applications and systems.

The term was coined by Tim Berners-Lee for a web of data (or data web) that can be processed by machines—that is, one in which much of the meaning is machine-readable. While its critics have questioned its feasibility, proponents argue that applications in library and information science, industry, biology and human sciences research have already proven the validity of the original concept.Berners-Lee originally expressed his vision of the Semantic Web as follows:

I have a dream for the Web [in which computers] become capable of analyzing all the data on the Web – the content, links, and transactions between people and computers. A "Semantic Web", which makes this possible, has yet to emerge, but when it does, the day-to-day mechanisms of trade, bureaucracy and our daily lives will be handled by machines talking to machines. The "intelligent agents" people have touted for ages will finally materialize.

The 2001 Scientific American article by Berners-Lee, Hendler, and Lassila described an expected evolution of the existing Web to a Semantic Web. In 2006, Berners-Lee and colleagues stated that: "This simple idea…remains largely unrealized".

In 2013, more than four million Web domains contained Semantic Web markup.

Uniform Resource Identifier

A Uniform Resource Identifier (URI) is a string of characters that unambiguously identifies a particular resource. To guarantee uniformity, all URIs follow a predefined set of syntax rules, but also maintain extensibility through a separately defined hierarchical naming scheme (e.g. http://).

Such identification enables interaction with representations of the resource over a network, typically the World Wide Web, using specific protocols. Schemes specifying a concrete syntax and associated protocols define each URI. The most common form of URI is the Uniform Resource Locator (URL), frequently referred to informally as a web address. More rarely seen in usage is the Uniform Resource Name (URN), which was designed to complement URLs by providing a mechanism for the identification of resources in particular namespaces.

Web standards

Web standards are the formal, non-proprietary standards and other technical specifications that define and describe aspects of the World Wide Web. In recent years, the term has been more frequently associated with the trend of endorsing a set of standardized best practices for building web sites, and a philosophy of web design and development that includes those methods.

World Wide Web Consortium

The World Wide Web Consortium (W3C) is the main international standards organization for the World Wide Web (abbreviated WWW or W3).

Founded and currently led by Tim Berners-Lee, the consortium is made up of member organizations which maintain full-time staff for the purpose of working together in the development of standards for the World Wide Web. As of 19 November 2018, the World Wide Web Consortium (W3C) has 476 members.The W3C also engages in education and outreach, develops software and serves as an open forum for discussion about the Web.

Background
Sub-topics
Applications
Related topics
Standards

This page is based on a Wikipedia article written by authors (here).
Text is available under the CC BY-SA 3.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.