Edgar F. Codd

Edgar Frank "Ted" Codd (19 August 1923 – 18 April 2003) was an English computer scientist who, while working for IBM, invented the relational model for database management, the theoretical basis for relational databases and relational database management systems. He made other valuable contributions to computer science, but the relational model, a very influential general theory of data management, remains his most mentioned, analyzed and celebrated achievement.[6][7]

Edgar "Ted" Codd
Edgar F Codd
Born
Edgar Frank Codd

19 August 1923[1][2]
Died18 April 2003 (aged 79)
Williams Island, Aventura, Florida, USA
Alma materExeter College, Oxford
University of Michigan
Known forOLAP
Relational model[3]
Codd's cellular automaton
Codd's 12 rules
Boyce–Codd normal form
AwardsTuring Award (1981)[4]
Scientific career
FieldsComputer Science
InstitutionsUniversity of Oxford
University of Michigan
IBM
ThesisPropagation, Computation, and Construction in Two-dimensional cellular spaces (1965)
Doctoral advisorJohn Henry Holland[5]

Biography

Edgar Frank Codd was born in Fortuneswell, on the Isle of Portland in Dorset, England. After attending Poole Grammar School, he studied mathematics and chemistry at Exeter College, Oxford, before serving as a pilot in the RAF Coastal Command during the Second World War, flying Sunderlands.[8] In 1948, he moved to New York to work for IBM as a mathematical programmer. In 1953, angered by Senator Joseph McCarthy, Codd moved to Ottawa, Ontario, Canada. In 1957 he returned to the US working for IBM and from 1961–1965 pursuing his doctorate in computer science at the University of Michigan in Ann Arbor. Two years later he moved to San Jose, California, to work at IBM's San Jose Research Laboratory, where he continued to work until the 1980s.[1][9] He was appointed IBM Fellow in 1976. During the 1990s, his health deteriorated and he ceased work.[10]

Codd received the Turing Award in 1981,[1] and in 1994 he was inducted as a Fellow of the Association for Computing Machinery.[11]

Codd died of heart failure at his home in Williams Island, Florida, at the age of 79 on 18 April 2003[12]

Work

Codd received a PhD in 1965 from the University of Michigan, Ann Arbor advised by John Henry Holland.[5][10][13] His thesis was about self-replication in cellular automata, extending on work of von Neumann and showing that a set of eight states was sufficient for universal computation and construction.[14] His design for a self-replicating computer was only implemented in 2010.

In the 1960s and 1970s he worked out his theories of data arrangement, issuing his paper "A Relational Model of Data for Large Shared Data Banks"[3] in 1970, after an internal IBM paper one year earlier.[15] To his disappointment, IBM proved slow to exploit his suggestions until commercial rivals started implementing them.

Initially, IBM refused to implement the relational model to preserve revenue from IMS/DB. Codd then showed IBM customers the potential of the implementation of its model, and they in turn pressured IBM. Then IBM included in its Future Systems project a System R subproject – but put in charge of it developers who were not thoroughly familiar with Codd's ideas, and isolated the team from Codd. As a result, they did not use Codd's own Alpha language but created a non-relational one, SEQUEL. Even so, SEQUEL was so superior to pre-relational systems that it was copied, in 1979, based on pre-launch papers presented at conferences, by Larry Ellison, of Relational Software Inc, in his Oracle Database, which actually reached market before SQL/DS – because of the then-already proprietary status of the original name, SEQUEL had been renamed SQL.

Codd continued to develop and extend his relational model, sometimes in collaboration with Christopher J. Date. One of the normalised forms, the Boyce–Codd normal form, is named after him.

Codd's theorem, a result proven in his seminal work on the relational model, equates the expressive power of relational algebra and relational calculus (both of which, lacking recursion, are strictly less powerful than first-order logic).

As the relational model started to become fashionable in the early 1980s, Codd fought a sometimes bitter campaign to prevent the term being misused by database vendors who had merely added a relational veneer to older technology. As part of this campaign, he published his 12 rules to define what constituted a relational database. This made his position in IBM increasingly difficult, so he left to form his own consulting company with Chris Date and others.

Codd coined the term Online analytical processing (OLAP) and wrote the "twelve laws of online analytical processing".[16] Controversy erupted, however, after it was discovered that this paper had been sponsored by Arbor Software (subsequently Hyperion, now acquired by Oracle), a conflict of interest that had not been disclosed, and Computerworld withdrew the paper.[17]

In 2004, SIGMOD renamed its highest prize to the SIGMOD Edgar F. Codd Innovations Award, in his honour.

Publications

  • Codd, E. F. (1968). Cellular Automata. Academic Press, Inc. LCCN 68-23486.
  • Codd, E. F. (1970). "Relational Completeness of Data Base Sublanguages". Database Systems: 65–98. CiteSeerX 10.1.1.86.9277.
  • Codd, E. F. (9 November 1981). "1981 Turing Award Lecture – Relational Database: A Practical Foundation for Productivity".
  • Codd, E. F. (1990). The Relational Model for Database Management (Version 2 ed.). Addison Wesley Publishing Company. ISBN 978-0-201-14192-4.
  • Codd, E. F.; Codd, S. B.; Salley, C. T. (1993). "Providing OLAP to User-Analysts: An IT Mandate" (PDF).

See also

References

  1. ^ a b c Date, C. J. "A. M. Turing Award – Edgar F. ("Ted") Codd". ACM. Retrieved 2 September 2013. United States – 1981. For his fundamental and continuing contributions to the theory and practice of database management systems.
  2. ^ "12 simple rules: How Ted Codd transformed the humble database". The Register. Retrieved 19 August 2013.
  3. ^ a b Codd, E. F. (1970). "A relational model of data for large shared data banks" (PDF). Communications of the ACM. 13 (6): 377–387. doi:10.1145/362384.362685.
  4. ^ Codd, E. F. (1982). "Relational database: A practical foundation for productivity". Communications of the ACM. 25 (2): 109–117. doi:10.1145/358396.358400.
  5. ^ a b Edgar F. Codd at the Mathematics Genealogy Project
  6. ^ E. F. Codd at DBLP Bibliography Server Edit this at Wikidata
  7. ^ Edgar F. Codd author profile page at the ACM Digital Library
  8. ^ "Edgar F. ("Ted") Codd". A. M. Turing award. he volunteered for active duty and became a flight lieutenant in the Royal Air Force Coastal Command, flying Sunderlands
  9. ^ Rubenstein, Steve. "Edgar F. Codd – computer pioneer in databases." San Francisco Chronicle 24 April 2003: A21. Gale Biography in Context. Web. 1 December 2011.
  10. ^ a b Martin Campbell-Kelly (1 May 2003). "Edgar Codd". The Independent. Retrieved 24 October 2011.
  11. ^ ACM Fellows Archived 15 June 2009 at the Wayback Machine
  12. ^ Edgar F Codd Passes Away, IBM Research, 2003 Apr 23.
  13. ^ Codd, Edgar (1965). Propagation, Computation, and Construction in Two-dimensional cellular spaces (PhD thesis). University of Michigan.
  14. ^ Codd, E. F. (1968). Cellular Automata. London: Academic Pr. ISBN 978-0-12-178850-6.
  15. ^ Michael Owens. The Definitive Guide to SQLite, p.47. New York: Apress (Springer-Verlag) 2006. ISBN 978-1-59059-673-9.
  16. ^ Providing OLAP to User-Analysts: An IT Mandate by E F Codd, S B Codd and C T Salley, ComputerWorld, 26 July 1993.
  17. ^ Whitehorn, Mark (26 January 2007). "OLAP and the need for SPEED". The Register. Retrieved 30 December 2014.

Further reading

External links

Alpha (programming language)

The Alpha language was the original database language proposed by Edgar F. Codd, the inventor of the relational database approach. It was defined in Codd's 1971 paper "A Data Base Sublanguage Founded on the Relational Calculus". Alpha influenced the design of QUEL. It was eventually supplanted by SQL (which is however based on the relational algebra defined by Codd in "Relational Completeness of Data Base Sublanguages"), which IBM developed for its first commercial relational database product.

Boyce–Codd normal form

Boyce–Codd normal form (or BCNF or 3.5NF) is a normal form used in database normalization. It is a slightly stronger version of the third normal form (3NF). BCNF was developed in 1975 by Raymond F. Boyce and Edgar F. Codd to address certain types of anomalies not dealt with by 3NF as originally defined.If a relational schema is in BCNF then all redundancy based on functional dependency has been removed, although other types of redundancy may still exist. A relational schema R is in Boyce–Codd normal form if and only if for every one of its dependencies X → Y, at least one of the following conditions hold:

X → Y is a trivial functional dependency (Y ⊆ X)

X is a superkey for schema R

Cardinality (data modeling)

In database design, the cardinality or fundamental principle of one data aspect with respect to another is a critical feature. The relationship of one to the other must be precise and exact between each other in order to explain how each aspect links together.

In the relational model, tables can be related as any of "one-to-many", "many-to-many" "one-to-zero-or-one", etc.. This is said to be the cardinality of a given table in relation to another.

For example, consider a database designed to keep track of hospital records. Such a database could have many tables like:

a doctor table with information about physicians;

a patient table for medical subjects undergoing treatment;

and a department table with an entry for each division of a hospital.In that model:

There is a many-to-many relationship between the records in the doctor table and records in the patient table because doctors have many patients, and a patient could have several doctors;

There is a one-to-many relationship between the department table and the doctor table because each doctor may work for only one department, but one department could have many doctors.A "one-to-one" relationship is mostly used to split a table in two in order to provide information concisely and make it more understandable. In the hospital example, such a relationship could be used to keep apart doctors' own unique professional information from administrative details.

In data modeling, collections of data elements are grouped into "data tables" which contain groups of data field names called "database attributes". Tables are linked by "key fields". A "primary key" assigns a field to its "special order table". For example, the "Doctor Last Name" field might be assigned as a primary key of the Doctor table with all people having same last name organized alphabetically according to the first three letters of their first name. A table can also have a foreign key which indicates that field is linked to the primary key of another table.

A complex data model can involve hundreds of related tables. A renowned computer scientist, Edgar F. Codd, created a systematic method to decompose and organize relational databases. Codd's steps for organizing database tables and their keys is called database normalization, which avoids certain hidden database design errors (delete anomalies or update anomalies). In real life the process of database normalization ends up breaking tables into a larger number of smaller tables.

In the real world, data modeling is critical because as the data grows voluminous, tables linked by keys must be used to

speed up programmed retrieval of data. If a data model is poorly crafted, even a computer applications system with just a million

records will give the end-users unacceptable response time delays. For this reason, data modeling is a keystone in the skills

needed by a modern software developer.

Codd

Codd is a surname. Notable people with the surname include:

Bernard Codd (died 2013), English motorcycle racer

Edgar F. Codd (1923–2003), British computer scientist

Frederick Codd (1832–), English Gothic revival architect

Hiram Codd (1838–1887), English engineer who invented and patented the Codd Bottle

Leslie Codd (1908–1999), South African botanist

Mike Codd (1939–), former senior Australian public servant

Codd's 12 rules

Codd's twelve rules are a set of thirteen rules (numbered zero to twelve) proposed by Edgar F. Codd, a pioneer of the relational model for databases, designed to define what is required from a database management system in order for it to be considered relational, i.e., a relational database management system (RDBMS). They are sometimes jokingly referred to as "Codd's Twelve Commandments".

Codd's cellular automaton

Codd's cellular automaton is a cellular automaton (CA) devised by the British computer scientist Edgar F. Codd in 1968. It was designed to recreate the computation- and construction-universality of von Neumann's CA but with fewer states: 8 instead of 29. Codd showed that it was possible to make a self-reproducing machine in his CA, in a similar way to von Neumann's universal constructor, but never gave a complete implementation.

Codd's theorem

Codd's theorem states that relational algebra and the domain-independent relational calculus queries, two well-known foundational query languages for the relational model, are precisely equivalent in expressive power. That is, a database query can be formulated in one language if and only if it can be expressed in the other.

The theorem is named after Edgar F. Codd, the father of the relational model for database management.

The domain independent relational calculus queries are precisely those relational calculus queries that are invariant under choosing domains of values beyond those appearing in the database itself. That is, queries that may return different results for different domains are excluded. An example of such a forbidden query is the query "select all tuples other than those occurring in relation R", where R is a relation in the database. Assuming different domains, i.e., sets of atomic data items from which tuples can be constructed, this query returns different results and thus is clearly not domain independent.

Codd's Theorem is notable since it establishes the equivalence of two syntactically quite dissimilar languages: relational algebra is a variable-free language, while relational calculus is a logical language with variables and quantification.

Relational calculus is essentially equivalent to first-order logic, and indeed, Codd's Theorem had been known to logicians since the late 1940s.Query languages that are equivalent in expressive power to relational algebra were called relationally complete by Codd. By Codd's Theorem, this includes relational calculus. Relational completeness clearly does not imply that any interesting database query can be expressed in relationally complete languages. Well-known examples of inexpressible queries include simple aggregations (counting tuples, or summing up values occurring in tuples, which are operations expressible in SQL but not in relational algebra) and computing the transitive closure of a graph given by its binary edge relation (see also expressive power). Codd's theorem also doesn't consider SQL nulls and the three-valued logic they entail; the logical treatment of nulls remains mired in controversy. (For recent work extending Codd's theorem in this direction see the 2012 paper of Franconi and Tessaris.) Additionally, SQL allows duplicate rows (has multiset semantics.) Nevertheless, relational completeness constitutes an important yardstick by which the expressive power of query languages can be compared.

Database normalization

Database normalization is the process of structuring a relational database in accordance with a series of so-called normal forms in order to reduce data redundancy and improve data integrity. It was first proposed by Edgar F. Codd as an integral part of his relational model.

Normalization entails organizing the columns (attributes) and tables (relations) of a database to ensure that their dependencies are properly enforced by database integrity constraints. It is accomplished by applying some formal rules either by a process of synthesis (creating a new database design) or decomposition (improving an existing database design).

David DeWitt

David J. DeWitt is a computer scientist specializing in database management system research at the Massachusetts Institute of Technology. Prior to moving to MIT, DeWitt was the John P. Morgridge Professor (Emeritus) of Computer Sciences at the University of Wisconsin–Madison. He was also a Technical Fellow at Microsoft, leading the Microsoft Jim Gray Systems Lab at Madison, Wisconsin. Professor DeWitt received a B.A. degree from Colgate University in 1970, and a Ph.D. from the University of Michigan in 1976. He then joined the University of Wisconsin-Madison and started the Wisconsin Database Group, which he led for more than 30 years.

Professor DeWitt is known for his research in the areas of parallel databases, benchmarking, object-oriented databases, and XML databases. He is an elected member of the National Academy of Engineering (1998), and a Fellow of the Association for Computing Machinery.

He received the ACM SIGMOD Innovations Award (now renamed SIGMOD Edgar F. Codd Innovations Award) in 1995 for his contributions to the database systems field. In 2009, ACM recognized the seminal contributions of his Gamma parallel database system project with the ACM Software System Award. Also in 2009, he received the IEEE Emanuel R. Piore Award for his contributions to the database systems field.

David Maier

David Maier (born 2 June 1953) is the Maseeh Professor of Emerging Technologies in the Department of Computer Science at Portland State University. Born in Eugene, OR, he has also been a computer science faculty member at the State University of New York at Stony Brook (1978–82), Oregon Graduate Center (OGC, 1982-2001), University of Wisconsin (UW, 1997-8), Oregon Health & Science University (2001–present) and National University of Singapore (2012-5). He holds a B.A. in Mathematics and Computer Science from the University of Oregon (Honors College, 1974) and a Ph.D. in Electrical Engineering and Computer Science from Princeton University (1978).

Maier has been chairman of the program committee of ACM SIGMOD. He also served as an associate editor of ACM Transactions on Database Systems. Maier has consulted with Tektronix, Inc., Servio Corporation, the Microelectronics and Computer Technology Corporation (MCC), Digital Equipment Corporation, Altair, Honeywell, Texas Instruments, IBM, Microsoft, Informix, Oracle Corporation, NCR, and Object Design, as well as several governmental agencies. He is a founding member of the Data-Intensive Systems Center (DISC), a joint project of OGI and Portland State University. He is the author of books on relational databases, logic programming and object-oriented databases, as well as papers in database theory, object-oriented technology and scientific databases. He received the Presidential Young Investigator Award from the National Science Foundation in 1984 at OGC, and was awarded the 1997 SIGMOD Edgar F. Codd Innovations Award for his contributions in objects and databases at UW. He is also an ACM Fellow.

Maier established some of the earliest results on using the relational model. Together with his thesis advisor, Jeffrey Ullman, and fellow Princeton students, including Alberto O. Mendelzon and Yehoshua Sagiv, he co-authored a number of influential papers that laid out the fundamental issues and approaches for relational databases. In a now-famous paper (Maier, Mendelzon and Sagiv, TODS 1979), he introduced the chase, a method for testing implication of data dependencies that is now of widespread use in the database theory literature. This work has been highly influential: it is used, directly or indirectly, on an everyday basis by people who design databases, and it is used in commercial systems to reason about the consistency and correctness of a data design. New applications of the chase in meta-data management and data exchange are still being discovered.

He is credited for coining the term Datalog.

Gerhard Weikum

Gerhard Weikum is a Research Director at the Max Planck Institute for Informatics in Saarbrücken, Germany, where he is leading the databases and information systems department. His current research interests include transactional and distributed systems, self-tuning database systems, data and text integration, and the automatic construction of knowledge bases. He is one of the creators of the YAGO knowledge base. He is also the Dean of the International Max Planck Research School for Computer Science (IMPRS-CS).

Earlier he held positions at Saarland University in Saarbrücken, Germany, at ETH Zurich, Switzerland, at MCC in Austin, Texas, and he was a visiting senior researcher at Microsoft Research in Redmond, Washington. He received his diploma and doctoral degrees from the TU Darmstadt, Germany.

He acted as the President of the VLDB endowment in 2005 and 2006. The endowment organizes the yearly International Conference on Very Large Databases, a scientific conference for researchers in the area of database research.

In 2005 the Association for Computing Machinery appointed Gerhard Weikum a fellow, one of the highest honors of the ACM. Weikum has been honored for his research in the fields of databases and information systems, in particular for his contributions to improve the reliability and the performance of large-scale, distributed information systems. In 2010 he was elected as a fellow of the Gesellschaft für Informatik and received a Google Focused Research Award. He received the ACM SIGMOD Contributions Award in 2011, an ERC Synergy Grant in 2013, and the ACM SIGMOD Edgar F. Codd Innovations Award in 2016.

IBM Research - Almaden

IBM Research - Almaden is in Almaden Valley, San Jose, California, and is one of IBM's twelve worldwide research labs that form IBM Research. Its scientists perform basic and applied research in computer science, services, storage systems, physical sciences, and materials science and technology. The center opened in 1986, and continues the research started in San Jose more than fifty years ago. Nearly all of Almaden’s approximately 500 research employees are in technical functions and more than half of these hold Ph.D.s. The lab is home to ten IBM Fellows, ten IBM Distinguished Engineers, nine IBM Master Inventors and seventeen members of the IBM Academy of Technology.

Almaden occupies part of a site owned by IBM at 650 Harry Road on nearly 700 acres (2.8 km2) of land in the hills above Silicon Valley. The site, built in 1985 for the research center, was chosen because of its close proximity to Stanford University, UC Santa Cruz, UC Berkeley and other collaborative academic institutions. Today, the research division is still the largest tenant of the site, but the majority of occupants work for other divisions of IBM.

IBM opened its first West Coast research centre, the San Jose Research Laboratory in 1952, managed by Reynold B. Johnson. Amongst its first developments was the IBM 350, the first commercial moving head hard disk drive. Launched in 1956, this saw use in the IBM 305 RAMAC computer system. Subdivisions included the Advanced Systems Development Division. Directors of the center include hard disc drive developer Jack Harker.

Prompted by a need for additional space, the center moved to its present Almaden location in 1986.

Scientists at IBM Almaden have contributed to several scientific discoveries such as the development of photoresists and the quantum mirage effect.The following are some of the famous scientists who have worked in the past or are currently working in this laboratory: Rakesh Agrawal, John Backus, Raymond F. Boyce, Donald D. Chamberlin, Ashok K. Chandra, Edgar F. Codd, Mark Dean, Cynthia Dwork, Don Eigler, Ronald Fagin, Jim Gray, Laura M. Haas, Joseph Halpern, Andreas J. Heinrich, Reynold B. Johnson, Maria Klawe, Jaishankar Menon, Dharmendra Modha, William E. Moerner, C. Mohan, Stuart Parkin, Nick Pippenger, Patricia Selinger, Ted Selker, Barbara Simons, Ramakrishnan Srikant, Larry Stockmeyer, Moshe Vardi, Jennifer Widom.

Michael J. Carey (professor)

Michael J. Carey is an American computer scientist. He currently serves as Bren Professor of Information and Computer Science in the Donald Bren School at the University of California, Irvine.

Phil Bernstein

Philip A. Bernstein is a computer scientist specializing in database research in the Database Group of Microsoft Research. Bernstein is also an affiliate professor at the University of Washington and frequent committee member or chair of conferences such as VLDB and SIGMOD. He won the SIGMOD Edgar F. Codd Innovations Award in 1994, and in 2011 with Jayant Madhavan and Erhard Rahm the VLDB 10 Year Best Paper Award for their VLDB 2001 paper "Generic Schema Matching with Cupid".Bernstein is a member of the National Academy of Engineering (elected 2003) and an elected Fellow of the Association for Computing Machinery. He is a charter member of the Washington State Academy of Sciences (2008) and serves on their board of directors.

Ronald Fagin

Ronald Fagin (born 1945) is an American mathematician and computer scientist, and IBM Fellow at the IBM Almaden Research Center. He is known for his work in database theory, finite model theory, and reasoning about knowledge.

Rudolf Bayer

Rudolf Bayer (born 3 March 1939) is a German computer scientist.

He is professor emeritus of Informatics at the Technical University of Munich where he had been employed since 1972. He is noted for inventing three data sorting structures: the B-tree (with Edward M. McCreight), the UB-tree (with Volker Markl) and the red-black tree.

Bayer is a recipient of 2001 ACM SIGMOD Edgar F. Codd Innovations Award. In 2005 he was elected as a fellow of the Gesellschaft für Informatik.

SIGMOD

SIGMOD is the Association for Computing Machinery's Special Interest Group on Management of Data, which specializes in large-scale data management problems and databases.

The annual ACM SIGMOD Conference, which began in 1975, is considered one of the most important in the field. While traditionally this conference had always been held within North America, it took place in Paris in 2004, Beijing in 2007, Athens in 2011, and Melbourne in 2015. The acceptance rate of the ACM SIGMOD Conference, averaged from 1996 to 2012, was 18%, and it was 17% in 2012.In association with SIGACT and SIGART, SIGMOD also sponsors the annual ACM Symposium on Principles of Database Systems (PODS) conference on the theoretical aspects of database systems. PODS began in 1982, and has been held jointly with the SIGMOD conference since 1991.

Each year, the group gives out several awards to contributions to the field of data management. The most important of these is the SIGMOD Edgar F. Codd Innovations Award (named after the computer scientist Edgar F. Codd), which is awarded to "innovative and highly significant contributions of enduring value to the development, understanding, or use of database systems and databases". Additionally, SIGMOD presents a Best Paper Award to recognize the highest quality paper at each conference, and Jim Gray Dissertion Award to the best Ph.D thesis in data management.

SIGMOD Edgar F. Codd Innovations Award

The ACM SIGMOD Edgar F. Codd Innovations Award is a lifetime research achievement award given by the ACM Special Interest Group on Management of Data, at its yearly flagship conference (also called SIGMOD). According to its homepage, it is given "for innovative and highly significant contributions of enduring value to the development, understanding, or use of database systems and databases". The award has been given since 1992.

Tuple relational calculus

Tuple calculus is a calculus that was created and introduced by Edgar F. Codd as part of the relational model, in order to provide a declarative database-query language for data manipulation in this data model. It formed the inspiration for the database-query languages QUEL and SQL, of which the latter, although far less faithful to the original relational model and calculus, is now the de facto standard database-query language; a dialect of SQL is used by nearly every relational-database-management system. Michel Lacroix and Alain Pirotte proposed domain calculus, which is closer to first-order logic and together with Codd showed that both of these calculi (as well as relational algebra) are equivalent in expressive power. Subsequently, query languages for the relational model were called relationally complete if they could express at least all of these queries.

1960s
1970s
1980s
1990s
2000s
2010s
Versions
Keywords
Related
ISO/IEC SQL parts

This page is based on a Wikipedia article written by authors (here).
Text is available under the CC BY-SA 3.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.