Google transliteration

Google transliteration (formerly Google Indic Transliteration) is a transliteration typing service for Hindi and other languages.

This tool first appeared in Blogger, Google's popular blogging service.[1] Later on, it came into existence as a separate online tool. Its popularity got it embedded in GMail and Orkut. In December 2009, Google released its offline version named Google IME.

This tool from Google is based on dictionary based phonetic transliteration approach. In contrast to older Indic typing tools, which work by transliterating under a particular scheme, Google transliterates by matching the Latin alphabet words with an inbuilt dictionary. Since users do not need to remember the transliteration scheme, the service is so easy that it is suitable for total beginners.

For transliteration between scripts, there was, until July 2011, a separate service named Google Script Converter.

See also

External links


Arabic (Arabic: العَرَبِيَّة‎) al-ʻarabiyyah [alʕaraˈbijːa] (listen) or (Arabic: عَرَبِيّ‎) ʻarabī [ˈʕarabiː] (listen) or Arabic pronunciation: [ʕaraˈbij]) is a Central Semitic language that first emerged in Iron Age northwestern Arabia and is now the lingua franca of the Arab world. It is named after the Arabs, a term initially used to describe peoples living in the area bounded by Mesopotamia in the east and the Anti-Lebanon mountains in the west, in northwestern Arabia, and in the Sinai Peninsula. Arabic is classified as a macrolanguage comprising 30 modern varieties, including its standard form, Modern Standard Arabic, which is derived from Classical Arabic.

As the modern written language, Modern Standard Arabic is widely taught in schools and universities, and is used to varying degrees in workplaces, government, and the media. The two formal varieties are grouped together as Literary Arabic (fuṣḥā), which is the official language of 26 states, and the liturgical language of Islam. Modern Standard Arabic largely follows the grammatical standards of Classical Arabic, and uses much of the same vocabulary. However, it has discarded some grammatical constructions and vocabulary that no longer have any counterpart in the spoken varieties, and has adopted certain new constructions and vocabulary from the spoken varieties. Much of the new vocabulary is used to denote concepts that have arisen in the post-classical era, especially in modern times. Due to its grounding in Classical Arabic, Modern Standard Arabic is removed over a millennium from everyday speech, which is construed as a multitude of dialects of this language. These dialects and Modern Standard Arabic are described by some scholars as not mutually comprehensible. The former are usually acquired in families, while the latter is taught in formal education settings. However, there have been studies reporting some degree of comprehension of stories told in the standard variety among preschool-aged children. The relation between Modern Standard Arabic and these dialects is sometimes compared to that of Latin and vernaculars (or today's French, Czech or German) in medieval and early modern Europe. This view though does not take into account the widespread use of Modern Standard Arabic as a medium of audiovisual communication in today's mass media—a function Latin has never performed.

During the Middle Ages, Literary Arabic was a major vehicle of culture in Europe, especially in science, mathematics and philosophy. As a result, many European languages have also borrowed many words from it. Arabic influence, mainly in vocabulary, is seen in European languages, mainly Spanish and to a lesser extent Portuguese, and Catalan, owing to both the proximity of Christian European and Muslim Arab civilizations and 800 years of Arabic culture and language in the Iberian Peninsula, referred to in Arabic as al-Andalus. Sicilian has about 500 Arabic words as result of Sicily being progressively conquered by Arabs from North Africa, from the mid-9th to mid-10th centuries. Many of these words relate to agriculture and related activities. Balkan languages, including Greek and Bulgarian, have also acquired a significant number of Arabic words through contact with Ottoman Turkish.

Arabic has influenced many languages around the globe throughout its history. Some of the most influenced languages are Persian, Turkish, Spanish, Urdu, Kashmiri, Kurdish, Bosnian, Kazakh, Bengali, Hindi, Malay, Maldivian, Indonesian, Pashto, Punjabi, Tagalog, Sindhi, and Hausa, and some languages in parts of Africa. Conversely, Arabic has borrowed words from other languages, including Greek and Persian in medieval times, and contemporary European languages such as English and French in modern times.

Classical Arabic is the liturgical language of 1.8 billion Muslims, and Modern Standard Arabic is one of six official languages of the United Nations. All varieties of Arabic combined are spoken by perhaps as many as 422 million speakers (native and non-native) in the Arab world, making it the fifth most spoken language in the world. Arabic is written with the Arabic alphabet, which is an abjad script and is written from right to left, although the spoken varieties are sometimes written in ASCII Latin from left to right with no standardized orthography.

Azhagi (software)

Azhagi (Tamil: அழகி) is a freeware transliteration tool used to convert the words to regional languages like Tamil, Hindi and some other Indian languages which The Hindu named as among the transliteration tools that "stand out" in 2002. Since 2000, Azhagi has provided support to Tamil transliteration; later this was expanded to nearly 13 Indian Languages.

In 2006 Azhagi was the recipient of the Manthan Award of India's Digital Empowerment Foundation and the World Summit Award project, in the category Localization. In the same year Azhagi was identified as a "success story" by Microsoft's Indic language computing site.

Bengali input methods

Bengali input methods refer to different systems developed to type Bengali language characters using a typewriter or a computer keyboard.

Devanagari transliteration

There are several methods of transliteration from Devanāgarī to the Roman script (a process known as romanization) which share similarities, although no single system of transliteration has emerged as the standard. This process has been termed Romanagari, a portmanteau of the words Roman and Devanagari. (Devanagari is the name of the script in which Hindi is written). The term may also be used for other languages that use Devanagari as the standard writing script, such as Marathi, Nepali or Sanskrit.

Google IME

Google IME is a set of typing tools (input method editors) by Google for 22 languages, including Amharic, Arabic, Bengali, Chinese, Greek, Gujarati, Hindi, Japanese, Kannada, Malayalam, Marathi, Nepali, Persian, Punjabi, Russian, Sanskrit, Serbian, Tamil, Telugu, Tigrinya, and Urdu. It is a virtual keyboard that allows users to type in their local language text directly in any application without the hassle of copying and pasting.

Google Script Converter

Google Script Converter was an online transliteration tool for transliteration (script conversion) between Hindi, Romanagari and various other scripts. This tool was started in November 2009. This service was ended in July 2011 because Google shut down Google Labs and all associated projects.

This tool could do transliteration of text as well as complete web pages from one script to other. The transliteration was done in real time and the transliterated page could be seen in browser immediately.

The accuracy of this tool had been found better than other tools of this kind. Being based on the dictionary approach, this service was very useful to convert Romanagari text to unicode Hindi (Devanagari).


The "Indian languages TRANSliteration" (ITRANS) is an ASCII transliteration scheme for Indic scripts, particularly for Devanagari script.

The need for a simple encoding scheme that used only keys available on an ordinary keyboard was felt in the early days of the RMIM newsgroup where lyrics and trivia about Indian popular movie songs was being discussed. In parallel was a Sanskrit Mailing list that quickly felt the need of an exact and unambiguous encoding. ITRANS emerged on the RMIM newsgroup as early as 1994. This was spearheaded by Avinash Chopde, who developed a transliteration package. Its latest version is v5.34.The package also enables automatic conversion of the Roman script to the Indic version.

ITRANS was in use for the encoding of Indian etexts - it is wider in scope than the Harvard-Kyoto scheme for Devanagari transliteration, with which it coincides largely, but not entirely. The early Sanskrit mailing list of the early 1990s, almost same time as RMIM, developed into the full blown Sanskrit Documents project and now uses ITRANS extensively, with thousands of encoded texts. With the wider implementation of Unicode, the traditional IAST is used increasingly also for electronic texts.

Like the Harvard-Kyoto scheme, the ITRANS romanization only uses diacritical signs found on the common English-language computer keyboard, and it is quite easy to read and pick up.

Input method

An input method (or input method editor, commonly abbreviated IME) is an operating system component or program that allows any data, such as keyboard strokes or mouse movements, to be received as input. In this way users can enter characters and symbols not found on their input devices. Using an input method is obligatory for any language that has more graphemes than there are keys on the keyboard.

For instance, on the computer, this allows the user of Latin keyboards to input Chinese, Japanese, Korean and Indic characters; on many hand-held devices, such as mobile phones, it enables using the numeric keypad to enter Latin alphabet characters (or any other alphabet characters) or a screen display to be touched to do so. On some operating systems, an input method is also used to define the behaviour of the dead keys.

Kannada script

The Kannada script (IAST: Kannaḍa lipi) is an abugida of the Brahmic family, used primarily to write the Kannada language, one of the Dravidian languages of South India especially in the state of Karnataka, Kannada script is widely used for writing Sanskrit texts in Karnataka. Several minor languages, such as Tulu, Konkani, Kodava, Sanketi and Beary, also use alphabets based on the Kannada script. The Kannada and Telugu scripts share high mutual intellegibility with each other, and are often considered to be regional variants of single script. Other scripts similar to Kannada script are Sinhala script (which included some elements from the Kadamba script), and Old Peguan script

(used in Burma).The Kannada script (ಅಕ್ಷರಮಾಲೆ akṣaramāle or ವರ್ಣಮಾಲೆ varṇamāle) is a phonemic abugida of forty-nine letters, and is written from left to right. The character set is almost identical to that of other Brahmic scripts. Consonantal letters imply an inherent vowel. Letters representing consonants are combined to form digraphs (ಒತ್ತಕ್ಷರ ottakṣara) when there is no intervening vowel. Otherwise, each letter corresponds to a syllable.

The letters are classified into three categories: ಸ್ವರ svara (vowels), ವ್ಯಂಜನ vyañjana (consonants), and ಯೋಗವಾಹಕ yōgavāhaka (semiconsonants).

The Kannada words for a letter of the script are ಅಕ್ಷರ akshara, ಅಕ್ಕರ akkara, and ವರ್ಣ varṇa. Each letter has its own form (ಆಕಾರ ākāra) and sound (ಶಬ್ದ śabda), providing the visible and audible representations, respectively. Kannada is written from left to right.

Microsoft Indic Language Input Tool

Microsoft Indic Language Input Tool is a typing tool (Input Method Editor) for languages written in Indic scripts. It is a virtual keyboard which allows to type Indic text directly in any application without hassle of copying and pasting. It is available for both, online and offline use. It was released in December 2009.

It works on Dictionary based Phonetic Transliteration approach. It means whatever you type in Latin characters, it matches that with its dictionary and transliterates it, it also gives suggestions for matching words.

Tamil blogosphere

The Tamil blogosphere is the online community of Tamil-language weblogs that are a part of the larger Indian blogosphere. The Tamil blogosphere has a considerable number of contributors from Sri Lanka and Singapore, and is one of the largest blogospheres resident in India.

Tamil input methods

Tamil input methods refer to different systems developed to type Tamil language characters using a typewriter or a computer keyboard.

Several programs such as Azhagi and NHM writer provide both fixed and phonetic type layouts for typing.


This page is based on a Wikipedia article written by authors (here).
Text is available under the CC BY-SA 3.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.