A file format is a standard way that information is encoded for storage in a computer file. It specifies how bits are used to encode information in a digital storage medium. File formats may be either proprietary or free and may be either unpublished or open.
Some file formats are designed for very particular types of data: PNG files, for example, store bitmapped images using lossless data compression. Other file formats, however, are designed for storage of several different types of data: the Ogg format can act as a container for different types of multimedia including any combination of audio and video, with or without text (such as subtitles), and metadata. A text file can contain any stream of characters, including possible control characters, and is encoded in one of various character encoding schemes. Some file formats, such as HTML, scalable vector graphics, and the source code of computer software are text files with defined syntaxes that allow them to be used for specific purposes.
File formats often have a published specification describing the encoding method and enabling testing of program intended functionality. Not all formats have freely available specification documents, partly because some developers view their specification documents as trade secrets, and partly because other developers never author a formal specification document, letting precedent set by other already existing programs that use the format define the format via how these existing programs use it.
If the developer of a format doesn't publish free specifications, another developer looking to utilize that kind of file must either reverse engineer the file to find out how to read it or acquire the specification document from the format's developers for a fee and by signing a non-disclosure agreement. The latter approach is possible only when a formal specification document exists. Both strategies require significant time, money, or both; therefore, file formats with publicly available specifications tend to be supported by more programs.
Patent law, rather than copyright, is more often used to protect a file format. Although patents for file formats are not directly permitted under US law, some formats encode data using patented algorithms. For example, using compression with the GIF file format requires the use of a patented algorithm, and though the patent owner did not initially enforce their patent, they later began collecting royalty fees. This has resulted in a significant decrease in the use of GIFs, and is partly responsible for the development of the alternative PNG format. However, the patent expired in the US in mid-2003, and worldwide in mid-2004.
Different operating systems have traditionally taken different approaches to determining a particular file's format, with each approach having its own advantages and disadvantages. Most modern operating systems and individual applications need to use all of the following approaches to read "foreign" file formats, if not work with them completely.
One popular method used by many operating systems, including Windows, Mac OS X, CP/M, DOS, VMS and VM/CMS is to determine the format of a file based on the end of its name, more specifically the letters following the final period. This portion of the filename is known as the filename extension. For example, HTML documents are identified by names that end with .html (or .htm), and GIF images by .gif. In the original FAT filesystem, file names were limited to an eight-character identifier and a three-character extension, known as an 8.3 filename. There are only so many three-letter extensions, so, often any given extension might be linked to more than one program. Many formats still use three-character extensions even though modern operating systems and application programs no longer have this limitation. Since there is no standard list of extensions, more than one format can use the same extension, which can confuse both the operating system and users.
One artifact of this approach is that the system can easily be tricked into treating a file as a different format simply by renaming it—an HTML file can, for instance, be easily treated as plain text by renaming it from filename.html to filename.txt. Although this strategy was useful to expert users who could easily understand and manipulate this information, it was often confusing to less technical users, who could accidentally make a file unusable (or "lose" it) by renaming it incorrectly.
This led more recent operating system shells, such as Windows 95 and Mac OS X, to hide the extension when listing files. This prevents the user from accidentally changing the file type, and allows expert users to turn this feature off and display the extensions.
Hiding the extension, however, can create the appearance of two or more identical filenames in the same folder. For example, a company logo may be needed both in .eps format (for publishing) and .png format (for web sites). With the extensions visible, these would appear as the unique filenames: "CompanyLogo.eps" and "CompanyLogo.png". On the other hand, hiding the extensions would make both appear as "CompanyLogo", which can lead to confusion.
Hiding extensions can also pose a security risk. For example, a malicious user could create an executable program with an innocent name such as "Holiday photo.jpg.exe". The ".exe" would be hidden and an unsuspecting user would see "Holiday photo.jpg", which would appear to be a JPEG image, usually unable to harm the machine. However, the operating system would still see the ".exe" extension and run the program, which would then be able to cause harm to the computer. The same is true with files with only one extension: as it is not shown to the user, no information about the file can be deduced without explicitly investigating the file. To further trick users, it is possible to store an icon inside the program, in which case some operating systems' icon assignment for the executable file (.exe) would be overridden with an icon commonly used to represent JPEG images, making the program look like an image. Extensions can also be spoofed: some Microsoft Word macro viruses create a Word file in template format and save it with a .doc extension. Since Word generally ignores extensions and looks at the format of the file, these would open as templates, execute, and spread the virus. These issues require users with extensions hidden to be vigilant and never let the operating system choose with what program to open a file not known to be trustworthy (which contradicts the idea of making things easier for the user). This represents a practical problem for Windows systems where extension-hiding is turned on by default.
A second way to identify a file format is to use information regarding the format stored inside the file itself, either information meant for this purpose or binary strings that happen to always be in specific locations in files of some formats. Since the easiest place to locate them is at the beginning, such area is usually called a file header when it is greater than a few bytes, or a magic number if it is just a few bytes long.
The metadata contained in a file header are usually stored at the start of the file, but might be present in other areas too, often including the end, depending on the file format or the type of data contained. Character-based (text) files usually have character-based headers, whereas binary formats usually have binary headers, although this is not a rule. Text-based file headers usually take up more space, but being human-readable, they can easily be examined by using simple software such as a text editor or a hexadecimal editor.
As well as identifying the file format, file headers may contain metadata about the file and its contents. For example, most image files store information about image format, size, resolution and color space, and optionally authoring information such as who made the image, when and where it was made, what camera model and photographic settings were used (Exif), and so on. Such metadata may be used by software reading or interpreting the file during the loading process and afterwards.
File headers may be used by an operating system to quickly gather information about a file without loading it all into memory, but doing so uses more of a computer's resources than reading directly from the directory information. For instance, when a graphic file manager has to display the contents of a folder, it must read the headers of many files before it can display the appropriate icons, but these will be located in different places on the storage medium thus taking longer to access. A folder containing many files with complex metadata such as thumbnail information may require considerable time before it can be displayed.
If a header is binary hard-coded such that the header itself needs complex interpretation in order to be recognized, especially for metadata content protection's sake, there is a risk that the file format can be misinterpreted. It may even have been badly written at the source. This can result in corrupt metadata which, in extremely bad cases, might even render the file unreadable.
A more complex example of file headers are those used for wrapper (or container) file formats.
One way to incorporate file type metadata, often associated with Unix and its derivatives, is just to store a "magic number" inside the file itself. Originally, this term was used for a specific set of 2-byte identifiers at the beginnings of files, but since any binary sequence can be regarded as a number, any feature of a file format which uniquely distinguishes it can be used for identification. GIF images, for instance, always begin with the ASCII representation of either GIF87a or GIF89a, depending upon the standard to which they adhere. Many file types, especially plain-text files, are harder to spot by this method. HTML files, for example, might begin with the string <html> (which is not case sensitive), or an appropriate document type definition that starts with <!DOCTYPE HTML>, or, for XHTML, the XML identifier, which begins with <?xml. The files can also begin with HTML comments, random text, or several empty lines, but still be usable HTML.
The magic number approach offers better guarantees that the format will be identified correctly, and can often determine more precise information about the file. Since reasonably reliable "magic number" tests can be fairly complex, and each file must effectively be tested against every possibility in the magic database, this approach is relatively inefficient, especially for displaying large lists of files (in contrast, file name and metadata-based methods need to check only one piece of data, and match it against a sorted index). Also, data must be read from the file itself, increasing latency as opposed to metadata stored in the directory. Where file types don't lend themselves to recognition in this way, the system must fall back to metadata. It is, however, the best way for a program to check if the file it has been told to process is of the correct format: while the file's name or metadata may be altered independently of its content, failing a well-designed magic number test is a pretty sure sign that the file is either corrupt or of the wrong type. On the other hand, a valid magic number does not guarantee that the file is not corrupt or is of a correct type.
So-called shebang lines in script files are a special case of magic numbers. Here, the magic number is human-readable text that identifies a specific command interpreter and options to be passed to the command interpreter.
Another operating system using magic numbers is AmigaOS, where magic numbers were called "Magic Cookies" and were adopted as a standard system to recognize executables in Hunk executable file format and also to let single programs, tools and utilities deal automatically with their saved data files, or any other kind of file types when saving and loading data. This system was then enhanced with the Amiga standard Datatype recognition system. Another method was the FourCC method, originating in OSType on Macintosh, later adapted by Interchange File Format (IFF) and derivatives.
A final way of storing the format of a file is to explicitly store information about the format in the file system, rather than within the file itself.
This approach keeps the metadata separate from both the main data and the name, but is also less portable than either file extensions or "magic numbers", since the format has to be converted from filesystem to filesystem. While this is also true to an extent with filename extensions—for instance, for compatibility with MS-DOS's three character limit—most forms of storage have a roughly equivalent definition of a file's data and name, but may have varying or no representation of further metadata.
Note that zip files or archive files solve the problem of handling metadata. A utility program collects multiple files together along with metadata about each file and the folders/directories they came from all within one new file (e.g. a zip file with extension .zip). The new file is also compressed and possibly encrypted, but now is transmissible as a single file across operating systems by FTP systems or attached to email. At the destination, it must be unzipped by a compatible utility to be useful, but the problems of transmission are solved this way.
The Mac OS' Hierarchical File System stores codes for creator and type as part of the directory entry for each file. These codes are referred to as OSTypes. These codes could be any 4-byte sequence, but were often selected so that the ASCII representation formed a sequence of meaningful characters, such as an abbreviation of the application's name or the developer's initials. For instance a HyperCard "stack" file has a creator of WILD (from Hypercard's previous name, "WildCard") and a type of STAK. The BBEdit text editor has a creator code of
R*ch referring to its original programmer, Rich Siegel. The type code specifies the format of the file, while the creator code specifies the default program to open it with when double-clicked by the user. For example, the user could have several text files all with the type code of TEXT, but which each open in a different program, due to having differing creator codes. This feature was intended so that, for example, human-readable plain-text files could be opened in a general purpose text editor, while programming or HTML code files would open in a specialized editor or IDE. However, this feature was often the source of user confusion, as which program would launch when the files were double-clicked was often unpredictable. RISC OS uses a similar system, consisting of a 12-bit number which can be looked up in a table of descriptions—e.g. the hexadecimal number
FF5 is "aliased" to PoScript, representing a PostScript file.
A Uniform Type Identifier (UTI) is a method used in Mac OS X for uniquely identifying "typed" classes of entity, such as file formats. It was developed by Apple as a replacement for OSType (type & creator codes).
The UTI is a Core Foundation string, which uses a reverse-DNS string. Some common and standard types use a domain called public (e.g. public.png for a Portable Network Graphics image), while other domains can be used for third-party types (e.g. com.adobe.pdf for Portable Document Format). UTIs can be defined within a hierarchical structure, known as a conformance hierarchy. Thus, public.png conforms to a supertype of public.image, which itself conforms to a supertype of public.data. A UTI can exist in multiple hierarchies, which provides great flexibility.
In addition to file formats, UTIs can also be used for other entities which can exist in OS X, including:
The HPFS, FAT12 and FAT16 (but not FAT32) filesystems allow the storage of "extended attributes" with files. These comprise an arbitrary set of triplets with a name, a coded type for the value and a value, where the names are unique and values can be up to 64 KB long. There are standardized meanings for certain types and names (under OS/2). One such is that the ".TYPE" extended attribute is used to determine the file type. Its value comprises a list of one or more file types associated with the file, each of which is a string, such as "Plain Text" or "HTML document". Thus a file may have several types.
The NTFS filesystem also allows storage of OS/2 extended attributes, as one of the file forks, but this feature is merely present to support the OS/2 subsystem (not present in XP), so the Win32 subsystem treats this information as an opaque block of data and does not use it. Instead, it relies on other file forks to store meta-information in Win32-specific formats. OS/2 extended attributes can still be read and written by Win32 programs, but the data must be entirely parsed by applications.
On Unix and Unix-like systems, the ext2, ext3, ReiserFS version 3, XFS, JFS, FFS, and HFS+ filesystems allow the storage of extended attributes with files. These include an arbitrary list of "name=value" strings, where the names are unique and a value can be accessed through its related name.
The PRONOM Persistent Unique Identifier (PUID) is an extensible scheme of persistent, unique and unambiguous identifiers for file formats, which has been developed by The National Archives of the UK as part of its PRONOM technical registry service. PUIDs can be expressed as Uniform Resource Identifiers using the info:pronom/ namespace. Although not yet widely used outside of UK government and some digital preservation programmes, the PUID scheme does provide greater granularity than most alternative schemes.
MIME types are widely used in many Internet-related applications, and increasingly elsewhere, although their usage for on-disc type information is rare. These consist of a standardised system of identifiers (managed by IANA) consisting of a type and a sub-type, separated by a slash—for instance, text/html or image/gif. These were originally intended as a way of identifying what type of file was attached to an e-mail, independent of the source and target operating systems. MIME types identify files on BeOS, AmigaOS 4.0 and MorphOS, as well as store unique application signatures for application launching. In AmigaOS and MorphOS the Mime type system works in parallel with Amiga specific Datatype system.
There are problems with the MIME types though; several organisations and people have created their own MIME types without registering them properly with IANA, which makes the use of this standard awkward in some cases.
File format identifiers is another, not widely used way to identify file formats according to their origin and their file category. It was created for the Description Explorer suite of software. It is composed of several digits of the form
NNNNNNNNN-XX-YYYYYYY. The first part indicates the organisation origin/maintainer (this number represents a value in a company/standards organisation database), the 2 following digits categorize the type of file in hexadecimal. The final part is composed of the usual file extension of the file or the international standard number of the file, padded left with zeros. For example, the PNG file specification has the FFID of
31 indicates an image file,
0015948 is the standard number and
000000001 indicates the International Organization for Standardization (ISO).
Another but less popular way to identify the file format is to examine the file contents for distinguishable patterns among file types. The contents of a file are a sequence of bytes and a byte has 256 unique permutations (0–255). Thus, counting the occurrence of byte patterns that is often referred as byte frequency distribution gives distinguishable patterns to identify file types. There are many content-based file type identification schemes that use byte frequency distribution to build the representative models for file type and use any statistical and data mining techniques to identify file types 
There are several types of ways to structure data in a file. The most usual ones are described below.
Earlier file formats used raw data formats that consisted of directly dumping the memory images of one or more structures into the file.
This has several drawbacks. Unless the memory images also have reserved spaces for future extensions, extending and improving this type of structured file is very difficult. It also creates files that might be specific to one platform or programming language (for example a structure containing a Pascal string is not recognized as such in C). On the other hand, developing tools for reading and writing these types of files is very simple.
The limitations of the unstructured formats led to the development of other types of file formats that could be easily extended and be backward compatible at the same time.
In this kind of file structure, each piece of data is embedded in a container that somehow identifies the data. The container's scope can be identified by start- and end-markers of some kind, by an explicit length field somewhere, or by fixed requirements of the file format's definition.
Throughout the 1970s, many programs used formats of this general kind. For example, word-processors such as troff, Script, and Scribe, and database export files such as CSV. Electronic Arts and Commodore-Amiga also used this type of file format in 1985, with their IFF (Interchange File Format) file format.
A container is sometimes called a "chunk", although "chunk" may also imply that each piece is small, and/or that chunks do not contain other chunks; many formats do not impose those requirements.
The information that identifies a particular "chunk" may be called many different things, often terms including "field name", "identifier", "label", or "tag". The identifiers are often human-readable, and classify parts of the data: for example, as a "surname", "address", "rectangle", "font name", etc. These are not the same thing as identifiers in the sense of a database key or serial number (although an identifier may well identify its associated data as such a key).
With this type of file structure, tools that do not know certain chunk identifiers simply skip those that they do not understand. Depending on the actual meaning of the skipped data, this may or may not be useful (CSS explicitly defines such behavior).
This concept has been used again and again by RIFF (Microsoft-IBM equivalent of IFF), PNG, JPEG storage, DER (Distinguished Encoding Rules) encoded streams and files (which were originally described in CCITT X.409:1984 and therefore predate IFF), and Structured Data Exchange Format (SDXF).
Indeed, any data format must somehow identify the significance of its component parts, and embedded boundary-markers are an obvious way to do so:
This is another extensible format, that closely resembles a file system (OLE Documents are actual filesystems), where the file is composed of 'directory entries' that contain the location of the data within the file itself as well as its signatures (and in certain cases its type). Good examples of these types of file structures are disk images, OLE documents TIFF, libraries. ODT and DOCX, being PKZip-based are chunked and also carry a directory.
3GP (3GPP file format) is a multimedia container format defined by the Third Generation Partnership Project (3GPP) for 3G UMTS multimedia services. It is used on 3G mobile phones but can also be played on some 2G and 4G phones.
3G2 (3GPP2 file format) is a multimedia container format defined by the 3GPP2 for 3G CDMA2000 multimedia services. It is very similar to the 3GP file format but consumes less space & bandwidth also has some extensions and limitations in comparison to 3GP.Audio file format
An audio file format is a file format for storing digital audio data on a computer system. The bit layout of the audio data (excluding metadata) is called the audio coding format and can be uncompressed, or compressed to reduce the file size, often using lossy compression. The data can be a raw bitstream in an audio coding format, but it is usually embedded in a container format or an audio data format with defined storage layer.BMP file format
The BMP file format, also known as bitmap image file or device independent bitmap (DIB) file format or simply a bitmap, is a raster graphics image file format used to store bitmap digital images, independently of the display device (such as a graphics adapter), especially on Microsoft Windows and OS/2 operating systems.
The BMP file format is capable of storing two-dimensional digital images both monochrome and color, in various color depths, and optionally with data compression, alpha channels, and color profiles. The Windows Metafile (WMF) specification covers the BMP file format. Among others, wingdi.h defines BMP constants and structures.Binary file
A binary file is a computer file that is not a text file. The term "binary file" is often used as a term meaning "non-text file". Many binary file formats contain parts that can be interpreted as text; for example, some computer document files containing formatted text, such as older Microsoft Word document files, contain the text of the document but also contain formatting information in binary form.Doc (computing)
In computing, DOC or doc (an abbreviation of "document") is a filename extension for word processing documents, most commonly in the proprietary Microsoft Word Binary File Format. Historically, the extension was used for documentation in plain text, particularly of programs or computer hardware on a wide range of operating systems. During the 1980s, WordPerfect used DOC as the extension of their proprietary format. Later, in 1983, Microsoft chose to use the DOC extension for their proprietary Microsoft Word format. These uses for the extension have largely disappeared from the PC world.Exif
Exchangeable image file format (officially Exif, according to JEIDA/JEITA/CIPA specifications) is a standard that specifies the formats for images, sound, and ancillary tags used by digital cameras (including smartphones), scanners and other systems handling image and sound files recorded by digital cameras. The specification uses the following existing file formats with the addition of specific metadata tags: JPEG discrete cosine transform (DCT) for compressed image files, TIFF Rev. 6.0 (RGB or YCbCr) for uncompressed image files, and RIFF WAV for audio files (Linear PCM or ITU-T G.711 μ-Law PCM for uncompressed audio data, and IMA-ADPCM for compressed audio data). It is not used in JPEG 2000 or GIF.
This standard consists of the Exif image file specification and the Exif audio file specification.Gzip
gzip is a file format and a software application used for file compression and decompression. The program was created by Jean-loup Gailly and Mark Adler as a free software replacement for the compress program used in early Unix systems, and intended for use by GNU (the "g" is from "GNU"). Version 0.1 was first publicly released on 31 October 1992, and version 1.0 followed in February 1993.ISO base media file format
ISO base media file format (ISO/IEC 14496-12 – MPEG-4 Part 12) defines a general structure for time-based multimedia files such as video and audio.
The identical text is published as ISO/IEC 15444-12 (JPEG 2000, Part 12).It is designed as a flexible, extensible format that facilitates interchange, management, editing and presentation of the media. The presentation may be local, or via a network or other stream delivery mechanism. The file format is designed to be independent of any particular network protocol while enabling support for them in general. It is used as the basis for other media file formats (e.g. container formats MP4 and 3GP).Image file formats
Image file formats are standardized means of organizing and storing digital images. Image files are composed of digital data in one of these formats that can be rasterized for use on a computer display or printer. An image file format may store data in uncompressed, compressed, or vector formats. Once rasterized, an image becomes a grid of pixels, each of which has a number of bits to designate its color equal to the color depth of the device displaying it.JPEG
JPEG ( JAY-peg) is a commonly used method of lossy compression for digital images, particularly for those images produced by digital photography. The degree of compression can be adjusted, allowing a selectable tradeoff between storage size and image quality. JPEG typically achieves 10:1 compression with little perceptible loss in image quality.JPEG compression is used in a number of image file formats. JPEG/Exif is the most common image format used by digital cameras and other photographic image capture devices; along with JPEG/JFIF, it is the most common format for storing and transmitting photographic images on the World Wide Web. These format variations are often not distinguished, and are simply called JPEG.
The term "JPEG" is an initialism/acronym for the Joint Photographic Experts Group, which created the standard. The MIME media type for JPEG is image/jpeg, except in older Internet Explorer versions, which provides a MIME type of image/pjpeg when uploading JPEG images. JPEG files usually have a filename extension of .jpg or .jpeg.
JPEG/JFIF supports a maximum image size of 65,535×65,535 pixels, hence up to 4 gigapixels for an aspect ratio of 1:1.MPEG-4 Part 14
MPEG-4 Part 14 or MP4 is a digital multimedia container format most commonly used to store video and audio, but it can also be used to store other data such as subtitles and still images. Like most modern container formats, it allows streaming over the Internet. The only official filename extension for MPEG-4 Part 14 files is .mp4. MPEG-4 Part 14 (formally ISO/IEC 14496-14:2003) is a standard specified as a part of MPEG-4.
Portable media players are sometimes advertised as "MP4 Players", although some are simply MP3 Players that also play AMV video or some other video format, and do not necessarily play the MPEG-4 Part 14 format.Portable Network Graphics
Portable Network Graphics (PNG, pronounced PEE-en-JEE or PING) is a raster-graphics file-format that supports lossless data compression. PNG was developed as an improved, non-patented replacement for Graphics Interchange Format (GIF).
PNG supports palette-based images (with palettes of 24-bit RGB or 32-bit RGBA colors), grayscale images (with or without alpha channel for transparency), and full-color non-palette-based RGB/RGBA images (with or without alpha channel). The PNG working group designed the format for transferring images on the Internet, not for professional-quality print graphics, and therefore it does not support non-RGB color spaces such as CMYK. A PNG file contains a single image in an extensible structure of "chunks", encoding the basic pixels and other information such as textual comments and integrity checks documented in RFC 2083.PNG files nearly always use the file extension PNG or png and are assigned MIME media type image/png.
PNG was published as informational RFC 2083 in March 1997 and as an ISO/IEC standard in 2004.RSS
RSS (originally RDF Site Summary; later, two competing approaches emerged, which used the backronyms Rich Site Summary and Really Simple Syndication respectively) is a type of web feed which allows users and applications to access updates to online content in a standardized, computer-readable format. These feeds can, for example, allow a user to keep track of many different websites in a single news aggregator. The news aggregator will automatically check the RSS feed for new content, allowing the content to be automatically passed from website to website or from website to user. This passing of content is called web syndication. Websites usually use RSS feeds to publish frequently updated information, such as blog entries, news headlines, or episodes of audio and video series. RSS is also used to distribute podcasts. An RSS document (called "feed", "web feed", or "channel") includes full or summarized text, and metadata, like publishing date and author's name.
A standard XML file format ensures compatibility with many different machines/programs. RSS feeds also benefit users who want to receive timely updates from favourite websites or to aggregate data from many sites.
Subscribing to a website RSS removes the need for the user to manually check the website for new content. Instead, their browser constantly monitors the site and informs the user of any updates. The browser can also be commanded to automatically download the new data for the user.
RSS feed data is presented to users using software called a news aggregator. This aggregator can be built into a website, installed on a desktop computer, or installed on a mobile device. Users subscribe to feeds either by entering a feed's URI into the reader or by clicking on the browser's feed icon. The RSS reader checks the user's feeds regularly for new information and can automatically download it, if that function is enabled. The reader also provides a user interface.STL (file format)
STL (an abbreviation of "stereolithography") is a file format native to the stereolithography CAD software created by 3D Systems. STL has several after-the-fact backronyms such as "Standard Triangle Language" and "Standard Tessellation Language". This file format is supported by many other software packages; it is widely used for rapid prototyping, 3D printing and computer-aided manufacturing. STL files describe only the surface geometry of a three-dimensional object without any representation of color, texture or other common CAD model attributes. The STL format specifies both ASCII and binary representations. Binary files are more common, since they are more compact.An STL file describes a raw, unstructured triangulated surface by the unit normal and vertices (ordered by the right-hand rule) of the triangles using a three-dimensional Cartesian coordinate system. In the original specification, all STL coordinates were required to be positive numbers, but this restriction is no longer enforced and negative coordinates are commonly encountered in STL files today. STL files contain no scale information, and the units are arbitrary.TIFF
Tagged Image File Format, abbreviated TIFF or TIF, is a computer file format for storing raster graphics images, popular among graphic artists, the publishing industry, and photographers. TIFF is widely supported by scanning, faxing, word processing, optical character recognition, image manipulation, desktop publishing, and page-layout applications. The format was created by Aldus Corporation for use in desktop publishing. It published the latest version 6.0 in 1992, subsequently updated with an Adobe Systems copyright after the latter acquired Aldus in 1994. Several Aldus or Adobe technical notes have been published with minor extensions to the format, and several specifications have been based on TIFF 6.0, including TIFF/EP (ISO 12234-2), TIFF/IT (ISO 12639), TIFF-F (RFC 2306) and TIFF-FX (RFC 3949).TIFF/EP
Tag Image File Format/Electronic Photography (TIFF/EP) is a digital image file format standard – ISO 12234-2, titled "Electronic still-picture imaging – Removable memory – Part 2: TIFF/EP image data format". This is different from the Tagged Image File Format, which is a standard administered by Adobe currently called "TIFF, Revision 6.0 Final – June 3, 1992".
The TIFF/EP standard is based on a subset of the Adobe TIFF standard, and a subset of the JEITA Exif standard, with some differences and extensions.
One of the uses of TIFF/EP is as a raw image format. A characteristic of most digital cameras (but excluding those using the Foveon X3 sensor or similar, hence especially Sigma cameras) is that they use a color filter array (CFA). Software processing a raw image format for such a camera needs information about the configuration of the color filter array, so that the raw image can identify separate data from the individual sites of the sensor. Ideally this information is held within the raw image file itself, and TIFF/EP uses the tags that begin "CFA", CFARepeatPatternDim and CFAPattern, which are only relevant for raw images.
This standard has not been adopted by most camera manufacturers – Exif/DCF is the current industry standard file organisation system which uses the Exchangeable image file format. However, TIFF/EP provided a basis for the raw image formats of a number of cameras. One example is Nikon's NEF raw file format, which uses the tag TIFF/EPStandardID (with value 126.96.36.199). Adobe's DNG (Digital Negative) raw file format was based on TIFF/EP, and the DNG specification states "DNG... is compatible with the TIFF-EP standard". Several cameras use DNG as their raw file format, so in that limited sense they use TIFF/EP too.Tar (computing)
In computing, tar is a computer software utility for collecting many files into one archive file, often referred to as a tarball, for distribution or backup purposes. The name is derived from (t)ape (ar)chive, as it was originally developed to write data to sequential I/O devices with no file system of their own. The archive data sets created by tar contain various file system parameters, such as name, time stamps, ownership, file access permissions, and directory organization. The command line utility was first introduced in the Version 7 Unix in January 1979, replacing the tp program. The file structure to store this information was standardized in POSIX.1-1988 and later POSIX.1-2001, and became a format supported by most modern file archiving systems.WAV
Waveform Audio File Format (WAVE, or more commonly known as WAV due to its filename extension; pronounced "wave" or WAV) is a Microsoft and IBM audio file format standard for storing an audio bitstream on PCs. It is an application of the Resource Interchange File Format (RIFF) bitstream format method for storing data in "chunks", and thus is also close to the 8SVX and the AIFF format used on Amiga and Macintosh computers, respectively. It is the main format used on Microsoft Windows systems for raw and typically uncompressed audio. The usual bitstream encoding is the linear pulse-code modulation (LPCM) format.Zip (file format)
ZIP is an archive file format that supports lossless data compression. A ZIP file may contain one or more files or directories that may have been compressed. The ZIP file format permits a number of compression algorithms, though DEFLATE is the most common. This format was originally created in 1989 and released to the public domain on February 14, 1989 by Phil Katz, and was first implemented in PKWARE, Inc.'s PKZIP utility, as a replacement for the previous ARC compression format by Thom Henderson. The ZIP format was then quickly supported by many software utilities other than PKZIP. Microsoft has included built-in ZIP support (under the name "compressed folders") in versions of Microsoft Windows since 1998. Apple has included built-in ZIP support in Mac OS X 10.3 (via BOMArchiveHelper, now Archive Utility) and later. Most free operating systems have built in support for ZIP in similar manners to Windows and Mac OS X.
ZIP files generally use the file extensions .zip or .ZIP and the MIME media type application/zip. ZIP is used as a base file format by many programs, usually under a different name. When navigating a file system via a user interface, graphical icons representing ZIP files often appear as a document or other object prominently featuring a zipper.