File Types and Formats: Complete Guide

Thanks to the files, the software can have configurations, process information, store data, etc. We all use a multitude of files of all kinds (text, videos, images, executable binaries,…) on a day-to-day basis. Therefore, since they are so important, we are going to dedicate this article to showing the types of files and the differences with the format and extensions.

Index of contents

  • What is a file?
    • File Operations
  • File types
  • file formats
  • How is a format identified?
  • Convert between formats and file types

What is a file?

When we refer to a file or computer file , we are referring to a sequence of bytes of information that are stored in some type of memory, such as a secondary storage medium , or CD/DVD, a USB memory, an SD card, etc

In addition, these files will be organized on a file system or file system , that is, what we call formats (NTFS, ext4, HFS,…) . In turn, we can also have folders or directories where to group the files if desired, keeping the information more orderly and organized.

A directory or folder is not a file, but rather a virtual container. It can be considered as a special file that can contain files inside it and also other folders or subdirectories.

Files also have a series of basic elements , such as the name of the file that identifies it, the extension that will identify the format (eg: .jpg, .avi, .zip, .c), metadata (creation date, date of modification, owner, sizes…), and the content of the file itself (the bytes that contain the information as such and that can represent text, binaries,…).

File Operations

Different operations can be performed with the files , such as:

  • Create– Files can be created by a user or by software. To do this, a system call is used, such as creat().
  • Open– Files can be opened to access the information they contain, both for reading and writing to them. This other one also requires a system call like open().
  • Close– Once it is no longer needed, you can also close the file. This is done by using the close() system call.
  • Modification: as I have mentioned, you can also write to it to modify its content, or its data, etc.

File types

Files can be cataloged based on their content or purpose. For example, here are some of the existing types :

  • Text– This type of file contains, as the name suggests, text. It can be plain text or in binary. These files can also contain, for example, source code, which are also plain text files to be later compiled or interpreted, depending on the language.
  • Data: to contain input and output data that is used by programs for processing. Databases, which are a hierarchy of objects or ordered information, can also be included in this group. There may even be data files belonging to the operating system itself, such as some drivers, shared libraries, etc.
  • Audio: of course, there are also audio ones, where the necessary bytes are contained to be able to reproduce the sound.
  • Video: there are also video ones, which can be an achievement of frames or frames that can be accompanied by sound or not.
  • Image:
    • 3D image: contain images modeled in a three-dimensional way.
    • Raster Image or Bitmap– Raster images are those that are made up of a series of pixels, with colors defined for each of these pixels.
    • Vector Image– Vectorized images are made up of dependent geometric objects and attributes that define their properties.
  • Executable– is a type of file in binary format that contains instructions and data that define a program. In this way, it can be loaded into RAM memory to be processed by the CPU. Within this type we can also have other similar files, although they are usually smaller and used by executable programs, such as plugins or modules capable of extending the capabilities of these programs.
  • Web: contain code for creating web pages or web apps, and can be processed by web browsers.
  • Packaged:These are files or a set of files that are packaged in a single file. For example, they can be backup tarballs, disk images, snapshots, etc.
  • Encrypted/encrypted: these are files that contain an encoded format or that have been encrypted to prevent the content from being accessed by third parties without authorization.
  • Other– There may be other types of files as well, but the above are the most popular.

file formats

Within the types of files, we also find different formats . The format is usually identified by an extension that has 3 characters for old FAT systems and DOS systems, or more than 3 characters supported by current file systems and operating systems. For example, we have extensions like .jpg, .html, .db, etc.

Thanks to the formats you can identify the particular type of information that the types of files store. For example, we can have a .png or a .jpg format. Both are images, but they are different. For example, while .png can store transparency, .jpg cannot. And this depends on the way in which the information is encoded to store this data.

There are proprietary formats, created by companies and whose encoding system is secret, although programs that support those formats can be created through reverse engineering. And there are also open file formats.

For example, if you want to see some formats corresponding to the file types described above, we have:

  • Text: .txt, .docx, .odt, .rtf, .c, .js, .vhdl, .cpp,…
  • Data: .csv, .ppt, .pdf, .xml, .pct, .sql, .db, .mdb, .fnt, .fon, .ttf, .sys, .dll, .a, .ink, msi, . ics,…
  • Audio: .mp3, .ogg, .m4a, .wav,…
  • Video: .flv, .mp4, .avi, .mpg, .mkv, …
  • Image:
    • 3D image: .obj, .3dm, .max,…
    • Raster image or bitmap: .bmp, .png, .tiff, .psd, .jpg, .gif,…
    • Vector image: .ai, .svg, .spg, .eps,…
  • Executable: .apk, .rom, .gam, .exe, .jar, …
  • Web: .html, .php, .aspx,…
  • Packaged:.zip, .rar, .zx, .7z, .gz, .bz2, .tmp, .tar, .bak, .tape, .dmg, .iso, .mdf, …
  • Encrypted/encrypted: .hgx, .mim, .uue, .cypher, .cyp, .gpg, .axx, .key, .cha, .kdbx, .epm, …

How is a format identified?

File formats need to be identified in order to know what program they can be opened with, or how they can be handled. Depending on the operating system, this can be done in one way or another. Here are the most popular methods:

  • Extension: the extension that accompanies the file name can be used by operating systems to identify the type of file it is. For example, Windows, DOS, and other systems can use the extension as an identifier. But this has its problems, because if the extension is changed, it could be interpreted as another type of file.
  • Metadata –Some systems may also use internal metadata to identify the file.
  • Header: The header can also be read by the system to try to identify the format.
  • Magic number: This method used by Unix or Unix Like systems, such as Linux, uses an identifier that is stored within the file itself. It is usually 2 bytes of information at the beginning. For example, it could be a code, a tag, or a shebang.
  • Type codes: macOS can also use certain codes that are stored in the file system to identify formats. These codes are known as OSTypes and are a sequence of 4 bytes.
  • UTI (Uniform Type Identify): This is an identifier used in macOS to replace the previous method.
  • Extended attributes– Some systems, such as IBM OS/2, also used extended attributes to identify formats, such as .TYPE.
  • MIME Type: is another way to identify formats, especially on the Internet. It is used by web browsers, as well as systems like AmigaOS, BeOS, MorphOS, Unix and Unix-like, etc.
  • Others– There are also other, less common forms, such as PUI, FFID, content-based identification, etc.

Convert between formats and file types

Finally, surely many have doubts about whether it can be converted between formats or if one type of file can be changed to another . Well, here are the answers:

  • Conversion between types– Converting from one format to another doesn’t make much sense. Generally, it is not possible to go from one type to another, since it does not make sense to convert, for example, a video file to a text file. It also wouldn’t make much sense to convert an executable binary to an image file. However, in some cases it is possible to convert from one type to another, and it makes sense. For example, suppose you convert a video-type file to an audio-type file, since you want to convert a song video clip to a music-only file for playback. Another example case would be if you want to convert a .jpg to a PDF document, that is, an image to a data file.
  • Converting between formats: This other case is not as limited as the previous one, but there are some limitations as well. For example, we can convert a .png that is usually heavier into a .jpg so that its size is reduced to upload it to a website. Or we can convert an .avi to an .mp4 to go from one codec to another so that the video can be played on a player or device that did not support the .avi format. Or maybe we can convert from a .docx to an .odt to use the native format of free office suites or other software alternatives …

Either way, conversion software is generally needed to be able to convert between types or formats.