What is utf 8 in HTML

Last Updated : 26 May, 2026

When creating websites and web applications, it is important to ensure that content displays correctly for users worldwide. Text encoding defines how characters are represented digitally, and UTF-8 (Unicode Transformation Format 8-bit) is one of the most commonly used encodings because it supports characters from many languages and scripts.

  • UTF-8 uses one to four bytes to represent characters through variable-length encoding.
  • It supports all valid Unicode character code points.
  • Common ASCII characters use only one byte, while complex characters use additional bytes.

Why Use UTF-8 in HTML

  • Global Compatibility: UTF-8 supports characters from nearly all languages, including alphabets, symbols, and emojis, making it suitable for international websites.
  • Standard Web Encoding: UTF-8 is the default character encoding for HTML5 and is widely supported by modern browsers and web servers.
  • Space Efficiency: UTF-8 uses a single byte for ASCII characters, making it efficient for English text and common symbols, while still supporting non-ASCII characters.

Setting UTF-8 in HTML

To ensure that the HTML document uses the UTF-8 encoding, we should specify it within the <head> section using the <meta> tag.

Syntax:

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>UTF-8 in HTML</title>
</head>
</html>

In the above example:

  • The meta tag with charset="UTF-8" ensures that the browser interprets the HTML file using the UTF-8 encoding.
  • It can allows the page to display the special characters and symbols correctly.

UTF-8 and Special Characters

UTF-8 can enables you to include the special characters directly in the HTML content. This can be particularly useful when you need to display the non-English characters, emojis, or mathematical symbols. For example:

<p>Smiley Face Emoji: 😊</p>
<p>Math Symbol: ∑ (summation)</p>
<p>Chinese Characters: 汉字</p>

Without the proper UTF-8 encoding then the browser might display such characters as the garbled text (known as "mojibake").

Verifying UTF-8 Encoding

It can be essential that the server is also sending the correct content type header specifying the UTF-8. This can be done through the server configuration files like .htaccess for Apache or nginx.conf for Nginx.

For Apache, we can add the following line to the .htaccess file:

AddDefaultCharset UTF-8

For the Nginx, add this in the nginx.conf file:

charset utf-8;

These configurations ensure that the server tells the browser to interpret the content as the UTF-8.

Example: This example will demonstrates how to set up the HTML page with UTF-8 encoding and how to display the text and symbols from various language. We will create the simple webpage that showcases text in the multiple languages, special characters, and emojis, all encoded with UTF-8.

HTML
<!DOCTYPE html>
<html lang="en">

<head>
    <meta charset="UTF-8">
    <title>UTF-8 HTML Example</title>
    <style>
        body {
            font-family: Arial, sans-serif;
            line-height: 1.5;
            background-color: #f4f4f4;
            margin: 0;
            padding: 20px;
        }

        .container {
            max-width: 800px;
            margin: 0 auto;
            background: white;
            padding: 20px;
            border-radius: 8px;
            box-shadow: 0 0 10px rgba(0, 0, 0, 0.1);
        }

        h1 {
            color: #333;
        }

        .example-text {
            margin-top: 20px;
            padding: 10px;
            background: #e9e9e9;
            border-radius: 5px;
        }
    </style>
</head>

<body>
    <div class="container">
        <h1>Welcome to the UTF-8 HTML Example</h1>
        <p>This page demonstrates UTF-8 encoding in an HTML 
            document, allowing the display of a wide range of
            characters, including various languages, symbols, and emojis.</p>

        <div class="example-text">
            <h2>Languages:</h2>
            <p>English: Hello, World!</p>
            <p>Spanish: ¡Hola, Mundo!</p>
            <p>Chinese: 你好,世界!</p>
            <p>Hindi: नमस्ते, दुनिया!</p>
            <p>Arabic: مرحباً بالعالم!</p>
            <p>Japanese: こんにちは、世界!</p>
            <p>Korean: 안녕하세요, 세계!</p>
        </div>

        <div class="example-text">
            <h2>Special Characters & Symbols:</h2>
            <p>Mathematical Symbol: ∑ (summation)</p>
            <p>Greek Letter: Ω (Omega)</p>
            <p>Currencies: $ (Dollar), € (Euro), ¥ (Yen), ₹ (Rupee)</p>
        </div>

        <div class="example-text">
            <h2>Emojis:</h2>
            <p>Smiley Face: 😊</p>
            <p>Rocket: 🚀</p>
            <p>Thumbs Up: 👍</p>
            <p>Earth Globe: 🌍</p>
        </div>

        <footer>
            <p>&copy; 2024 UTF-8 HTML Example. All rights reserved.</p>
        </footer>
    </div>
</body>

</html>
Comment