Unicode Escape Converter

Turn any text into Unicode escape sequences — or paste escapes and get the text back. Pick the format your language expects: \uXXXX for JavaScript and Java, \u{1F600} for modern ECMAScript, U+1F600 notation, or HTML entities. Emoji and other astral characters are handled correctly, with surrogate pairs generated where the format needs them.

How to use the Unicode Escape Converter

Choose Encode to turn text into escape sequences, then pick the format your target language or document expects. With Only escape non-ASCII ticked, plain letters, digits, and spaces are left as-is and only characters above U+007F are escaped — the most readable result for source code. Untick it to escape every character.

Choose Decode to go the other way. Decoding is format-agnostic: it recognizes \uXXXX, \u{X}, \xXX, U+XXXX, and both hex (😀) and decimal (😀) HTML entities in the same input, so you can paste mixed text and it will all resolve. Consecutive \uXXXX surrogate pairs are recombined into the single emoji or astral character they represent.

Everything runs in your browser as you type. Use Copy output to grab the result.

Unicode escapes and the surrogate-pair trap

A Unicode escape is a way of writing a character using its code point number instead of the glyph itself. This matters whenever a character is hard to type, invisible, or unsafe to store in a particular file — a smart quote in a JSON config, an emoji in a Java .properties file, or a zero-width space you want to make visible. Each environment has its own spelling for the same idea: JavaScript and Java historically use \uXXXX with exactly four hex digits, modern JavaScript adds \u{...} that accepts a full code point, the Unicode standard itself writes U+1F600, and HTML uses numeric character references like 😀 or 😀.

The four-hex-digit format hides a famous trap. Sixteen bits can only address code points up to U+FFFF, the Basic Multilingual Plane. Everything beyond it — most emoji, many historic scripts, mathematical alphanumerics — lives in the supplementary planes and cannot fit in a single \uXXXX. UTF-16 solves this with surrogate pairs: the code point is split into two 16-bit halves, a high surrogate in the range U+D800–U+DBFF and a low surrogate in U+DC00–U+DFFF, written as two consecutive \u escapes. So 😀 (U+1F600) becomes 😀. A converter that ignores this either fails on emoji or emits invalid escapes.

This tool encodes to whichever format you select — generating surrogate pairs only for the fixed-width \uXXXX form, and emitting the clean single code point for \u{...}, U+, and HTML entities, which can all represent supplementary characters directly. When decoding, it reads the input code point by code point, recombines any adjacent high/low surrogate pair back into one character, and accepts all the common formats at once. That round-trip safety — text in, escapes out, the same text back — is the whole point, and it is why getting the surrogate handling right matters.

Common use cases

  • Embedding text in source code. Escape accented names or symbols so a string literal stays pure ASCII and survives any encoding.
  • Reading escaped strings. Decode a \uXXXX-laden JSON value or log line to see the actual characters.
  • HTML numeric entities. Produce &#x...; references for characters you cannot type directly into markup.
  • Debugging emoji. Reveal the surrogate pairs behind an emoji, or rebuild the emoji from its escapes.

Frequently asked questions

Does it handle emoji and other astral characters?

Yes. For the \uXXXX format it emits the correct UTF-16 surrogate pair; for \u{X}, U+, and HTML entities it emits the single code point. Decoding recombines surrogate pairs back into one character.

What formats can the decoder read?

It accepts \uXXXX, \u{X}, \xXX, U+XXXX, and HTML hex (&#x...;) and decimal (&#...;) entities, even mixed together in the same input.

What does "Only escape non-ASCII" do?

When on, characters in the printable ASCII range are left unchanged and only characters above U+007F are escaped — ideal for keeping source code readable. Turn it off to escape every character.

Is my text sent anywhere?

No. All conversion happens locally in your browser, so you can safely process private strings and tokens.