You may have heard of the concept of homoglyphs, or confusables as ีnicode calls them. To make it simplะต just because a letter on a cฮฟmputer looks like an X does not mean it is an ฮง. In fact a few of character๊ฎช in this last paragraph are not the standard Latin characters.
Above I avoided some more of the obโ ดัoโชs ones. What blends in varies highly based on the font you are using and the particular formatting around it. Overuse of them can cause the ransom note effect. The more devious, or the ones who have read about this before may have realized a problem with having identical characters. wรญkipedia.org is a great example. At first glance it may appear to be Wikipedia, but itโs actually Desciclopรฉdia, the Portuguese version of Uncyclopedia.
Those who followed the link may have noticed they are protected from such an attack. In the navbar it changes to http://xn--wkipedia-c2a.org. Different programs handle this differently. For example Discord does the same thing, but Slack will happily show the Unicode version. The reason behind this is the DNS wants hostnames to be in ASCII, but still wants to be able to support non-latin characters. The solution to this is called Punycode.
Unicode is massive. As of this writing the latest version is 11.0, which contains 137,439 characters from 146 different scripts. These range from languages in active use to things like Inscriptional Parthian and Linear B. There is even some fun stuff in there like ๏ธ(U+FE18) whose name contains a misspelling of bracket as brakcet. There are 1,114,112 different code points in UTF-16, meaning that just over a tenth are used. Reducency is therefore not a problem and may go a bit to explaining why homoglyphs are allowed to exist.
If you want to play with what I am about to talk about yourself, head over to the website I made and give it a try: https://textconstructor.y42.xyz/
Cyrillic
By far one of the easiest foreign scripts to find equivalents to Latin characters in. It should be obvious why this is in Unicode being the language of many people.
ะฐ ั ะต ะพ ั ั
ั โ Lowercase Russian Cyrillic
a c e o p x y โ Lowercase Latin
ะ ะ ะก ะ ะ ะ ะ ะ ะ ะ ะ ะ
ะข ะฅ โ Uppercase Russian Cyrillic
A B C E H I J K M O P S T X โ Uppercase Latin
ั ั ิ ั ิ าฎ า ิ โ Misc Cryllic
i j q s w Y F G โ Latin
Greek
Greek shares an overlap with Latin characters, mostly capitals. Interestingly it also has an overlap with Cyrillic which I will not get into here.
ฮ ฮ ฮ ฮ ฮ ฮ ฮ ฮ ฮ ฮก ฮค ฮง ฮฅ ฮ ฮฟ ฮฝ โ Greek
A B E H I K M N O P T X Y Z o v โ Latin
Armenian
There is not as much overlap here but there is still a few shared characters.
ิผ ี ึ
ีธ ีฝ โ Armenian
L S o n u โ Latin
Roman Numerals
A rather odd thing to include, given that they are just Latin letters. However they were added to Unicode as a way to make compatibility with older letter encoding systems easier.
โ
โ
ค โ
ฉ โ
ฌ โ
ญ โ
ฎ โ
ฏ โ
ฐ โ
ด โ
น โ
ผ โ
ฝ โ
พ โ
ฟ โ Roman numerals
I V X L C D M i v x l c d m โ Latin
Bold/Italic/Sans-serif
These are all functionally the same as normal Latin characters but with some sort of formatting involved. Unicode says you should not use these for presentation markup, meaning you should not just substitute the bold set in to display a bolded Latin character. The sans-serif set will appear identical to the normal Latin set in a sans-serif font.
๐๐๐๐๐๐๐ ๐ก๐ข๐ฃ๐ค๐ฅ๐ฆ๐ง๐จ๐ฉ๐ช๐ซ๐ฌ๐ญ๐ฎ๐ฏ๐ฐ๐ฑ๐ฒ๐ณ๐๐๐๐๐๐
๐๐๐๐๐๐๐๐๐๐๐๐๐๐๐๐๐๐๐๐ โ Bold
๐ข๐ฃ๐ค๐ฅ๐ฆ๐ง๐จ๐ฉ๐ช๐ซ๐ฌ๐ญ๐ฎ๐ฏ๐ฐ๐ฑ๐ฒ๐ณ๐ด๐ต๐ถ๐ท๐ธ๐น๐บ๐ป๐๐๐๐๐๐๐๐๐๐๐๐๐๐๐๐๐๐๐๐๐๐๐๐๐ ๐ก โ Italic
๐๐๐๐๐๐๐๐๐๐๐ ๐ก๐ข๐ฃ๐ค๐ฅ๐ฆ๐ง๐จ๐ฉ๐ช๐ซ๐ฌ๐ญ๐ฎ๐ฏ๐ผ๐ฝ๐พ๐ฟ๐๐๐๐๐๐
๐๐๐๐๐๐๐๐๐๐๐๐๐๐๐๐ โ Bold Italic
๐ ๐ก๐ข๐ฃ๐ค๐ฅ๐ฆ๐ง๐จ๐ฉ๐ช๐ซ๐ฌ๐ญ๐ฎ๐ฏ๐ฐ๐ฑ๐ฒ๐ณ๐ด๐ต๐ถ๐ท๐ธ๐น๐บ๐ป๐ผ๐ฝ๐พ๐ฟ๐๐๐๐๐๐
๐๐๐๐๐๐๐๐๐๐๐๐๐๐ โ Sans Serif (these also come in bold,italic, and bold/italic varients)
Bubble
This is a very old part of Unicode though it has been updated and expanded since it first appeared. It was used for making things like lists. ยฎ ยฉ โ are not actually part of this set considered instead to be unique symbols.
โโโโโโโโโโโโโโโโโ โกโขโฃโคโฅโฆโงโจโฉโถโทโธโนโบโปโผโฝโพโฟโโโโโโ
โโโโโโโโโโ โ Circled Latin
๐
๐
๐
๐
๐
๐
๐
๐
๐
๐
๐
๐
๐
๐
๐
๐
๐
๐
ก๐
ข๐
ฃ๐
ค๐
ฅ๐
ฆ๐
ง๐
จ๐
ฉ๐
๐
๐
๐
๐
๐
๐
๐
๐
๐
๐
๐
๐
๐
๐
๐
๐
๐
ก๐
ข๐
ฃ๐
ค๐
ฅ๐
ฆ๐
ง๐
จ๐
ฉ โ Negative Circled Latin
Small Capitals
Like normal capital letters but smaller. These are meant for IPA representations of spellings. This set is missing the x, and the f, q, and s are not available on some systems.
แดสแดแด
แด๊ฐษขสษชแดแดสแดษดแดแด๊ฏส๊ฑแดแดแด แดกสแดข โ Small caps
Superscript / Subscript
The most interesting thing here is the lack of the letter q in superscript lowercase. If this post does one thing, I hope it starts momentum to fix this oversight. Both subscript and superscript capital are missing a number of characters.
แดฌแดฎแดฐแดฑแดณแดดแดตแดถแดทแดธแดนแดบแดผแดพแดฟแตแตแต โ Superscript capital
แตแตแถแตแตแถ แตสฐโฑสฒแตหกแตโฟแตแตสณหขแตแตแตสทหฃสธแถป โ Superscript lowercase
โโโแตขโฑผโโโโโโแตฃโโแตคแตฅโ โ Subscript
Upside Down Text
This one is a hodgepodge of different characters. Different people implement the idea differently. Wikipedia actually has a chart comparing sites (the site I made uses the list on that page) that have done which may be close to the most ridiculous feature comparison on Wikipedia. Most of the truly flipped characters are used in IPA.
Zโ
XMฮโฉโฅSแดแขิONWหฅ๊ลฟIHโ
โฒฦแกฦแบโzสxสสnสsษนbdouษฏืสษพฤฑษฅฦษวpษqษ โ Upside Down Latin
Paratherized
This is from the same block as the bubble letters above and are used for a similar purpose.
โโโโโ โกโขโฃโคโฅโฆโงโจโฉโชโซโฌโญโฎโฏโฐโฑโฒโณโดโต โ Paratherized Latin
Squared
From the same block as Paratherized and Bubble letters. This block also contains the regional symbols (๐บ ๐ธ) which are used to make the country flag emojis. The negative version of squared does not render the same for every character because of both blood types (A,B,O, and AB) and parking (the P). There exists a few combination squares as well . ๐๐๐๐๐๐๐๐๐๐ ๐ ๐ ๐ ๐๐๐
๐ฐ๐ฑ๐ฒ๐ณ๐ด๐ต๐ถ๐ท๐ธ๐น๐บ๐ป๐ผ๐ฝ๐พ๐ฟ๐
๐
๐
๐
๐
๐
๐
๐
๐
๐
โ Squared
๐
ฐ๐
ฑ๐
ฒ๐
ณ๐
ด๐
ต๐
ถ๐
ท๐
ธ๐
น๐
บ๐
ป๐
ผ๐
ฝ๐
พ๐
ฟ๐๐๐๐๐๐
๐๐๐๐ โ Negative Squared
Blackboard Bold
This group is used for math, specifically number sets. It is not new, and is thought to have come from a 1965 textbook on complex analysis.
๐๐๐๐๐๐๐๐๐๐๐๐๐๐๐ ๐ก๐ข๐ฃ๐ค๐ฅ๐ฆ๐ง๐จ๐ฉ๐ช๐ซ๐ธ๐นโ๐ป๐ผ๐ฝ๐พโ๐๐๐๐๐โ๐โโโ๐๐๐๐๐๐๐โค โ Blackboard Bold
Full Width
The official text of ๏ฝ๏ฝ
๏ฝ๏ฝ๏ฝ๏ฝ
๏ฝ๏ฝ๏ฝ. This is a holdover from early usage of Chinese on the computer. A Chinese character is closer to a square so it would take up two character slots on a terminal. To keep formatting ASCII text consistent they added full width characters which take up two normal ASCII character slots.
๏ฝ๏ฝ๏ฝ๏ฝ๏ฝ
๏ฝ๏ฝ๏ฝ๏ฝ๏ฝ๏ฝ๏ฝ๏ฝ๏ฝ๏ฝ๏ฝ๏ฝ๏ฝ๏ฝ๏ฝ๏ฝ๏ฝ๏ฝ๏ฝ๏ฝ๏ฝ๏ผก๏ผข๏ผฃ๏ผค๏ผฅ๏ผฆ๏ผง๏ผจ๏ผฉ๏ผช๏ผซ๏ผฌ๏ผญ๏ผฎ๏ผฏ๏ผฐ๏ผฑ๏ผฒ๏ผณ๏ผด๏ผต๏ผถ๏ผท๏ผธ๏ผน๏ผบ โ Full Width Latin
Script
Unicode has included another set of letters for the purpose of mathematics. This is quite fancy looking and comes in normal and bold variants. This has less support than many other items on this list, notably ChromeOS can not display it.
๐ถ๐ท๐ธ๐น๐๐ป๐๐ฝ๐พ๐ฟ๐๐๐๐๐๐
๐๐๐๐๐๐๐๐๐๐๐๐ต๐๐๐ธ๐น๐ข๐ป๐ผ๐ฅ๐ฆ๐ฟ๐๐ฉ๐ช๐ซ๐ฌ๐
๐ฎ๐ฏ๐ฐ๐ฑ๐ฒ๐ณ๐ด๐ต โ Script Latin
๐ช๐ซ๐ฌ๐ญ๐ฎ๐ฏ๐ฐ๐ฑ๐ฒ๐ณ๐ด๐ต๐ถ๐ท๐ธ๐น๐บ๐ป๐ผ๐ฝ๐พ๐ฟ๐๐๐๐๐๐๐๐๐๐๐๐๐๐๐๐๐๐๐๐๐ ๐ก๐ข๐ฃ๐ค๐ฅ๐ฆ๐ง๐จ๐ฉ โ Bold Script Latin
Fraktur
This might be one of the weirder alphabets on this list. There are 400 years of history behind the usage of it. Up until the early 20th century it was the German lettering of choice. The move away from it was a dispute worthy of a Wikipedia article. It does contain the entire Latin alphabet so it is included here.
๐๐๐ ๐ก๐ข๐ฃ๐ค๐ฅ๐ฆ๐ง๐จ๐ฉ๐ช๐ซ๐ฌ๐ญ๐ฎ๐ฏ๐ฐ๐ฑ๐ฒ๐ณ๐ด๐ต๐ถ๐ท๐๐
โญ๐๐๐๐โโ๐๐๐๐๐๐๐๐โ๐๐๐๐๐๐๐โจ โ Fraktur
๐๐๐๐๐๐๐๐๐๐๐๐๐๐๐๐๐๐๐๐๐๐๐๐๐๐๐ฌ๐ญ๐ฎ๐ฏ๐ฐ๐ฑ๐ฒ๐ณ๐ด๐ต๐ถ๐ท๐ธ๐น๐บ๐ป๐ผ๐ฝ๐พ๐ฟ๐๐๐๐๐๐
โ Bold Fraktur
This is not a comprehensive list. Unicode is massive and they continue to add to it every year. In the future more homoglyphs may appear. I welcome any feedback, interesting tidbits, and corrections. There will be an attempt to keep this article up to date but no guarantee. I hope this inspires you to look at the text of Unicode in a different way and make your own discoveries about the weirdness it contains.