Thoughts on URLs and Dropcaps in Medium (Tamil Language)


International URLs

I’ve noticed the recent push of new shiny updates on Medium has some issues in relation to URL on non-latin (specifically in Indic language — Tamil) language specific post.

As on earlier version, when I write a post in Tamil and publish the URL will be like this — https://medium.com/tamil-writing/b1f739cd7c3f — which is obvious that the Medium omitted the Unicode characters from the URL to make it clean and unique.

But, for according to the latest version, when I publish a post in Tamil, the URL looks like this in the address bar

Even though, it shows the Unicode characters in the address bar, as it is with the correct characters.

But when someone wants to share this article, if they copy the URL and they get the URL as like this.

https://medium.com/tamil-writing/%E0%AE%B5%E0%AF%86%E0%AE%AF%E0%AE%BF%E0%AE%B2%E0%AE%BF%E0%AE%B2%E0%AF%8D-%E0%AE%AE%E0%AE%B4%E0%AF%88-%E0%AE%A4%E0%AF%87%E0%AE%9F%E0%AF%81%E0%AE%95%E0%AE%BF%E0%AE%B1%E0%AE%BE%E0%AE%A9%E0%AF%8D-54090a86868d

If someone clicks, “Share on Twitter”, they’ll also get a popup like this

This is a kind of inconvenient way to deal with URLs in terms of Indic letters. It uses the percent-encoding for URL, which is very long for Indic language, at least.

In order to get rid of this issue, there are some benchmarks in place in different platforms.

For example, in WordPress, there is an option called post slug, where the author can set a specific word or phrase to go along with URL to overwrite the default option, where the WordPress itself make URL from the title. It’s same as the Medium current way.

In Wikipedia, there is a custom shorten link provided for each article in order to avoid this issue.

URL in the Address bar of Wikipedia Tamil version
Shorten URL is provided under the title of each article.

But, I think, all these benchmarks can not come in handy for Medium Platform. Medium is unique, minimal and simple. All the above benchmarks add another layer of work to the publisher which is not, user-friendly at all.

I found out that Medium already has the solution built in, which is the matter of activating the option to work across on languages specific post, I reckon.

I believe, Medium has unique post id for each post, which is the heart of all Medium system including awesome unique analytics. Any post can be accessed with its unique id even without knowing the whole URL. The post, which I’ve mentioned can also be accessed with this URL https://medium.com/p/54090a86868d

By incorporating this URL to the sharing options, will get rid of this percent-encoding issue on URLs, I believe. It’s my humble opinion and solution for the URL issue.

Drop caps


Before, I dive into the drop caps style cases, I want to give some insights on how Tamil characters work. There are mainly 247 characters in Tamil alphabet (excluding granda, which another set of characters derived from another Indic script)

Since most of the letters (namely 216) are formed by combining the vowel and consonant together. Sometimes, a letter means two glyphs together.

Here is a quick infographic, I made to illustrate the formation of letters.

A quick infographic, illustrating the letter formation in Tamil language.

By considering the letter formation and fluctuating baseline of glyphs, I personally believe, that automating the drop cap thing for Tamil language in any writing system will be pretty much complex. Following are some of my takes on implementing the drop caps for Tamil language.

Regular drop cap

Regular Drop Cap without bold phrase
Regular drop cap with bold phrase

As you can see, the drop cap of பே, the letter has two glyphs while it’s single letter in whole. This caveat needs to be considered in formatting algorithm system for drop caps.

A two-letter word

Two-letter word drop cap

Above example for drop-cap with two letters word, seems to be nice and fine and syncs well with the article. But consider following scenario.

Two-letter word with three glyphs

Here two-letter word with three glyphs. The drop cap seems to be long, compared to a regular drop cap style. This can be a distraction in the readability.

And the first letter மூ got long extension going underneath. It is affecting the phrase, beneath it. (See example above)

I hope these details will help you to implement the drop cap feature inside the awesome Medium in Tamil language. If you need more information, please do buzz me.

Thank you.

Tharique Azeez (@enathu)