How to Design a Spam Filter for GMAIL

This is one of the trending topic in interview questions.

I have read few blogs but i did not get any useful information to answer this. In reality, There are so many lengthy algorithm which run behind it and these are really hard to explain in interview. However interviewer is also not expecting those Algos from candidate.

Here i am trying to figure out how a developer can think of it and gathered below information from couple of blogs.

 — Phishing (online fraud — trick the victim — revealing sensitive details — as a trustworthy)
 — Spam
 — Hijack
 — Clickjacking protection (hyperlinks beneath legitimate clickable content)
 Text Filters
 Blacklisted domains/senders
 Spoofed email addrs: Greek character (“Σ”) for the Latin character “E”
 Unconfirmed sender
 Messages already marked as spam and by how many people
 Message content is empty
 User already tried to unsubscribe
 Community feedback
 Language difference

There are so many other points which we can add here. Hence please feed free to leave your comments. I am going to update this post with HLD and LLD of this Spam-Filter. Please share some useful link on same topic.