System Interaction Design of Sponsored SMS systems

A case study on how these messaging systems can be built

Vaibhav Singh
Javarevisited
5 min readJul 11, 2021

--

Startup world is buzzing around with news all around on the messaging service providers in last few years like Line, Twilio etc. going public in US with billion dollar valuations, latest one in that trend is MessageBird.

This trend has now reached in India where it’s home grown messaging service providers like Route Mobile have gone public and few others like SMS Gupshup are in line to ride the IPO bandwagon.

What’s Special about a SMS Service ?

Well, its the scale these delivery services deal with, though the message data size is small (just 160 characters which is in few kb’s) but the volumes are in scale of 100+ millions a day at least, which transforms to 3 to 4 billion messages a month at minimum.

So how do you build a system like this ?

Well that’s the topic of discussion of this blog, so without wasting any more time let’s dive into my case study of the System Interaction design of Sponsored SMS system.

System Interaction Design

Design Considerations

  • Has to be fast with very low latency i.e. within few seconds
  • Handle volumes of at least 200 million messages a day
  • Has to be Lossless & Reliable for transactional messaging
  • Handle traffic burst(Promotional traffic is for short time)
  • Low Cost of Infrastructure preferably should auto-scale.
  • Message filtering/spam detection/scrubbing capable.
  • Support various protocols for incoming & outbound traffic
  • Have capabilities of real-time tracking and reporting at granular level

Above are few demands of the sponsored messaging system like Twilio has, it should be heavily distributed, micro serviced, load balanced, cache enabled, file storage supported system. We may have to build our custom solutions but I will not deep dive into the implementation of those libraries/solutions (will write a blog later).

Basic flow is :

USERS SPONSORED SMS SYSTEM OPERATOR

First who are the users of this system, answer is businesses who use promotional & transactional messaging i.e Banks, Ecommerce portals, Promotional Campaigners etc. Operators are the mobile service providers like Airtel, Vodafone, Reliance Jio etc. Below system diagram act as the factual representation of the information that’s gonna follow :

Interaction flow of Sponsored SMS System

Considerations of the Sponsored SMS System

Sponsored SMS System provides multiple ways to send the 160 character SMS, One is using the SMPP protocol (Read more about it here) supported by SMPP MsgListener Service, Second could be uploading file to their Web Portal (or approval based/ Custom solution).

But I believe 90% of the message is submitted via HTTP directly onto the FrontAPI component as sponsored system are usually integrated with different business softwares. Submission to FrontAPI could be done via synchronous API’s (which accepts the request does a basic authenticity check and returns a response) or asynchronous API’s( which just accepts the request and processes it at a later point in time).

Further systems after FrontAPI have to be asynchronous so as the support volume else too many machines will be required, This can be handled through a distributed Queue(like Redis-Based/RabbitMq etc) or custom asynchronous file based queue (optimized for object storage as Java some times make a mess of serialization) and later on processed.

(More on the custom asynchronous File based queue in my upcoming blogs)

Producer-Consumer, LMAX Disruptor, etc., and its multithreaded variants should be implemented to achieve high performance and extract complete juice of the infrastructure.

DB calls should have to be absolutely zero to achieve scale, for this writing to DB should happen asynchronously from MQ’s(Messaging Queue’s). Data into tables should follow write-through cache implementations with Redis or Memcached layer.

Also a Near Real Time (NRT) Reporting Dashboard to track the progress of each sponsored SMS which passes through all applications. This can be achieved by sending events to another system called Event processor using a transporter such as as Scribe. Point to note here is The scale of this system is almost 10 times since 1 SMS generates 10 different events so daily reporting and analytics are provided on 1 billion of events at least.

Roles & Responsibilities of various Components

FrontAPI should do (Note it has to return the response in milliseconds)

  • Sms template matching
  • Input validations etc
  • Avoid DB calls for account authenticity,
  • Message template matching,
  • Should use distributed cache
  • Pass the message asynchronously to SMS Engine component.

WebHook Service is another outbound component with collects incoming responses of users (for e.g. Kaun Banega Crorepati — India’s version of Who wants to be a millionaire users sent their replies to a 4 digit short code) and then forwarded to the company through the configured URL.

SMS Engine (is basically the heart of the system):

  • Bifurcation, Diversion & Distribution of promotional & transactional traffic to orchestrator instances
  • Applying various scrubbing logic For e.g. NV (Number Validation), NCPR(National Customer Preference Register of (TRAI)), MNP(mobile number portability), BlockNo service etc. .
  • Filtering SPAM/Scheduled Messages and moving them to corresponding SpamHolder/Scheduler Holder Components

SchedulerHolder component job is to store the message till the scheduled date & apply the same scrubbing rules for that day since data of scrubbing changes on a daily and weekly basis.

SpamHolder is the gating component for promotional messages which contain objectional or spam content to users, there is word engine logic written to mark these messages as spam and inform the customer relationship team of such campaigns.

Orchestrator is the distributor component that asynchronously receives and distributes the heavy traffic across servers based on where the account is configured.

MessageDispatcher is the sending logic processor, it has various greedy algorithms which chooses which operator to send, since some are cheaper in specific areas across India and some have high delivery cost but guaranteed delivery, so this component has the algorithms implemented like Leaky Bucket, optimization based on cost, delivery & time parameters.

The logging component asynchronously receives & stores all the outgoing messages in tables/files (binary or parquet format)for reconciliation. This acts as one source of input for the Reconciliation component whose job is to create the usage and billing report for each of their customers.

MsgSender :

  • Submits the message to operator SMSC
  • Collects SMPP submit response from the operator SMSC(they deals with SMPP message)
  • Identifies whether the message is to be retried or not.
  • Collects delivery reports from the Operator SMSC and passes it on to the Delivery Report System for further processing.
  • This also send the incoming responses to Web Hook Message Receiver Service and which in turn forwards to Web hook Service , which forwards the response to the configured URL of the customer.

SMSRetryCallback System : This is the retry system which resets properties of the SMS to another operator since the first try has failed so that it is again retried from the MessageDispatcher on a different operator SMSC.

Delivery Report (DR) System :

This microservice should deal with the delivery reports and persists into the DB/files and should be cached along side. Cached DR’s will speed up the reconciliation process done by MsgReconciliation Component.

Event Processor responsibility should be

  • Provide Near Real Time tracking(NRT) of every message
  • The intimate technical support team for failures within the system
  • Provide a system of warehousing to get reports & analyze trends based on region, circle, time, operator etc.

Interesting Reads

Questions ? Suggestions ? Comments ?

What’s next? Follow me on Medium to be the first to read my stories.

--

--

Vaibhav Singh
Javarevisited

I am a #TechnologyEnthusiast #Coder #JavaProgrammer #Blogger (https://linqz.io) #Dreamer. In my free time I love to #Cook, #ShareStories & #VolunteerForPoor.