Image for post
Image for post
collections

Apart from built-in general purpose container data structures like list, dict, set and tuple . Python provides collections module which implements some specialized container data types.

Following container data types are present in collections module for python 3.6.

  1. namedtuple() : factory function for creating tuple subclasses with named fields.
  2. deque : list-like container with fast appends and pops on either end.
  3. ChainMap : dict-like class for creating a single view of multiple mappings
  4. Counter: dict subclass for counting hashable objects
  5. OrderedDict : dict subclass that remembers the order entries were added
  6. defaultdict: dict subclass that calls a factory function to supply missing values. …


Image for post
Image for post

In this blog post, I am going to look back at the things that I have done in 2019 like Open Source Contribution, Conferences, Talks, Blogs, etc. It has been a very important year for me in my career for several reasons and you will find them as you go through this blog. :)

Open Source Contribution

I started my open source contribution with Hacktoberfest 2017 and OpenFaaS was the first project I contributed to. Since then I have been an active contributor to the OpenFaaS project. By the end of 2019, I got a total of 157 commits merged to OpenFaaS organization. …


Image for post
Image for post
Source: https://www.pexels.com

In this blog, I will be discussing MapReduce programming model and how it works. This blog is based on the original MapReduce research paper MapReduce: Simplified Data Processing on Large Clusters from Google.

MapReduce is a programming model and an associated implementation for processing and generating large data sets. User specifies a map function that processes a key/value pair to generate a set for intermediate key/value pairs, and a reduce function that merges all intermediate values associated with the same intermediate key.

MapReduce programs written in these functional styles are automatically parallelized and executed on a large cluster of commodity machines. The runtime system takes care of the details of partitioning input data, scheduling the program’s execution across a set of machines, handling machine failures, and managing the required inter-machine communication. This allows programmers without any experience with parallel and distributed systems to easily utilize the resources of a large distributed system. …


In this blog post, I will show how you can access OpenFaaS function logs from the OpenFaaS CLI.

Image for post
Image for post

At the time of writing this blog, logs are available for only Kubernetes provider for OpenFaaS which is also called faas-netes. Log support was added to faas-netes version 0.8.0, please make sure you are running faas-netes version greater than equal to 0.8.0 .

As of today this feature is available if you are already using faas-netes, standby for support in other providers as well.

Installation

Follow the instructions here to install the latest version of OpenFaaS on Kubernetes using helm.

You should also have faas-cli installed with a version greater than or equal to 0.8.21. Follow instructions here to install the latest version of CLI. …


Image for post
Image for post

Bloom filter is a probabilistic data structure invented by Burton Howard Bloom in 1970. It allows for membership check in constant space and time. Bloom filter trades exactness for efficiency and has a large number of applications in software engineering.

Some of the properties of bloom filters are

  • It allows for membership lookups in constant space & time. Bloom filter can very quickly answer YES/NO questions, like “is this item in the set?”.
  • Very infrequently it will give a false positive answer, implies it will say YES if the answer is NO (Probably in the set).
  • It will never give false negative answer implies it will never say NO if the answer is YES (Definitely not in the set). …


Image for post
Image for post

Caching is the process of storing data in the cache. The cache is a temporary storage area relatively small in size with faster access time. Whenever your application has to read data it should first try to retrieve the data from the cache. Only if it’s not found in the cache then it should try to get the data from the data store. Caching improves latency and can reduce the load on your servers and databases.

Caching can be done at different levels

  • Client Caching

Caches are located on the client side like OS, Browser, Servers acting as a client for someone like Reverse-Proxy.


Image for post
Image for post

A distributed system is a system in which components are located on different networked computers, which can communicate and coordinate their actions by passing messages to one another. The components interact with one another in order to achieve a common goal.

Key characteristics of distributed systems are

  • Resource Sharing

Resource sharing means that the existing resources in a distributed system can be accessed or remotely accessed across multiple computers in the system. Computers in distributed systems shares resources like hardware (disks and printers), software (files, windows and data objects) and data. Hardware resources are shared for reductions in cost and convenience. …


Image for post
Image for post

Database sharding is the process of splitting up a database across multiple machines to improve the scalability of an application. In Sharding, one’s data is broken into two or more smaller chunks, called logical shards. The logical shards are then distributed across separate database nodes, referred to as physical shards.

Database shards are autonomous and they don’t share any of the same data or computing resources. In some cases, though, it makes sense to replicate certain tables into each shard to serve as referenced tables.

Often, Sharding is implemented at the application level, meaning that application includes code that defines which shard to transmit read and writes to. However, some database management systems have sharding capabilities built in, allowing you to implement sharding directly at the database level. …


Image for post
Image for post

The event-driven architecture pattern is a popular distributed asynchronous architecture pattern used to produce highly scalable applications. It is also highly adaptable and can be used for small applications and as well as large, complex ones. The event-driven architecture is made up of highly decoupled, single-purpose event processing components that asynchronously receive and process events.

It is suitable for applications or systems that transmit events among loosely coupled software components and services. An event-driven system consist of typically consists of event emitters (or agents), event consumers (or sinks), and event channels. Emitters have the responsibility to detect, gather, and transfer events. They do not know anything about consumers, it’s existence or how it processed the events. Sinks have the responsibility of applying reaction as soon as an event is produced. The sink can process the event or do some transformation and forward the event to another component.


Image for post
Image for post

Web applications were originally designed as a simple client-server model where the web client initiates an HTTP request requesting some data from the server. For example, a basic web application with the client-server model flow will be as follows.

  1. A client makes an HTTP request requesting a web page from a server.
  2. The server calculates the response
  3. The server sends the response to the client

As developers began to explore ways to implement more “real-time” applications. …

About

Vivek Kumar Singh

Contributor@OpenFaaS, Gopher, Pythonista. https://www.viveksyngh.xyz/

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store