The second part of the article where I review and compare various Pandas optimization tools.

Image for post
Image for post

In previous article, we looked at some simple ways to speed up Pandas through jit-compilation and multiprocessing using tools like Numba and Pandarallel. This time we will talk about more powerful tools with which you can not only speed up pandas, but also cluster it, thus allowing you to process big data.

  • Numba
  • Multiprocessing
  • Pandarallel

Chapter 2:

  • Swifter
  • Modin
  • Dask

Swifter

Swifter is another small but smart pandas wrapper. Depending on the situation, it chooses the most effective optimization method out of the possible ones — vectorization, parallelization or pandas implementations. …


In this article I will tell you about six tools that can significantly speed up your pandas code. For most tools, just install the module and add a couple lines of code.

Image for post
Image for post

Pandas has long been an indispensable tool for any developer thanks to a simple and understandable API, as well as a rich set of tools for cleaning, researching and analyzing data. And everything would be fine, but when it comes to data that does not fit into RAM or require complex calculations, pandas performance is not enough.

In this article, I will not describe qualitatively different approaches to data analysis, such as Spark or DataFlow. Instead, I will describe six interesting tools and demonstrate the results of their use:

  • Numba
  • Multiprocessing
  • Pandarallel

Chapter 2:

  • Swifter
  • Modin
  • Dask

Numba

This tool directly accelerates Python itself. Numba is a JIT compiler that likes loops, mathematical operations and Numpy, which is a Pandas core lib. Let’s check in practice what advantages it gives. …


In this article I gonna show how to control the throughput of a queue in a distributed task queue based systems, or, in a simpler language, how to set its rate limit. As an example, I’ll take python and my favorite Celery + RabbitMQ kit, although the algorithm that I use does not depend on these tools and can be implemented on any other stack.

Image for post
Image for post

So what’s the problem?

First, a few words about what kind of problem I’m trying to solve. The fact is that 99.9% of Internet services restrict access to their resources, not allowing to hit them with 100/1000 req/s and threatening to return 403 or 500 in response. The greedy ones, isn’t it? Sometimes even your own DB can act as such a service… Well, you can’t trust anyone nowadays, huh, so that you need to limit yourself somehow.

Of course, if we have only 1 process, then there is no problem, but we work with Celery — it means it is possible that we have not only N processes (hereinafter referred to as workers), but also M servers, and the task of synchronizing all this stuff doesn’t seem so trivial. …


Image for post
Image for post
Lightning network

In the previous article, we analysed in detail the functioning of payment channels and several methods to secure the payments passing through these channels. However, it is not enough to build a functioning network of channels: even if we are sure that inside every channel everyone is playing fair, we cannot guarantee the delivery of funds by a chain passing through a number of channels. That is where the smart contracts known as HTLC (hash-timelock-contracts) come in. …


Image for post
Image for post
Lightning network

Learn how to pay for a cup of coffee with bitcoins.

Lightning network is a decentralized off-chain technology, allowing tens of thousands transactions per second, similar to what can be done, for instance, with Visa. Currently the bitcoin, the most popular cryptocurrency in the world, cannot support more than approximately seven transactions per second, and high commission fees together with long confirmation time nullify the possibility to send microtransactions. The Lightning Network solves both problems.

Table of contents

  • Introduction
  • Payment channels
  • Simple payment channel example
  • Trustless channels
  • Using timelocks
  • Asymmetric revocable commitments
  • Conclusion
  • Links

Introduction

The Lightning network is a system of payment channels which are nothing else than the common multisig wallets. To open a channel, the parties create a multisig wallet and send funds to this wallet. The received amount of money becomes the balance of the channel and all subsequent transactions between the participants of the channel are made outside the blockchain. The channel can be closed at any moment by any party. In this case, the last off-chain transaction, determining the balance of the channel, is sent to the network, and invalidate all intermediate transactions because they all use the same output. As a result, we need only one transaction to open the channel and one more to close it, while all the intermediate transactions are made instantaneously, without a record in the blockchain. …


Image for post
Image for post

The scalability of the bitcoin is one of its main problems and a focus of active efforts. One of proposed solutions is, for instance, the Lightning network technology, but its implementation does not yet seem possible because of several vulnerabilities. Another solution, the Segregated Witness, is also aimed at scalability increase but it also solves a number of other problems including the aforementioned vulnerability that interferes with the implementation of the Lightning network. In this article, we will have a look at the advantages of the Segregated Witness and describe how it works.

The Segregated witness or SegWit is a soft fork described in the series of BIPs (141, 142, 143, 144 и 145), whose main aim is to optimise the structure of transactions and blocks by moving signatures (called ‘scriptSig’, ‘witness’ or’unlocking script’) from the transaction to a separate structure. Not only it allows to decrease the size of transactions, providing for more spacious blocks, but it also solves the issue of the transaction malleability (the vulnerability we spoke about in the beginning of the text), which is crucial for such technologies as payment channels or Lightning, based on the bitcoin transaction structure. …

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store