From Understanding It to Utizling for Your Tests

Image for post
Image for post
Photo by Susan Yin on Unsplash

When you start writing unit tests for your project, you might need to understand unittest.mock in Python. Imagine that you are making a library interacting with Google Spreadsheet and trying to test it. Do we need to connect Google Spreadsheet for every test? That sounds really time-consuming. What if your project gets bigger and bigger? Your HTTP connections are gonna be huge! But don’t worry, you don’t need to speed up your Wi-Fi or make a phone call to Google to say your requests aren’t DoS attack :)

In this post, I’ll explain unittest.mock, a built-in library for testing in…


Smaller Code, Less Pain

Image for post
Image for post
Photo by Ferenc Horvath on Unsplash

For an NLP task, you might need to tokenize text or build the vocabulary in the pre-processing. And you probably have experienced that the pre-processing code is as messy as your desk. Forgive me if your desk is clean :) I have such experience too. That’s why I create LineFlow to ease your pain! It will make your “desk” as clean as possible. How does the real code look like? Take a look at the figure below. The pre-processing including tokenization, building the vocabulary, and indexing.


Understanding its restriction and How to Avoid it

TL;DR

You can write below function to use multiprocessing with Lambda function:

Image for post
Image for post
Photo by Mike Enerio on Unsplash

When Do We Need Multiprocessing?

When you handle tons of text files or images, you might want to use multiprocessing to speed up the processing. An intuitive way in Python is below:

import multiprocessingwith multiprocessing.Pool() as p:
result = p.map(lambda x: x ** 2, range(100))

But unfortunately, this won’t work because you cannot write a lambda function or a closure with multiprocessing. As for the reason, you can find it in Why? section below.

When you google this problem, you’ll find someone suggests you use joblib or pathos or something like that…


Image for post
Image for post
Photo by Denys Nevozhai on Unsplash

The purpose of the shortest paths problem is to find the shortest path from the starting vertex to the goal vertex. We widely use the algorithms to solve the shortest paths problem from competitive programming to Google Maps directions search. By understanding the key notion, “edge relaxation”, it is really easier to understand the concrete algorithms, say Dijsktra’s algorithm or Bellman-Ford algorithm. In other words, it might be difficult to make these algorithms your own without understanding edge relaxation. In this post, I focus on edge relaxation and explain the general structure to solve the shortest paths problem. Also, we’ll…


Say Goodbye to Your Messy Codes!

Image for post
Image for post
Photo by Jamie Templeton on Unsplash

Do you happen to know the library, AllenNLP? If you’re working on Natural Language Processing (NLP), you might hear about the name. However, I guess a few people actually use it. Or the other has tried before but hasn’t know where to start because there are lots of functions. For those who aren’t familiar with AllenNLP, I will give a brief overview of the library and let you know the advantages of integrating it to your project.

AllenNLP is the deep learning library for NLP. Allen Institute for Artificial Intelligence, which is one of the leading research organizations of Artificial…


Grab your own powerful IPython with a very simple way

TL;DR

You just copy this alias below and paste it to your .bashrc, .zshrc or some configure file:

alias ipy="ipython --no-confirm-exit --no-banner --quick --InteractiveShellApp.extensions=\"['autoreload']\" --InteractiveShellApp.exec_lines=\"['%autoreload 2', 'import os,sys']\""
Image for post
Image for post
Photo by Geetanjal Khanna on Unsplash

I often use IPython to develop my library or do some research, because IPython has really great features as follows:

  • Run common shell commands: ls, cp, rm, etc.
  • Also, run any shell command with !(some command).
  • IPython provides a lot of magic commands: run, debug, timeit, etc
  • Great autocompletion
  • Preload your favorite modules in Python
  • Autoreload Extension

I think you’ve already known IPython provides a good Python interpreter, but also known that you…


Image for post
Image for post
Photo by Sébastien Marchand on Unsplash

There are two fundamental ways of graph search, which are the breadth-first search (BFS) and the depth-first search (DFS). In this post, I’ll explain the depth-first search. Here, I focus on the relation between the depth-first search and a topological sort. A topological sort is deeply related to dynamic programming which you should know when you tackle competitive programming. For its implementation, I used Python. If you’d like to know the breadth-first search, check my other post: Understanding the Breadth-First Search with Python.

1. The algorithm of the depth-first search

In the depth-first search, we visit vertices until we reach the dead-end in which we cannot find…


The Important Data Structure for Search Algorithms

Image for post
Image for post
Photo by Rick Mason on Unsplash

Today I will explain the heap, which is one of the basic data structures. Also, the famous search algorithms like Dijkstra's algorithm or A* use the heap. A* can appear in the Hidden Malkov Model (HMM) which is often applied to time-series pattern recognition. Please note that this post isn’t about search algorithms. I’ll explain the way how a heap works, and its time complexity and Python implementation. The lecture of MIT OpenCourseWare really helps me to understand a heap. So I followed the way of explanations in that lecture but I summarized a little and added some Python implementations…


One of the essential search algorithms for competitive programmers

Image for post
Image for post
Photo by saeed mhmdi on Unsplash

There are two basic graph search algorithms: One is the breadth-first search (BFS) and the other is the depth-first search (DFS). Today I focus on breadth-first search and explain about it. Breadth-First Search is one of the essential search algorithms to tackle competitive programming. In this post, I’ll explain the way how to implement the breadth-first search and its time complexity. Please note that I don’t explain how to use it in competitive programming but these are useful for competitive programming. I use Python for the implementation. …


Image for post
Image for post
Photo by Giammarco Boscaro on Unsplash

I often use Python, but I really don’t care about the way how Python works internally. So today I focus on the Python list and explain inside the implementation of it. Python’s list.append and list.pop change the list size dynamically, which make them run fast in O(1) time. Please note that list.pop for the last item only takes constant time. In this post, I’ll show the reason why it has become possible.

We call the data structure like Python list the dynamic array and call normal array the static array. This post is structured as follows.

  1. What is a dynamic…

Yasufumi TANIGUCHI

Software engineer, My interest in Natural Language Processing

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store