50th Monthly Technical Session (MTS) Report

Published in

henngeblog

8 min readOct 15, 2018

50th Monthly Technical Session (MTS) was held on September 21st, 2018. MTS is a knowledge sharing event, in which HDE members present some topics and have Q&A sessions, both in English.

“Network Troubleshooting with Wireshark” by Arakawa

Arakawa presenting “Network Troubleshooting with Wireshark”

Wireshark is arguably the most well-known open source packet analyzer. It is used for network troubleshooting, analysis, software and communications protocol development, and education. In this talk, Arakawa shared some tips and tricks of troubleshooting networks using Wireshark.

Arakawa utilizes tcpdump to capture packets. To reduce the packet captures’ sizes, we can specify client network (i.e. IP address) when executing tcpdump commands. It is recommended to regularly rotate and delete packet captures (e.g. using cron).

Wireshark highlights all network problems (e.g. packet loss). It also provides information about TCP stream and TCP handshake, among others.

In conclusion, Wireshark helps us understand how to troubleshoot networks and identify a lack of network security.

“Tears of Refactoring” by Matsuura

Matsuura presenting “Tears of Refactoring”

Matsuura and his team has been refactoring one of our internal projects. This talk is about the major hardships he encountered and eventually solved.

“Admission Control Service: Design and Implementation” by Tanabe

Tanabe presenting “Admission Control Service: Design and Implementation”

Admission control is a validation process in communication systems. Before a connection is established, a check is performed to see whether the available resources are able to handle the proposed connection. With admission control, a system will not accept requests that it can’t respond to within a reasonable latency (as it shouldn’t). Even services like G Suite and Exchange Online have limits, which implies that they also have admission controls.

There are several approaches of admission control, namely rate limiting and flow control (back pressure). As the name implies, rate limiting is about limiting the rate of connections (requests) under a predefined, well, limit. On the other hand, flow control is about responding within a latency that depends on the current resource usage.

“PyCon MY 2018” by Bumi

PyCon MY 2018 was held on August 25th and 26th at University of Malaya, Kuala Lumpur, Malaysia. About 123 people attended the event.

There were 27 talks (in 2 concurrent tracks), 2 keynotes, and several lightning talks. HDE members Bumi, Iskandar, and Jonas were among the speakers. Bumi’s talk was Enhancing Angklung Performances with Python. Iskandar’s talk was Building a Personal Assistant with Python. Jonas’ talk was Artisanal Async Adventures.

“PyCon JP 2018” by Iskandar

PyCon JP 2018 was held from September 15th to September 18th. The first day was a sprint, an impromptu coding meeting where developers get together to make a quick progress on a project they are interested in. This event was held at our very own HDE, Inc. headquarters. Tutorials was held on the second day. Finally, the last two days were the main conference.

Some of the talks that Iskandar found interesting was Argentina in Python: Community, Dreams, Travel, and Learning by Manuel Kaufmann, Interpretable Machine Learning, Making Black Box Models Explainable with Python! by David Low, and Fun with Python and Kanji by Michael Penkov. Iskandar himself was one of the speakers, with his talk being From Data to Web Application: Anime Character Image Recognition with Transfer Learning.

“Exiting the Matrix: Explaining Virtualization with the Example of AWS” by Anastasiia

Anastasiia presenting “Exiting the Matrix: Explaining Virtualization with the Example of AWS”

Anastasiia is one of our Global Internship Program (GIP) participants.

In the context of her talk, virtualization is the act of running multiple virtual machines on a single physical machine. Each virtual machine shares the resource of the physical machine. Virtualization is done to lower cost, ease maintenance, ease backup, ease recovery, and various other dimensions of increasing performance.

There are several virtualization approaches, such as pure software-based virtualization, paravirtualization, and hardware-assisted virtualization. Technically, all virtualization are software-based. Techniques such as binary translation, memory shadowing, and I/O emulation are used to simulate hardware with software.

In paravirtualization approach, a guest OS is recompiled prior to installation inside a virtual machine. This provides an interface to the virtual machine that is similar (but not necessarily identical) to the underlying hardware-software interface.

Hardware-assisted virtualization involves hardware that are specifically designed to make virtualization easier and more performant. Such hardware are equipped with features that make hypervisors more efficient.

Among the aforementioned virtualization approaches, pure software-based virtualization has the most overhead, followed by paravirtualization, with hardware-assisted virtualization having the least overhead. AWS itself has been making a move from pure software-based virtualization to hardware-assisted virtualization. Its latest offering, Nitro, is a lightweight hypervisor that works on top of hardware virtualization of network, storage I/O, and even interrupts. Its overhead is often less than 1%.

“Underwater Sensor Networks — a Deep Dive into the Unknown” by Arvind

Arvind presenting “Underwater Sensor Networks — a Deep Dive into the Unknown”

Arvind is also one of our GIP participants.

Wireless sensor networks are low-power, short range mesh networks. They are an easy to scale and fairly cost-efficient solution approach to monitoring and recording sensor data. Arvind has been involved in at least two projects involving this technology.

The first project is about using sensors to battle ghost fishing. Ghost fishing refers to the death of marine life due to lost fishing gears. It is a significant contributor to overfishing and is a problem for marine life. His solution is to apply low-cost acoustic communication tags to both fishing vessels and gears. The locations of fishing gears can be determined by utilizing PINGs from fishing vessels and PONGs from fishing gears. This system works up to a range of 80 meters. It does simplify the process of locating missing fishing gears, but extracting those gears from the sea remains a challenge.

The second project is about detecting blockage in urban water pipelines. In Tokyo alone, there is more than 24,000 kilometers of freshwater pipelines, 400 of which are replaced annually. Furthermore, annual cost of water leakage due to pipeline deficiencies goes as high as $2 billion. Current blockage detection methods are labor-intensive, inaccurate, costly, and prone to damage. Arvind proposed a blockage detection method that uses sonar waves. His blockage detection devices cost less than $50 to make, achieve 95% accuracy in the 2–15 meters range, and can detect blockages as far as 100 meters away. This solution also includes an acoustic communication system for real-time data transfer.

“Processing Big Data in the Spark Way” by Camilo

Camilo presenting “Processing Big Data in the Spark Way”

Camilo is another one of our GIP participants.

Apache Spark is a unified analytics engine for large-scale data processing. It is preceded by MapReduce, a programming model for processing and generating big data with a parallel, distributed algorithm on a cluster. Apache Hadoop is generally an implementation of MapReduce. Compared to Hadoop, Spark is faster, allows more expressiveness, and can be run in various environments (it doesn’t depend on a Hadoop environment).

Spark provides both cluster and standalone modes, but the concepts are pretty much the same. A Spark cluster consists of a cluster manager, worker nodes, and data sources. Cluster manager acquire resources on the cluster, while worker nodes run application code in the cluster. Data sources can be HDFS, SQL database, cloud object storage (e.g. S3), and even regular files (for standalone mode).

The most basic data structure in Apache Spark is an RDD (Resilient Distributed Dataset). Simply put, an RDD is a distributed list of objects. It has two types of operations, transformations and actions. Transformations are lazy operations on an RDD that create one or many new RDDs, such as map, flatMap, filter, and join. Actions are RDD operations that produce non-RDD values, such as collect, reduce, and saveAsTextFile.

Camilo bookended his talk by demonstrating how one would use Spark to count words of a big set of books.

“Illumination with Shadertoy” by Siangping

Siangping presenting “Illumination with Shadertoy”

Siangping is the last of our GIP participants.

Computer graphics is Siangping’s specialty. According to her, rendering graphics is generally an act of simulating light. Given information about object, light, and camera, we want to create a 2D image of a scene.

Shaders play an important role in rendering graphics. They are programs that determine how color and brightness of a surface varies with lighting. Most of them are coded for GPUs. Computer graphics APIs such as OpenGL and WebGL include shaders in their rendering pipelines.

Shadertoy is a cross-browser online community and tool for creating and sharing shaders. Before demonstrating a fragment (per-pixel) shader on Shadertoy, Siangping went over some shading and rendering algorithms related to the demonstration. She explained ray casting, Phong reflection model, and ray-sphere intersection in detail.

Finally, she demonstrated her shader, which results in a scene with two spheres and a mirror. One of the spheres is stationary, while the other is bouncing and revolving around it. Light-wise, the scene is quite complex. The spheres are reflective, a mirror is involved, and there are also shadows to worry about. It was quite an impressive demonstration.

“Photo Booth in Depth” by Kaoriya

Kaoriya presenting “Photo Booth in Depth”

We are honored to have Kaoriya visit us and talk about the photo booth he developed for BuildersCon Tokyo 2018.

He started by listing the hardware he used to implement his photo booth. Two components were emphasized, Intel RealSense Depth Camera D415 and Intel NUC 7 enthusiast mini PC.

The image processing procedure was also explained in detail. First, the camera captures a color image. As it is a depth camera, it also captures a depth image. Binarization is applied to the depth image to separate objects in the foreground from the background. Afterwards, various techniques are applied to smooth out the extraction of objects in the foreground. Finally, the magazine cover designs applied to replace the background.

Kaoriya also went over some details of the software involved in the development of his photo booth. It turns out that this has been a cross-platform development process, involving Windows and Ubuntu platforms.

Naturally, he segued into how his photo booth can be developed in other platforms. Apparently, there is a similar implementation in iOS. Fun fact: the iOS implementation was published at iOSDC Japan 2018, only a week separated from BuildersCon Tokyo 2018. Kaoriya ended his talk by comparing his photo booth with the iOS implementation in terms of cost, reliability, availability, and quality.

As usual, we had a party afterwards :)