Let’s refactor: std_json_io.

Roman Chvanikov
May 7, 2017 · 15 min read

Today I finished refactoring of std_json_io library (the fork can be found at https://github.com/chvanikoff/std_json_io with all the commits mentioned in the article in place, Pull request to the original library is here) and would like to share experience of “how it was”.
std_json_io is a nice library for Elixir which is responsible for requesting external scripts which read JSON-input from STDIN and write JSON-output to STDOUT. The library is oftenly used for communication with react-stdio to achieve server-side rendering of React JS code. Usage is following: call StdJsonIo.json_call(%{component: "my/server/js/component.js", props: %{}}), StdJsonIo encodes data to JSON and send it (via Porcelain library) to react-stdio’s STDIN, it returns compiled html to STDOUT, StdJsonIo gets the output (via Porcelain) and returns it to caller.
What’s wrong with the original version?

  1. Ability to handle response from react-stdio larger than 64KB: if response is >64KB, it is sent to StdJsonIo in chunks which it can’t handle.
  2. Requirement to create additional module in your app to use the library: of course it can be necessary in many cases, but as we’ll see later, StdJsonIo is not the one.
  3. Requirement to add StdJsonIo supervisor to your app supervision tree — I think such kind of apps should be designed as OTP Applications and have it’s own supervision tree.
  4. Outdated dependencies: it is usually bad to depend on outdated libraries, but who knows if we can painlessly upgrade the deps if…
  5. …Tests are absent. This one is the most important. It is impossible for software to not become legacy if it have no tests. I can’t even update dependencies and make sure everything works as before or figure out what got broken after update. I can’t be sure library is functional in the way it is supposed to be. I have to test every change manually and deal with a lot of assumptions which of course slows down development process significantly.
  6. Elixir compiler shows up some warnings in compile time and it’s a good idea to fix that.
  7. StdJsonIo.Reloader is useless. Since primary target for std_json_io is react-stdio, it is aimed to restart connections with a script when (server-side) js is updated. But there’s no point since we send a file path to react-stdio and it will return us a new response as soon as file changed despite using old connections, so the Reloader does nothing good but confusing library users. Restarting connections with a script may only be useful when the script itself changes, which is controversially. So, I’m going to remove Reloader at all.
  8. And there’s always more being found during refactoring :)

Now when problems are determined, it’s time to fork the project, clone it, checkout to refactoring branch, install deps (cd std_json_io && mix deps.get) and start with the most important: problem#5, tests. But even before start working on the problem, it appears that mix.lock have outdated format. Let’s commit a new one:

git add mix.lock && git commit -m 'Updated mix.lock format'

We will not start with the most wanted problem#6 and even more — we will leave it to the end because many of warnings can disappear to that moment. Instead, we will start with tests. Without it we can’t modify any single line of code or remove anything (problem#7) and be sure that it will not impact the application API in any way, so let’s go:

SETTING UP TEST ENVIRONMENT

#!/usr/bin/env python
import sys
for line in iter(sys.stdin.readline, ''):
line = line.rstrip('\n')
sys.stdout.write('{"response": '+ line + '}'),

change permissions to the script chmod +x test/fixtures/echo.py and verify it’s working by running echo "{}" | test/fixtures/echo.py, the result is expected: {"response": {}}
Next, we need to create a module which will use StdJsonIo (problem#2), let’s do it in test/test_helper.exs so it will look like

ExUnit.start()defmodule StdJsonIoMock do
use StdJsonIo, otp_app: :std_json_io, script: "python -u test/fixtures/echo.py"
end

We use python -u test/fixtures/echo.py as a script because by default Python buffers its output and to avoid this we have to pass -u flag to it.
Next let’s update test/std_json_io_test.exs: we must start StdJsonIo supervisor before requesting it so define a simple setup function:

setup do
{:ok, _} = StdJsonIoMock.start_link([])
{:ok, %{}}
end

Test environment have been set up and it’s time to commit changes made:

git add test && git commit -m 'Setup test environment'

Writing first tests

test "Call to json_call returns correct value" do
message = %{"hello" => "world"}
expected = {:ok, %{"response" => message}}
assert StdJsonIoMock.json_call(message) == expected
end
test "Call to json_call! returns correct value" do
message = %{"hello" => "world"}
expected = %{"response" => message}
assert StdJsonIoMock.json_call!(message) == expected
end

Done! Running mix test and… 2 similar errors saying that

** (ArgumentError) expected :name option to be one of:

* nil
* atom
* {:global, term}
* {:via, module, term}

Got: {:local, StdJsonIoMock}

stacktrace:
(elixir) lib/supervisor.ex:332: Supervisor.start_link/3
test/std_json_io_test.exs:6: StdJsonIoTest.__ex_unit_setup_0/1
test/std_json_io_test.exs:1: StdJsonIoTest.__ex_unit__/2

Ok, Supervisor is trying to start with a name of {:local, StdJsonIoMock} and while it’s ok for Erlang to start supervisor in such a manner, Elixir for some reason denies this form. That’s weird since the app was definitely working with this type of registration before so out of curiosity I googled a bit and found that this change was introduced in this PR https://github.com/elixir-lang/elixir/pull/5027. It appears Elixir does convert atom_name to {:local, atom_name} automatically since that PR and doesn’t allow to use {local, atom_name} form anymore. It means our update will be painless, no logic is changed and new way of starting a supervisor is completely backward-compatible: just replace Supervisor.start_link(__MODULE__, :ok, name: {:local, __MODULE__}) with Supervisor.start_link(__MODULE__, :ok, name: __MODULE__) in lib/std_json_io.ex. Once done, it’s a good idea to commit the change:

git add lib/std_json_io.ex && git commit -m 'Updated call to Supervisor.start_link by passing :atom instead of {:local, :atom} as a name'

Now run mix test again. After waiting about 20s, another 2 similar errors are showing up:

1) test Call to json_call returns correct value (StdJsonIoTest)
test/std_json_io_test.exs:10
** (exit) exited in: GenServer.call(#PID<0.189.0>, {:json, %{“hello” => “world”}}, 10000)
** (EXIT) time out
stacktrace:
(elixir) lib/gen_server.ex:737: GenServer.call/3
src/poolboy.erl:76: :poolboy.transaction/3
test/test_helper.exs:4: StdJsonIoMock.json_call/2
test/std_json_io_test.exs:12: (test)

Here we can see that Porcelain didn’t send a response to StdJsonIo in 10s which should be more than enough. So what’s wrong and why it didn’t send a response? The obvious reason here is that because it didn’t get one from a program. But we tested it with a bash command! Let’s return to our script to check what could go wrong. The echo.py reads line-by-line from STDIN and writes line-by-line to STDOUT, but it seems StdJsonIo sends data to it (via Porcelain) with no newline character so echo.py just not aware of input end and keep waiting for \n character to come. Let’s check this assumption by updating worker module lib/std_json_io/worker.ex — add trailing \n to json we are sending to Porcelain: Proc.send_input(state.js_proc, json <> "\n")
Now run mix test and… Hooray, it’s all green now, the test have passed! We had to update logic a bit by adding \n to data sent to a program, but that’s completely fine since many programs expect it and ones not relying on it (like react-stdio) will just ignore it, so we just solved a problem we didn’t encounter yet. It’s time to commit our changes: the trailing newline and the first test:

git add lib/std_json_io/worker.ex && git commit -m 'Updated StdJsonIo.Worker to append trailing newline to data sent to external program'git add test && git commit -m 'Added test for StdJsonIo.json_call and StdJsonIo.json_call!'

Keep on testing

test "Can handle big response" do
message = %{"thisishuge" => String.duplicate("Lorem Ipsum Dolor Sit Amet", 10000)}
expected = {:ok, %{"response" => message}}
assert StdJsonIoMock.json_call(message) == expected
end

Running mix test we’ll get an error

..

1) test Can handle big response (StdJsonIoTest)
test/std_json_io_test.exs:22
** (MatchError) no match of right hand side value: {:error, :invalid}
stacktrace:
test/test_helper.exs:4: StdJsonIoMock.json_call/2
test/std_json_io_test.exs:25: (test)

Finished in 0.4 seconds
3 tests, 1 failure

Randomized with seed 50902

18:00:59.512 [error] GenServer #PID<0.215.0> terminating
** (FunctionClauseError) no function clause matching in StdJsonIo.Worker.handle_info/2
(std_json_io) lib/std_json_io/worker.ex:33: StdJsonIo.Worker.handle_info({#PID<0.216.0>, :data, :out, “lor Sit Amet<…>Lorem Ipsum Dolor Sit AmetLo (truncated)

(I removed some part of the real output since it’s useless but kept what really matters)
It’s time to get this fixed. To write a fix we must understand why this happens. Let’s look at test error stacktrace: we can see that StdJsonIo.Worker.handle_info/2 was unable to match against message it have got: {#PID<0.200.0>, :data, :out, "lor Sit Amet<...>"}, but if we open lib/std_json_io/worker.ex we’ll quickly find out that such a data is expected in a receive instruction of handle_call/3 method and not handle_info/2. Our worker sends data to Porcelain, it handles communication with the external program (echo.py) and passes response from a program back to worker via process message. This response is then decoded from JSON and returned to caller (test suite in our case). But we can find another detail in the error we’ve got: StdJsonIoMock returned {:error, :invalid}, which seems like there was a JSON decoding error. So according to these 2 facts, it seems response was sent in more than one chunk from Porcelain and after 1st chunk came to receive instruction, second was handled by handle_info/2. And since 1st chunk is not a complete JSON, Poison couldn’t decode it and returned an error. Simplest solution is following: replace currentreceive block with function receiving chunks and concating them in loop till valid JSON is received:

receiver = fn f, data ->
receive do
{_pid, :data, :out, msg} ->
new_data = data <> msg
case Poison.decode(new_data) do
{:error, _} ->
# Couldn't decode JSON, there are more chunks
# to receive and concat with
f.(f, new_data)
{:ok, _} ->
# All chunks received
{:reply, {:ok, new_data}, state}
end
other ->
{:reply, {:error, other}, state}
end
end
Proc.send_input(state.js_proc, json <> "\n")
receiver.(receiver, "")

As soon as the change is applied, mix test shows all 3 tests passed. Time to commit, this time no need to split files into different commits since test and logic changed are closely related to 1 problem:

git add . && git commit -m 'Fixed processing big response'

Removing StdJsonIo.Reloader

-        files = Keyword.get(@options, :watch_files)
-
- if files && length(files) > 0 do
- Application.ensure_started(:fs, :permanent)
-
- reloader_spec = worker(
- StdJsonIo.Reloader,
- [__MODULE__, Enum.map(files, &Path.expand/1)],
- []
- )
-
- children = [reloader_spec | children]
- end

Also, as we know, Reloader was used to run StdJsonIo.restart_io_workers!/0 function which is useless as we already determined, so let’s remove it as well:

-      def restart_io_workers! do
- case Process.whereis(@pool_name) do
- nil ->
- Supervisor.restart_child(__MODULE__, @pool_name)
- _pid ->
- Supervisor.terminate_child(__MODULE__, @pool_name)
- Supervisor.restart_child(__MODULE__, @pool_name)
- end
- end

Dependency fs is also no longer needed:

-      applications: [:logger, :porcelain],
- included_applications: [:fs]
+ applications: [:logger, :porcelain]
...
- {:poison, "~> 1.5.0"},
- {:fs, "~> 0.9.1"},
+ {:poison, "~> 1.5.0"}

Final step is to clean up our deps: mix do deps.unlock fs, deps.clean fs. Now run mix test and confirm we are all good. Time to commit:

git add . && git commit -m 'Removed Reloader module and fs dependency'

Getting rid of unnecessary module

defmodule Myapp.StdJsonIo do
use StdJsonIo, otp_app: :myapp
end

however after quick inspection of lib/std_json_io.ex it appears that the macro is doing nothing special at all, and in fact just allows you to call Myapp.StdJsonIo.json_call/1 instead of StdJsonIo.json_call/1. otp_app is also something useless — it is required to configure StdJsonIo in a way of config :myapp, Myapp.StdJsonIo, config_list while it could be easily done by allowing users to configure StdJsonIo in a standard way of defining an OTP app and configuring it via config :std_json_io, config_list. Time to get rid of the __using__ macro in lib/std_json_io.ex and make StdJsonIo a full-fledged OTP application with it’s own supervision tree. Create a file lib/std_json_io/application.ex with a basic App/Supervisor setup:

defmodule StdJsonIo.Application do
use Application
def start(_type, _args) do
import Supervisor.Spec, warn: false
children = []
opts = [strategy: :one_for_one, name: StdJsonIo.Supervisor]
Supervisor.start_link(children, opts)
end
end

and update application/0 function in mix.exs:

def application do
[
applications: [:logger, :porcelain],
mod: {StdJsonIo.Application, []}
]
end

Once done, commit:

git add . && git commit -m 'Added OTP Application and Supervisor' 

Test environment should be rewritten accordingly: we no longer will need neither setup to start supervisor nor StdJsonIoMock to use StdJsonIo since that’s what we are aimed to get rid of. So test/test_helper.exs will end up with a single line ExUnit.start() and in test/std_json_io_test.exs remove completely setup macro and replace StdJsonIoMock with StdJsonIo.

Since app will use config instead of options passed to __using__/1 macro, it’s time to fill config/config.exs with values suitable for running tests:

config :std_json_io,
pool_size: 5,
pool_max_overflow: 10,
script: "python -u test/fixtures/echo.py"

Next step is to move all the supervisor setup (excluding Reloader, problem #7) from lib/std_json_io.ex to lib/std_json_io/application.ex:

def start(_type, _args) do
import Supervisor.Spec, warn: false
config = Application.get_all_env(:std_json_io)
pool_options = [
name: {:local, StdJsonIo.Pool},
worker_module: StdJsonIo.Worker,
size: Keyword.get(config, :pool_size, 15),
max_overflow: Keyword.get(config, :pool_max_overflow, 10)
]
children = [
:poolboy.child_spec(StdJsonIo.Pool, pool_options, [script: Keyword.fetch!(config, :script)])
]
opts = [strategy: :one_for_one, name: StdJsonIo.Supervisor]
Supervisor.start_link(children, opts)
end

Since we moved all the setup to lib/std_json_io/application.ex, we can clean up lib/std_json_io.ex and leave only 2 API functions: json_call/2 and json_call!/2. Finally, the whole module will fit in just 24 LOC:

defmodule StdJsonIo do
def json_call!(map, timeout \\ 10000) do
case json_call(map, timeout) do
{:ok, data} -> data
{:error, reason } -> raise "Failed to call to json service #{__MODULE__} #{to_string(reason)}"
end
end
def json_call(map, timeout \\ 10000) do
result = :poolboy.transaction(StdJsonIo.Pool, fn worker ->
GenServer.call(worker, {:json, map}, timeout)
end)
case result do
{:ok, json} ->
{:ok, data} = Poison.decode(json)
if data["error"] do
{:error, Map.get(data, "error")}
else
{:ok, data}
end
other -> other
end
end
end

Seems all is done, time to run mix test:

Finished in 0.1 seconds
3 tests, 0 failures

Great! We changed a lot, but everything is working as before, time to commit:

git add . && git commit -m 'Moved init functionality to OTP Application start\2 function'

So far we resolved problems #1, #2, #3, #5 and #7. The rest is to update dependencies and to fix warnings.

Updating dependencies

Dependency  Current  Latest  Update possible
poison 1.5.0 3.1.0 No
poolboy 1.5.1 1.5.1
porcelain 2.0.1 2.0.3 Yes

Let’s update Porcelain version required in mix.exs to 2.0.3 instead of 2.0, run mix deps.update porcelain and then mix test. Tests are passing, dependency updated, everything is fine. Next in queue is Poison:

Update Poison version required in mix.exs to 3.1.0, run mix deps.update poison and then mix test… Error:

..
18:50:51.991 [error] GenServer #PID<0.329.0> terminating
** (CaseClauseError) no case clause matching: {:error, :invalid, 65536}
(std_json_io) lib/std_json_io/worker.ex:24: anonymous fn/3 in StdJsonIo.Worker.handle_call/3
(stdlib) gen_server.erl:615: :gen_server.try_handle_call/4
(stdlib) gen_server.erl:647: :gen_server.handle_msg/5
(stdlib) proc_lib.erl:247: :proc_lib.init_p_do_apply/3
Last message: {:json, %{“thisishuge” => “Lorem Ipsum Dolor Sit Amet<…>Lorem Ipsum Dolor Sit Amet (truncated)

1) test Can handle big response (StdJsonIoTest)
test/std_json_io_test.exs:17
** (exit) exited in: GenServer.call(#PID<0.329.0>, {:json, %{“thisishuge” => “Lorem Ipsum Dolor Sit Amet<…>Lorem Ipsum Dolor Sit Amet”}}, 10000)
** (EXIT) an exception was raised:
** (CaseClauseError) no case clause matching: {:error, :invalid, 65536}
(std_json_io) lib/std_json_io/worker.ex:24: anonymous fn/3 in StdJsonIo.Worker.handle_call/3
(stdlib) gen_server.erl:615: :gen_server.try_handle_call/4
(stdlib) gen_server.erl:647: :gen_server.handle_msg/5
(stdlib) proc_lib.erl:247: :proc_lib.init_p_do_apply/3
stacktrace:
(elixir) lib/gen_server.ex:737: GenServer.call/3
src/poolboy.erl:76: :poolboy.transaction/3
(std_json_io) lib/std_json_io.ex:10: StdJsonIo.json_call/2
test/std_json_io_test.exs:20: (test)

Finished in 0.1 seconds
3 tests, 1 failure

Let’s check what’s going on at lib/std_json_io/worker.ex:24: there’s a call to Poison.decode/1. As you can see, 2 tests successfully passed while testing big response failed. It means that when Poison.decode/1 returns {:ok, _} everything is fine but expected {:error, error} is not returned for incomplete JSON, because Poison.decode/1 was updated and now returns {:error, :invalid, 65536} instead of {:error, error}. Quick googling shows up the change was introduced in this PR: https://github.com/devinus/poison/pull/100 and 65536 is a position where Poison parser got failure trying to decode data (since it was just a chunk of a JSON but not a complete and valid one). We don’t actually care about it and can simply fix it by changing the case-block in following way:

case Poison.decode(new_data) do
{:ok, _} ->
# All chunks received
{:reply, {:ok, new_data}, state}
_ ->
# Couldn't decode JSON, there are more chunks
# to receive and concat with
f.(f, new_data)
end

Now tests are passing again. Good reason to commit:

git add . && git commit -m 'Upgraded Poison and Porcelain dependencies'

Now when problem#4 is resolved, it’s time to get to problem#6 — remove all Elixir compiler warnings. When we started refactoring, there were a lot of them, but most existed in StdJsonIo.__using__/1 and are gone now leaving just these two:

warning: variable “docs” does not exist and is being expanded to “docs()”, please use parentheses to remove the ambiguity or change the variable name
mix.exs:20

warning: variable “deps” does not exist and is being expanded to “deps()”, please use parentheses to remove the ambiguity or change the variable name
mix.exs:21

Both are as easy to fix as just add () to the end of deps and docs functions calls in mix.exs:

-     docs: docs,
- deps: deps]
+ docs: docs(),
+ deps: deps()]

Now there are no more warnings but still info-messages from Porcelain appears:

19:07:23.311 [info] [Porcelain]: goon executable not found

19:07:23.311 [info] [Porcelain]: falling back to the basic driver.

19:07:23.311 [info] [Porcelain]: (set `config :porcelain, driver: Porcelain.Driver.Basic` or `config :porcelain, goon_warn_if_missing: false` to disable this warning)

Since I don’t have goon installed and not planning to do so, I’ll add config :porcelain, driver: Porcelain.Driver.Basic to config/config.js for these info messages to disappear and not bother me every time I run tests. Now when I run mix test, the output is clean and nice:

➜ std_json_io git:(refactoring) ✗ mix test

Finished in 0.2 seconds
3 tests, 0 failures

Randomized with seed 158445

Commit:

git add . && git commit -m 'Fixed warnings'

Looks like we are done with problems #1 — #7, but when it comes to refactoring, it’s never enough. Solution to handle chunked response from Porcelain is a bit hacky and have some disadvantages, so it sounds like a good idea to refactor it in a way of involving GenServer’s handle_info/2 callback…

Refactoring is never enough!

def init(script) do
Process.flag(:trap_exit, true)
pproc = Porcelain.spawn_shell(script, in: :receive, out: {:send, self()})
{:ok, %{pproc: pproc, buffer: "", reply_to: nil}}
end
def handle_cast({:json, data, reply_to}, %{pproc: pproc} = state) do
{:ok, json} = Poison.encode(data)
PProc.send_input(pproc, json <> "\n")
{:noreply, %{state | reply_to: reply_to}}
end

Also, let’s update alias to Porcelain.Process to be PProc instead of Proc to not confuse it with Process accidentally and to reflect it’s namespace:

-  alias Porcelain.Process, as: Proc
+ alias Porcelain.Process, as: PProc

Now when Porcelain will send response to the server, it will be handled in handle_info/2, which is pretty much similar with our receive approach:

def handle_info({pproc_pid, :data, :out, data}, %{pproc: %PProc{pid: pproc_pid}, buffer: buffer} = state) do
new_buffer = buffer <> data
case Poison.decode(new_buffer) do
{:ok, decoded} ->
Process.send(state[:reply_to], decoded, [])
{:noreply, %{state | buffer: ""}}
_ ->
{:noreply, %{state | buffer: new_buffer}}
end
end

Also, it’s not a bad idea to update handle_info/2 responsible for handling :result (finishing) type of messages from Porcelain in case application it works with exits, to be more strict in checking incoming message:

-  def handle_info({_js_pid, :result, %Result{err: _, status: _status}}, state) do
+ def handle_info({pproc_pid, :result, %Result{err: _, status: _status}}, %{pproc: %PProc{pid: pproc_pid}} = state) do

Because we renamed js_proc to pproc and renamed Proc alias to PProc, it should also be renamed in terminate/2 function. Next, functions terminate(_, _), handle_call(:stop, _from, state) and start_io_server/1 can be removed: first because it’s useless, second because it’s not used from anywhere in the codebase and third is because we moved starting Porcelain process to init/1. The last step in updating the module is to completely remove handle_call/3 function as it’s no longer needed.

UPDATE WARNING: following is not the best approach for API <-> Worker communication and have disadvantages. You can check out the library fork repo for a better design approach.

Now it’s time to update API functions in lib/std_json_io.ex to work with updated worker, this is pretty simple: just replace GenServer.call/2 with GenServer.cast/2 and receive block. Also if you were attentive enough, you could see that we are sending decoded JSON to API now rather than plain one — that is done to avoid double decoding. New version of StdJsonIo.json_call/2 will be following:

def json_call(data, timeout \\ 10000) do
result = :poolboy.transaction(StdJsonIo.Pool, fn worker ->
GenServer.cast(worker, {:json, data, self()})
receive do
response ->
response
after
timeout ->
%{"error" => "timeout"}
end
end)
if result["error"] do
{:error, Map.get(result, "error")}
else
{:ok, result}
end
end

Now it seems everything is in place and time to run mix test:

Finished in 0.1 seconds
3 tests, 0 failures

git add . && git commit -m 'Updated Worker design'

The final accord to all done above is to update README.md so everyone is aware of how to use the new refactored library, but I think that’s of no interest to copy-paste here so you can check it out in the repo.

As I mentioned at the very beginning, you can find my fork at https://github.com/chvanikoff/std_json_io with all the commits mentioned in the article in place and pull request to the original library can be found here: https://github.com/hassox/std_json_io/pull/13.

That’s all folks.

Roman Chvanikov

Written by

Keep calm and curry on