I’ve been moving into more IoT / connected devices recently. There are a load of platforms but a lot of them seem very closed-off or limited in scope (Losant being an interesting exception). At the other end, there’s something like AWS IoT, which has all sorts of benefits (scalability, machine learning, big data) but with a steeper learning curve and the very present fear that I will accidentally rack up a huge bill.
I’ve used InfluxDB before for logging data locally, and am pretty impressed with its performance. For another project, I’ve been streaming data over BLE every 20ms while querying batches of data with a Python process, and it’s performed admirably with very low latency. It was also convenient for running machine learning on the data (offline training, and real-time execution). That project also used an ESP32 MCU, which I’ve found great for prototyping, though they’re playing catch-up with the documentation and codebase.
I thought that using InfluxDB for an IoT application would be a good introduction to streaming data over WiFi and some less intimidating cloud stuff. This was broken up into three parts:
- Sending data over HTTP to an InfluxDB server on my Mac
- As above, but HTTPS
- Deploying the server on AWS and streaming to it.
Setting up InfluxDB
InfluxDB is a NoSQL database explicitly for time-series data (i.e. all the data I ever work with). For the first two stages, I ran InfluxDB on my Mac:
homebrew install influxdb. The current version is 1.4.2; it seems there were some major changes recently and the Python client claims not to support it, but in practice I’ve had no issues querying data.
To start the server, run
influxd (no need to
sudo on a Mac). In another terminal tab, use the command line interface to set-up a database:
Connected to http://localhost:8086 version v1.4.2
InfluxDB shell version: v1.4.2
> CREATE DATABASE esp32_tst
If you switch back to the tab running the daemon, you should see the corresponding message appearing:
When you are running embedded devices without a serial port for displaying error messages, checking this window can be useful for figure out what’s going wrong.
Now, we can add some data using the HTTP API, which requires a string containing the measurement name, tag sets and field sets (the space in the string is what separates tags and fields):
curl -i -XPOST ‘http://localhost:8086/write?db=esp32_tst' — data-binary “dummy,device=mac field_1=69
HTTP/1.1 204 No Content , meaning a successful write (a 404 probably means you haven’t created the database, and a 400 means you didn’t write a valid string). We create a tag called “device” and a field called, creatively, “field_1”. More info on tags and fields here. We can double check by querying the data in the CLI:
> use esp32_tst
Using database esp32_tst
> SELECT * FROM dummy
time device field_1
---- ------ -------
1516658590825700342 mac 69
There’s more configuration of InfluxDB later, but this will do for now.
I put together a simple library for writing data using the HTTP API. It’s a pretty thin wrapper for the HTTP Client library, but makes life easier. All it requires is the local IP of the where the InfluxDB server is running(in this case, my Mac’s address, which you can find using
ipconfig getifaddr en0), and the port (we’re using the default of 8086). It also has options for authorisation and HTTPS, which we’ll get to later.
You can test it works by running the WritePoints.ino example, and setting your own values for WiFi credentials, and database name (make sure the database exists or you’ll get a 404 error). You can see the data coming in on the daemon window, or use the CLI to see what new data has come in:
SELECT * FROM test WHERE time > now() - 5s
time count new_tag random_var
---- ----- ------- ----------
1517223799607069455 91 Yes 0.802
1517223800492976221 92 Yes 0.263
1517223801509686351 93 Yes 0.092
1517223802510211739 94 Yes 0.322
1517223803492709390 95 Yes 0.765
By default, anyone can write to the database. To make sure we know who the data are coming from, we can limit privileges by adding users. In the influx CLI, create an admin user, and then another user:
CREATE USER admin WITH PASSWORD ‘<password>’ WITH ALL PRIVILEGES
CREATE USER tom WITH PASSWORD ‘nicetry’ WITH ALL PRIVILEGES
Then, save a config file to a location of your choosing, as a .conf file. Edit the
auth-enabled value of the config file to
true (it’s under the [http] header), and restart the InfluxDB daemon, with the config file as an argument:
influxd -config /path/to/conf/file/<configname.conf> . Now, if you enter the influx CLI as usual and try and run any command, you’ll get the following message:
ERR: unable to parse authentication details. Instead, we need to log in to the CLI:
influx -username tom -password <password> . Note that running
SHOW DATABASES won’t display anything, because the previous one we made was made as the root user.
If you create the database and try running the same Arduino code as before, unsurprisingly you’ll see a 401 (unauthorised) error. In the sketch, un-uncomment the
influx.authorise line and add your details, and you should be back to a successful 204 code.
Authorisation might be a bit safer, but it’s still possible for other people to intercept and read the data you’re sending. To get around this, we can encrypt it using TLS, sending via HTTPS. The HTTPClient library can do this, using the mbed TLS library which Espressif has conveniently ported to the ESP32 (it’s not an easy task).
We are going to be using a self-signed certificate, as opposed to sending it to Symantec or someone to approve it. This means our data will be encrypted, but not verified (that we can’t officially prove who we say we are). And that’s fine, but it did cause me a lot of grief for a long time. We have to modify the source slightly, by changing a line in ssl_client.cpp from the WiFiClientSecure library:
// mbedtls_ssl_conf_authmode(&ssl_client->ssl_conf, MBEDTLS_SSL_VERIFY_REQUIRED);
mbedtls_ssl_conf_authmode(&ssl_client->ssl_conf, MBEDTLS_SSL_VERIFY_NONE); //REQUIRED FOR SELF-SIGNED CERT
Create your self-signed certificate according to InfluxData’s instructions. I made a directory in
usr/local/etc/sslto save them in, but I guess anywhere out the way is fine ¯\_(ツ)_/¯. In your .conf file, just down from the
auth-enabled setting, set the following:
https-enabled = true
https-certificate = "/usr/local/etc/ssl/,<cert-name>.crt"
https-private-key = "/usr/local/etc/ssl/<key-name>.key"
Restart the daemon, making sure to add the config file argument. If you try and access the CLI, you’ll get an error because it’s still trying to use HTTP:
influx -username tom -password dontthinksomate
Failed to connect to https://localhost:8086: Get http://localhost:8086/ping: net/http: HTTP/1.x transport connection broken: malformed HTTP response "\x15\x03\x01\x00\x02\x02"
Please use the -ssl flag to connect using SSL.
As it says, we need to use the
-ssl flag. Because our certificate is self-signed, we also need to use the
-unsafeSsl flag, which they seemed to hide from the documentation.
influx -ssl -unsafeSsl -username tom -password comeon should let you back in as usual.
If you try and run the existing Arduino sketch, InfluxDB will throw a TLS handshake error. To start using HTTPS, you need to copy your certificate into the InfluxCert.hpp template, making sure to include the “begin/end certificate” lines, newlines, and the backslash to let the compiler know this is one big string. Also make sure to
#include “InfluxCert.hpp" and uncomment the
influx.addCertificate(ROOT_CERT); line. Once compiled, you should get the same old 204 response. If you get anything mentioning an EOF error, that probably means you’ve messed up copying/ formatting the certificate in InfluxCert.hpp. If you get “remote error: tls: bad certificate”, that means ssl_client.cpp wasn’t changed correctly and it’s trying to verify the certificate.
Grafana is a popular open-source time series visualisation/ analytics tool, which supports integration with InfluxDB. (InfluxData makes a similar program called Chronograf, which I was a couple of years ago and wasn’t so keen on, but I should give it another try because a lot will have changed and I find the Grafana user experience quite unintuitive).
brew install grafana
brew tap homebrew/services
Start Grafana using one of these:
brew services start grafanagrafana-server --config=/usr/local/etc/grafana/grafana.ini --homepath /usr/local/share/grafana
If you’ve already got Grafana installed, then make sure you update because there was a problem with using only CA certs (it insists on a client certificate and key and according to the Github Issue I can’t find anymore, it was also a security flaw).
In your browser, head to http://localhost:3000/login (3000 being Grafana’s default port), and log in as admin with password admin. Obviously you should change this password: go to your profile by navigating to the Grafana logo in the top left.
Add the data source by filling out the forms. Most of it is self-explanatory: make sure to tick the “With CA Cert” box, and then the Skip TLS Verification box (because ours is self-signed, remember). Also make sure to change the default URL to HTTPS.
On the Grafana menu on the top left, click on Dashboards and create a new one. Drag a graph into the hatched space, then click on the title, then edit to actually select what data to plot (weird, right?). I made a very basic query to plot the random variable generated in the Sketch. We’ll cover more things you can do later. You can remove predefined elements (such as mean aggregation, and group by time) by clicking on the left-hand part of the box. Clicking on the the top right of the window lets you control the time history, including a “refresh every” option at the bottom left of the menu, which periodically refreshes the plots so you see the new data coming in.
That’s it — remember to save the dashboard by clicking the flopp disk at the top, otherwise you’ll have to repeat it all next time you log in…