Log monitoring with Promtail and Grafana Cloud

Aggregate logs from PythonAnywhere

Robert Szulist
6 min readOct 7, 2021

Zabbix is my go-to monitoring tool, but it’s not perfect. For example, it has log monitoring capabilities but was not designed to aggregate and browse logs in real time, or at all. You can give it a go, but it won’t be as good as something designed specifically for this job, like Loki from Grafana Labs. And the best part is that Loki is included in Grafana Cloud’s free offering.

In this article we’ll take a look at how to use Grafana Cloud and Promtail to aggregate and analyse logs from apps hosted on PythonAnywhere.

Initial setup

The first thing we need to do is to set up an account in Grafana cloud . The process is pretty straightforward, but be sure to pick up a nice username, as it will be a part of your instances URL, a detail that might be important if you ever decide to share your stats with friends or family. With that out of the way, we can start setting up log collection.

Onboarding/Walkthrough subpage

Navigate to Onboarding>Walkthrough and select “Forward metrics, logs and traces”. There you’ll see a variety of options for forwarding collected data. We are interested in Loki — the “Prometheus, but for logs”. You will be asked to generate an API key. Creating it will generate a boilerplate Promtail configuration, which should look similar to this:

server:
http_listen_port: 0
grpc_listen_port: 0

positions:
filename: /tmp/positions.yaml

client:
url: https://(REDACTED)/api/prom/push

scrape_configs:
- job_name: system
static_configs:
- targets:
- localhost
labels:
job: varlogs
__path__: /var/log/*.log

Take note of the url parameter as it contains authorization details to your Loki instance. Obviously you should never share this with anyone you don’t trust.

Running Promtail on PythonAnywhere

Getting Promtail

Now let’s move to PythonAnywhere. We start by downloading the Promtail binary. The latest release can always be found on the project’s Github page. As of the time of writing this article, the newest version is 2.3.0. To download it just run:

$ wget https://github.com/grafana/loki/releases/download/v2.3.0/promtail-linux-amd64.zip -O /tmp/promtail-linux-amd64.zip

After this we can unzip the archive and copy the binary into some other location. I like to keep executables and scripts in ~/bin and all related configuration files in ~/etc. This makes it easy to keep things tidy. Remember to set proper permissions to the extracted file.

$ mkdir ~/bin && cd ~/bin
$ unzip /tmp/promtail-linux-amd64.zip
$ chmod +x promtail-linux-amd64

Regardless of where you decided to keep this executable, you might want to add it to your PATH. It’s as easy as appending a single line to ~/.bashrc. For example: $ echo 'export PATH=$PATH:~/bin' >> ~/.bashrc. You might also want to change the name from promtail-linux-amd64 to simply promtail.

Configuring Promtail

The boilerplate configuration file serves as a nice starting point, but needs some refinement. Bellow you will find a more elaborate configuration, that does more than just ship all logs found in a directory.

Set the url parameter with the value from your boilerplate and save it as ~/etc/promtail.conf.

server:
http_listen_port: 0
grpc_listen_port: 0
positions:
filename: /tmp/positions.yaml
client:
url: https://<secret>
external_labels:
source: pythonanywhere
scrape_configs:
- job_name: access
static_configs:
- labels:
job: access_log
__path__: /var/log/*.access.log
pipeline_stages:
- regex:
expression: >-
^(?P<ip>.*?) (?P<remote_log_name>.*?)
(?P<userid>.*?) \[(?P<date>.*?) (?P<timezone>.*?)\]
\"(?P<request_method>.*?) (?P<path>.*?)
(?P<request_version>HTTP/.*)?\" (?P<status>.*?)
(?P<length>.*?) \"(?P<referrer>.*?)\"
\"(?P<user_agent>.*?)\" (?P<session_id>.*?)
response-time=(?P<response_time>.*)
- labels:
ip:
userid:
date:
timezone:
request_method:
path:
request_version:
status:
length:
referrer:
user_agent:
- job_name: error
static_configs:
- labels:
job: error_log
__path__: /var/log/*.error.log
- job_name: server
static_configs:
- labels:
job: server_log
__path__: /var/log/*.server.log
- job_name: cron
static_configs:
- labels:
job: cron_log
__path__: /var/log/tasklog*.log

Scraping is nothing more than the discovery of log files based on certain rules. Once Promtail detects that a line was added it will be passed it through a pipeline, which is a set of stages meant to transform each log line. In this instance certain parts of access log are extracted with regex and used as labels. Please note that the label value is empty — this is because it will be populated with values from corresponding capture groups.

There is a limit on how many labels can be applied to a log entry, so don’t go too wild or you will encounter the following error:

level=error ts=2021-10-06T11:55:46.626337138Z caller=client.go:355 component=client host=logs-prod-us-central1.grafana.net msg="final error sending batch" status=400 error="server returned HTTP status 400 Bad Request (400): entry for stream '(REDACTED)' has 20 label names; limit 15"

You will also notice that there are several different scrape configs. That is because each targets a different log type, each with a different purpose and a different format. Having a separate configurations makes applying custom pipelines that much easier, so if I’ll ever need to change something for error logs, it won’t be too much of a problem.

Now it’s the time to do a test run, just to see that everything is working. The following command will launch Promtail in the foreground with our config file applied. Note the -dry-run option — this will force Promtail to print log streams instead of sending them to Loki. This is really helpful during troubleshooting.

promtail-linux-amd64 -dry-run -config.file ~/etc/promtail.yaml

Take note of any errors that might appear on your screen. If everything went well, you can just kill Promtail with CTRL+C.

Always-on tasks

Running Promtail directly in the command line isn’t the best solution. Luckily PythonAnywhere provides something called a “Always-on task”. As the name implies it’s meant to manage programs that should be constantly running in the background, and what’s more — if the process fails for any reason it will be automatically restarted. This is the closest to an actual daemon as we can get. The configuration is quite easy just provide the command used to start the task. In this case we can use the same that was used to verify our configuration (without -dry-run, obviously). So at the very end the configuration should look like this.

Always-on task configuration

Once the service starts you can investigate its logs for good measure. If there are no errors, you can go ahead and browse all logs in Grafana Cloud.

Browsing logs and creating dashboards

Once everything is done, you should have a life view of all incoming logs. They are browsable through the Explore section. There you can filter logs using LogQL to get relevant information. Bellow you’ll find a sample query that will match any request that didn’t return the OK response. By using the predefined filename label it is possible to narrow down the search to a specific log source.

Sample Loki query

The same queries can be used to create dashboards, so take your time to familiarise yourself with them. Once the query was executed, you should be able to see all matching logs. Bellow you’ll find an example line from access log in its raw form. Clicking on it reveals all extracted labels.

Detailed view of a log entry.

The nice thing is that labels come with their own Ad-hoc statistics. For example, in the picture above you can see that in the selected time frame 67% of all requests were made to /robots.txt and the other 33% was someone being naughty. This is possible because we made a label out of the requested path for every line in access_log.

It is also possible to create a dashboard showing the data in a more readable form. For example, when creating a panel you can convert log entries into a table using the Labels to Fields transformation.

Logs in a table. Each column corresponds to a specific label.

Conclusion

This is how you can monitor logs of your applications using Grafana Cloud. Of course, this is only a small sample of what can be achieved using this solution.

If you have any questions, please feel free to leave a comment.

--

--

Robert Szulist

Python and cloud enthusiast, Zabbix Certified Trainer.