Analytics with Plausible

By Connor Taffe | Published .

While server-side logs are the most accurate information on who is accessing what on your site, client side analytics can compliment this by focusing only on browsers which have JavaScript enabled as opposed to any other client.

This blog is an exercise in creating a site from scratch, both code and infrastructure, so Google Analytics is off the table. I'd also like to avoid readers having to load scripts from a third party site, or give that tool access to my readers' behavior. Plausible is an open source solution that I can run myself, where I retain control over all analytics and how they are used.

I've been running Plausible for over a week now, here's what I see when I navigate to my instance:

Plausible Site View
Plausible Site View

I can quickly see where readers are connecting from geographically, including some countries I wouldn't expect like Russia, China, and Germany; where readers find my content, which is unsurprisingly Google but also some locally popular search engines; and which are my most popular pages, which by far is my article on AirPrint with CUPS. This is all information I didn't have prior to setting up Plausible -- it should be available in the logs, but processing logs is something I haven't taken a whack at yet.

Below I detail how I set up Plausible in my environment. Plausible includes helpful materials in the Community Edition repo including a Docker Compose file I based my Kubernetes configuration on.

Installation

Before we start, you'll need to install PostgreSQL and ClickHouse in your environment. As I haven't yet solved the problem of persistent volumes on Kubernetes at home, I use a dedicated Fedora Linux 38 VM at db.home.arpa running on VMWare to host these databases.

VMWare Web UI
VMWare Web UI

Install PostgreSQL

  1. On our database instance, install PostgreSQL:

    ; sudo dnf install postgresql-server postgresql-contrib
    ; sudo postgresql-setup --initdb --unit postgresql
    ; sudo systemctl enable --now postgresql
    
  2. Now connect and optionally create a user and database for yourself, so you can login with your own user on the VM. The user name must match your Linux user name for peer authentication to succeed.

    ; whoami
    cptaffe
    ; sudo -u postgres psql
    psql (15.4)
    Type "help" for help.
    
    postgres=# CREATE ROLE cptaffe LOGIN;
    postgres=# CREATE DATABASE cptaffe;
    
  3. Login to that new user, and optionally set a password for connecting over the network:

    ; psql
    psql (15.4)
    Type "help" for help.
    
    cptaffe=> \password cptaffe
    

    Save this password in a password manager.

  4. Edit your file var/lib/pgsql/data/pg_hba.conf to control access, mine looks like:

    # TYPE  DATABASE        USER            ADDRESS                 METHOD
    
    # "local" is for Unix domain socket connections only
    local   all             all                                     peer
    # IPv4 local connections:
    host    all             all             127.0.0.1/32            ident
    # IPv6 local connections:
    host    all             all             ::1/128                 ident
    # Allow replication connections from localhost, by a user with the
    # replication privilege.
    local   replication     all                                     peer
    host    replication     all             127.0.0.1/32            ident
    host    replication     all             ::1/128                 ident
    host    all             all             samenet                 scram-sha-256
    

    The important line here is the last one, which enables connections over the network to all users and databases, but only from hosts on the same network and they must authenticate using scram-sha-256.

    ; sudo systemctl restart postgresql.service
    
  5. Add a firewall rule which allows connections to PostgreSQL:

    ; sudo firewall-cmd --permanent --new-service=postgres
    ; sudo firewall-cmd --permanent --service=postgres --add-port=5432/tcp
    ; sudo firewall-cmd --permanent --add-service=postgres
    ; sudo firewall-cmd --reload
    
  6. Now test that you can login over the network. From another machine (assuming you have the same username):

    ; psql postgres://db.home.arpa
    Password for user cptaffe:
    psql (14.9 (Homebrew), server 15.4)
    WARNING: psql major version 14, server major version 15.
            Some psql features might not work.
    Type "help" for help.
    
    cptaffe=>
    

Install ClickHouse

  1. On the same instance, or another dedicated instance, install ClickHouse:

    ; sudo yum install -y yum-utils
    ; sudo yum-config-manager --add-repo https://packages.clickhouse.com/rpm/clickhouse.repo
    ; sudo yum install -y clickhouse-server clickhouse-client
    ; sudo systemctl enable --now clickhouse-server
    
  2. Edit /etc/clickhouse-server/config.xml to enable listening for remote connections:

    <listen_host>::</listen_host>
    

    I also set

    <display_name>db.home.arpa</display_name>
    

    and commented out any unused protocols like mysql_port, postgresql_port, etc.

  3. Generate a random password and a hash for that password:

    ; PASSWORD=$(base64 < /dev/urandom | head -c8); echo "$PASSWORD"; echo -n "$PASSWORD" | sha256sum | tr -d '-'
    

    Save this password in a password manager.

    Then edit `` and add the line:

    <password_sha256_hex>xyz</password_sha256_hex>
    

    where xyz is replaced with the password hash from the above command.

  4. Restart the service

    ; sudo systemctl restart clickhouse-server
    

    and ensure you can connect to it:

    ; clickhouse-client
    Password for user (default):
    
    db.home.arpa :)
    
  5. Add firewall rules to allow connection to ClickHouse:

    ; sudo firewall-cmd --permanent --new-service=clickhouse
    ; sudo firewall-cmd --permanent --service=clickhouse --add-port=9000/tcp
    ; sudo firewall-cmd --permanent --service=clickhouse --add-port=8123/tcp
    ; sudo firewall-cmd --permanent --add-service=clickhouse
    ; sudo firewall-cmd --reload
    

Credentials

Next, we should create dedicated accounts on both systems for Plausible, to limit access. From our VM, run the following commands for PostgreSQL, replacing xyz with a secure random password.

sudo -u postgres psql
psql (15.4)
Type "help" for help.

postgres=# CREATE DATABASE plausible;
postgres=# CREATE USER plausible WITH ENCRYPTED PASSWORD 'xyz';
postgres=# GRANT ALL PRIVILEGES ON DATABASE plausible TO plausible;
postgres=# GRANT ALL ON SCHEMA public TO plausible;

Next do the same for ClickHouse:

; clickhouse-client
Password for user (default):

db.home.arpa :) CREATE USER plausible IDENTIFIED WITH sha256_password BY 'xyz';
db.home.arpa :) CREATE DATABASE plausible;
db.home.arpa :) GRANT SELECT, INSERT, ALTER, CREATE DATABASE, CREATE TABLE, CREATE VIEW, CREATE DICTIONARY, DROP DATABASE, DROP TABLE, DROP VIEW, DROP DICTIONARY, TRUNCATE ON plausible.* TO plausible;

Kubernetes

What follows is the Kubernetes configuration I use for my Plausible setup.

  1. First, create a new namespace for Plausible:

    apiVersion: v1
    kind: Namespace
    metadata:
    name: plausible
    
  2. Create a secret in that namespace populated with the login information from above:

    apiVersion: v1
    kind: Secret
    metadata:
    name: plausible
    namespace: plausible
    type: Opaque
    stringData:
    BASE_URL: https://plausible.example.com
    SECRET_KEY_BASE:
    MAXMIND_LICENSE_KEY:
    MAXMIND_EDITION: GeoLite2-City
    GOOGLE_CLIENT_ID:
    GOOGLE_CLIENT_SECRET:
    DATABASE_URL: postgres://plausible:xyz@db.home.arpa:5432/plausible
    CLICKHOUSE_DATABASE_URL: http://plausible:xyz@db.home.arpa:8123/plausible
    DISABLE_REGISTRATION: invite_only
    

    See the documentation for details on configuration. Replace BASE_URL with the Internet-accessible domain name of your instance.

    A new SECRET_KEY_BASE value can be generated simply with:

    ; head -c 64 < /dev/urandom | base64
    
  3. Create the deployment which will be configured by the secret:

    apiVersion: apps/v1
    kind: Deployment
    metadata:
    name: plausible
    namespace: plausible
    spec:
    selector:
        matchLabels:
        app: plausible
    
    template:
        metadata:
        labels:
            app: plausible
        spec:
        containers:
            - name: plausible
            image: plausible/analytics:latest
            command: ["/bin/sh"]
            args:
                [
                "-c",
                "sleep 10 && /entrypoint.sh db createdb && /entrypoint.sh db migrate && /entrypoint.sh run",
                ]
            ports:
                - name: http
                containerPort: 8000
            envFrom:
                - secretRef:
                    name: plausible
    
  4. Create the service which will make our Plausible instance accessible on our local network:

    apiVersion: v1
    kind: Service
    metadata:
    name: plausible
    namespace: plausible
    spec:
    selector:
        app: plausible
    ports:
        - name: http
        port: 80
        targetPort: http
    

    Once Plausible is running, navigate to it and set up your account. On my network, pfSense delegates k8s.home.arpa to Kubernetes, so we can navigate to https://plausible.plausible.svc.k8s.home.arpa/.

  5. Finally, create a new Ingress which will make the service available from the Internet. On my cluster, kubernetes-pfsense-controller syncs the Ingress configuration to HAProxy running on pfSense.

    apiVersion: networking.k8s.io/v1
    kind: Ingress
    metadata:
    name: plausible
    namespace: plausible
    spec:
    ingressClassName: traefik
    rules:
    - host: plausible.example.com
        http:
        paths:
        - backend:
            service:
                name: plausible
                port:
                name: http
            path: /
            pathType: Prefix
    

DNS

We need a new DNS entry for our Plausible server's domain. Navigate to your DNS provider and mirror the A or AAA records for your main domain for your new Plausible domain.

CloudFlare DNS Configuration
CloudFlare DNS Configuration

pfSense

Configuration of HAProxy is automatically handled by kubernetes-pfsense-controller, so we only need to ensure our ACME certificate can handle our new plausible.example.com domain.

  1. Navigate to Services, ACME Certificates.
  2. Click edit on your certificate.
  3. In the Domain SAN List, add our new domain name; copy the e.g. webroot configuration from other domains.
  4. Then back at the certificates list, click Issue/Renew on the certificate to ask Let's Encrypt to issue a new certificate with the updated domains list.

If successful, we have our updated certificate and HTTPS will work on our Plausible services.

Setup

Now that our Plausible server is accessible from the Internet, and we've created an account, we can add the analytics script to our site. For each page or template, add the following XML1 snippet to the bottom of the <header> tag:

<script async="async" data-domain="example.com" src="https://plausible.example.com/js/script.js"></script>

This differs from the snippet Plausible provides in two ways:


  1. Yes, HTML 5 supports XML serialization↩︎