Announcing monad-logger-aeson: Structured logging in Haskell for cheap

Updates

Intro

Structured logging is one of those things where once we have it, it’s hard to imagine going back to not having it.

Without structured logging, our log messages likely are in some number of bespoke text formats, and any metadata associated with the text component of a message is often converted to text then smashed into that very same text component, e.g.:

[Debug] Doing stuff: x=42 @(main:Main app/readme-example.hs:12:3)

With structured logging, our log messages are generally in some standard format (e.g. JSON) so they are inherently parsable, and it follows that the text component and metadata component of a log message can be recorded in separate fields.

{
  "time": "2022-05-17T04:52:15.5559417Z",
  "level": "debug",
  "location": {
    "package": "main",
    "module": "Main",
    "file": "app/readme-example.hs",
    "line": 11,
    "char": 3
  },
  "message": {
    "text": "Doing stuff",
    "meta": {
      "x": 42
    }
  }
}

Structured logging plays very nicely with log aggregation/analysis services and is an important tool in improving the observability of our programs.

This post serves as an announcement for the monad-logger-aeson package. This package provides structured JSON logging using monad-logger’s interface, in the format seen above. Specifically, it is intended to be a (largely) drop-in replacement for monad-logger’s Control.Monad.Logger.CallStack module.

There are many interesting logging libraries to choose from in Haskell: monad-logger, di, logging-effect, katip, and so on. Both by comparing the reverse dependency list for monad-logger with the other logging libraries’ reverse dependency lists, and also consulting our personal experiences working on Haskell codebases, monad-logger would seem to be the most prevalent logging library in the wild. In developing our library as a (largely) drop-in replacement for monad-logger, we hope to empower Haskellers using this popular logging interface to add structured logging to their programs with minimal fuss.

Tutorial

Assuming we have the following monad-logger-based code:

{-# LANGUAGE BlockArguments #-}
{-# LANGUAGE OverloadedStrings #-}
module Main
  ( main
  ) where

import Control.Monad.Logger.CallStack
import Data.Text (pack)

doStuff :: (MonadLogger m) => Int -> m ()
doStuff x = do
  logDebug $ "Doing stuff: x=" <> pack (show x)

main :: IO ()
main = do
  runStdoutLoggingT do
    doStuff 42
    logInfo "Done"

We would get something like this log output:

[Debug] Doing stuff: x=42 @(main:Main app/readme-example.hs:12:3)
[Info] Done @(main:Main app/readme-example.hs:18:5)

We can change our import from this:

import Control.Monad.Logger.CallStack

To this:

import Control.Monad.Logger.Aeson

In changing the import, we’ll have one compiler error to address:

monad-logger-aeson/app/readme-example.hs:12:35: error:
    • Couldn't match expected type ‘Message’
                  with actual type ‘Data.Text.Internal.Text’
    • In the second argument of ‘(<>)’, namely ‘pack (show x)’
      In the second argument of ‘($)’, namely
        ‘"Doing stuff: x=" <> pack (show x)’
      In a stmt of a 'do' block:
        logDebug $ "Doing stuff: x=" <> pack (show x)
   |
12 |   logDebug $ "Doing stuff: x=" <> pack (show x)
   |

This indicates that we need to provide the logDebug call a Message rather than a Text value. The Message type is central to monad-logger-aeson, as this is how we provide structure to our log messages. It consists of a textual component and a metadata component:

data Message = Text :# [Series]

The compiler error seen above gives us a choice depending upon our current time constraints: we can either go ahead and convert this Text value to a “proper” Message by moving the metadata it encodes into structured data (i.e. a [Series] value, where Series is an aeson key and encoded value), or we can defer doing that for now by tacking on an empty [Series] value. We’ll opt for the former here:

logDebug $ "Doing stuff" :# ["x" .= x]

Note that the logInfo call did not give us a compiler error, as Message has an IsString instance:

instance IsString Message where
  fromString string = Text.pack string :# []

With the above in place, our log output now looks like this (formatted for readability here with jq):

{
  "time": "2022-05-15T20:52:15.5559417Z",
  "level": "debug",
  "location": {
    "package": "main",
    "module": "Main",
    "file": "app/readme-example.hs",
    "line": 11,
    "char": 3
  },
  "message": {
    "text": "Doing stuff",
    "meta": {
      "x": 42
    }
  }
}
{
  "time": "2022-05-15T20:52:15.5560448Z",
  "level": "info",
  "location": {
    "package": "main",
    "module": "Main",
    "file": "app/readme-example.hs",
    "line": 17,
    "char": 5
  },
  "message": {
    "text": "Done"
  }
}

Voilà! Now our Haskell code is using structured logging. Our logs are fit for parsing, ingestion into our log aggregation/analysis service of choice, etc.

Extras

The previous section summarizes basic usage of monad-logger-aeson, but the library provides additional functionality we may wish to leverage.

Thread context

It can often be convenient to register some structured, contextual info for the duration of a block of code. To support this, monad-logger-aeson provides the following function:

withThreadContext :: (MonadIO m, MonadMask m) => [Pair] -> m a -> m a

All messages logged within the provided action will automatically include this contextual info. This function is thread-safe, as the contextual info is scoped to the calling thread only.

Registering thread context for messages can be useful in many scenarios. One particularly apt scenario is in wai middlewares. We can generate an ID for each incoming request then include it in the thread context:

import Control.Monad.Logger.Aeson ((.=), withThreadContext)
import Network.Wai (Middleware)
import qualified Data.UUID.V4 as UUID

addRequestId :: Middleware
addRequestId app = \request sendResponse -> do
  uuid <- UUID.nextRandom
  withThreadContext ["requestId" .= uuid] do
    app request sendResponse

Now all messages subsequently logged from our endpoint handler will automatically include a request ID local to each specific request (we’ve elided the "time" and "location" fields here for brevity):

{
  "level": "debug",
  "context": {
    "requestId": "29fb7473-4f20-4b29-8c61-b2c7bf0b5310"
  },
  "message": {
    "text": "Fetching current weather",
    "meta": {
      "zipCode": "12345"
    }
  }
}
{
  "level": "debug",
  "context": {
    "requestId": "7df6d573-918b-4b66-9b59-14d4c9648530"
  },
  "message": {
    "text": "Fetching current weather",
    "meta": {
      "zipCode": "12345"
    }
  }
}
{
  "level": "debug",
  "context": {
    "requestId": "7df6d573-918b-4b66-9b59-14d4c9648530"
  },
  "message": {
    "text": "Fetched current weather",
    "meta": {
      "zipCode": "12345"
    }
  }
}
{
  "level": "error",
  "context": {
    "requestId": "29fb7473-4f20-4b29-8c61-b2c7bf0b5310"
  },
  "message": {
    "text": "Failed to fetch current weather",
    "meta": {
      "zipCode": "12345",
      "reason": "blah blah"
    }
  }
}

withThreadContext is additive: if we nest calls to it, each nested call will add to the existing thread context. In the case of overlapping keys, the nested call’s Pair value(s) will win. Whenever the inner action completes, the thread context is rolled back to its value set in the enclosing action.

If we wish to include the existing thread context from one thread in another thread, we must register the thread context explicitly on that other thread. The myThreadContext function can be leveraged in this case.

If we’re coming from a Java background, it may be helpful for us to draw parallels between withThreadContext and log4j2’s ThreadContext (or perhaps log4j’s MDC). They all enable the same thing: setting some thread-local info that will be automatically pulled into each logged message.

Customizing logging

If we would like to have monad-logger-aeson automatically include the ThreadId in each message’s thread context, we can do so via the lower-level defaultOutputWith function:

defaultOutputWith
  :: OutputOptions
  -> Loc
  -> LogSource
  -> LogLevel
  -> LogStr
  -> IO ()

This function lets us build our own LoggingT “runner” functions if the existing ones in monad-logger-aeson do not suit our needs:

import qualified Data.ByteString.Char8 as BS8
import qualified System.IO as IO

-- ...

customStdoutLoggingT :: LoggingT m a -> m a
customStdoutLoggingT = flip runLoggingT logger
  where
  logger :: Loc -> LogSource -> LogLevel -> LogStr -> IO ()
  logger =
    defaultOutputWith (defaultOutputOptions outputAction)
      { outputIncludeThreadId = True
      }

  outputAction :: LogLevel -> BS8.ByteString -> IO ()
  outputAction _logLevel logMessageBytes = do
    BS8.hPutStrLn IO.stdout logMessageBytes

Our thread context field ("context") will now automatically include a Pair capturing the ThreadId, e.g.:

{ ..., "tid": "ThreadId 1", ... }

We can also register a “base” thread context, i.e. a set of Pair that will always be present in log messages. We’ll extend the previous example to demonstrate this:

import Data.Text (Text)

-- ...

customStdoutLoggingT :: Text -> LoggingT m a -> m a
customStdoutLoggingT service = flip runLoggingT logger -- service param is new
  where
  logger :: Loc -> LogSource -> LogLevel -> LogStr -> IO ()
  logger =
    defaultOutputWith (defaultOutputOptions outputAction)
      { outputIncludeThreadId = True
      , outputBaseThreadContext = ["service" .= service] -- this line is new
      }

  outputAction :: LogLevel -> BS8.ByteString -> IO ()
  outputAction _logLevel logMessageBytes = do
    BS8.hPutStrLn IO.stdout logMessageBytes

Our thread context field ("context") will now always contain the “base” thread context, e.g.:

{ ..., "service": "weather", ... }

If we subsequently use withThreadContext to register some thread context for our messages, if any of the keys in those Pair values overlap with the “base” thread context, then the overlapped Pair values in the “base” thread context will be overridden for the duration of the action provided to withThreadContext.

Additional LoggingT runners

We may find some of the additional LoggingT runner functions monad-logger-aeson provides to be useful:

runFastLoggingT :: LoggerSet -> LoggingT m a -> m a
runHandleLoggingT :: (LogLevel -> Handle) -> LoggingT m a -> m a

The former is convenient when we are working with a fast-logger LoggerSet, and the latter is convenient for CLIs/tools where we’d like to appropriately route log messages to stdout/stderr based on the message’s LogLevel. We can get a default implementation for runHandleLoggingT’s input function via defaultHandleFromLevel.

Parsing logs

monad-logger-aeson provides the following type:

data LoggedMessage = LoggedMessage
  { loggedMessageTimestamp :: UTCTime
  , loggedMessageLevel :: LogLevel
  , loggedMessageLoc :: Maybe Loc
  , loggedMessageLogSource :: Maybe LogSource
  , loggedMessageThreadContext :: KeyMap Value
  , loggedMessageText :: Text
  , loggedMessageMeta :: KeyMap Value
  }

This type is the Haskell representation of each JSON log message produced by the library. While we never interact with this type directly when logging messages with monad-logger-aeson, we may wish to use this type (and it’s FromJSON/ToJSON instances) if we are parsing/processing log files generated by this library.

End

We’ve now walked through the major features of monad-logger-aeson. We’ve seen how we can add structured logging to our Haskell programs with minimal effort, and we’ve added it using an interface many of us are already familiar with/likely already have in our codebases: MonadLogger/LoggingT.