Introducing the universal `ci` modifier

TL;DR: in Seq 2020.1, the postfix ci keyword turns any text comparison into a case-insensitive one. Download the preview or pull datalust/seq:preview from Docker Hub to try it out!

Grouping by request path: case-sensitive (CS) vs. case-insensitive (CI). In a case-sensitive grouping, /API/ORDERS and /api/orders identify different groups. The CI modifier makes the request path grouping case-insensitive, so these two values identify the same group.

Past attempts at specifying case-insensitivity

String comparisons present some challenges in the design of a query language for log search and analysis. Should comparisons be case-sensitive (CS) or case-insensitive (CI) by default? And whichever default is chosen, how does a user switch between them? Do the CS and CI variants of an operation syntactically resemble each other?

With the introduction of the SQL-style syntax way back in version 3.0, Seq chose to make operators like = case-sensitive, and to provide case-insensitive alternatives through built-in functions:

Case-sensitive comparison:

Environment = 'test'

Case-insenstive comparison:

EqualIgnoreCase(Environment, 'test')

The ergnomics of this scheme turned out to be poor: after typing the first version and realizing that the second was needed, a user had to switch syntaxes, shuffle the expression around, and remember an infrequently-used function name.

To work around this, Seq's like operator, shipped in the same 3.0 release, was case-insensitive. This meant one, slightly more predictable syntax transformation could change a comparison from CS to CI.

Case-insensitive in Seq 3.0 to 5.1:

Environment like 'test'

(Mis-)using like this way meant StartsWithIgnoreCase() (like 'test%'), EndsWithIgnoreCase() (like '%test'), and ContainsIgnoreCase() (like '%test%') could all be expressed in a convenient syntax.

Unfortunately, this also meant that:

  • Seq missed out on a case-insensitive like operator,
  • Users would be surprised to find out that like deviates from the case-sensitive SQL norms,
  • Operators such as IndexOf() that can't be expressed with like still needed verbose names (IndexOfIgnoreCase()),
  • Case-modifiable SQL constructs such as group by and order by would require yet another mechanism (e.g. group by ToLower(..)) to approximate case-insensitive versions.

We've lived with this solution for some time, but kept returning to the problem in the hope of identifying a better one, and finally, we think we have it.

Enter ci

Seq 2020.1 introduces a new modifier, ci, that turns a case-sensitive expression:

Environment = 'test'

into a case-insensitive one:

Environment = 'test' ci

Just like that 😎.

Why is ci so much better than the mish-mash of solutions we've carried along so far?

  • It's universal
  • It's ergonomic
  • It's predictable

ci is universal

You've just seen how a case-sensitive, infix = operator can be modified to a case-insensitive one. Case-insensitive inequality <> follows naturally.

How about case-insensitive set membership - in?

Environment in ['test', 'staging'] ci

Aggregate operators? Function-call distinct() syntax can be modified, too:

select distinct(Environment) ci from stream

And groupings?

select count(*) from stream group by Environment ci

The ci modifier is universal. Wherever Seq operates on text, you can apply ci to switch from case-sensitive to case-insensitive comparisons.

ci is ergnomic

Postfix operators are unusual, so why did we choose one for ci? Because for us, and we suspect for many others, the point where we think about modifying the case-sensitivity of a comparison is:

Environment = 'test'█

That is, it's more often than not the cursor is at the end - so <space> ci requires the least keystrokes or mouse clicks of the various options we considered.

Updating like: ci is predictable

The success of ci depends on it being predictable: it's always a postfix keyword operator, and built-in operations are always case-sensitive without it.

This means Seq finally has a normal, case-sensitive like operator. 🎉

Case-sensitive like:

Environment like 'test%'

Case-insensitive like:

Environment like 'test%' ci

Every silver cloud has a ... dark lining? ... 🤔 ... This does mean ci brings with it a breaking change. If you're used to case-insentive like, 2020.1 will require some reprogramming of your muscle-memory.

Upgrading from Seq 5.1

Changing the case-sensitivity of like isn't something we embarked upon lightheartedly (or, in fact, ever imagined that we'd do!). There's a lot of opportunity to break things, and to break things in subtle ways. Because of this, we took special care to automatically fix as much affected configuration as possible.

If you upgrade an existing Seq 5.1 instance, the migrator that runs the first time 2020.1 starts up will scan all existing signals, queries, dashboards, and other expressions, to insert the ci modifier where appropriate and keep the semantics of these stable.

If you're using Seq directly through its API, you'll need to check that any searches or queries using like will work correctly with case-sensitive semantics.

Learn more

There's more discussion and detailed examples in the RFC for this feature. You may need to bear with us while we update the Seq docs over the next few weeks.

Questions or feedback?

Let us know what you think, and if you hit any edge cases or need help figuring out how to express something in the new syntax, please drop us a line.

HApPy lOgGiNg! 😄

Nicholas Blumhardt

Read more posts by this author.