TL;DR: in Seq 2020.1, the postfix
cikeyword turns any text comparison into a case-insensitive one. Download the preview or pull
datalust/seq:previewfrom Docker Hub to try it out!
Past attempts at specifying case-insensitivity
String comparisons present some challenges in the design of a query language for log search and analysis. Should comparisons be case-sensitive (CS) or case-insensitive (CI) by default? And whichever default is chosen, how does a user switch between them? Do the CS and CI variants of an operation syntactically resemble each other?
With the introduction of the SQL-style syntax way back in version 3.0, Seq chose to make operators like
= case-sensitive, and to provide case-insensitive alternatives through built-in functions:
Environment = 'test'
The ergnomics of this scheme turned out to be poor: after typing the first version and realizing that the second was needed, a user had to switch syntaxes, shuffle the expression around, and remember an infrequently-used function name.
To work around this, Seq's
like operator, shipped in the same 3.0 release, was case-insensitive. This meant one, slightly more predictable syntax transformation could change a comparison from CS to CI.
Case-insensitive in Seq 3.0 to 5.1:
Environment like 'test'
like this way meant
like '%test'), and
like '%test%') could all be expressed in a convenient syntax.
Unfortunately, this also meant that:
- Seq missed out on a case-insensitive
- Users would be surprised to find out that
likedeviates from the case-sensitive SQL norms,
- Operators such as
IndexOf()that can't be expressed with
likestill needed verbose names (
- Case-modifiable SQL constructs such as
order bywould require yet another mechanism (e.g.
group by ToLower(..)) to approximate case-insensitive versions.
We've lived with this solution for some time, but kept returning to the problem in the hope of identifying a better one, and finally, we think we have it.
Seq 2020.1 introduces a new modifier,
ci, that turns a case-sensitive expression:
Environment = 'test'
into a case-insensitive one:
Environment = 'test' ci
Just like that 😎.
ci so much better than the mish-mash of solutions we've carried along so far?
- It's universal
- It's ergonomic
- It's predictable
ci is universal
You've just seen how a case-sensitive, infix
= operator can be modified to a case-insensitive one. Case-insensitive inequality
<> follows naturally.
How about case-insensitive set membership -
Environment in ['test', 'staging'] ci
Aggregate operators? Function-call
distinct() syntax can be modified, too:
select distinct(Environment) ci from stream
select count(*) from stream group by Environment ci
ci modifier is universal. Wherever Seq operates on text, you can apply
ci to switch from case-sensitive to case-insensitive comparisons.
ci is ergnomic
Postfix operators are unusual, so why did we choose one for
ci? Because for us, and we suspect for many others, the point where we think about modifying the case-sensitivity of a comparison is:
Environment = 'test'█
That is, it's more often than not the cursor is at the end - so
<space> ci requires the least keystrokes or mouse clicks of the various options we considered.
ci is predictable
The success of
ci depends on it being predictable: it's always a postfix keyword operator, and built-in operations are always case-sensitive without it.
This means Seq finally has a normal, case-sensitive
like operator. 🎉
Environment like 'test%'
Environment like 'test%' ci
Every silver cloud has a ... dark lining? ... 🤔 ... This does mean
ci brings with it a breaking change. If you're used to case-insentive
like, 2020.1 will require some reprogramming of your muscle-memory.
Upgrading from Seq 5.1
Changing the case-sensitivity of
like isn't something we embarked upon lightheartedly (or, in fact, ever imagined that we'd do!). There's a lot of opportunity to break things, and to break things in subtle ways. Because of this, we took special care to automatically fix as much affected configuration as possible.
If you upgrade an existing Seq 5.1 instance, the migrator that runs the first time 2020.1 starts up will scan all existing signals, queries, dashboards, and other expressions, to insert the
ci modifier where appropriate and keep the semantics of these stable.
If you're using Seq directly through its API, you'll need to check that any searches or queries using
like will work correctly with case-sensitive semantics.
There's more discussion and detailed examples in the RFC for this feature. You may need to bear with us while we update the Seq docs over the next few weeks.
Questions or feedback?
Let us know what you think, and if you hit any edge cases or need help figuring out how to express something in the new syntax, please drop us a line.
HApPy lOgGiNg! 😄