Introducing the universal `ci` modifier
TL;DR: in Seq 2020.1, the postfix
ci
keyword turns any text comparison into a case-insensitive one. Download the preview or pulldatalust/seq:preview
from Docker Hub to try it out!
Past attempts at specifying case-insensitivity
String comparisons present some challenges in the design of a query language for log search and analysis. Should comparisons be case-sensitive (CS) or case-insensitive (CI) by default? And whichever default is chosen, how does a user switch between them? Do the CS and CI variants of an operation syntactically resemble each other?
With the introduction of the SQL-style syntax way back in version 3.0, Seq chose to make operators like =
case-sensitive, and to provide case-insensitive alternatives through built-in functions:
Case-sensitive comparison:
Environment = 'test'
Case-insenstive comparison:
EqualIgnoreCase(Environment, 'test')
The ergnomics of this scheme turned out to be poor: after typing the first version and realizing that the second was needed, a user had to switch syntaxes, shuffle the expression around, and remember an infrequently-used function name.
To work around this, Seq's like
operator, shipped in the same 3.0 release, was case-insensitive. This meant one, slightly more predictable syntax transformation could change a comparison from CS to CI.
Case-insensitive in Seq 3.0 to 5.1:
Environment like 'test'
(Mis-)using like
this way meant StartsWithIgnoreCase()
(like 'test%'
), EndsWithIgnoreCase()
(like '%test'
), and ContainsIgnoreCase()
(like '%test%'
) could all be expressed in a convenient syntax.
Unfortunately, this also meant that:
- Seq missed out on a case-insensitive
like
operator, - Users would be surprised to find out that
like
deviates from the case-sensitive SQL norms, - Operators such as
IndexOf()
that can't be expressed withlike
still needed verbose names (IndexOfIgnoreCase()
), - Case-modifiable SQL constructs such as
group by
andorder by
would require yet another mechanism (e.g.group by ToLower(..)
) to approximate case-insensitive versions.
We've lived with this solution for some time, but kept returning to the problem in the hope of identifying a better one, and finally, we think we have it.
Enter ci
Seq 2020.1 introduces a new modifier, ci
, that turns a case-sensitive expression:
Environment = 'test'
into a case-insensitive one:
Environment = 'test' ci
Just like that 😎.
Why is ci
so much better than the mish-mash of solutions we've carried along so far?
- It's universal
- It's ergonomic
- It's predictable
ci
is universal
You've just seen how a case-sensitive, infix =
operator can be modified to a case-insensitive one. Case-insensitive inequality <>
follows naturally.
How about case-insensitive set membership - in
?
Environment in ['test', 'staging'] ci
Aggregate operators? Function-call distinct()
syntax can be modified, too:
select distinct(Environment) ci from stream
And groupings?
select count(*) from stream group by Environment ci
The ci
modifier is universal. Wherever Seq operates on text, you can apply ci
to switch from case-sensitive to case-insensitive comparisons.
ci
is ergnomic
Postfix operators are unusual, so why did we choose one for ci
? Because for us, and we suspect for many others, the point where we think about modifying the case-sensitivity of a comparison is:
Environment = 'test'█
That is, it's more often than not the cursor is at the end - so <space> ci
requires the least keystrokes or mouse clicks of the various options we considered.
Updating like
: ci
is predictable
The success of ci
depends on it being predictable: it's always a postfix keyword operator, and built-in operations are always case-sensitive without it.
This means Seq finally has a normal, case-sensitive like
operator. 🎉
Case-sensitive like
:
Environment like 'test%'
Case-insensitive like
:
Environment like 'test%' ci
Every silver cloud has a ... dark lining? ... 🤔 ... This does mean ci
brings with it a breaking change. If you're used to case-insentive like
, 2020.1 will require some reprogramming of your muscle-memory.
Upgrading from Seq 5.1
Changing the case-sensitivity of like
isn't something we embarked upon lightheartedly (or, in fact, ever imagined that we'd do!). There's a lot of opportunity to break things, and to break things in subtle ways. Because of this, we took special care to automatically fix as much affected configuration as possible.
If you upgrade an existing Seq 5.1 instance, the migrator that runs the first time 2020.1 starts up will scan all existing signals, queries, dashboards, and other expressions, to insert the ci
modifier where appropriate and keep the semantics of these stable.
If you're using Seq directly through its API, you'll need to check that any searches or queries using like
will work correctly with case-sensitive semantics.
Learn more
There's more discussion and detailed examples in the RFC for this feature. You may need to bear with us while we update the Seq docs over the next few weeks.
Questions or feedback?
Let us know what you think, and if you hit any edge cases or need help figuring out how to express something in the new syntax, please drop us a line.
HApPy lOgGiNg! 😄