kJQ Filters

An introduction to filtering queries with kJQ

Overview

JQ is a popular, practical language described as 'like sed for JSON data'. Data inspect supports JQ-like filters on Kafka topics. We call this kJQ!

kPow implements a subset of JQ allowing you to search JSON, Avro, Transit, EDN, String, and Custom Serdes with complex queries on structured data.

Language

In simple terms kJQ supports multiple comma-separated filters, one optional reduction, one optional negation, in that order.

kJQ filters can be applied to keys, values, and headers. kPow will scan tens of thousands of messages a second to find matching data.

kJQ is not whitespace sensitive.

kJQ Grammar

A kJQ filter is a limited version of a basic JQ filter

A filter consists of a selector optionally followed by either a comparator or a function.

A selector is a JQ dot notation Object Index or zero-based Array Index.

e.g. .foo.bar, .[1], .foo[1].bar

A comparator is an operator followed by a selector or a scalar.

Valid operators: ==, !=, <, <=, >, >=

e.g. >= 10, != false, == "text", < .foo.baz

A function is a pipe followed by a function-name with text parameter.

Valid function names: startswith, endswith, contains

e.g: | startswith("text"), | endswith("text"), | contains("text")

kJQ Query Evaluation

Multiple comma-separated kJQ filters are evaluated in isolation and then combined into a single query predicate with a reduction.

Strict filter isolation means the behaviour of kJQ is slightly different from JQ, particularly regarding pipe precedence.

The default reduction is all, which is a logical and of your filters. You can explicitly set a query reduction to any, or all, which is a logical or of your filters.

e.g. | any | all

kJQ Query Negation

A kJQ query predicate can be negated, this is the logical negation of the combined predicate.

e.g. | not

Examples

Truthy Filter

.foo

Matches where the selector is not null (eg {"foo": true} or {"foo": 1} will match, {"bar": true} will not match)

Scalar Comparator Filter

.foo.bar > 10

Matches where the selector > 10 (eg, {"foo": {"bar": 11}} will match, {"foo": {"bar": 8}} will not)

Selector Comparator Filter

.foo.bar == .foo.zoo

Matches where both selectors are equal (eg {"foo": {"bar": 10, "zoo": 10}} will match, {"foo": {"bar": 10, "zoo": 7}} will not)

Function Filter

.foo.baz[0] | contains("IDDQD")

Matches where the selector contains text (eg {"foo": {"baz": ["IDDQDXXXXX"]}} will match, {"foo": {"baz": ["XXXXX"]}} will not)

Negated Filter

.[0].foo | contains("IDDQD") | not

Matches where the selector does not contain text

Quoted and Clojure Selectors / Scalars

.foo/bar.baz > 10,
.foo."field!" == :some-keyword

kJQ understands quoted and Clojure data

Multiple Filters + (Default) All

.foo.bar > 10,
.foo.bar == .foo.zoo,
.foo.baz[0] | contains("IDDQD")

Matches where every filter is true (default). (eg {"foo": {"bar": 12, "zoo": 12, "baz": ["IDDQDXXXXX"]}} will match)

Multiple Filters + Explicit All

.foo.bar > 10,
.foo.bar == .foo.zoo,
.foo.baz[0] | contains("IDDQD")
| all

Matches where every filter is true.

Multiple Filters + Any

.foo.bar == .foo.zoo,
.foo.baz[0] | contains("IDDQD")
| any

Matches where any filter is true. (eg, {"foo: {"baz": ["IDDQDXXX"]}} will match)

Multiple Filters + Any + Negated

.foo.bar > 10,
.foo.bar == .foo.zoo,
.foo.baz[0] | contains("IDDQD")
| any
| not

Matches where any filter is false