Features

Data policies

Kpow supports configurable redaction of Data inspection results with Data policies.

Data policies are defined in a YAML file and configured with an environment variable:

DATA_POLICY_CONFIGURATION_FILE=/path/to/masking/config.yml

Data policies are a declarative way of defining how redactions are applied to query results (both Data inspect and ksqlDB queries).

Kpow supports redactions on both the key, value and header attributes of records and supports redaction of scalar types (eg: strings) or within structured data types (eg: maps, collections).

Structured data redaction currently supports Protobuf, AVRO, JSON, Transit, and EDN data formats as well as Custom Serdes with JSON format.

String serdes are removed from Data inspect when Data policies are configured as they could be used to circumvent redaction.

Exclusions

Define exclusions: in your Data policies YAML file to exclude specific topics from redaction and allow them to be inspected with String serdes.

exclusions:
  topics: ["tx_meta", "tx_metrics"]

Data policies

The YAML configuration defines policies, each policy contains:

  • name: the unique name of the data policy
  • resources: the resources governed by the policy
  • category: the category for this policy
  • redaction: the redaction function to be applied
  • type: the type of data (either scalar or non-scalar)
  • fields: the fields to redact for non-scalar data

Example YAML

Example: A Credit Card policy that shows only the last four digits of specific fields in all topics.

policies:
  - name: Credit Card
    category: PII
    resources:
      - [ 'cluster', '*', 'topic', '*', 'value']
    redaction: ShowLast4
    type: non-scalar
    fields: [ credit_card, creditcard, pan ]

Resource

Resources are defined through a taxonomy that describes the hierarchy of objects in Kpow:

[DOMAIN_TYPE, DOMAIN_ID, OBJECT_TYPE?, OBJECT_ID? OBJECT_RESOURCE?]

Where:

  • DOMAIN_TYPE: always cluster for data policies
  • DOMAIN_ID: the ID of the cluster or * for all clusters.
  • OBJECT_TYPE: always topic for data policies
  • OBJECT_ID: the name of the topic or * for all topics.
  • OBJECT_RESOURCE: (optional) either key, headers or value

Specifying a topic, key, or value is optional.

Example Resources

ResourceEffect
["cluster", "*"]All clusters and topics
["cluster", "N9xnGujkR32eYxHICeaHuQ"]All topics for a specific cluster
["cluster", "*", "topic", "MyTopic"]Specific topic on all clusters (key and value)
["cluster", "*", "topic", "MyTopic", "key"]Specific topic on all clusters (key only)
["cluster", "*", "topic", "*", "value"]All topics on all clusters (value only)
["cluster", "*", "topic", "MyTopic", "headers"]Specific topic on all clusters (headers only)

Redaction Functions

Supported redaction functions include:

RedactionDescriptionExample DataExample Result
FullFully redact the matched valueJohn Smith************
SHAHashApply a SHA512 hash to the valueJohn Smithed014a19bb67a..
ShowEmailHostShow the email host[email protected]*********@corp.org
ShowEmailPartShow first character and host[email protected]j********@corp.org
ShowFirstShow the first characterJohn SmithJ*********
ShowFirst2Show the first two charactersJohn SmithJo********
ShowFirst4Show the first four charactersJohn SmithJohn******
ShowFirst6Show the first six charactersJohn SmithJohn S****
ShowLastShow the last characterJohn Smith*********h
ShowLast2Show the last two charactersJohn Smith********th
ShowLast4Show the last four charactersJohn Smith******mith
ShowLast6Show the last six charactersJohn Smith**** Smith

Nested Redaction

Kpow supports redaction of nested data structures.

Example: Applying the example Credit Card policy to a JSON message.

{
  "user_details": {
    "email_address": "[email protected]",
    "payment_options": [
      { "credit_card": "376953644924215" }
    ]
  }
}

The data is masked accordingly when displayed in Data inspect search results:

{
  "user_details": {
    "email_address": "[email protected]",
    "payment_options": [
      { "credit_card": "***4215" }
    ]
  }
}

Kpow is conservative when applying data policies. Given a field where the selected redaction function cannot apply, the fallback is to use the Full redaction policy, e.g:

{
  "user_details": {
    "email_address": "[email protected]",
    "payment_options": [
      {
        "credit_card": {
          "pan": "376953644924215",
          "expiry": "10/10/2010"
        }
      }
    ]
  }
}

Applying the same Credit Card policy to this data incurs a Full redaction at the credit_card field as Kpow does not know how to apply the configured "ShowLast4" redactor to a structured value (in this case a map with "pan" and "expiry" fields).

The result is effectively truncated:

{
  "user_details": {
    "email_address": "[email protected]",
    "payment_options": [
      { "credit_card": "***" }
    ]
  }
}

Data Policy Sandbox

Kpow comes with a built in Data Policy Sandbox to experiment with your currently configured policies or to create and test new configuration.

To access the Data Policy Sandbox navigate to Admin -> Data policies

Kpow provides a Data policies sandbox