NATS Engine
This engine allows integrating ClickHouse with NATS.
NATS lets you:
- Publish or subscribe to message subjects.
- Process new messages as they become available.
Creating a Table
Required parameters:
- nats_url– host:port (for example,- localhost:5672)..
- nats_subjects– List of subject for NATS table to subscribe/publish to. Supports wildcard subjects like- foo.*.baror- baz.>
- nats_format– Message format. Uses the same notation as the SQL- FORMATfunction, such as- JSONEachRow. For more information, see the Formats section.
Optional parameters:
- nats_schema– Parameter that must be used if the format requires a schema definition. For example, Cap'n Proto requires the path to the schema file and the name of the root- schema.capnp:Messageobject.
- nats_num_consumers– The number of consumers per table. Default:- 1. Specify more consumers if the throughput of one consumer is insufficient.
- nats_queue_group– Name for queue group of NATS subscribers. Default is the table name.
- nats_max_reconnect– Deprecated and has no effect, reconnect is performed permanently with nats_reconnect_wait timeout.
- nats_reconnect_wait– Amount of time in milliseconds to sleep between each reconnect attempt. Default:- 5000.
- nats_server_list- Server list for connection. Can be specified to connect to NATS cluster.
- nats_skip_broken_messages- NATS message parser tolerance to schema-incompatible messages per block. Default:- 0. If- nats_skip_broken_messages = Nthen the engine skips N NATS messages that cannot be parsed (a message equals a row of data).
- nats_max_block_size- Number of row collected by poll(s) for flushing data from NATS. Default: max_insert_block_size.
- nats_flush_interval_ms- Timeout for flushing data read from NATS. Default: stream_flush_interval_ms.
- nats_username- NATS username.
- nats_password- NATS password.
- nats_token- NATS auth token.
- nats_credential_file- Path to a NATS credentials file.
- nats_startup_connect_tries- Number of connect tries at startup. Default:- 5.
- nats_max_rows_per_message— The maximum number of rows written in one NATS message for row-based formats. (default :- 1).
- nats_handle_error_mode— How to handle errors for NATS engine. Possible values: default (the exception will be thrown if we fail to parse a message), stream (the exception message and raw message will be saved in virtual columns- _errorand- _raw_message).
SSL connection:
For secure connection use nats_secure = 1.
The default behaviour of the used library is not to check if the created TLS connection is sufficiently secure. Whether the certificate is expired, self-signed, missing or invalid: the connection is simply permitted. More strict checking of certificates can possibly be implemented in the future.
Writing to NATS table:
If table reads only from one subject, any insert will publish to the same subject.
However, if table reads from multiple subjects, we need to specify which subject we want to publish to.
That is why whenever inserting into table with multiple subjects, setting stream_like_engine_insert_queue is needed.
You can select one of the subjects the table reads from and publish your data there. For example:
Also format settings can be added along with nats-related settings.
Example:
The NATS server configuration can be added using the ClickHouse config file. More specifically you can add Redis password for NATS engine:
Description
SELECT is not particularly useful for reading messages (except for debugging), because each message can be read only once. It is more practical to create real-time threads using materialized views. To do this:
- Use the engine to create a NATS consumer and consider it a data stream.
- Create a table with the desired structure.
- Create a materialized view that converts data from the engine and puts it into a previously created table.
When the MATERIALIZED VIEW joins the engine, it starts collecting data in the background. This allows you to continually receive messages from NATS and convert them to the required format using SELECT.
One NATS table can have as many materialized views as you like, they do not read data from the table directly, but receive new records (in blocks), this way you can write to several tables with different detail level (with grouping - aggregation and without).
Example:
To stop receiving streams data or to change the conversion logic, detach the materialized view:
If you want to change the target table by using ALTER, we recommend disabling the material view to avoid discrepancies between the target table and the data from the view.
Virtual Columns
- _subject- NATS message subject. Data type:- String.
Additional virtual columns when nats_handle_error_mode='stream':
- _raw_message- Raw message that couldn't be parsed successfully. Data type:- Nullable(String).
- _error- Exception message happened during failed parsing. Data type:- Nullable(String).
Note: _raw_message and _error virtual columns are filled only in case of exception during parsing, they are always NULL when message was parsed successfully.
Data formats support
NATS engine supports all formats supported in ClickHouse. The number of rows in one NATS message depends on whether the format is row-based or block-based:
- For row-based formats the number of rows in one NATS message can be controlled by setting nats_max_rows_per_message.
- For block-based formats we cannot divide block into smaller parts, but the number of rows in one block can be controlled by general setting max_block_size.