You can create changefeeds on tables with more than one column family. Changefeeds will emit individual messages per column family on a table.
For further detail, see the following sections:
Syntax
To target a table with multiple column families, set the split_column_families
option when creating a changefeed:
CREATE CHANGEFEED FOR TABLE {table} INTO {sink} WITH split_column_families;
To emit messages for a specific column family, use the FAMILY
keyword:
CREATE CHANGEFEED FOR TABLE {table} FAMILY {family} INTO {sink};
You can also use Core changefeeds on tables with column families by using the EXPERIMENTAL CHANGEFEED FOR
statement with split_column_families
or the FAMILY
keyword.
If a table has multiple column families, the FAMILY
keyword will ensure the changefeed emits messages for each column family you define with FAMILY
in the CREATE CHANGEFEED
statement. If you do not specify FAMILY
, then the changefeed will emit messages for all the table's column families.
To specify multiple families on the same table, it is necessary to define the table and family in both instances:
CREATE CHANGEFEED FOR TABLE tbl FAMILY f_1, TABLE tbl FAMILY f_2;
Message format
The response will follow a typical changefeed message format, but with the family name appended to the table name with a .
, in the format table.family
:
{"after":{"column":"value"},"key":[1],"topic":"table.family"}
For cloud storage sinks, the filename will include the family name appended to the table name with a +
, in the format table+primary
.
Avro schema names will include the family name concatenated to the table name.
The primary key columns will appear in the key
for all column families, and will also appear in the value only for the families that they are a member of.
For example, if the table office_dogs
has a column family primary
, containing the primary key and a STRING
column, and a secondary
column family containing a different STRING
column, then you'll receive two messages for an insert.
CREATE TABLE office_dogs (
id INT PRIMARY KEY,
name STRING,
owner STRING,
FAMILY primary (id, name),
FAMILY secondary (owner)
);
The changefeed targeting this table (started with split_column_families
) will emit the following when there are inserts to the table:
{"after":{"id":4,"name":"Toby"},"key":[4],"topic":"office_dogs.primary"}],"length":1}
{"after":{"owner":"Ashley"},"key":[4],"topic":"office_dogs.secondary"}],"length":1}
The output shows the primary
column family with 4
in the value ({"id":4,"name":"Toby"}
) and the key ("key":[4]
). The secondary
family doesn't contain the id
column, so the primary key 4
is only in the key and not the value. For an update that only affects data in one column family, the changefeed will send one message for that update relating to the family.
Considerations
- If you create a table without column families and then start a changefeed with the
split_column_families
option, it is not possible to add column families. A subsequentALTER TABLE
statement adding a column family to the table will cause the changefeed to fail. - When you do not specify column family names in the
CREATE
orALTER TABLE
statement, the family names will default to either of the following:primary
: Sinceprimary
is a key word, you'll receive a syntax error if you runCREATE CHANGEFEED FOR table FAMILY primary
. To avoid this syntax error, use double quotes:CREATE CHANGEFEED FOR table FAMILY "primary"
. You'll receive output from the changefeed like:table.primary
.fam_<zero-indexed family id>_<delimited list of columns>
: For a table that does not include a name for the family:FAMILY (id, name)
, you'll receive output from the changefeed containing:table.fam_0_id_name
. This references the table, the family ID and the two columns that this column family includes.
- Creating a changefeed with CDC queries is not supported on tables with more than one column family.
- When you create a changefeed on a table with more than one column family, the changefeed will emit messages per column family in separate streams. As a result, changefeed messages for different column families will arrive at the sink under separate topics. For more details, refer to Message format.
For examples of starting changefeeds on tables with column families, see the following examples for Enterprise and Core changefeeds.
Create a changefeed on a table with column families
In this example, you'll set up changefeeds on two tables that have column families. You'll use a single-node cluster sending changes to a webhook sink for this example, but you can use any changefeed sink to work with tables that include column families.
If you do not already have one, request a trial Enterprise license.
Use the
cockroach start-single-node
command to start a single-node cluster:cockroach start-single-node --insecure --listen-addr=localhost
As the
root
user, open the built-in SQL client:cockroach sql --insecure
Set your organization and license key:
SET CLUSTER SETTING cluster.organization = '<organization name>';
SET CLUSTER SETTING enterprise.license = '<secret>';
Enable the
kv.rangefeed.enabled
cluster setting:SET CLUSTER SETTING kv.rangefeed.enabled = true;
In a separate terminal window, set up your HTTP server. Clone the test repository:
git clone https://github.com/cockroachlabs/cdc-webhook-sink-test-server.git
cd cdc-webhook-sink-test-server/go-https-server
Next make the script executable and then run the server (passing a specific port if preferred, otherwise it will default to
:3000
):chmod +x ./server.sh
./server.sh <port>
Back in your SQL shell, create a database called
cdc_demo
:CREATE DATABASE cdc_demo;
Set the database as the default:
USE cdc_demo;
Create a table with two column families:
CREATE TABLE office_dogs ( id INT PRIMARY KEY, name STRING, dog_owner STRING, FAMILY dogs (id, name), FAMILY employee (dog_owner) );
Insert some data into the table:
INSERT INTO office_dogs (id, name, dog_owner) VALUES (1, 'Petee', 'Lauren'), (2, 'Max', 'Taylor'), (3, 'Patch', 'Sammy'), (4, 'Roach', 'Ashley');
Create a second table that also defines column families:
CREATE TABLE office_plants ( id INT PRIMARY KEY, plant_name STRING, office_floor INT, safe_for_dogs BOOL, FAMILY dog_friendly (office_floor, safe_for_dogs), FAMILY plant (id, plant_name) );
Insert some data into
office_plants
:INSERT INTO office_plants (id, plant_name, office_floor, safe_for_dogs) VALUES (1, 'Sansevieria', 11, false), (2, 'Monstera', 11, false), (3, 'Peperomia', 10, true), (4, 'Jade', 9, true);
Create a changefeed on the
office_dogs
table targeting one of the column families. Use theFAMILY
keyword in theCREATE
statement:CREATE CHANGEFEED FOR TABLE office_dogs FAMILY employee INTO 'webhook-https://localhost:3000?insecure_tls_skip_verify=true';
You'll receive one message for each of the inserts that affects the specified column family:
{"payload":[{"after":{"dog_owner":"Lauren"},"key":[1],"topic":"office_dogs.employee"}],"length":1} {"payload":[{"after":{"dog_owner":"Sammy"},"key":[3],"topic":"office_dogs.employee"}],"length":1} {"payload":[{"after":{"dog_owner":"Taylor"},"key":[2],"topic":"office_dogs.employee"}],"length":1} {"payload":[{"after":{"dog_owner":"Ashley"},"key":[4],"topic":"office_dogs.employee"}],"length":1}
Note:The ordering of messages is not guaranteed. That is, you may not always receive messages for the same row, or even the same change to the same row, next to each other.
Alternatively, create a changefeed using the
FAMILY
keyword across two tables:CREATE CHANGEFEED FOR TABLE office_dogs FAMILY employee, TABLE office_plants FAMILY dog_friendly INTO 'webhook-https://localhost:3000?insecure_tls_skip_verify=true';
You'll receive one message for each insert that affects the specified column families:
{"payload":[{"after":{"dog_owner":"Lauren"},"key":[1],"topic":"office_dogs.employee"}],"length":1} {"payload":[{"after":{"office_floor":11,"safe_for_dogs":false},"key":[1],"topic":"office_plants.dog_friendly"}],"length":1} {"payload":[{"after":{"office_floor":9,"safe_for_dogs":true},"key":[4],"topic":"office_plants.dog_friendly"}],"length":1} {"payload":[{"after":{"dog_owner":"Taylor"},"key":[2],"topic":"office_dogs.employee"}],"length":1} {"payload":[{"after":{"office_floor":11,"safe_for_dogs":false},"key":[2],"topic":"office_plants.dog_friendly"}],"length":1} {"payload":[{"after":{"office_floor":10,"safe_for_dogs":true},"key":[3],"topic":"office_plants.dog_friendly"}],"length":1} {"payload":[{"after":{"dog_owner":"Ashley"},"key":[4],"topic":"office_dogs.employee"}],"length":1} {"payload":[{"after":{"dog_owner":"Sammy"},"key":[3],"topic":"office_dogs.employee"}],"length":1}
This allows you to define particular column families for the changefeed to target, without necessarily specifying every family in a table.
Note:To create a changefeed specifying two families on one table, ensure that you define the table and family in both instances:
CREATE CHANGEFEED FOR TABLE office_dogs FAMILY employee, TABLE office_dogs FAMILY dogs INTO {sink};
To create a changefeed that emits messages for all column families in a table, use the
split_column_families
option:CREATE CHANGEFEED FOR TABLE office_dogs INTO 'webhook-https://localhost:3000?insecure_tls_skip_verify=true' with split_column_families;
You'll receive output for both of the column families in the
office_dogs
table:{"payload":[{"after":{"id":1,"name":"Petee"},"key":[1],"topic":"office_dogs.dogs"}],"length":1} {"payload":[{"after":{"dog_owner":"Lauren"},"key":[1],"topic":"office_dogs.employee"}],"length":1} {"payload":[{"after":{"id":2,"name":"Max"},"key":[2],"topic":"office_dogs.dogs"}],"length":1} {"payload":[{"after":{"dog_owner":"Taylor"},"key":[2],"topic":"office_dogs.employee"}],"length":1} {"payload":[{"after":{"id":3,"name":"Patch"},"key":[3],"topic":"office_dogs.dogs"}],"length":1} {"payload":[{"after":{"dog_owner":"Sammy"},"key":[3],"topic":"office_dogs.employee"}],"length":1} {"payload":[{"after":{"id":4,"name":"Roach"},"key":[4],"topic":"office_dogs.dogs"}],"length":1} {"payload":[{"after":{"dog_owner":"Ashley"},"key":[4],"topic":"office_dogs.employee"}],"length":1}
Note:You can find details of your changefeed job using
SHOW CHANGEFEED JOBS
. Changefeeds streaming to Kafka or Google Cloud Pub/Sub will populate thetopics
field in theSHOW CHANGEFEED JOBS
output.When using the
FAMILY
keyword, thetopics
field will display in the formattopic.family
, e.g.,office_dogs.employee,office_dogs.dogs
. With thesplit_column_families
option set,topics
will show the topic name and a family placeholdertopic.{family}
, e.g.,office_dogs.{family}
.Update one of the values in the table:
UPDATE office_dogs SET name = 'Izzy' WHERE id = 4;
This only affects one column family, which means you'll receive one message:
{"payload":[{"after":{"id":4,"name":"Izzy"},"key":[4],"topic":"office_dogs.dogs"}],"length":1}
Create a Core changefeed on a table with column families
In this example, you'll set up Core changefeeds on two tables that have column families. You'll use a single-node cluster with the Core changefeed sending changes to the client.
Use the
cockroach start-single-node
command to start a single-node cluster:cockroach start-single-node --insecure --listen-addr=localhost
As the
root
user, open the built-in SQL client:cockroach sql --url="postgresql://root@127.0.0.1:26257?sslmode=disable" --format=csv
Enable the
kv.rangefeed.enabled
cluster setting:SET CLUSTER SETTING kv.rangefeed.enabled = true;
Create a database called
cdc_demo
:CREATE DATABASE cdc_demo;
Set the database as the default:
USE cdc_demo;
Create a table with two column families:
CREATE TABLE office_dogs ( id INT PRIMARY KEY, name STRING, dog_owner STRING, FAMILY dogs (id, name), FAMILY employee (dog_owner) );
Insert some data into the table:
INSERT INTO office_dogs (id, name, dog_owner) VALUES (1, 'Petee', 'Lauren'), (2, 'Max', 'Taylor'), (3, 'Patch', 'Sammy'), (4, 'Roach', 'Ashley');
Create another table that also defines two column families:
CREATE TABLE office_plants ( id INT PRIMARY KEY, plant_name STRING, office_floor INT, safe_for_dogs BOOL, FAMILY dog_friendly (office_floor, safe_for_dogs), FAMILY plant (id, plant_name) );
Insert some data into
office_plants
:INSERT INTO office_plants (id, plant_name, office_floor, safe_for_dogs) VALUES (1, 'Sansevieria', 11, false), (2, 'Monstera', 11, false), (3, 'Peperomia', 10, true), (4, 'Jade', 9, true);
Create a changefeed on the
office_dogs
table targeting one of the column families. Use theFAMILY
keyword in the statement:EXPERIMENTAL CHANGEFEED FOR TABLE office_dogs FAMILY employee;
You'll receive one message for each of the inserts that affects the specified column family:
table,key,value office_dogs.employee,[1],"{""after"": {""owner"": ""Lauren""}}" office_dogs.employee,[2],"{""after"": {""owner"": ""Taylor""}}" office_dogs.employee,[3],"{""after"": {""owner"": ""Sammy""}}" office_dogs.employee,[4],"{""after"": {""owner"": ""Ashley""}}"
Note:The ordering of messages is not guaranteed. That is, you may not always receive messages for the same row, or even the same change to the same row, next to each other.
Alternatively, create a changefeed using the
FAMILY
keyword across two tables:EXPERIMENTAL CHANGEFEED FOR TABLE office_dogs FAMILY employee, TABLE office_plants FAMILY dog_friendly;
You'll receive one message for each insert that affects the specified column families:
table,key,value office_plants.dog_friendly,[1],"{""after"": {""office_floor"": 11, ""safe_for_dogs"": false}}" office_plants.dog_friendly,[2],"{""after"": {""office_floor"": 11, ""safe_for_dogs"": false}}" office_plants.dog_friendly,[3],"{""after"": {""office_floor"": 10, ""safe_for_dogs"": true}}" office_plants.dog_friendly,[4],"{""after"": {""office_floor"": 9, ""safe_for_dogs"": true}}" office_dogs.employee,[1],"{""after"": {""dog_owner"": ""Lauren""}}" office_dogs.employee,[2],"{""after"": {""dog_owner"": ""Taylor""}}" office_dogs.employee,[3],"{""after"": {""dog_owner"": ""Sammy""}}" office_dogs.employee,[4],"{""after"": {""dog_owner"": ""Ashley""}}"
This allows you to define particular column families for the changefeed to target, without necessarily specifying every family in a table.
Note:To create a changefeed specifying two families on one table, ensure that you define the table and family in both instances:
EXPERIMENTAL CHANGEFEED FOR TABLE office_dogs FAMILY employee, TABLE office_dogs FAMILY dogs;
To create a changefeed that emits messages for all column families in a table, use the
split_column_families
option:EXPERIMENTAL CHANGEFEED FOR TABLE office_dogs WITH split_column_families;
In your other terminal window, insert some more values:
cockroach sql --insecure -e "INSERT INTO cdc_demo.office_dogs (id, name, dog_owner) VALUES (5, 'Daisy', 'Cameron'), (6, 'Sage', 'Blair'), (7, 'Bella', 'Ellis');"
Your changefeed will output the following:
table,key,value office_dogs.dogs,[1],"{""after"": {""id"": 1, ""name"": ""Petee""}}" office_dogs.employee,[1],"{""after"": {""owner"": ""Lauren""}}" office_dogs.dogs,[2],"{""after"": {""id"": 2, ""name"": ""Max""}}" office_dogs.employee,[2],"{""after"": {""owner"": ""Taylor""}}" office_dogs.dogs,[3],"{""after"": {""id"": 3, ""name"": ""Patch""}}" office_dogs.employee,[3],"{""after"": {""owner"": ""Sammy""}}" office_dogs.dogs,[4],"{""after"": {""id"": 4, ""name"": ""Roach""}}" office_dogs.employee,[4],"{""after"": {""owner"": ""Ashley""}}" office_dogs.dogs,[5],"{""after"": {""id"": 5, ""name"": ""Daisy""}}" office_dogs.employee,[5],"{""after"": {""owner"": ""Cameron""}}" office_dogs.dogs,[6],"{""after"": {""id"": 6, ""name"": ""Sage""}}" office_dogs.employee,[6],"{""after"": {""owner"": ""Blair""}}" office_dogs.dogs,[7],"{""after"": {""id"": 7, ""name"": ""Bella""}}" office_dogs.employee,[7],"{""after"": {""owner"": ""Ellis""}}"
In your other terminal window, update one of the values in the table:
cockroach sql --insecure -e "UPDATE cdc_demo.office_dogs SET name = 'Izzy' WHERE id = 4;"
This only affects one column family, which means you'll receive one message:
office_dogs.dogs,[4],"{""after"": {""id"": 4, ""name"": ""Izzy""}}"