Kafka Connect Connector Deployment

Deploy and manage Kafka Connect connectors (Source/Sink).

Task

You are an expert in Kafka Connect. Help users deploy source and sink connectors.

Steps:

Ask for Requirements:
- Connector type: Source or Sink
- Connector class (JDBC, S3, Elasticsearch, etc.)
- Connection details
- Topic configuration
Generate Connector Configuration:

JDBC Source Connector (PostgreSQL):

{
  "name": "postgres-source-connector",
  "config": {
    "connector.class": "io.confluent.connect.jdbc.JdbcSourceConnector",
    "tasks.max": "1",
    "connection.url": "jdbc:postgresql://localhost:5432/mydb",
    "connection.user": "postgres",
    "connection.password": "${file:/secrets.properties:db-password}",
    "mode": "incrementing",
    "incrementing.column.name": "id",
    "topic.prefix": "postgres-",
    "table.whitelist": "users,orders",
    "poll.interval.ms": "5000",
    "batch.max.rows": "1000",
    "transforms": "createKey,extractInt",
    "transforms.createKey.type": "org.apache.kafka.connect.transforms.ValueToKey",
    "transforms.createKey.fields": "id",
    "transforms.extractInt.type": "org.apache.kafka.connect.transforms.ExtractField$Key",
    "transforms.extractInt.field": "id"
  }
}

Elasticsearch Sink Connector:

{
  "name": "elasticsearch-sink-connector",
  "config": {
    "connector.class": "io.confluent.connect.elasticsearch.ElasticsearchSinkConnector",
    "tasks.max": "2",
    "topics": "users,orders",
    "connection.url": "http://elasticsearch:9200",
    "type.name": "_doc",
    "key.ignore": "false",
    "schema.ignore": "true",
    "behavior.on.null.values": "delete",
    "behavior.on.malformed.documents": "warn",
    "max.buffered.records": "20000",
    "batch.size": "2000",
    "linger.ms": "1000",
    "max.in.flight.requests": "5",
    "retry.backoff.ms": "100",
    "max.retries": "10"
  }
}

S3 Sink Connector:

{
  "name": "s3-sink-connector",
  "config": {
    "connector.class": "io.confluent.connect.s3.S3SinkConnector",
    "tasks.max": "3",
    "topics": "events",
    "s3.bucket.name": "my-kafka-bucket",
    "s3.region": "us-east-1",
    "s3.part.size": "5242880",
    "flush.size": "1000",
    "rotate.interval.ms": "3600000",
    "storage.class": "io.confluent.connect.s3.storage.S3Storage",
    "format.class": "io.confluent.connect.s3.format.parquet.ParquetFormat",
    "partitioner.class": "io.confluent.connect.storage.partitioner.TimeBasedPartitioner",
    "partition.duration.ms": "3600000",
    "path.format": "'year'=YYYY/'month'=MM/'day'=dd/'hour'=HH",
    "locale": "en-US",
    "timezone": "UTC"
  }
}

Generate Deployment Scripts:

Using REST API:

curl -X POST http://localhost:8083/connectors \
  -H "Content-Type: application/json" \
  -d @connector-config.json

Using Confluent CLI:

confluent connect create \
  --config connector-config.json

Check Status:

curl http://localhost:8083/connectors/postgres-source-connector/status

# Expected response:
{
  "name": "postgres-source-connector",
  "connector": {"state": "RUNNING", "worker_id": "connect:8083"},
  "tasks": [{"id": 0, "state": "RUNNING", "worker_id": "connect:8083"}]
}

Generate Monitoring Queries:

# List all connectors
curl http://localhost:8083/connectors

# Get connector config
curl http://localhost:8083/connectors/postgres-source-connector/config

# Get connector metrics
curl http://localhost:8083/connectors/postgres-source-connector/status

# Restart connector
curl -X POST http://localhost:8083/connectors/postgres-source-connector/restart

# Pause connector
curl -X PUT http://localhost:8083/connectors/postgres-source-connector/pause

# Resume connector
curl -X PUT http://localhost:8083/connectors/postgres-source-connector/resume

Best Practices:

Use secret management for credentials
Configure appropriate error handling
Set up monitoring and alerting
Use SMT (Single Message Transforms) for data transformation
Configure dead letter queues
Set appropriate batch sizes and flush intervals
Use time-based partitioning for sinks

Example Usage:

User: "Deploy PostgreSQL source connector for users table"
Result: Complete connector config + deployment scripts

4.2 KiB Raw Blame History

Kafka Connect Connector Deployment

Task

Steps:

JDBC Source Connector (PostgreSQL):

Elasticsearch Sink Connector:

S3 Sink Connector:

Using REST API:

Using Confluent CLI:

Check Status:

Example Usage:

4.2 KiB

Raw Blame History