Falolu OlaitanSometimes you need to move data between environments without introducing a heavy ETL tool like Azure...
Sometimes you need to move data between environments without introducing a heavy ETL tool like Azure Data Factory.
In my case, I needed to sync a table from:
The goal:
Incrementally sync new rows every 2 minutes.
No ADF. No linked servers. No manual exports.
Here’s how I built a lightweight, production-ready sync using:
Architecture Overview
Networking:
Step 1: Create a Watermark Table in Dev
This keeps track of what has already been synced.
CREATE TABLE dbo.DataSyncWatermark (
TableName sysname PRIMARY KEY,
LastSyncedId bigint NOT NULL DEFAULT(0),
UpdatedAt datetime2 NOT NULL DEFAULT SYSUTCDATETIME()
);
INSERT INTO dbo.DataSyncWatermark(TableName, LastSyncedId)
VALUES ('dbo.ActivityLogs', 0);
Step 2: Example Table Schema (Generic)
To keep this reusable, here’s a sample table structure:
CREATE TABLE dbo.ActivityLogs (
Id bigint IDENTITY(1,1) NOT NULL PRIMARY KEY,
UserId nvarchar(100),
ActionType nvarchar(200),
Payload nvarchar(max),
Response nvarchar(max),
StatusCode int,
CreatedAt datetime2,
ExtraMetadata nvarchar(2048)
);
Both Prod and Dev must have identical structure.
Step 3: Create Kubernetes Secret
Never hardcode credentials in YAML.
Create a secret like this:
oc create secret generic sql-sync-secret \
--from-literal=PROD_SERVER=prod-sql.database.windows.net \
--from-literal=PROD_DB=prod_database \
--from-literal=PROD_USER=sync_user \
--from-literal=PROD_PASS='StrongPassword!' \
--from-literal=DEV_SERVER=dev-sql.database.windows.net \
--from-literal=DEV_DB=dev_database \
--from-literal=DEV_USER=sync_user \
--from-literal=DEV_PASS='StrongPassword!'
Now the CronJob can safely consume these via environment variables.
Step 4: Required SQL Permissions
On Prod:
GRANT SELECT ON dbo.ActivityLogs TO sync_user;
On Dev:
GRANT SELECT, INSERT ON dbo.ActivityLogs TO sync_user;
GRANT SELECT, INSERT, UPDATE ON dbo.DataSyncWatermark TO sync_user;
GRANT ALTER ON dbo.ActivityLogs TO sync_user;
ALTER is required because we preserve identity values during import.
Step 5: The CronJob YAML
This version:
apiVersion: batch/v1
kind: CronJob
metadata:
name: sql-table-sync
spec:
schedule: "*/2 * * * *"
concurrencyPolicy: Forbid
jobTemplate:
spec:
template:
spec:
restartPolicy: Never
containers:
- name: runner
image: mcr.microsoft.com/mssql-tools
envFrom:
- secretRef:
name: sql-sync-secret
command: ["/bin/bash","-lc"]
args:
- |
set -e
SQLCMD="/opt/mssql-tools/bin/sqlcmd"
BCP="/opt/mssql-tools/bin/bcp"
TABLE="dbo.ActivityLogs"
WM="dbo.DataSyncWatermark"
LAST_ID=$($SQLCMD -S "$DEV_SERVER" -d "$DEV_DB" -U "$DEV_USER" -P "$DEV_PASS" \
-h -1 -W -Q "SET NOCOUNT ON; SELECT LastSyncedId FROM $WM WHERE TableName = '$TABLE';")
DEV_MAX_ID=$($SQLCMD -S "$DEV_SERVER" -d "$DEV_DB" -U "$DEV_USER" -P "$DEV_PASS" \
-h -1 -W -Q "SET NOCOUNT ON; SELECT ISNULL(MAX(Id),0) FROM $TABLE;")
if [ "$LAST_ID" -lt "$DEV_MAX_ID" ]; then
LAST_ID="$DEV_MAX_ID"
fi
NEW_COUNT=$($SQLCMD -S "$PROD_SERVER" -d "$PROD_DB" -U "$PROD_USER" -P "$PROD_PASS" \
-h -1 -W -Q "SET NOCOUNT ON; SELECT COUNT(1) FROM $TABLE WHERE Id > ${LAST_ID};")
if [ "$NEW_COUNT" = "0" ]; then
exit 0
fi
$BCP "
SELECT
Id,
UserId COLLATE DATABASE_DEFAULT,
ActionType COLLATE DATABASE_DEFAULT,
Payload COLLATE DATABASE_DEFAULT,
Response COLLATE DATABASE_DEFAULT,
StatusCode,
CreatedAt,
ExtraMetadata COLLATE DATABASE_DEFAULT
FROM $TABLE
WHERE Id > ${LAST_ID}
ORDER BY Id
" queryout /tmp/data.dat \
-S "$PROD_SERVER" -d "$PROD_DB" -U "$PROD_USER" -P "$PROD_PASS" -n
$BCP $TABLE in /tmp/data.dat \
-S "$DEV_SERVER" -d "$DEV_DB" -U "$DEV_USER" -P "$DEV_PASS" \
-n -E -b 5000
$SQLCMD -S "$DEV_SERVER" -d "$DEV_DB" -U "$DEV_USER" -P "$DEV_PASS" -Q "
UPDATE $WM
SET LastSyncedId = (SELECT MAX(Id) FROM $TABLE),
UpdatedAt = SYSUTCDATETIME()
WHERE TableName = '$TABLE';
"
Common Issues I Faced
Why This Approach Works
When You Should Use Something Else
If you need:
Then consider:
Final Thoughts
Sometimes you don’t need a heavy data pipeline.
A well-designed incremental job, proper networking, and a watermark table can solve the problem cleanly.