Data Stream connector
The Data Stream connector pushes every shipment data change to an AWS S3 bucket managed by Carriyo in near real-time. You pull from that bucket to feed your data warehouse or analytics pipeline.
This connector is an add-on subscription for enterprise accounts. The subscription covers:
- Data streaming pipeline, Carriyo pushes your data every 5 minutes to a staging area in S3 through a dedicated pipeline. The subscription covers the compute resources to maintain this data flow.
- Data storage, Carriyo stores the data in its S3 bucket and retains it for one year.
- Security & maintenance, the subscription covers ongoing system maintenance and security protocols.
Contact your Carriyo account manager to learn more or enable this feature.
Carriyo pushes changes to the S3 bucket at approximately 5-minute intervals and provides an AWS role or user (or both) for accessing the bucket.
Folder structure
Output files are named:
<table-name>-<version>-YYYY-MM-DD-hh-mm-ss-<uuid v4>
Files follow a folder structure of /YYYY/MM/DD/hh. For example, shipment data
streamed at 10 a.m. (UTC) on April 1, 2024:
- 2024
- 04
- 01
- 10
- PROD_<TENANT>_Shipment-2-2024-04-01-10-00-00-d963b748-56e9-4b75-a334-29e2cc3a43a1
File format
Each file may contain multiple change records. Each record is a single line of JSON with the following components:
Keys: the Shipment ID this change record relates to.OldImage: a snapshot of the shipment data before the change.NewImage: a snapshot of the shipment data after the change.- Other metadata: change-log details such as timestamps and byte size.
Both OldImage and NewImage follow the Shipment object model.
The sample file is formatted for readability and contains a single change record. In practice, files contain multiple records, each formatted as a single line of JSON without extra line breaks or spacing.
Accessing the S3 bucket
Using an AWS role
If you have an AWS account, Carriyo will configure access for your AWS account ID and provide:
- The S3 bucket name
- An ExternalId (used for secure sharing: learn more)
- An AWS role ARN
You can pull data from the bucket via API calls, the AWS SDK, or a data warehouse tool using the bucket name, role ARN, and ExternalId.
Using an AWS user
If you don't have an AWS account or prefer not to connect yours, Carriyo will give you direct access to the bucket. Carriyo provides:
- The S3 bucket name
- AWS user credentials (access key ID and secret access key)
Pull data using the bucket name and the provided user credentials.