Data Stream connector

Updated May 26, 20262 min read

The Data Stream connector pushes every shipment data change to an AWS S3 bucket managed by Carriyo in near real-time. You pull from that bucket to feed your data warehouse or analytics pipeline.

Enterprise add-on

This connector is an add-on subscription for enterprise accounts. The subscription covers:

Data streaming pipeline, Carriyo pushes your data every 5 minutes to a staging area in S3 through a dedicated pipeline. The subscription covers the compute resources to maintain this data flow.
Data storage, Carriyo stores the data in its S3 bucket and retains it for one year.
Security & maintenance, the subscription covers ongoing system maintenance and security protocols.

Contact your Carriyo account manager to learn more or enable this feature.

Carriyo pushes changes to the S3 bucket at approximately 5-minute intervals and provides an AWS role or user (or both) for accessing the bucket.

Folder structure

Output files are named:

<table-name>-<version>-YYYY-MM-DD-hh-mm-ss-<uuid v4>

Files follow a folder structure of /YYYY/MM/DD/hh. For example, shipment data streamed at 10 a.m. (UTC) on April 1, 2024:

- 2024
    - 04
        - 01
            - 10
                - PROD_<TENANT>_Shipment-2-2024-04-01-10-00-00-d963b748-56e9-4b75-a334-29e2cc3a43a1

File format

Each file may contain multiple change records. Each record is a single line of JSON with the following components:

Keys: the Shipment ID this change record relates to.
OldImage: a snapshot of the shipment data before the change.
NewImage: a snapshot of the shipment data after the change.
Other metadata: change-log details such as timestamps and byte size.

Both OldImage and NewImage follow the Shipment object model.

Download a sample file →

Note

The sample file is formatted for readability and contains a single change record. In practice, files contain multiple records, each formatted as a single line of JSON without extra line breaks or spacing.

Accessing the S3 bucket

Using an AWS role

If you have an AWS account, Carriyo will configure access for your AWS account ID and provide:

The S3 bucket name
An ExternalId (used for secure sharing: learn more)
An AWS role ARN

You can pull data from the bucket via API calls, the AWS SDK, or a data warehouse tool using the bucket name, role ARN, and ExternalId.

Using an AWS user

If you don't have an AWS account or prefer not to connect yours, Carriyo will give you direct access to the bucket. Carriyo provides:

The S3 bucket name
AWS user credentials (access key ID and secret access key)

Pull data using the bucket name and the provided user credentials.