Carriyo Data Stream

Carriyo Data Stream is a feature of Carriyo to stream every change to client data to an AWS S3 bucket for analytics and data warehousing purposes.

Push data every 5 minutes
Pull data
Carriyo
AWS S3 Bucket
Client

New data is pushed to the S3 bucket approximately at 5 minutes interval. Carriyo provides you, the client, with either an AWS role or an AWS user (or both) that you may use to access the S3 bucket.

File Structure

The output files are named using the format <table name>-<version>-YYYY-MM-DD-hh-mm-ss-<uuid v4> in a directory structure of format /YYYY/MM/DD/hh.

As an example, shipment data streamed at 10am on 1 April 2022 will be available in this structure

  • 2022
    • 04
      • 01
        • 10
          • PROD_<TENANT>_Shipment-2-2022-04-01-10-00-00-d963b748-56e9-4b75-a334-29e2cc3a43a1

A file can have multiple change logs. Each log is a single line of JSON in dynamodb row data structure with fields as listed in the documentation of the respective entity type (e.g. Shipment).

How to access the S3 bucket

With AWS Role

Carriyo will create a stream for your AWS account id and provide you an S3 bucket name, an ExternalId (it is a like a shared token - read more) and an AWS role ARN. You can then pull data from the S3 bucket via API calls / AWS SDK or data warehouse tool using the S3 bucket name, role ARN and ExternalId.

With AWS User

Carriyo will create a stream and provide you an S3 bucket name and an AWS user credentials (client ID and secret key). You can then pull data from the S3 bucket via API calls / AWS SDK or data warehouse tool using the S3 bucket name and user credentials.