Page tree
Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 4 Next »

Ultimately, we want to connect our C3 cluster to a source of data. This is done through the process of 'Data Integration'. Generally, data flow from a source file or system into a so called 'Canonical' Type. This Canonical type is meant to mirror directly the data source. Next, a 'Canonical Transform' is defined which connects the Canonical Type to another C3 Type which is part of your final data model. A general Diagram follows:

Additional C3.ai resources:

Specialized Types

First, we'll discuss the specialized types which are used throughout the Data Integration system, and discuss what happens once the data enters the C3 AI Suite. Things within the C3 AI Suite are a little cleaner to think about first. Once we've established how things work inside the C3 AI Suite, we'll follow with how we can get the data into the first step of the C3 AI Suite's Data Integration System.

Canonical Types

A Canonical Type is the entry point of data into the C3 AI Suite. It is a special Type which mixes in the Canonical Type. Mixing in the Canonical type tells the C3 AI Suite to add some capabilities such as a RESTFUL API endpoint to ingest data, the ability to grab data from a seed data directory, and the ability to kick off the Data Integration pipelin when new data arrives. Conventionally, a Canonical type should start with the word 'Canonical'. Its fields should match the names of fields in the intended source, and the fields should be primitive types. For example, let's look at the type 'CanonicalSmartBulb' from the lightbulbAD tutorial package (See C3 lightbulbAD Package).

/*
 * Copyright 2009-2020 C3 (www.c3.ai). All Rights Reserved.
 * This material, including without limitation any software, is the confidential trade secret and proprietary
 * information of C3 and its licensors. Reproduction, use and/or distribution of this material in any form is
 * strictly prohibited except as set forth in a written license agreement with C3 and/or its authorized distributors.
 * This material may be covered by one or more patents or pending patent applications.
 */

/**
* This type represents the raw data that will represent {@link SmartBulb} information.
*/
type CanonicalSmartBulb mixes Canonical<CanonicalSmartBulb> {
  /**
   * This represents the manufacturer of a {@link LightBulb}
   */
  Manufacturer: string

  /**
   * This represents the bulbType of a {@link LightBulb}
   */
  BulbType:     string

   /**
   * This represents the wattage of a {@link LightBulb}
   */
  Wattage:     decimal

  /**
   * This represents the id of a {@link LightBulb}
   */
  SN:           string

  /**
   * This represents the startDate of a {@link LightBulb}
   */
  StartDate:    datetime

  /**
   * This represents the latitude of a {@link SmartBulb}
   */
  Latitude:     double

  /**
   * This represents the longitude of a {@link SmartBulb}
   */
  Longitude:    double
}

We'll notice first, that far fewer fields are present here than in the SmartBulb Type. This is because the Canonical Type is just used as an entry point to the C3 AI Suite. You don't need to define any methods, and the only fields necessary are those needed to hold data from the source. In fact, you'll notice that the 'CanonicalSmartBulb' type doesn't use the 'entity' keyword. This means it isn't persisted either.

Generally speaking, you need to define a new Canonical Type for each type of data source.

C3.ai resources on Canonical Types

Transform Types

With a Canonical Type defined to receive new data into the C3 AI Suite, we need to move this data into the data model. Transform types define this operation. A Transform type firstly, mixes the destination type, and then uses the `transforms` keyword followed by the source canonical type to indicate where it should take data from. Secondly, Transform types support a special syntax for their fields. This syntax defines an expression for each field which takes data from the source type, and produces a result to be stored in the target type in the given field. Let's look at an example and discuss some of the syntax.

/*
 * Copyright 2009-2020 C3 (www.c3.ai). All Rights Reserved.
 * This material, including without limitation any software, is the confidential trade secret and proprietary
 * information of C3 and its licensors. Reproduction, use and/or distribution of this material in any form is
 * strictly prohibited except as set forth in a written license agreement with C3 and/or its authorized distributors.
 * This material may be covered by one or more patents or pending patent applications.
 */

/**
 * This type encapsulates the data flow from the {@link CanonicalSmartBulb} to the {@link SmartBulb} type.
 */
type TransformCanonicalSmartBulbToSmartBulb mixes SmartBulb transforms CanonicalSmartBulb {

  id:              ~ expression "SN"
  manufacturer:    ~ expression {id: "Manufacturer"}
  bulbType:        ~ expression "BulbType"
  wattage:         ~ expression "Wattage"
  startDate:       ~ expression "StartDate"
  latitude:        ~ expression "Latitude"
  longitude:       ~ expression "Longitude"
  lumensUOM:       ~ expression "{ \"id\": \"'lumen'\" }"
  powerUOM:        ~ expression "{ \"id\": \"'watt'\" }"
  temperatureUOM:  ~ expression "{ \"id\": \"'degrees_fahrenheit'\" }"
  voltageUOM:      ~ expression "{ \"id\": \"'volt'\" }"

}

There are a few different types of fields you might see in a Transform type some of which are visible here. Let's go through them

1. Constant

Constant fields are not present in the current example, but they would appear without the '~ expression' notation. For example, it might look like this:

id: "ID"
value: 1.5

In this case, the field is simply set to the constant value listed.

2. Copy Property of Canonical Type

Many transforms are simply copying the value to the right field in the destination type. This is done with the syntax '~ expression "<origin_field>"', Let's take a look at the first field, 'id':

id: ~ expression "SN"

Here, the `~` means the result of the expression is stored at the field. This is true all of the fields here. Then the expression is evaluated in the context of the source type. So in this case, the property SN is evaluated and copied to the 'id' field.

3. Fetch field Type via a key

We often want to link our destination types to other types which are related. In this example, the SmartBulbtype contains a field 'manufacturer' which is of type Manufacturer. How can we grab the correct Manufacturer? This is done with another expression, this time using the '{}' notation. Let's have a look:

manufacturer: ~ expression {id: "Manufacturer"}

This means, we fetch the Manufacturer by the 'id' key using the value in the expression "Manufacturer" which in this case is the value of the "Manufacturer" field in the source type for the transform. So the Manufacturer field from the source type is used to find the Manufacturer Type with id matching, and this is set as the value of the 'manufacturer' field of the destination type.

Finally, we can do the same but use a constant as the key. for example:

lumensUOM: ~ expression "{ \"id\": \"'lumen'\" }"

This can also be written

lumensUOM: ~ expression { id: "'lumen'" }

Here, the expression is evaluated with a constant key. This means the Unit Type with id 'lumen' will always be used for the lumensUOM field of the destination type.

C3.ai resources on Canonical Transform Types

Application Types

Finally, we should mention application types a little bit. There is no special syntax however, any data you wish to store and later retrieve must end up in a persistable type. These types start with the `entity` keyword. Together your defined types build a data model against which you can easily make complex queries which before would've required complex multi-database queries.

For instance, for the SmartBulb types we can see that the SmartBulb type includes several other types like 'Manufacturer' mentioned earlier. This allows us to for instance, select SmartBulbs based on Manufacturer.

Basic Data Sources

CSV Files

JSON Files

Seed Data

Sending Data Via POST

Complex Data Sources

Custom External Database

C3 Supported Database technologies

Examples

  • No labels