Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents
maxLevel3

The C3 AI Suite provides researchers many tools to analyze data and build and deploy machine learning models. To get started using the C3 AI Suite, please follow the C3.ai DTI Quickstart guide. This guide explains how to connect to a C3.ai clusterthe C3 AI Suite, access data using C3 .ai AI methods, and convert C3 .ai AI method outputs to an easy-to-analyze form. Please noteAdditionally, the guide also provides more detailed instructions to DTI members using the Covid-19 Data Lake. Examples in this guide rely on the 'baseCovidDataLake' package available in this git repository.

Please note, this guide covers how to query data from the C3 AI this guide covers how to run read-only queries on the C3 AI Suite. For more advanced topics , such as loading data, building metrics, or configuring and training machine learning models, please refer to the following wikis:

  • Data Integration (Not yet available)
  • Metrics (Not yet available)
  • Machine Learning (Not yet available)

Connect to a C3.ai cluster

There are several options to connect to a C3.ai Cluster:

Fetching Instances of Types

All data in C3 is represented by a 'Type'. Data for a specific type can be 'fetched' from C3 using the 'fetch' API. In each language, each Type has a 'fetch' function to which a FetchSpec Type can be passed. This function then retrieves the data in a FetchResult Type which can be opened and used for data analysis.

Fetching is governed by the FetchSpec Type and you can find full documentation of that type here. However, as a simple API description, may not be particularly helpful at first. Generally, the FetchSpec type defines a set of constraints to apply when gathering Type data. it can be 'empty' i.e. without constraints, but generally you'll want to apply one or more to get reasonable results.

The most useful properties of the FetchSpec are:

  • filter: Defines an expression to evaluate for each type. When the expression evaluates as true, that type is fetched.
  • limit: Fetch only 'limit' results. Can be useful to debug a fetch which might grab a lot of results.
  • include: Define specific properties of the Type to include in the fetch result. If not defined, all properties will be grabbed.
  • order: An expression which defines the order to return the results in.

Examples of Fetch operations

Consider the DTI housing example located here: https://github.com/c3aidti/HouseCoverageExample

In this example, the Type BlockInfo defines information aggregated about census blocks. We can for example, fetch BlockInfo types for which the 'prp_bf_lr' property is defined. Then we can order them based on their 'id' properly.

Code Block
languagejs
BlockInfo.fetch({
	'limit': -1,
	'filter': 'exists(prp_bf_lr)',
	'order': 'descending(id)',
	'include': 'pct_i_l,pct_t_l,prp_res_lr,pop10_ha_lr,hu10_ha_lr,eroom_ha_lr,med10_age,prp_bf_lr',
})

Here's how you would perform the same fetch in python:

Code Block
languagepy
raw_data = c3.BlockInfo.fetch(spec={
    'limit': -1,
    'filter': 'exists(prp_bf_lr)',
    'order': 'descending(id)',
    'include': 'pct_i_l,pct_t_l,prp_res_lr,pop10_ha_lr,hu10_ha_lr,eroom_ha_lr,med10_age,prp_bf_lr'
})

Another useful fetch command is fetchCount. This function is nearly identical to the fetch commands above, but it just returns a count of the number of objects which match the fetch filter. This is useful when trying to determine whether a given search is refined enough.

In the Javascript Console this is:

Code Block
BlockInfo.fetchCount({'filter': 'exists(prp_bf_lr)'})

The same in python is:

Code Block
c3.BlockInfo.fetchCount(spec={'filter': 'exists(prp_bf_lr)'})

More examples of fetch commands can be found in our examples here:

Here is a list of C3 documentation mentioning fetching:

...

Terminology

To best understand the C3 AI Suite and this guide, let's introduce key terminology used by C3 AI Suite developers:

  • Type: Nearly all aspects of the C3 AI Suite (e.g., data, machine learning models, cloud-provider microservices) are stored and accessed through Types. C3 AI Types are logical objects akin to a Java class, and contain 'fields' and 'methods'. Some Types are persisted to internal databases (like Postgres or Cassandra), while others are not. 
  • Field: A field of a Type. Fields contain attributes or data associated with the Type.
  • Method: A method declared on a Type. Methods define business logic associated with the Type. 
  • Vanity Url: The URL at which a specific tenant/tag of a C3 Cluster can be accessed. 
  • Cluster: A deployment of the C3 AI Suite. A C3 Cluster is a collection of hardware or virtualized cloud instances (e.g., servers, databases, load balancers) used to run the C3 AI Suite and C3 AI Applications. The C3 AI Suite can run on any public or private cloud infrastructure or on a local machine (in a docker container).
  • Tenant: A logical partition of a C3 Cluster. All tenants in a cluster generally share the same compute and storage resources. Data within tenants, while stored in a single database, are logically separated. C3 AI Suite users on one tenant can't see data stored on another tenant. In other words, users are only able to view data for tenants to which they are explicitly granted access.
  • Tag: A logical partition of a tenant. A single tag hosts one C3 AI application (deployed package). 
  • Package: All the code a developer writes for an application. The C3 AI Suite runs a package on a tag. 
  • Provisioning: Deploying a package onto a tenant/tag in a C3 Cluster. 
  • Static Console: A browser-based tool that C3 AI developers use to interact with the C3 AI Suite. Static console is available on all modern browsers, including Google Chrome, Mozilla Firefox, and Apple Safari. Developers use the console to query data, evaluate expressions, view documentation, monitor environments, develop and test code, and manage environment configurations. You can access the static console at the url 'https://<vanity_url>/static/console' (replacing <vanity_url> with your Vanity Url).
  • Metric: Expressions that transform raw data into a time-series. 

C3 Cluster Overview

The C3 AI Suite is a Platform as a Service (PaaS), which enables organizations to build, deploy, and operate enterprise-scale Big Data, AI, and IoT applications. The C3 AI Suite can be deployed on any private or public cloud infrastructure such as AWS, Azure, and Google Cloud Platform. When developing and operating applications, a C3 cluster is responsible for managing and supporting all the features of the C3 AI Suite. A C3 Cluster has at least one Master node and many Worker nodes. Master nodes prioritize and distribute jobs to Worker nodes and handle user requests. Worker nodes carry out jobs, allocated by the Master node. Other components of a C3 Cluster include databases (e.g., Postgres, Cassandra, Azure Blob), logging services (i.e., Splunk), and Jupyter. Atop these hardware or virtualized cloud resources is a logical software structure, with the highest level being a Cluster. A C3 Cluster is broken out into numerous tenants. Tenants are logically separated from each other (i.e., a particular tenant's data and packages are not accessible or visible to any other tenants), and contain many tags. Tags host C3 AI Packages (i.e., the code that C3 AI developers write and provision to the C3 AI Suite). A typical multi-tag, multi-tenant C3 Cluster is shown in a logical diagram below:

Image Added

To learn more about the architecture of a C3 cluster, please see the training materials here:

Provision a C3 AI Package

To provision a package to your tag, follow the instructions available at the DTI Guide: Provisioning.

To run the examples in this guide you will need to provision the 'baseCovidDataLake' by following the directions in the 'COVID-19 Data Lake Provisioning' section.

To learn more about provisioning, please see the C3 AI Develop Documentation here:

...

  • Developer Documentation

      ...

      ...

      Connecting to the C3 AI Suite 

      The static console is the main tool that developers use to interact with the C3 AI Suite. However, we anticipate that most DTI members will use Python (via Jupyter notebook) for data analysis. That being said, the static console is an essential part of working with the C3 AI Suite and you will use it frequently. For example, the static console is the best place to find documentation tailored directly to your package. It's also a great place to quickly test queries as no specialized environments need to be set up to use it. Static console is ready-to-go in all modern browsers, including Google Chrome, Mozilla Firefox, and Apple Safari.

      Accessing the Static Console

      Once you have provisioned a package to your tag, navigate to the static console page at this url: 'https://<vanity_url>/static/console' (replacing <vanity_url> with your Vanity Url provided in your C3.ai DTI Training Cluster Onboarding Email). The static console page looks like this:

      Image Added

      The 'Tools' drop-down menu in the upper left-hand corner contains a list of available developer tools. The most relevant tool is the Provisioner, though there are also utilities for loading JavaScript files, debugging JS code, and inspecting Errors.

      Image Added

      The 'Help' drop-down menu in the upper left-hand corner allows users to access console documentation and a C3 Cluster hosted documentation portal.

      Image Added

      Most tools are also accessible through a series of icons in the upper right-hand corner:

      Image Added

      Using the Static Console

      Developers interact with the static console through the JavaScript console tab in the browser. When the static console page loads (or when you run the c3ImportAll() command), JavaScript methods associated with all of your Package's defined Types are populated. You can write and run JavaScript code directly in the console tab to interact with your package.

      You can also open Javascript console with the 'Ctrl+Shift+I' keyboard shortcut (in most browsers). Javascript console is also available through the browser's developer tools. If the 'Ctrl+Shift+I' keyboard shortcut doesn't work for you, review your browser's documentation on developer tools. Here's how the static console looks in Firefox, with the JavaScript console open:

      Image Added

      Finally, let's write some JavaScript commands to see the console in action!

      Image Added

      Tutorial Video

      The DTI Team have recorded a short video introducing and describing the static console functionality:

      Multimedia
      nameConsole_demo_compress.mp4


      Console Commands

      Here are common JavaScript console commands used on the static console page.

      • c3ImportAll: A console command which loads all the C3 Types from your package into the static console. Always be sure refresh the static console page or run c3ImportAll() after provisioning a new package.
      • c3Grid: A console command to display a table of data stored within a C3 Type. (e.g., data returned from a fetch operation, or an evaluate operation among many others).
      • c3Viz: A console command to produce quick visualizations or plots for some C3 Types. (e.g., time-series data like EvalMetricsResult)
      • c3ShowType: A console command to access documentation for a given C3 Type. (e.g., c3ShowType(OutbreakLocation))

      Official C3 AI Training Materials on Static Console

      Using Python with the C3 AI Suite

      We anticipate most DTI researchers will want to use Python for data analysis. There are two options to connect to a C3 Cluster via Python. Please follow the links below for detailed information about each.

      To learn more about the general structure of a C3 cluster, please see the resources here:

      Fetching Instances of C3 Types

      All data in the C3 AI Suite are stored in C3 Types. Users can access data from a Type with the 'fetch' method. Behind the scenes, the 'fetch' method submits a query directly to the database underlying a Type, and retrieves and presents the query results.

      The C3 AI Suite returns the 'fetch' query's response, which includes:

      • data from the Type itself;
      • Metadata for the 'fetch' query (e.g., the number of objects, whether additional data exists in the database) into the FetchResult type for data analysis (see example below).

      To learn more about the 'fetch' method, please see the C3 AI resources here:

      Users can also provide a FetchSpec (or parameters) to the 'fetch' method to describe particular data to retrieve (e.g., only retrieve gene sequences collected in Germany). The FetchSpec can be 'empty' (e.g., OutbreakLocation.fetch()), or contain several parameters to return a subset of the data.

      Some example FetchSpec parameters include:

      • filter: Filter expression to return a subset of the data (e.g., age <= 20). Filter expressions must evaluate to a Boolean type (i.e., true or false)
      • limit: The maximum number of rows that should be returned. Be default, if no limit is specified, the C3 AI Suite returns 2,000 rows from the C3 Type. Specifying a limit is often helpful to debug a fetch 'method' without returning too many records.
      • include: Specifies the particular fields from a C3 Type to return to the FetchResult. By default, if no include spec is defined, all fields from the Type will be returned.
      • order: Specifies the order to return the query's results (either "ascending" or "descending").

      Note: Please see the official FetchSpec documentation for a full list of parameters: https://developer.c3.ai/docs/7.12.17/type/FetchSpec

      Examples of Fetch Calls

      The OutbreakLocation Type contains information from various locations for which the Covid-19 Data Lake has virus-related information. We can fetch OutbreakLocation records for which the 'latestTotalPopulation' field exists (i.e., is not null). We can also retrieve these records in descending order by their 'countryArea':

      Code Block
      languagejs
      res = OutbreakLocation.fetch({
      	'limit': -1,
      	'filter': 'exists(latestTotalPopulation)',
      	'order': 'descending(latestTotalPopulation)',
      	'include': 'id, name, latestTotalPopulation, populationOfAllChildren, countryArea, countryCode'
      })

      And we can show these results in the C3 AI static console using the c3Grid command:

      Image Added

      You can run this same fetch in Python:

      Code Block
      languagepy
      raw_data = c3.OutbreakLocation.fetch({
      	'limit': -1,
      	'filter': 'exists(latestTotalPopulation)',
      	'order': 'descending(latestTotalPopulation)',
      	'include': 'id, name, latestTotalPopulation, populationOfAllChildren, countryArea, countryCode'
      })

      Additional details on "Fetching in Python" are available in this C3 AI Developer documentation: https://developer.c3.ai/docs/7.12.25/topic/ds-jupyter-notebooks

      Additional examples of fetch calls can be found here:

      Tutorial Video

      This tutorial video goes over fetching and filtering:

      Multimedia
      nameFetching_And_Filtering_OutbreakLocation_compress.mp4

      The fetchCount Method

      Another useful command is 'fetchCount'. Like 'fetch', users can also provide a FetchSpec (or parameters) to 'fetchCount'. The 'fetchCount' method then returns the number of records that match the FetchSpec. This is useful when trying to determine whether a given search is refined enough.

      Code Block
      languagejs
      OutbreakLocation.fetchCount({'filter': 'exists(latestTotalPopulation)'})

      You can run the same 'fetchCount' in python: 

      Code Block
      languagepy
      c3.OutbreakLocation.fetchCount(spec={'filter': 'exists(latestTotalPopulation)'})

      To learn more about the 'fetchCount' method, please see the fetchCount method definition in the Persistable Type documentation: https://developer.c3.ai/docs/7.12.25/type/Persistable

      Converting Fetch results to usable forms in Jupyter Notebook

      When using a Jupyter Notebook, C3 AI developers typically modify FetchResults for data analysis. This section shows a couple of ways to convert FetchResults into easy-to-analyze forms.

      Python

      In python, first retrieve the 'objs' field from the FetchResults object, and then call the toJson() function. The toJson() function returns an array of dictionaries each with keys equal to the requested fields of the fetched C3 Type. Using the Pandas library, this array can be turned into an analysis-ready DataFrame, as the below example shows:

      Code Block
      languagepy
      import pandas as pd
      df = pd.DataFrame(raw_data.objs.toJson())
      df.head()
      df.drop('meta', axis=1, inplace=True)
      df.drop('type', axis=1, inplace=True)
      df.drop('version', axis=1, inplace=True)
      df.drop('id', axis=1, inplace=True)
      df.head()


      Image Added

      Users can then manipulate the resulting DataFrame, using common programming libraries and frameworks.

      ExpressionEngineFunctions

      The C3 AI Suite also provides a pre-built library of "ExpressionEngineFunctions". Expression Engine Functions take a variety of arguments and perform various data processing tasks. For example, the function 'contains' takes two strings as arguments and checks whether the first argument contains the second argument. The function 'lowerCase' takes as input a string and returns that same string with all lowercase letters. In addition to these string processing functions, the C3 AI Suite's ExpressionEngine includes many math functions (such as 'log', 'avg', and 'abs') which operate on various input data types (e.g. int, double, float).

      The ExpressionEngine Functions are used in several places, such as:

      • 'fetch' filters
      • Simple and compound metric expressions
      • tsDecl metric values

      To learn more about ExpressionEngineFunctions, please see the C3 AI resources here:

      Computations on C3 Types using Evaluate

      Using the 'evaluate' method, developers can run aggregations or other computations on data fetched from a C3 Type. (e.g., compute the average area across all countries with area data available, in the OutbreakLocation Type).

      The 'evaluate' method takes several parameters:

      • projection: [Required] A comma-separated list of expressions (from the ExpressionEngineFunction library) to apply to data from a C3 Type (e.g., avg, unique, min, max). You can simply think about a projection as the columns/fields, calculated or otherwise, which the "evaluate" method should return.
      • group: A comma-separated list of columns/fields, to group the aggregated/transformed data by (e.g, compute the average area by the 'locationType' field in OutbreakLocation). Please note, in any 'evaluate' command, all columns in the 'group' field MUST ALSO BE in the 'projection' field, as the example below shows. 
      • having: A SQL style having clause.
      • order: A comma-separated list of columns/fields, to order aggregated/transformed data by. Users can access data in 'ascending' or 'descending' order. Please note, in any 'evaluate' command, all columns in the 'order' field MUST ALSO BE in the 'projection' field.
      • filter: A filter expression that restricts the rows in a C3 Type on which the evaluate method is run. 

      In static console, 'c3Grid' displays the 'evaluate' method results nicely:

      (Note: the 'locationType' expression within the 'group' field is also within the 'projection' field. This is required.)

      Code Block
      languagejs
      var eval_result = OutbreakLocation.evaluate({
          'projection': 'avg(countryArea), locationType',
          'group': 'locationType',
          'filter': 'exists(countryArea) && exists(locationType)'
      })
      c3Grid(eval_result)

      Image Added


      Users can also run the 'evaluate' method in python. In this case, users often modify the 'evaluate' method's results for data analysis. To view and analyze the 'evaluate' method's results in Python, please use the helper function available in C3 DTI's c3python module here: https://github.com/c3aidti/c3python

      NOTE: The 'locationType' expression within the 'group' field is also within the 'projection' field. This is required.

      Code Block
      languagepy
      eval_spec = {
          'projection': 'avg(countryArea), locationType',
          'group': 'locationType',
          'filter': 'exists(countryArea) && exists(locationType)'
      }
      eval_res = c3.OutbreakLocation.evaluate(eval_spec)
      df = c3python.EvaluateResultToPandas(result=eval_res, eval_spec=eval_spec)

      Image Added


      Here's another example of running the 'evaluate' method in Python, this time using the 'order' parameter as well:

      NOTE: The 'count(ethnicity)' expression within the 'order' field is also within the 'projection' field. This is required.

      Code Block
      languagepy
      spec = c3.EvaluateSpec(
          projection="ethnicity, count(ethnicity)",
          order='descending(count(ethnicity))',
          group="ethnicity"
      )
      c3python.EvaluateResultToPandas(result=c3.SurveyData.evaluate(spec), eval_spec=spec)

      Image Added



      To learn more about the 'evaluate' method, please see the C3 AI resources here:

      Developing Metrics on Timeseries data

      The C3 AI Suite also offers several features to handle time series data. To interact with time series, C3 AI developers typically use simple and compound metrics. These metrics are used in several places in the C3 AI Suite such as:

      • Alerts and Application Logic
      • Machine Learning Features
      • User Interface (to Visualize Data)

      Timeseries Video Tutorial

      To supplement the documentation below, we also have recorded a video lecture about Time Series data on the C3 AI Platform.

      Multimedia
      nameTime_Series_compress.mp4

      Simple Metrics

      Simple metrics allow developers to produce time-series from raw data and are often used to construct more advanced metrics (i.e., Compound Metrics), in practice. Simple metrics are linked to a specific C3 Type and reference the timeseries data stored within that Type. To declare a simple metric, users should specify the following fields:

      1. id: The simple metric's unique id, which should follow the convention "name_srcType" (e.g., Apple_DrivingMobility_OutbreakLocation).
      2. name: The simple metric's name (e.g., Apple_DrivingMobility).
      3. description: The simple metric's description (optional field).
      4. srcType: The C3 Type the simple metric is analyzed on (e.g., OutbreakLocation).
      5. path: The path from the srcType to the C3 Type that stores the raw data referenced by the simple metric (e.g., pointMeasurements)
        Note: If the srcType itself stores the raw data referenced by the simple metric, path field is optional.
      6. expression: The expression (or ExpressionEngineFunction) applied to the raw data, referenced by the simple metric (e.g., avg(avg(normalized.data.quantity)).
        Note: The "normalized" key term instructs the simple metric to use normalized (instead of raw) data on the C3 AI Suite. To learn more about Normalization, see this C3 AI Developer Documentation: https://developer.c3.ai/docs/7.12.25/topic/normalization.

      Here is an example of a Simple Metric:

      Code Block
      languagepy
      met = c3.SimpleMetric(
        id='JHU_ConfirmedCases2_OutbreakLocation',
       name='JHU_ConfirmedCases2',
       srcType='OutbreakLocation',
       path="aggregateMeasurements.(measurementType == 'confirmed' && origin == "
             "'Johns Hopkins University')",
       expression='interpolate(avg(avg(normalized.data.value)), "PREVIOUS", "MISSING")'
      )

      To learn more about Simple Metrics, please see the C3 AI resources here:

      Another type of SimpleMetric is a tsDecl (Timeseries Declaration) metric. tsDecl metrics are often used to turn non-time series raw data (e.g., event data, status data, or data with irregular intervals) into time series. tsDecl metrics have the same fields as standard SimpleMetric, except for the 'tsDecl' field, which replaces the 'expression' field. tsDecl metrics may allow users the added flexibility to define new metrics that the expression field may not support. Using a tsDecl metric, the above metric can be re-written as:

      Code Block
      languagepy
      met = c3.SimpleMetric(
      	id='JHU_ConfirmedCases3_OutbreakLocation',
      	name='JHU_ConfirmedCases3',
      	srcType='OutbreakLocation',
      	path="aggregateMeasurements.(measurementType == 'confirmed' && origin == "
             "'Johns Hopkins University')",
      	tsDecl={
      		'data': 'data',
      		'treatment': 'AVG',
      		'start': 'start',
      		'value': 'value'
      	}
      )

      To learn more about tsDecl metrics, please see the C3 AI resources here:

      Compound Metrics

      Compound metrics allow C3 AI developers to manipulate or combine existing metrics into more complex time series. Compound metrics are built on top of one or many existing Simple or Compound metrics. Please note, to evaluate a Compound metric on a C3 Type, all Simple metrics used in that Compound metric must be defined on that Type as well. If not, an error is returned.

      To declare a compound metric, users should specify the following fields:

      1. id: The compound metric's unique id, typically the same as 'name' (e.g., BLS_UnemploymentRate).
      2. name: The compound metric's name (e.g., BLS_UnemploymentRate).
      3. description: The compound metric's description (optional field).
      4. expression: The expression (or ExpressionEngineFunction) applied to the metrics underlying the Compound metric (e.g., "BLS_LaborForcePopulation ? 100 * BLS_UnemployedPopulation / BLS_LaborForcePopulation: null").

      An example CompoundMetric is:

      Code Block
      languagepy
      met = c3.CompoundMetric(
      	id='JHU_CaseFatalityRate',
      	name='JHU_CaseFatalityRate',
      	expression='JHU_ConfirmedDeaths/JHU_ConfirmedCases',
      )

      To learn more about Compound metrics, please see the C3 AI resources here:

      Converting Fetch results to usable forms in Jupyter Notebook

      For most data analysis situations, FetchResults need to be massaged a little bit to be useful. Here, we show the typical ways FetchResults can be used.

      Python

      In python, generally, you get the 'objs' property from the FetchResults object, then call the toJson() function. This function returns an array of dictionaries each with keys equal to the requested properties of the fetched type. This works well with the pandas DataFrame constructor which accepts such an array. The returned DataFrame object can now be analyzed very easily. We show an example below.

      A Code Example in Jupyter Notebook:

      Code Block
      ## continue from above ##
      import pandas as pd
      df = pd.DataFrame(raw_data.objs.toJson())
      df.head()
      df.drop('meta', axis=1, inplace=True)
      df.drop('type', axis=1, inplace=True)
      df.drop('version', axis=1, inplace=True)
      df.drop('id', axis=1, inplace=True)
      df.head()

      Image Removed

      Users can then use the resulting dataframe as they normally would.

      ExpressionEngineFunctions

      Several spots in the C3 API allow for the use of so-called 'ExpressionEngineFunctions'. These functions take a variety of arguments and perform a variety of processing tasks. For example, the function 'contains' takes two strings and checks whether the first argument contains the second argument. The function 'lowerCase' takes an input string, and returns that string with all uppercase letters replaced with lowercase ones. In addition to those string processing functions, many math functions exist as well such as 'log' which operates on a variety of input data types.

      These functions are very helpful, and can be used in a number of places such as:

      • 'filter' field of FetchSpec
      • 'expression' field of CompoundMetric
      • 'value' field of tsDecl component of tsDecl Metrics.

      Official C3 documentation for ExpressionEngineFunctions: https://developer.c3.ai/docs/7.12.0/type/ExpressionEngineFunction

      Evaluating Metrics on Time series data

      C3 can store timeseries data using many different types, however knowing the exact type of timeseries data isn't super important when it comes to evaluating so-called 'Metrics' on that data.

      Normalization Process

      Usually, timeseries data goes through a 'normalization' process, the purpose of which is to take non-uniform, and possibly multiple datasets and produce a single uniform timeseries which can be analyzed a little more easily in most cases. We copy here the list of normalization steps that are currently performed within the C3 platform, these are available from C3's official documentation here: https://developer.c3.ai/docs/7.12.0/guide/guide-c3aisuite-basic/ts-normalization-engine

      1. Drop data points with irregular dates. For example, dates where start date is after end date, dates are > 50 years apart, etc.
      2. Remove duplicate data points that might have been sent due to data loading issues or issues with IoT sensor hardware.
      3. Correctly apportion the values in the correct time interval in case of overlapping data points.
      4. Convert data points in various units into a homogenous unit utilizing C3's unit conversion capabilities.
      5. Automatic detection of the natural frequency of the data.
      6. Aggregate or disaggregate data into coarse or finer intervals to optimize for storage and accuracy.

      Once the normalization process is complete, a single time series sampled at a uniform interval is given.

      SimpleMetrics

      Simple metrics form the 'base' of the Metrics system. They are defined on a specific Type and reference timeseries data stored within. Essentially, the Simple metric defines:

      1. The Type on which the metric is defined
      2. How to find the timeseries data on the Type
      3. Configuration of the Normalization engine
      4. The name of the metric

      An example SimpleMetric is:

      Code Block
      languagepy
      sample_met = c3.SimpleMetric({
      	'id': 'SampleMetric_SampleType',
      	'name': 'SampleMetric',
      	'srcType': 'SampleType',
      	'path': 'timeseriesValues',
      	'expression': 'avg(avg(normalized.data.x))'
      })

      Another variety of SimpleMetric is a tsDecl Metric. These can be used to turn traditionally non-timeseries data such as event data or status data into timeseries. A tsDecl metric is the same as a SimpleMetric, but instead of an 'expression' field, a 'tsDecl' field is used. tsDecl metrics can sometimes provide some additional flexibility to define new metrics which the expression field may not support. The same example can be re-written as:

      Code Block
      sample_met = c3.SimpleMetric({
      	'id': 'SampleMetric_SampleType',
      	'name': 'SampleMetric',
      	'srcType': 'SampleType',
      	'path': 'timeseriesValues',
      	'tsDecl': {
      		'data': 'data',
      		'treatment': 'AVERAGE',
      		'start': 'start',
      		'value': 'value'
      	}
      })

      Please note that the above examples do not have an example context in which they work. This will be updated soon with a version backed up by a working exercise.

      ...

      • Developer Documentation

          ...

          ...

          ...

          ...

          ...

          ...

          ...

          CompoundMetrics

          Compound metrics are generally easier to define and use as they operate on already defined metrics either Simple or Compound. They essentially just consist of and id/name, and an expression defining the metric in terms of constants and already defined metrics. If you try and execute a CompoundMetric on a type for which some necessary SimpleMetric is not defined, you'll get an error.

          Essentially, a CompoundMetric defines:

          1. The name/id of the metric
          2. The expression defining the metric

          Finding, Evaluating, and Visualizing Metrics

          Users can find, evaluate, and visualize metrics built in the C3 AI Suite via the JavaScript console or a hosted Jupyter notebook.

          Finding Metrics

          All metrics that users build and deploy in the C3 AI Suite are also stored in C3 Types. To view a list of all the simple and compound metrics applicable to a C3 Type, run the 'listMetrics' method as shown below:

          Javascript:

          Code Block
          languagejs
          var metrics = OutbreakLocation.listMetrics()
          c3Grid(metrics)

          Python

          ...

          :

          Code Block
          languagepy
          sample_compound_met = c3.CompoundMetric.make({
          	'id': 'CompoundMetric',
          	'name': 'CompoundMetric',
          	'expression': 'window("AVG", SimpleMetric, 0, 7)',
          })

          Please note that the above example does not have an example context in which it will work. This will be updated soon with a version backed up by a working exercise.

          For more detail, see the C3 documentation on CompoundMetrics here: https://developer.c3.ai/docs/7.12.0/guide/guide-c3aisuite-basic/metrics-compound-metrics

          Evaluating Metrics

          import pandas as pd
          pd.DataFrame(c3.OutbreakLocation.listMetrics().toJson())

          DTI Members using the Covid-19 Data Lake: While listMetrics does return a list, this is fairly bare bones if the 'description' field of a given metric isn't filled in. The Covid-19 Data Lake API documentation provides an extensive list of production-ready metrics along with detailed descriptions and usage examples.

          After finding a metric, the next step is to evaluate on data in a C3 Type.

          Evaluating Metrics

          Metrics are evaluated with either the 'evalMetrics' or 'evalMetricsWithMetadata' methods. Behind the scenes, 'evalMetrics' and 'evalMetricsWithMetadata', fetch and transform raw data from a C3 Type into easy-to-analyze timeseries data. 'evalMetrics' is used to evaluate metrics provisioned (deployed) to a tenant/tag. 'evalMetricsWithMetadata' allows users to evaluate metrics either provisioned to a tenant/tag, or defined on-the-fly in JavaScript console, or a hosted Jupyter notebook (typically for debugging).

          To learn more about the differences between 'evalMetrics' and 'evalMetricsWithMetadata' see the documentation Types on which you can evaluate a metric mixin the Type 'MetricEvaluatable' (C3 Docs here: https://developer.c3.ai/docs/7.12.0/type/MetricEvaluatable)

          Finding Metrics

          Not all SimpleMetrics are defined on all types.
          This bestows the function 'listMetrics' (among others) to that type, so if you're unsure what kind of metrics are available for a given type, execute that function to get a list, for example:

          Javascript:

          Code Block
          languagejs
          var metrics = SmartBulb.listMetrics()
          c3Grid(metrics)

          Python:

          Code Block
          languagepy
          pd.DataFrame(c3.SmartBulb.listMetrics().toJson())

          Once you have the metric you want to evaluate in mind, you can evaluate it.

          Evaluating Metrics

          With a metric in mind, you can use the 'evalMetrics' API function which is brought in with the MetricEvaluatable type to actually evaluate the metric. The evalMetrics function takes an 'EvalMetricsSpec' type which contains the following:

          1. list of Type ids you want the metrics to be evaluated on
          2. A list of metrics you want to be evaluated
          3. A start date (in ISO 8601 format)
          4. An end date (in ISO 8601 format)
          5. An evaluation interval

          ...

          25/type/MetricEvaluatable

          To evaluate a metric, users must provide the following parameters (called an EvalMetricSpec) to the 'evalMetrics' or 'evalMetricsWithMetadata' methods:

          1. ids ([string]): A list of ids in the C3 Type on which you want to evaluate the metrics (e.g., "Germany", "California_UnitedStates").
          2. expressions ([string]): A list of metrics to evaluate (e.g., "JHU_ConfirmedCases", "Apple_DrivingMobility").
          3. start (datetime): Start datetime of the time range to be evaluated (in ISO 8601 format) (e.g., "2020-01-01").
          4. end (datetime): End datetime of the time range to be evaluated (in ISO 8601 format) (e.g., "2020-08-01").
          5. interval (string): Desired interval for the resulting timeseries data (e.g., MINUTE, HOUR, DAY, MONTH, YEAR).

          Here's an example of evaluating a metric in Python:

          Code Block
          languagepy
          spec = c3.EvalMetricsSpec(
            ids=[ 'Illinois_UnitedStates', 'California_UnitedStates', 'UnitedStates' ],
            expressions=[ 'JHU_ConfirmedCases', 'JHU_ConfirmedDeaths' ],
            start='2020-01-01',
            end='2020-08-01',
            interval='DAY',
          )
          
          results = c3.OutbreakLocation.evalMetrics(spec=spec)

          In Python, you can also specify the spec using a Dictionary without creating an EvalMetricsSpec Type:

          Code Block
          languagepy
          specresults = c3.OutbreakLocation.EvalMetricsSpecevalMetrics(spec={
          	'ids': [ 'AIllinois_UnitedStates', 'BCalifornia_UnitedStates', 'CUnitedStates' ],
          	'expressions': [ 'SampleMetricJHU_ConfirmedCases', 'SampleMetric2JHU_ConfirmedDeaths' ],
          	'start': '20192020-01-01',
          	'end': '20192020-0508-01',
          	'interval': 'DAY',
          })
          
          results = c3.SampleType.evalMetrics(spec=spec)


          The results are in the form of a 'EvalMetricsResult'. By itself, this type isn't easily usable, however C3 offers the type 'Dataset' which is better suited for data analysis.
          We can then convert the EvalMetricsResult to a Dataset using a convenient helper function and then in the case of Python to a pandas DataFrame using another
          helper function.C3 AI Suite returns the evaluated metric results (a timeseries) into the 'EvalMetricsResult' type. With various helper functions, C3 AI developers may then convert this timeseries into a Pandas DataFrame (via "Dataset" type) for further data analysis or model development in a Jupyter notebook, as shown below:

          Code Block
          languagepy
          ds = c3.Dataset.fromEvalMetricsResult(result=results)
          df = c3.Dataset.toPandas(dataset=ds)

          If you're Additionally, users can visualize evaluated metric results directly in the browser using Javascript, you can utilize web-browser (i.e., JavaScript console) with the 'c3Viz' console function to display the result of eval metrics. The whole evaluation might look like thisfunction.

          Here's an example of evaluating and visualizing in JavaScript console:

          Code Block
          languagejs
          var spec = EvalMetricsSpec.make({
          	'ids=': ['AIllinois_UnitedStates', 'BCalifornia_UnitedStates', 'CUnitedStates' ],
          	'expressions=': [ 'SampleMetricJHU_ConfirmedCases', 'SampleMetric2JHU_ConfirmedDeaths' ],
          	'start=': '20192020-01-01',
          	'end=': '20192020-0508-01',
          	'interval=': 'DAY'
          })
          
          var results = OutbreakLocation.evalMetrics(spec)
          c3Viz(results)

          Similarly, we don't have to explicitly create an EvalMetricsSpec type:

          Code Block
          languagejs
          var results = SampleTypeOutbreakLocation.evalMetrics(spec)
          c3Viz(results)

          Bonus: An additional function is available as well: evalMetric, Have a look at the MetricEvaluatable documentation to see how it differs from evalMetrics: https://developer.c3.ai/docs/7.12.0/type/MetricEvaluatable

          Conclusion

          To get started quickly, focus on 'CompoundMetrics'. They're the easiest to use, and for most cases, the 'AVG' treatment works well.

          Official C3 documentation:

          {
              'ids': ['Illinois_UnitedStates', 'California_UnitedStates', 'UnitedStates' ],
              'expressions': [ 'JHU_ConfirmedCases', 'JHU_ConfirmedDeaths' ],
              'start': '2020-01-01',
              'end': '2020-08-01',
              'interval': 'DAY'
          })
          c3Viz(results)


          To learn more about evaluating and visualizing metrics, please see the C3 AI Developer Documentation here:

          Note: Metrics can only be evaluated on C3 Types that mix in the 'MetricEvaluatable' Type.

          Additional Resources

          Official C3 AI Developer Documentation:

          Review and Next Steps


          For most data exploration, you'll find yourself 'Fetching' and running analysis, C3 AI developers run the 'fetch' and 'evalMetrics' methods. This C3.ai DTI Quickstart guide provides a good foundation for these activities. This first set of activities might be described as 'Read-Only'. Here you're using an introduction to these methods in which the C3 AI Suite is used as simply a readable database and API. The next set of things to learn would be 'Write' type operations. How do you define new types? How do you 'persist' new instances of a type? How do you clean the databases in your tag up? And so on. These will be the subject of a planned DTI Guideread-only database accessed via APIs. In the following guides, you will learn how to run 'write' operations on the C3 AI Suite such as:

          • Defining new Types
          • Loading new data
          • Clean-up databases in your tag
          • Train machine learning models

          Welcome to the start of your experience with the C3 AI Suite.