You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 46 Next »

The C3 AI Suite provides researchers many tools to analyze data and build and deploy machine learning models. To get started using the C3 AI Suite, please follow the C3.ai DTI Quickstart guide. This guide explains how to connect to a C3.ai cluster, access data using C3.ai methods, and convert C3.ai method outputs to an easy-to-analyze form.

Please note, this guide covers how to run read-only queries on the C3 AI Suite. For more advanced topics, such as loading data, building metrics, or configuring and training machine learning models, please refer to the following wikis:

  • Data Integration (Not yet available)
  • Metrics (Not yet available)
  • Machine Learning (Not yet available)

Connect to a C3.ai cluster

There are several options to connect to a C3.ai Cluster:

Fetching Instances of Types

All data in the C3 AI Suite are stored in C3.ai Types.Users can access data from a C3.ai Type with the 'fetch' method. Behind the scenes, the 'fetch' method submits a query directly to the database underlying a C3.ai Type, and retrieves and presents query results to C3 AI Suite users.

The C3 AI Suite returns the 'fetch' query's response, which includes (1) data from the C3.ai Type itself; (2) Metadata for the 'fetch' query (e.g., the number of objects, whether additional data exists in the database) into the FetchResult type, for data analysis (see example below).

To learn more about the 'fetch' method, please see the following C3.ai Developer Documentation:

Users can also provide a FetchSpec (or parameters) to the 'fetch' method to describe particular data to retrieve (e.g., only retrieve gene sequences collected in Germany). The FetchSpec can be 'empty' (e.g., OutbreakLocation.fetch()), or contain several parameters to return a subset of the data.

Some example FetchSpec parameters include:

  • filter: Filter expression to return a subset of the data (e.g., age <= 20). Filter expressions must evaluate to a Boolean type (i.e., true or false)
  • limit: the maximum number of rows that should be returned. Be default, if no limit is specified, the C3 AI Suite returns 2,000 rows from the C3.ai Type. Specifying a limit is often helpful to debug a fetch 'method', without returning too many records.
  • include: Specifies the particular fields from a C3.ai Type to return to the FetchResult. By default, if no include spec is defined, all fields from the C3.ai Type will be returned.
  • order: Specifies the order to return the query's results (either "ascending" or "descending")

Note: Please see this C3.ai Developer Documentation for full list of FetchSpec parameters: https://developer.c3.ai/docs/7.12.17/type/FetchSpec

Examples of Fetch Calls

As an example, please see the DTI Housing Coverage Example here: https://github.com/c3aidti/HouseCoverageExample

In this example, the BlockInfo Type contains information aggregated about census blocks. We can fetch BlockInfo records, for which the 'prp_bf_lr' field exists (i.e., is not null). We can also retrieve these records in descending order by their 'id'.

BlockInfo.fetch({
	'limit': -1,
	'filter': 'exists(prp_bf_lr)',
	'order': 'descending(id)',
	'include': 'pct_i_l,pct_t_l,prp_res_lr,pop10_ha_lr,hu10_ha_lr,eroom_ha_lr,med10_age,prp_bf_lr',
})

You can run this same fetch in Python:

raw_data = c3.BlockInfo.fetch(spec={
    'limit': -1,
    'filter': 'exists(prp_bf_lr)',
    'order': 'descending(id)',
    'include': 'pct_i_l,pct_t_l,prp_res_lr,pop10_ha_lr,hu10_ha_lr,eroom_ha_lr,med10_age,prp_bf_lr'
})

Additional details on "Fetching in Python" are available in this C3.ai Developer Documentation: https://developer.c3.ai/docs/7.12.0/topic/ds-jupyter-notebooks

Additional examples of fetch calls can be found in our examples here:

The fetchCount Method

Another useful fetch command is fetchCount. This function is nearly identical to the fetch commands above, but it just returns the number of records which match the fetch filter. This is useful when trying to determine whether a given search is refined enough.

BlockInfo.fetchCount({'filter': 'exists(prp_bf_lr)'})

The same in python is:

c3.BlockInfo.fetchCount(spec={'filter': 'exists(prp_bf_lr)'})


Converting Fetch results to usable forms in Jupyter Notebook

When using a Jupyter Notebook, C3.ai developers typically modify FetchResults for data analysis. This section shows a couple ways to convert FetchResults into easy-to-analyze forms.

Python

In python, first, retrieve the 'objs' field from the FetchResults object, and then call the toJson() function. The toJson() function returns an array of dictionaries each with keys equal to the requested fields of the fetched C3.ai Type. Using the Pandas library, this array can be turned into an analysis-ready DataFrame, as the below example shows.

A Code Example in Jupyter Notebook:

## continue from above ##
import pandas as pd
df = pd.DataFrame(raw_data.objs.toJson())
df.head()
df.drop('meta', axis=1, inplace=True)
df.drop('type', axis=1, inplace=True)
df.drop('version', axis=1, inplace=True)
df.drop('id', axis=1, inplace=True)
df.head()

Users can then manipulate the resulting DataFrame, using common programming libraries and frameworks.

ExpressionEngineFunctions

The C3 AI Suite also provides a pre-built library of "ExpressionEngineFunctions". Expression EngineFunctions take a variety of arguments and perform various data processing tasks. For example, the function 'contains' takes two strings as arguments, and checks whether the first argument contains the second argument. The function 'lowerCase' takes as input a string, and returns that same string with all lowercase letters. In addition to these string processing functions, the C3 AI Suite's ExpressionEngine includes many math functions such as 'log', 'avg', and 'abs', which operate on a various input data types (e.g. int, double, float).

The ExpressionEngine Functions are used in several places such as:

  • 'fetch filters
  • simple and compound metric expressions
  • tsDecl metric values

Please see this C3.ai Developer Documentation for a full list of the C3 AI Suite's ExpressionEngineFunctions: https://developer.c3.ai/docs/7.12.0/type/ExpressionEngineFunction

Developing Metrics on Timeseries data

The C3 AI Suite also offers several features to handle timseries data. To interact with timeseries C3.ai developers typically use simple and compound metrics. These metrics are used in several places in the C3 AI Suite such as:

  • Alerts and Application Logic
  • Machine Learning Features
  • User Interface (to Visualize Data)

Simple Metrics

Simple metrics allow C3.ai developers to produce timeseries from raw data, and are often used to construct more advanced metrics (i.e., Compound Metrics), in practice. Simple metrics are linked to a specific C3.ai Type and reference the timeseries data stored within that C3.ai Type. To declare a simple metric, users should specify the following fields:

  1. id: simple metric's unique id, which should follow the convention "name_srcType" (e.g., Apple_DrivingMobility_OutbreakLocation)
  2. name: simple metric's name (e.g., Apple_DrivingMobility)
  3. description: simple metric's description (optional field)
  4. srcType: the C3.ai Type the simple metric is analyzed on (e.g., OutbreakLocation)
  5. path: path from the srcType to the C3.ai Type, that stores the raw data referenced by the simple metric (e.g., pointMeasurements) Note: if the srcType itself stores the raw data referenced by the simple metric, path field is optional.
  6. expression: the expression (or ExpressionEngineFunction) applied to the raw data, referenced by the simple metric (e.g., avg(avg(normalized.data.quantity)). Note: the "normalized" key term, instructs the simple metric to use normalized (instead of raw) data on the C3 AI Suite (to learn more about Normalization, see this C3.ai Developer Documentation: https://developer.c3.ai/docs/7.12.17/topic/normalization )

Here is an example of a Simple Metric:

sample_met = c3.SimpleMetric({
	'id': 'SampleMetric_SampleType',
	'name': 'SampleMetric',
	'srcType': 'SampleType',
	'path': 'timeseriesValues',
	'expression': 'avg(avg(normalized.data.x))'
})

To learn more about Simple Metrics, please see the C3.ai Developer Documentation here:

Another type of SimpleMetric is a tsDecl (Timeseries Declaration) metric. tsDecl metrics are often used to turn non-timeseries raw data (e.g., event data, status data, or data with irregular intervals) into timeseries. tsDecl metrics have the same fields as standard SimpleMetric, except for the 'tsDecl' field, which replaces the 'expression' field. tsDecl metrics may allow users the added flexibility to define new metrics which the expression field may not support. Using a tsDecl metric, the above metric can be re-written as:

sample_met = c3.SimpleMetric({
	'id': 'SampleMetric_SampleType',
	'name': 'SampleMetric',
	'srcType': 'SampleType',
	'path': 'timeseriesValues',
	'tsDecl': {
		'data': 'data',
		'treatment': 'AVERAGE',
		'start': 'start',
		'value': 'value'
	}
})

Please note that the above examples do not have an example context in which they work. This will be updated soon with a version backed up by a working exercise.

To learn more about tsDecl metrics, please see the C3.ai Developer Documentation here:

Compound Metrics

Compound metrics are generally easier to define and use as they operate on already defined metrics either Simple or Compound. They essentially just consist of and id/name, and an expression defining the metric in terms of constants and already defined metrics. If you try and execute a CompoundMetric on a type for which some necessary SimpleMetric is not defined, you'll get an error.

Essentially, a CompoundMetric defines:

  1. The name/id of the metric
  2. The expression defining the metric

An example CompoundMetric is:

sample_compound_met = c3.CompoundMetric.make({
	'id': 'CompoundMetric',
	'name': 'CompoundMetric',
	'expression': 'window("AVG", SimpleMetric, 0, 7)',
})

Please note that the above example does not have an example context in which it will work. This will be updated soon with a version backed up by a working exercise.

For more detail, see the C3 documentation on CompoundMetrics here: https://developer.c3.ai/docs/7.12.0/guide/guide-c3aisuite-basic/metrics-compound-metrics

Evaluating Metrics

Types on which you can evaluate a metric mixin the Type 'MetricEvaluatable' (C3 Docs here: https://developer.c3.ai/docs/7.12.0/type/MetricEvaluatable)

Finding Metrics

Not all SimpleMetrics are defined on all types.
This bestows the function 'listMetrics' (among others) to that type, so if you're unsure what kind of metrics are available for a given type, execute that function to get a list, for example:

Javascript:

var metrics = SmartBulb.listMetrics()
c3Grid(metrics)

Python:

pd.DataFrame(c3.SmartBulb.listMetrics().toJson())

Once you have the metric you want to evaluate in mind, you can evaluate it.

Evaluating Metrics

With a metric in mind, you can use the 'evalMetrics' API function which is brought in with the MetricEvaluatable type to actually evaluate the metric. The evalMetrics function takes an 'EvalMetricsSpec' type which contains the following:

  1. list of Type ids you want the metrics to be evaluated on
  2. A list of metrics you want to be evaluated
  3. A start date (in ISO 8601 format)
  4. An end date (in ISO 8601 format)
  5. An evaluation interval

Such an evaluation in Python might look like this:

spec = c3.EvalMetricsSpec({
	'ids': [ 'A', 'B', 'C' ],
	'expressions': [ 'SampleMetric', 'SampleMetric2' ],
	'start': '2019-01-01',
	'end': '2019-05-01',
	'interval': 'DAY',
})

results = c3.SampleType.evalMetrics(spec=spec)

The results are in the form of a 'EvalMetricsResult'. By itself, this type isn't easily usable, however C3 offers the type 'Dataset' which is better suited for data analysis.
We can then convert the EvalMetricsResult to a Dataset using a convenient helper function and then in the case of Python to a pandas DataFrame using another
helper function.

ds = c3.Dataset.fromEvalMetricsResult(result=results)
df = c3.Dataset.toPandas(dataset=ds)

If you're in the browser using Javascript, you can utilize the 'c3Viz' console function to display the result of eval metrics. The whole evaluation might look like this:

var spec = EvalMetricsSpec(
	ids= ['A', 'B', 'C' ],
	expressions= [ 'SampleMetric', 'SampleMetric2' ],
	start= '2019-01-01',
	end= '2019-05-01',
	interval= 'DAY')

var results = SampleType.evalMetrics(spec)
c3Viz(results)

Bonus: An additional function is available as well: evalMetric, Have a look at the MetricEvaluatable documentation to see how it differs from evalMetrics: https://developer.c3.ai/docs/7.12.0/type/MetricEvaluatable

Conclusion

To get started quickly, focus on 'CompoundMetrics'. They're the easiest to use, and for most cases, the 'AVG' treatment works well.

Official C3 documentation:

Review and Next Steps

For most data exploration, you'll find yourself 'Fetching' and running 'evalMetrics'. This guide provides a good foundation for these activities. This first set of activities might be described as 'Read-Only'. Here you're using the C3 AI Suite as simply a readable database and API. The next set of things to learn would be 'Write' type operations. How do you define new types? How do you 'persist' new instances of a type? How do you clean the databases in your tag up? And so on. These will be the subject of a planned DTI Guide.

  • No labels