Welcome to the C3.ai Digital Transformation Institute!
You have been given a grant as part of the new C3.ai Digital Transformation Institute (DTI)!
To make the start of your DTI experience as fast as possible, we have assembled a set of resources to:
- Introduce researchers of all stripes to the C3.ai system
- Help researchers determine what level of training they will need to leverage C3.ai's resources
- Point researchers directly to relevant documentation they will need
- Provide worked examples of different research workflows and how they may be ported into
the C3.ai environment, or may use C3.ai's resources
If you have questions not covered by this guide, please contact the DTI team at the email help+c3ai@ncsa.illinois.edu
Table of Contents |
---|
Introduction to the C3
...
AI Suite
The C3 .ai AI® Suite is a data analytics engine designed to make perform the ingestion and analysis of heterogeneous data sources
as painless as possible. The C3 .ai platform AI Suite joins data from multiple sources into a single unified federated data image.
With the federated data image defined, C3 .ai AI then provides an API to access that data, and in the case of time-series data,
perform numerous transformations and computations, all producing normalized time-series data at regular intervals.
More background on the C3 AI Suite can be found in a one-hour C3.ai DTI™ webinar describing its capabilities here.
C3 AI also supports R and Python Jupyter notebook analysis of the federated data image. These notebooks provide a
great a great way for researchers to analyze data close to where the data is stored. While C3 .ai AI supports many data science
capabilities familiar to the researcherresearchers, some expected functionality may be missing. For these cases, C3 .ai supports
implementing AI supports implementing new data processing functions in python Python and javascriptJavaScript.
Like any other API porting your own workflows will take some care and time to learn properly. Please leverage this guide
to make understanding the C3.ai platform and porting your workflow as quick and easy as possible.
Services available from C3.ai
- Covid-19 Datalake: This unified federated Datalake includes data from numerous sources.
- C3.ai computing platform
- C3.ai Integrated Development Studio
- C3.ai Jupyter notebooks
- C3.ai Marketplace
- C3.ai UI system for creating dashboards
How does C3.ai differ from traditional HPC systems?
- Traditional HPC systems are similar to Hardware as a Service (HaaS), while C3.ai is more like a Platform as a Service (PaaS).
Users are encouraged to work within the platform's API to achieve the best performance out of C3.ai. - C3.ai offers a state-of-the-art data integration system as the basis for all Data Science operations.
This is in contrast to HPC systems where all components of data management and the analysis pipeline must be installed and managed independently.
What types of software can be run on C3.ai?
- Nearly any python module may be installed and used through pip or conda
- Nearly any R package may be installed and used within the R juptyer environment.
What types of software cannot be run on C3.ai?
- General binary executables are not supported by C3.ai out of the box.
- MPI-based python software
- Packages which must be built from scratch on the platform, or require specific hardware drivers
- Python modules which require special built binaries may not run as well.
How do I get started?
Use this guide to determine what training you need to utilize C3.ai resources effectively. We have identified four
categories of usage of the C3.ai platform. We include basic examples of workflows which might fall into that level,
pros and cons of operating on that level, and a list of training resources we recommend resources researchers
completing on the DTI training environment before starting their C3.ai allocations. This will ensure researchers will
be able to use their allocation as efficiently as possible.
Examine the high level overviews of each level below, then click the section titles to go to more in-depth
discussions related to that level, like the recommended training.
Level 1: Use COVID-19 Datalake Only
For many researchers, they will simply want to leverage the C3.ai COVID-19 Federated Data Image.
Pros:
- Easy to integrate into existing scientific workflows and run on existing scientific computational hardware
- Publicly available API means no credentials are needed to access the data
- Assuming you have access to your own computational resources, you don't have to worry about allocations
on the C3.ai platform.
Cons:
- All data used from the Datalake must be streamed to wherever you're processing data
- Performance benefits from working with the Datalake using C3.ai will not be available.
Level 2: GUI based data analysis on C3.ai
C3.ai provides a wonderful GUI-based interface to the C3.ai system with their Integrated Development Studio. Such
an environment is likely to be attractive to many researchers. This level is the easiest way to integrate new
data onto the Datalake.
Pros:
- GUI interface to manage C3.ai Types and data integration.
- GUI interface to piece together ML pipelines.
- Ability to load new data onto the Datalake
Cons:
- Some types of workflows may not be easily defined within the GUI framework
Level 3: Utilize C3.ai Suite and Jupyter notebook analysis
Some researchers will want to write their own C3.ai package and leverage more of the AI Suite through Jupyter notebooks.
C3.ai allows researchers to define their own types, methods, and use R and python to perform analysis on Datalake data.
Pros:
- Researchers can use a jupyter notebook to interface directly with their Data Model.
- Researchers can often use exactly the same workflow they were using before.
Cons:
- We recommend users take a full set of training to completely familiarize themselves
with the C3.ai system before embarking on their analysis.
Level 4: State-of-the-art ML workflows requiring special ML models and/or GPUs
Some researchers will want to bring state-of-the-art ML workflows to C3.ai. C3.ai can support such workflows, but
extra work may be needed.
Pros:
- Researchers can bring state of the art workflows close to the COVID-19 Datalake
Cons:
- The DTI team will evaluate on a case-by-case basis whether a workflow is appropriate for C3.ai.
- Some workflows may require major effort to fit within the C3.ai framework.
Accessing C3.ai
This section introduces the process to access C3.ai. Generally speaking, once you receive your grant,
the DTI team will reach out and discuss with you what your needs are.
- Determine which researchers will require access to a C3.ai environment
- Each researcher will be given a C3.ai developer portal login.
- Each researcher will be given a tag on the C3.ai DTI training cluster.
- Once training is complete, Discuss with the DTI team what your needs
for a C3.ai cluster will be. - The C3.ai DTI will work with C3.ai to stand up a new tag for your research.
- Access to that tag will be granted to your researchers
- Research can then proceed until your allocation is exhausted!
Essential C3.ai Concepts
C3.ai is quite different from traditional HPC resources. We have written an introduction to C3.ai from the
perspective of a scientific researcher. We go over several important C3.ai concepts and relate them to
what scientists are more familiar with.
C3.ai Allocation Management
This section introduces How researchers will be expected to manage their allocation while on the C3.ai platform.
This section will be expanded once the DTI team understands how this procedure will look to the researcher.
Special Compute Resource Information
Here you can find information about the special compute resources available to C3.ai DTI researchers.
Comprehensive List of Available Training and Resources
See the above link for a comprehensive list and categorization of the available training
materials. This includes C3.ai Documentation, DTI introductions, and DTI created examples and exercises.
Help! This guide doesn't solve my problem!
...
If you have any questions, feel free to contact the C3.ai DTI team at the email help@c3dti.ai.
C3 AI Suite Guides
We have prepared several self-study guides based on your level of understanding C3 AI Platform:
- DTI Readiness Checklist: Before you go to any guides below, please check if you have C3 AI Suite Access with this Readiness Checklist.
- DTI C3 AI Suite Quickstart Guide: If this is your first time using C3 AI Suite, this is the guide you are looking for. This quick start guide will provide you the core concepts about the platform and some visual (images and videos) guides to provide a basic understanding of the platform.
- C3.ai DTI Tutorial to C3 Types: This is a C3.ai DTI tutorial guide for essential concepts about C3 Types. This is one of the most important concepts in using the C3 AI Suite. The C3 Type is similar to the concept of the "class" in Java and Python.
- DTI Guide: C3 Expressions: This is a C3.ai DTI guide to understanding C3 Expressions which appear throughout the platform.
- C3.ai DTI Guide for Jupyter Notebook on C3 AI Suite: If you have had some basic understanding of the C3 AI Suite and want to use the Jupyter notebook on C3 AI Suite, those two basic guides will help you to get started. The Python Action Runtime setup guide is also included.
- DTI Guide: Python on C3 AI Suite: This guide describes some basic details of using Python on the C3 AI Suite including how to define Python runtime environments.
- C3.ai DTI Data Integration: This will be the guide if you want to integrate data into C3 AI Suite. It includes Official C3 AI documentation for data integration, detailed integration structural explanation, and a recorded demo video for data integration. This guide assumes some basic understanding of the C3 AI Suite and C3 types.
- C3.ai DTI Guide to Machine Learning: This guide will provide you some basic knowledge on how to start Machine Learning on C3 AI Suite. It provides you the substrate/concrete C3 Types that you will frequently use on the platform, along with a couple of usage examples. This guide assumes some basic understanding of the C3 AI Suite and C3 Types.
- C3.ai DTI RESTful API Documentation: This guide provides a basic understanding of the C3 RESTful APIs that are automatically generated when you deploy a C3 Package to your C3 Tag.
- C3.ai Tips and Tricks: This guide provides a number of useful tips and tricks for working with the C3 platform.
- DTI Platform Resource Management Guide: This DTI managed guide instructs how to manage your cluster resources. Especially useful for custom Jupyter sessions.
- C3.ai DTI Training Videos: If you are a visual learner and prefer to learn the basic concepts with a lecturer, feel free to check these training videos covering topics selected by the C3.ai DTI DevOps staff. These are a collaboration between the C3.ai DTI team and Armaan Butt from C3 AI and go through various aspects of development and usage on the C3 AI Suite.
- List of Additional Available Training Resources: There are still several additional training materials that have not been listed above. Please visit this page for more details about those trainings.
- Official C3 AI Training: If you would like to go to official C3 AI training, here is the link for you. There are multiple online courses and each will take about a week of working time to finish, but these can be really helpful for you to understand their platform and details.
Other Computational Resource Guides
We also have additional supplemental computational resources for supporting, including Azure Clouds and Supercomputing facilities:
- Introduction to Azure Cloud: This page will present you with information related to Azure Cloud. For more detail please contact to DTI Support Staff so that we can provide you a one-on-one Zoom meeting.
- NCSA Blue Water Resource Summary: This page is the system summary page for Blue Water's High-Performance Computing Resources.
- UCB NERSC Resource Summary: This page is the resource summary page for National Energy Research Scientific Computing Center's (NERSC) High-Performance Computing Resources.
Feedback
If you feel aspects of this guide are incomplete or inaccurate, please send an email to help+c3ai@ncsa.illinois.edu with the
issue to help@c3dti.ai with the issue or suggestion, and we will work to incorporate it to make the documentation better. We appreciate the new perspective
More perspective more eyes can bring to a software project!
Your DTI Team
NCSA
Jay Roloff - Executive Director
Matthew Krafczyk - Data AnalystResearch Scientist
Yifang Zhang - Data Analyst
Darren Adams - Senior Research Programmer
Weddie Jackson - Research Programmer
Bruno Abreu - Research Programmer
Berkeley
Larry Rohrbach - Executive Director
Eric Fraser - Chief Technology Officer
Greg Merritt - DevOps Lead
Matt Podolsky - Managing Director of Research Technology
Contact Us
If you have some additional questions or need that is not covered by this series of guides, please feel free to contact the DTI team at the email help@c3dti.ai. We can provide some help with your questions via help desk tickets and/or Zoom meetings.
Legal Notice
C3 AI, C3.ai, C3DTI, C3.ai DTI, C3.ai Digital Transformation Institute, and the C3.ai logos are trademarks or registered trademarks of C3.ai, Inc. in the United States and/or other countries. All other product names, trademarks, and registered trademarks are the property of their respective owners.