...
- Request accounts on Nano cluster.
- Copy scripts to nano and learn how to use them on simple batch-scheduled applications
- (possibly update scripts, if needed)
- Go over TensorFlow getting started tutorial (https://www.tensorflow.org/tutorials/).
- Figure out how to use the above scripts with TensorFlow tutorial examples, collect and visualize data.
- Collect data from TensorFlow benchmarks (https://www.tensorflow.org/guide/performance/benchmarks)
- Look into integrating the scripts with nano's system monitor (https://nano.ncsa.illinois.edu:3000/d/3QVrDIFmz/nano-status?refresh=1m&orgId=1).
- Yan: InfluxDB takes HTTP requests with JSON-formatted data. Also check relevant Telegraf plugins (cpu, mem, disk, diskio, zfs, etc.) as these are currently being used in admin (non-public) dashboards and may provide some of the desired metrics.