Dotscience allows teams to put their machine learning development and deployment into a robust model management framework. Within the framework, models and all accompanying files and datasets are versioned. The provenance of all files is auto-recorded throughout training and into deployment. Additionally, metadata about any runs of a model is auto-recorded. This record allows teams and organizations to reliably keep track of their work, share insights about model behavior, reducing the need to re-execute costly training runs, or reverse-engineer models.
Dotscience tracks the variables of models run in training and optimization, such as their hyperparameter combinations and corresponding accuracy metrics. This record provides a run history of your team's work, so that team members can easily compare the performance of models using different code, data versions, hyperparameter combinations and execution environments, such as hardware accelerator.
The run history can also be visualized on the Dotscience web interface to give insights into model behavior: for instance, to show the effect of a hyperparameter choice on accuracy metrics. This helps teams decide the next experiment to try. The provenance graph of each file used can also be viewed. Models can be deployed via the web interface to production environments.
To use Dotscience, you need to install the Dotscience container image on your chosen runner, and annotate your model code with the Dotscience Python library. Any additional files, including training data, required by the model can be added to the Dotscience web interface to place them under version control.
Models can be defined and trained in Jupyter notebooks, accessed via the Dotscience web interface. They may be GPU enabled. Runners can be any cloud VM or local machine.
A runner (any cloud machine) running either Ubuntu 16.04+ or CentOS 7, with Docker installed.
If you want to run on your runner's GPUs, then you will need to have installed nvidia container runtime
. For installation instructions, see the NVIDIA documentation.
TOKEN
from the code snippet shown.TOKEN
as an environment variable there, named TOKEN
:$ export TOKEN="<your token>" # replace <your token> with your copied value
docker run --name dotscience-runner -d -e TOKEN=$TOKEN \
--restart always -v /var/run/docker.sock:/var/run/docker.sock \
-v dotscience-task-spool:/spool \
nvcr.io/nvidia/dotmesh/dotscience-runner:latest #TODO update image location
Note that the dotscience-runner
container will boot up a couple more Docker containers on your runner.
Create a Dotscience project in the Projects view. Upload data files you will need, then Launch Jupyter to open a Jupyter lab instance using your runner as the backing compute. You can use the terminal on Jupyterlab to import more data, libraries, and other files.
Annotate your model code with the Dotscience Python library. See documentation and examples here: github.com/dotmesh-io/dotscience-python
Visualize your model metrics and provenance graph in the Dotscience web interface. Access collaboration features, including Github-style fork and merge, and version control for massive datasets.
Join the Dotscience community Slack here.
Dotscience is commercial software for enterprises. Access to Dotscience is provided here as a time-limited trial. Contact sales@dotscience.com to discuss signing up for an enterprise pilot.