pip install -i https://pypi.anaconda.org/octo/simple ddui
pip install -i https://pypi.anaconda.org/octo/label/DD-191/simple ddui
pip install -i https://pypi.anaconda.org/octo/label/dev/simple ddui
pip install -i https://pypi.anaconda.org/octo/label/test_anaconda_version/simple ddui
In a machine learning project, there is a recurring problem with the difference between local interactive modeling source code and production pipelines source code. It is very error prone and, as a consequence, time consuming because we switch constantly between experimentation and production.
The Datadriver project aims to solve this issue by making the glue code based on Pandas and sklearn for modelization, and on Airflow for automation, scheduling, and monitoring of training and predicting pipelines.
Datadriver UI (ddui) is the Airflow's plugin we developed to track our models. Combined with the Datadriver's API (pyddapi), it offers a DAG view to track machine learning workflow (or dataflow).
More specifically, it shows the Output of any Airflow's Task with a lot of metrics and charts :
git clone git_url_of_this_project && cd this_project
local install :
pip install -e .
ddui install
docker install :
./run_docker.sh
ddui/
dash_app -> the application defined like a Dash application, with callbacks and event handeling. It is imported in plugin.py later
dash_components -> html custom components like a Panel or an Alert Div
orm -> function to access the Airflow metastore and retrieve DAGs list and infos
plot -> functions using plotly, they return a Graph object
plugin -> defines the DataDriverUI plugin that implements Airflow's Plugin interface https://airflow.apache.org/plugins.html#interface
views -> a FlaskAdminView that implements Dash too, to have the ability to include plotly charts in Airflow
There is an existing DAG in tests/dags that mocks the behavior of Datadriver's API, but without any dependency to pyddapi.
You can use it to develop the User Interface, using the script located in tests/dev_tools.
cd tests/dev_tools
python run_webserver.py
It runs the Airflow's webserver, and it overrides the AIRFLOWCOREDAGS_FOLDER to look into tests/dags.
virtualenv venv
source venv/bin/activate
pip install -e .
pip install -r ci/tests_requirements.txt
ddui install