Tool for encapsulating, running, and reproducing data science projects.
Take any directory full of stuff that you're working on; web apps, scripts, Jupyter notebooks, data files, whatever it may be.
By adding an anaconda-project.yml
to this project directory,
a single anaconda-project run
command will be able to set
up all dependencies and then launch the project.
Anaconda projects should run in the same way on your machine, on a colleague's machine, or when deployed to a server.
Running an Anaconda project executes a command specified in the
anaconda-project.yml
(any arbitrary commands can be configured).
anaconda-project.yml
automates project setup; Anaconda can
establish all prerequisite conditions for the project's commands
to execute successfully. These conditions could include:
The goal is that if your project runs on your machine, it will also run on others' machines (or on your future machine after you reboot a few times and forget how your project works).
The command anaconda-project init DIRECTORY_NAME
creates an
anaconda-project.yml
, converting your project directory into an
Anaconda project.
Traditional build scripts such as setup.py
automate "building"
the project (going from source code to something runnable), while
anaconda-project
automates "running" the project (taking build
artifacts and doing any necessary setup prior to executing them).
anaconda-project
automates the setup steps; the README can say "type
anaconda-project run
" and that's it.anaconda-project
automates environment creation and verifies that environments
have the right versions of packages.anaconda-project
, you can os.getenv("DB_PASSWORD")
and
configure anaconda-project
to prompt the user for any missing
credentials.anaconda-project
,
someone who wants to reproduce your analysis can ensure they
have exactly the same setup that you have on your machine.anaconda-project.yml
tells hosting providers how to
run your project, so there's no special setup needed when
you move from your local machine to the web.Check out the complete documentation, including a tutorial and reference guide, at: http://anaconda-project.readthedocs.io/en/latest/index.html
conda env
and environment.yml
anaconda-project
has similar functionality and may be more
convenient. The advantage of anaconda-project
for environment
handling is that it performs conda operations, and records them
in a config file for reproducibility, in one step.
For example, if you do anaconda-project add-packages bokeh=0.11
,
that will install Bokeh with conda, and add bokeh=0.11
to an
environment spec in anaconda-project.yml
(the effect is comparable to
adding it to environment.yml
). In this way, "your current conda
environment's state" and "your configuration to be shared with
others" won't get out of sync.
anaconda-project
will also automatically set up environments for a
colleague when they type anaconda-project run
on their machine; they
don't have to do a separate step to create, update, or activate
environments before they run the code. This may be especially
useful when you change the required dependencies; with conda env
people can forget to re-run it and update their packages, while
anaconda-project run
will automatically add missing packages every
time.
In addition to environment creation, anaconda-project
can perform
other kinds of setup, such as adding data files and running a
database server. It's a superset of conda env
in that sense.
For the time being, the Anaconda project API and command line syntax are subject to change in future releases. A project created with the current “beta” version of Anaconda project may always need to be run with that version of Anaconda project and not conda project 1.0. When we think things are solid, we’ll switch from “beta” to “1.0” and you’ll be able to rely on long-term interface stability.
Please report issues right here on GitHub.
Please see CONTRIBUTING.md in this directory.