Guidelines for bioconda
recipes¶
bioconda recipe checklist¶
Source URL is stable (details)
sha256 or md5 hash included for source download (details)
Appropriate build number (details)
.bat
file for Windows removed (details)Remove unnecessary comments (details)
Adequate tests included (details)
Files created by the recipe follow the FSH (details)
License allows redistribution and license is indicated in
meta.yaml
Package does not already exist in the
defaults
orconda-forge
channels with some exceptions (details)Package is appropriate for bioconda (details)
If the recipe installs custom wrapper scripts, usage notes should be added to
extra -> notes
in themeta.yaml
.Update 21 Jan 2019: Recipes that contain pure Python packages should be marked as a “noarch” (details).
Update 7 Mar 2018: When patching a recipe, please provide details on how you tried to address the problem upstream (details)
Stable urls¶
While supported by conda, git_url
and git_rev
are not as stable as a git
tarball. Ideally, a github repo should have tagged releases that are accessible
as tarballs from the “releases” section of the github repo. Correspondingly, a
bitbucket repo should have have tagged versions that are accessible as tarballs
from the “Downloads” -> “tags” section of the bitbucket repo.
TODO: additional info on the various R and bioconductor URLs
Hashes¶
We support either sha256 or md5 checksums to verify the integrity of the source package. If a checksum is provided alongside the source package, then use that. Otherwise we prefer sha256 over md5.
Use shasum -a 256 ...
(or sha256sum
or openssl sha256
etc) on a
file to compute its sha256 hash, and copy this into the recipe, for example:
wget -O- $URL | shasum -a 256
Likewise use md5sum
(Linux) or md5
(OSX) on a file to compute its md5 hash,
and copy this into the recipe.
Build numbers¶
The build number (see conda docs
) can be
used to trigger a new build for a package whose version has not
changed. This is useful for fixing errors in recipes. The first
recipe for a new version should always have a build number of 0.
.bat
files¶
When creating a recipe using one of the conda skeleton
tools, a .bat
file for Windows will be created. Since bioconda does not support Windows and
to reduce clutter, please remove these files
Filesystem Hierarchy Standard¶
Recipes should conform to the Filesystem Hierarchy Standard (FSH). This is most important for libraries and Java packages; for these cases use one of the recipes below as a guideline.
Existing package exceptions¶
If a package already exists in one of the dependent channels but is broken or
cannot be used as-is, please first consider fixing the package in that channel.
If this is not possible, please indicate this in the PR and notify
@bioconda/core
in the PR.
Packages appropriate for bioconda¶
Bioconda is a bioinformatics channel, so we prefer to host packages specific to this domain. If a bioinformatics recipe has more general dependencies, please consider opening a pull request with conda-forge which hosts general packages.
All CRAN packages that do not depend on a package in bioconda should be added to conda-forge instead. This is still the case if the CRAN package is directly related to bioinformatics.
If uploading of an unreleased version is necessary, please follow the versioning scheme of conda for pre- and post-releases (e.g. using a, b, rc, and dev suffixes, see here).
“Noarch” packages¶
A “Noarch” packages package must be created for pure python packages. To do so,
add noarch: python
to the build
section of the meta.yaml
file.
- For other generic packages (like a data package), add
noarch: generic
to the
build
section.
Dependencies¶
There is currently no mechanism to define, in the meta.yaml
file, that
a particular dependency should come from a particular channel. This means that
a recipe must have its dependencies in one of the following:
as-yet-unbuilt recipes in the repo but that will be included in the PR
conda-forge
channelbioconda
channeldefault Anaconda channel
Otherwise, you will have to write the recipes for those dependencies and
include them in the PR. One shortcut is to use anaconda search -t conda
<dependency name>
to look for other packages built by others. Inspecting those
recipes can give some clues into building a version of the dependency for
bioconda.
Test Dependencies¶
During a Bioconda package build, the stringent mulled-build
test
runs the tests listed in the test
section of the recipe’s meta.yaml
file in an environment which does not have the dependencies listed in the
test: requires:
section. Because of this, all tests in that section
must only rely on runtime dependencies or the build will fail.
If you do want to run tests that rely on dependencies listed in test:
requires:
, those tests should not be written directly in the meta.yaml
file but instead put in a run_test.sh
file in the same directory as the
meta.yaml
file. This file is picked up and run by the standard conda build
test that occurs before the mulled-build
, but is not passed on to and run
by the mulled-build
.
Patching¶
Some recipes require small patches to get the tests to pass, for example, fixing hard-coded shebang lines (as described at /usr/bin/perl or /usr/bin/python not found). Other patches are more extensive. When patching a recipe, please first make an effort to fix the issue upstream and document that effort in your pull request by either linking to the relevant upstream PR or indicating that you have contacted the author. The goal is not to block merging your PR until upstream is fixed, but rather to make sure upstream authors know there’s an issue that other users (including non-bioconda users) might be having. Ideally, upstream would fix the issue quickly and the PR could be modified, but it’s fine to merge with the patches and if/when upstream fixes, a separate bioconda PR could be opened that pulls in those upstream changes.
Python¶
If a Python package is available on PyPI, use conda skeleton pypi
<packagename>
, or the more recent alternative grayskull pypi <packagename>
to create a recipe, then remove the bld.bat
and any extra comments
in meta.yaml
and build.sh
. The test that is automatically
added is probably sufficient. The exception is when the package also installs
a command-line tool, in which case that should be tested as well.
Note
conda skeleton pypi
and grayskull pypi
only work with packages that have a source
distribution produced with python setup.py sdist
. Packages on PyPI
which only have a wheel will not work.
Packages containing only “built source” distributions produced with
python setup.py bdist
on UNIX will likewise not work.
Note
Make sure you have a conda-build version 3.x when running
conda skeleton pypi <packagename>
. If you are still using conda-build
2.x, either update your conda-build package, or follow the migration
guidelines in cb3-main.
By default, Python recipes (those that have python
listed as a dependency)
must be successfully built and tested on all supported Python versions in order to
pass. However, many Python packages are not fully compatible across all Python
versions. Use the preprocessing selectors
in the meta.yaml file along with the build/skip
entry to indicate that
a recipe should be skipped.
For example, a recipe that only runs on Python 2.7 should include the following:
host:
- python <3
Or a package that only runs on Python 3.6 and 3.7:
host:
- python >=3.6
Alternatively, for straightforward compatibility fixes you can apply a patch
in the meta.yaml
.
R (CRAN)¶
Note
Most R packages on CRAN should be submitted at Conda-Forge. Specifically, if the CRAN package has a Bioconductor package dependency, it belongs in Bioconda. If the CRAN package does not have a Bioconductor package dependency, it belongs in Conda-Forge.
Note
Using the conda skeleton cran
method results in a recipe intended to be
built for Windows as well, with lines like:
{% set posix = 'm2-' if win else '' %}
{% set native = 'm2w64-' if win else '' %}
and
test:
commands:
- $R -e "library('RNeXML')" # [not win]
- "\"%R%\" -e \"library('RNeXML')\"" # [win]
and
.. code-block:: yaml
requirements:
build:
- {{ posix }}zip # [win]
The bioconda channel does not build for Windows. To keep recipes
streamlined, please remove the “set posix” and “set native” lines described
above, remove the whole build
entry under requirements
, and convert
the test:commands:
block to only:
test:
commands:
- $R -e "library('RNeXML')"
Use conda skeleton cran <packagename>
where packagename
is a
package available on CRAN and is case-sensitive. Either run that command
in the recipes
dir or move the recipe it creates to recipes
. The
recipe name will have an r-
prefix and will be converted to
lowercase. Typically can be used without modification, though
dependencies may also need recipes. For further details on skeleton entries, you
can also refer to the cran skeleton template.
Please remove any unnecessary comments and delete the bld.bat
file which is
used only on Windows.
If the recipe was created using conda skeleton cran
or the
scripts/bioconductor_skeleton.py
script, the default test is
probably sufficient. Otherwise see the examples below to see how tests are
performed for R packages.
recipe for R package not on CRAN, also with patch: spp
R (Bioconductor)¶
Use the bioconda-utils bioconductor-skeleton
tool to build a Bioconductor
skeleton. After using the bootstrap method to set up a testing
environment and activating that environment (which will ensure the correct
versions of bioconda-utils and conda-build), from the top level of the
bioconda-recipes
repository run:
bioconda-utils bioconductor-skeleton recipes config.yml DESeq2
Note that the provided package name is a case-sensitive package available on
Bioconductor. The output recipe name will have a bioconductor-
prefix and
will be converted to lowercase. Data packages will be detected automatically,
and a post-link script (see https://github.com/bioconda/bioconda-utils/pull/169
for details). Typically the resulting recipe can be used without modification,
though dependencies may also need recipes. Recipes for dependencies with an
r-
prefix should be created at Conda-Forge unless the CRAN package has a
Bioconductor dependency; see R (CRAN) above.
typical bioconductor recipe: bioconductor-limma/meta.yaml
R (other sources)¶
If a package is only provided in a public repository (e.g. at github or bitbucket) or via some other website, first check with the authors of the package, if they are planning to publish it on CRAN or Bioconductor. This is always preferable, as it will ensure quality control and permanent availability at a stable URL, and can warrant waiting for such a publication. If this is not planned, you should check if a tagged version is available in a public repo (see infos on stable URLs above) or if the authors are willing to generate one. Only if none of this succeeds, the risk of the source repository or website disappearing should be taken.
Once you have obtained a stable URL to the package, follow the guidelines for R packages on CRAN above and adjust the URL and checksum accordingly.
Java¶
Add a wrapper script if the software is typically called via java -jar ...
.
Sometimes the software already comes with one; for example, fastqc
already had a wrapper script, but peptide-shaker
did not.
New recipes should use the openjdk
package from conda-forge (recipe feedstock),
the java-jdk package from bioconda is deprecated.
JAR files should go in $PREFIX/share/$PKG_NAME-$PKG_VERSION-$PKG_BUILDNUM
.
A wrapper script should be placed here as well, and symlinked to
$PREFIX/bin
.
Example with added wrapper script: peptide-shaker
Example with patch to fix memory: fastqc
Perl¶
Use conda skeleton cpan <packagename>
to build a recipe for Perl and
place the recipe in the recipes
dir. The recipe will have the
perl-
prefix.
Note
Historically, before conda-forge hosted Perl packages, many non-bioinformatics-related Perl packages were hosted on Bioconda. These are slowly being migrated to conda-forge.
An example of such a package is perl-module-build.
Alternatively, you can additionally ensure the build requirements for
the recipe include perl-app-cpanminus
, and then the build.sh
script can be simplified. An example of this simplification is
perl-time-hires.
If the recipe was created with conda skeleton cpan
, the tests are
likely sufficient. Otherwise, test the import of modules (see the
imports
section of the meta.yaml
files in above examples).
Additionally, if the recipe was created with conda skeleton cpan
, several modifications
are necessary to satisfy bioconda policies:
remove the
bld.bat
scriptremove the
source/fn
entry inmeta.yaml
the
requirements/build
keyword inmeta.yaml
should be changed torequirements/host
C/C++¶
Build tools (e.g., autoconf
) and compilers (e.g., gcc
) should be
specified in the build requirements. Compilers are handled via a special macro.
E.g., {{ compiler('c')}}
ensures that the correct version of gcc
is used.
For the C++ variant g++
, you need to use {{ compiler('cxx') }}
.
These rules apply for both Linux and macOS.
Conda distinguishes between dependencies needed for building (the build
section),
and dependencies needed during build time (the host
section).
For example, the following
requirements:
build:
- {{ compiler('c') }}
host:
- zlib
run:
- zlib
specifies that a recipe needs the C compiler to build, and zlib present during building and running.
For two examples see:
If the package uses zlib
, then please see the troubleshooting section on zlib.
If your package links dynamically against a particular library, it is often necessary to pin the version against which it was compiled, in order to avoid ABI incompatibilities. Instead of hardcoding a particular version in the recipe, we rely on conda doing this automatically. We use globally defined configurations, namely this for dependencies from conda-forge and this for dependencies in bioconda. If you need to pin another library, please notify @bioconda/core, and we will extend these lists.
It’s not uncommon to have difficulty compiling package into a portable conda package. Since there is no single solution, here are some examples of how bioconda contributors have solved compiling issues to give you some ideas on what to try:
ococo edits the source in
build.sh
to accommodate the C++ compiler on OSXmuscle patches the makefile on OSX so it doesn’t use static libs
metavelvet, eautils, preseq have several patches to their makefiles to fix
LIBS
andINCLUDES
,INCLUDEARGS
, andCFLAGS
mapsplice includes an older version of samtools; the included samtools’ makefile is patched to work in conda envs.
mosaik has platform-specific patches – one removes
-static
on linux, and the other setsBLD_PLATFORM
correctly on OSXmothur and soapdenovo have many fixes to makefiles
Rust¶
The Rust compiler is included with the special macro {{ compiler('rust') }}
. In addition,
if your package relies on the C/C++ compilers, either directly, or indirectly through
building dependencies, use the macros for those compilers too.
Keeping in line with conda-forge policies,
all (Rust) dependencies of your package should have their license included in the package.
This can be done by including, and using, cargo-bundle-licenses
in the build process.
However, make sure to include the license files in the package (see below).
Lastly, there are some Cargo flags that are recommended for making the build process
as reproducible as possible across different systems. --locked
ensures the builds is
deterministic by asserting that the exact same dependencies and versions are used as in
the Cargo.lock
file. --no-track
prevents Cargo from tracking the installed binaries,
ensuring a clean and reproducible build environment for the package. --root $PREFIX
installs the package into the correct place in the Conda environment. --path
specifies the path to the package source code (usually the current directory .
).
Wrapping it all up, a typical Rust recipe would look like this:
{% set version = "0.1.0" %}
{% set name = "mypackage" %}
package:
name: {{ name }}
version: {{ version }}
source:
url: https://github.com/USERNAME/{{ name }}/archive/{{ version }}.tar.gz
sha256: 1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef
build:
number: 0
run_exports:
- {{ pin_subpackage('mypackage', max_pin="x.x") }}
script:
- cargo-bundle-licenses --format yaml --output THIRDPARTY.yml
- cargo install -v --locked --no-track --root $PREFIX --path .
requirements:
build:
- {{ compiler('rust') }}
- cargo-bundle-licenses
# - {{ compiler('c') }} add these C/C++ compilers if needed
# - {{ compiler('cxx') }}
test:
commands:
- mypackage --version
about:
home: https://example.com
license: MIT # please use https://spdx.org identifiers
license_file:
- LICENSE # the license file for your package
- THIRDPARTY.yml # this file is generated by cargo-bundle-licenses
summary: A Rust package
extra:
recipe-maintainers:
- @username # your GitHub username
Haskell¶
Bioconda has a small number of haskell tools. Most often they are built with
stack
(which is available on conda-forge). NGLess
provides an example of how to call stack
. Here are a few notes:
LD_LIBRARY_PATH
/LIBRARY_PATH
are set to include both${PREFIX}/lib
and the system paths (otherwise,stack setup
will fail).Create a directory (called
fake-home
in this example) and set it as$HOME
, further setting$STACK_ROOT
to use a subdirectory of this$HOME
.
Mac OS X support is generally missing (any help is appreciated, see #6607).
General command-line tools¶
If a command-line tool is installed, it should be tested. If it has a
shebang line, it should be patched to use /usr/bin/env
for more
general use. An example of this is fastq-screen.
For command-line tools, running the program with no arguments, checking
the programs version (e.g. with -v
) or checking the command-line
help is sufficient if doing so returns an exit code 0. Often the output
is piped to /dev/null
to avoid output during recipe builds.
Examples:
exit code 0: bedtools
exit code 255 in a separate script: ucsc-bedgraphtobigwig
confirm expected text in stderr: weblogo
If a package depends on Python and has a custom build string, then
py{{CONDA_PY}}
must be contained in that build string. Otherwise Python
will be automatically pinned to one minor version, resulting in dependency
conflicts with other packages. See mapsplice
for an example of this.
Metapackages¶
Metapackages tie
together other packages. All they do is define dependencies. For example, the
hubward-all
metapackage specifies the various other conda packages needed to get full
hubward
installation running just by installing one package. Other
metapackages might tie together conda packages with a theme. For example, all
UCSC utilities related to bigBed files, or a set of packages useful for variant
calling.
For packages that are not anchored to a particular package (as in the last example above), we recommended semantic versioning starting at 1.0.0 for metapackages.
Other examples of interest¶
Packaging is hard. Here are some examples, in no particular order, of how contributors have solved various problems:
graphviz has an OS-specific option to
configure
crossmap removes libs that are shipped with the source distribution
hisat2 runs
2to3
to make it Python 3 compatible, and copies over individual scripts to the bin dirkrona has a
post-link.sh
script that gets called after installation to alert the user a manual step is requiredhtslib has a small test script that creates example data and runs multiple programs on it
spectacle runs
2to3
to make the wrapper script Python 3 compatible, patches the wrapper script to have a shebang line, deletes example data to avoid taking up space in the bioconda channel, and includes a script for downloading the example data separately.gatk is a package for licensed software that cannot be redistributed. The package installs a placeholder script (in this case doubling as the
jar
wrapper) to alert the user if the program is not installed, along with a separate script (gatk-register
) to copy in a user-supplied archive/binary to the conda environment
Name collisions (identical names)¶
In rare cases, a new recipe may have an identical name as an existing conda, conda-forge, bioconda, or Python package. This sort of naming collision should be avoided. For example, the weblogo
recipe is for the standard command-line tool. There is also a Python package
called weblogo
on PyPI. In this case, to reduce ambiguity
we prefixed the Python package with python-
(see python-weblogo).
If in doubt about how to handle a naming collision, please submit an
issue, or ask @bioconda/core
in a PR.
Tests¶
An adequate test must be included in the recipe. An “adequate” test depends on the recipe, but must be able to detect a successful installation. While many packages may ship their own test suite (unit tests or otherwise), including these in the recipe is not recommended since it may timeout the build system on CircleCI. We especially want to avoid including any kind of test data in the repository.
Note that a test must return an exit code of 0. The test can be in the test
field of meta.yaml
, or can be a separate script (see the relevant conda
docs for
testing).
It is recommended to pipe unneeded stdout/stderr to /dev/null to avoid cluttering the output in the CircleCI build environment.
Link and unlink scripts (pre- and post- install hooks)¶
It is possible to include scripts that are
executed before or after installing a package, or before uninstalling
a package. These scripts can be helpful for alerting the user that manual
actions are required after adding or removing a package. For example,
a post-link.sh
script may be used to alert the user that he or she will
need to create a database or modify a settings file. Any package that requires
a manual preparatory step before it can be used should consider alerting the
user via an echo
statement in a post-link.sh
script. These scripts may
be added at the same level as meta.yaml
and build.sh
:
pre-link.sh
is executed prior to linking (installation). An error causes conda to stop.post-link.sh
is executed after linking (installation). When the post-link step fails, no package metadata is written, and the package is not considered installed.pre-unlink.sh
is executed prior to unlinking (uninstallation). Errors are ignored. Used for cleanup.
These scripts have access to the following environment variables:
$PREFIX
The install prefix$PKG_NAME
The name of the package$PKG_VERSION
The version of the package$PKG_BUILDNUM
The build number of the package
Versions¶
In general, recipes can be updated in-place. The older package[s] will continue to be hosted and available on anaconda.org while the recipe will reflect just the most recent package.
However, if an older version of a packages is required but has not yet had a package built, create a subdirectory of the recipe named after the old version and put the recipe there. Examples of this can be found in bowtie2, bx-python, and others.
Comments in recipes¶
When creating a recipe using one of the
conda skeleton
tools, often many comments are included, for example, to point out sections that can be uncommented and used. Please delete all auto-generated comments inmeta.yaml
andbuild.sh
. But please add any comments that you feel could help future maintainers of the recipe, especially if there’s something non-standard.