recipe bioconductor-genproseq

Generating Protein Sequences with Deep Generative Models






Generative modeling for protein engineering is key to solving fundamental problems in synthetic biology, medicine, and material science. Machine learning has enabled us to generate useful protein sequences on a variety of scales. Generative models are machine learning methods which seek to model the distribution underlying the data, allowing for the generation of novel samples with similar properties to those on which the model was trained. Generative models of proteins can learn biologically meaningful representations helpful for a variety of downstream tasks. Furthermore, they can learn to generate protein sequences that have not been observed before and to assign higher probability to protein sequences that satisfy desired criteria. In this package, common deep generative models for protein sequences, such as variational autoencoder (VAE), generative adversarial networks (GAN), and autoregressive models are available. In the VAE and GAN, the Word2vec is used for embedding. The transformer encoder is applied to protein sequences for the autoregressive model.

package bioconductor-genproseq

(downloads) docker_bioconductor-genproseq



depends bioconductor-deeppincs:


depends bioconductor-ttgsea:


depends r-base:


depends r-catencoders:

depends r-keras:

depends r-mclust:

depends r-reticulate:

depends r-tensorflow:

depends r-word2vec:



You need a conda-compatible package manager (currently either micromamba, mamba, or conda) and the Bioconda channel already activated (see set-up-channels).

While any of above package managers is fine, it is currently recommended to use either micromamba or mamba (see here for installation instructions). We will show all commands using mamba below, but the arguments are the same for the two others.

Given that you already have a conda environment in which you want to have this package, install with:

   mamba install bioconductor-genproseq

and update with::

   mamba update bioconductor-genproseq

To create a new environment, run:

mamba create --name myenvname bioconductor-genproseq

with myenvname being a reasonable name for the environment (see e.g. the mamba docs for details and further options).

Alternatively, use the docker container:

   docker pull<tag>

(see `bioconductor-genproseq/tags`_ for valid values for ``<tag>``)

Download stats