recipe biopet-fastqsplitter

This tool divides a fastq file into smaller fastq files, based on the number of output files specified.

Homepage:

https://github.com/biopet/fastq-splitter

License:

MIT

Recipe:

/biopet-fastqsplitter/meta.yaml

This tool divides a fastq file into smaller fastq files, based on the number of output files specified. For ecample, if one specifies 5 output files, it will split the fastq into 5 files of equal size. This can be very useful if one wants to use the chunking option in a pipeline: FastqSplitter can generate the exact number of fastq files (chunks) as needed.

FastqSplitter will read groups of reads (100 reads per group) and distribute this evenly over the output FASTQ files. FastqSplitter will iterate over all the output files while writing the read groups.

Example: A fastq file is split with a group size of 100 and three output files. read 1-100 will be assigned to output1 read 101-200 will be assigned to output2 read 201-300 will be assigned to output3 read 301-400 will be assigned to output1 read 401-500 will be assigned to output2 etc.

This will make sure the output fastq files are of equal size and there is no positional bias in each output file.

For documentation and manuals visit our github.io page: https://biopet.github.io/fastq-splitter

package biopet-fastqsplitter

(downloads) docker_biopet-fastqsplitter

Versions:

0.1-40.1-30.1-20.1-1

Depends:
  • on openjdk >=8,<9

  • on python

Additional platforms:

Installation

You need a conda-compatible package manager (currently either pixi, conda, or micromamba) and the Bioconda channel already activated (see Usage). Below, we show how to install with either pixi or conda (for micromamba and mamba, commands are essentially the same as with conda).

Pixi

With pixi installed and the Bioconda channel set up (see Usage), to install globally, run:

pixi global install biopet-fastqsplitter

to add into an existing workspace instead, run:

pixi add biopet-fastqsplitter

In the latter case, make sure to first add bioconda and conda-forge to the channels considered by the workspace:

pixi workspace channel add conda-forge
pixi workspace channel add bioconda

Conda

With conda installed and the Bioconda channel set up (see Usage), to install into an existing and activated environment, run:

conda install biopet-fastqsplitter

Alternatively, to install into a new environment, run:

conda create -n envname biopet-fastqsplitter

with envname being the name of the desired environment.

Container

Alternatively, every Bioconda package is available as a container image for usage with your preferred container runtime. For e.g. docker, run:

docker pull quay.io/biocontainers/biopet-fastqsplitter:<tag>

(see biopet-fastqsplitter/tags for valid values for <tag>).

Integrated deployment

Finally, note that many scientific workflow management systems directly integrate both conda and container based software deployment. Thus, workflow steps can be often directly annotated to use the package, leading to automatic deployment by the respective workflow management system, thereby improving reproducibility and transparency. Check the documentation of your workflow management system to find out about the integration.

Notes

biopet-fastqsplitter is a Java program that comes with a custom wrapper shell script. By default 'no default java option' is set in the wrapper. The command that runs the program is 'biopet-fastqsplitter'. If you want to overwrite it you can specify memory options directly after your binaries. If you have _JAVA_OPTIONS set globally this will take precedence. For example run it with 'biopet-fastqsplitter -Xms512m -Xmx1g'.

Download stats