sync_reads - resynchronize paired FASTQ files
sync_reads [options] --fwd left_reads --rev right reads
sync_reads will re-synchronize two FASTQ files containing paired reads which are no longer in sync due to individual removal of reads during pre-processing (trimming, filtering, etc). In this case, "in sync" means that both files have the same number of reads and, at any given read position in the files, the corresponding reads represent proper pairs. The resulting files will contain matching reads in order (assuming the input files were properly ordered). It will optionally print out unpaired reads to separate files. Memory usage is not dependent on the input file size but rather the maximum distance between paired reads in the two files, as the read cache is flushed each time paired reads are identified. In the worst-case scenario (one file has a single read that pairs with the last read in the matching file) memory usage can approach the largest file size, but in typical usage it rarely exceeds a few MB regardless of file size.
IMPORTANT: Reads in input files MUST be in the same order, aside from missing reads, or the output will report many valid pairs as singletons.
Specify FASTQ file containing the first of the trimmed read pairs
Specify FASTQ file containing the second of the trimmed read pairs
Specify output name for synced forward reads
Specify output name for synced reverse reads
Specify output name for forward singleton reads
Specify output name for reverse singleton reads
Specify suffix to add to synced read output files. This will be added to the input read name before the final suffix (i.e. after the last period). Default is 'sync'.
Specify type of compression for output files (will compress all output files)
If given, unpaired reads will be written to separate output files. Default is FALSE.
Specify suffix to add to singles read output files. This will be added to the input read name before the final suffix (i.e. after the last period). Default is 'singles'.
Display this usage page
Print version information
Currently no input validation is performed on the input files. Files are assumed to be standard FASTQ file format with each read represented by four lines and no other extraneous information present. CRITICALLY, they are also assumed to be in the same input order after accounting for deleted reads (the software will fail miserably if this is not the case).
Please submit bug reports to the issue tracker in the distribution repository.
Jeremy Volkening (jeremy.volkening@base2bio.com)
Copyright 2014-23 Jeremy Volkening
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.