summarize_assembly - print basic summary info for a file of assembly scaffolds
summarize_assembly [--cutoffs cutoff_1 cutoff_2 .. cutoff_N --strip_N --split_N ] --fasta input_file]
This program takes a FASTA file and optionally a list of cutoff values as input and prints out summary information about the contigs/scaffolds contained in the file. You can, of course, supply a FASTA file of any sort of nucleic acid sequences, but the summary information makes most sense for contigs from genomic sequencing assemblies.
Specify contig/scaffold file from which to read input (default: STDIN)
Space-separated integer list of cutoffs to calculate (e.g. '--cutoffs 50 90' will output N50 and N90 values) (default: 50)
If specified, Ns will be stripped from scaffold sequences before statistics are calculated (default: FALSE)
If specified, scaffold sequences will be split at regions of one or more Ns before statistics are calculated (e.g. to get contig-level stats from a scaffold file). Note that if this flag is specified, the value of '--strip_N' will be ignored. (default: FALSE)
Display this usage page
Print version information
Please submit bug reports to the issue tracker in the distribution repository.
Jeremy Volkening (jeremy.volkening@base2bio.com)
Copyright 2014-23 Jeremy Volkening
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.