<<

NAME

ncbi_blast_update - intelligent management of updates to local BLAST databases from remote servers

SYNOPSIS

ncbi_blast_update [options] --db db1,db2,etc --local_path path/to/local/db

OPTIONS

--server string

FQDN of FTP server to use

--remote_path string

Absolute path on the FTP server to the BLAST database files

--local_path

Full path to local directory where BLAST database files are stored. The SQLite database will also be written to this directory if it does not exist. The user running the update must have full read/write permissions on this directory.

--passive

Use passive FTP. This is often necessary when downloading from behind a firewall (default: TRUE).

--timeout integer

Set FTP timeout, in seconds (default: 600)

--attempts integer

Number of times to attempt a download before giving up (default: 3)

--list

Don't attempt any downloads - just query the remote server and print a list of all databases available for download.

--clean

Creates fresh SQLite database before commencing download (overwriting existing database file if necessary). Use with caution - this option will wipe out the download history and force a new download of all requested databases. It will not delete BLAST files on disk, although it is recommended to do so before running this command to keep things clean and sychronized.

--db string

Comma-separated list of database names to check/update. Example: 'nt,nr'

--chmod

After updating database, make files world-readable.

--syslog

Send status and error messages to the syslog daemon, if running

--verbose

Print additional warnings and status messages to STDERR

--version

Print sofware name, version, and license info and exit

DESCRIPTION

This program handles updating and tracking of currently installed preformatted NCBI BLAST databases. It tracks local versions using SQLite, and compares MD5 sums between remote files and records of previous downloads. It only downloads database files whose MD5 sums have changed, and thus in theory is capable of incremental updates, although experience suggests that NCBI makes changes to all files in a database during updates that result in different MD5 sums and therefore triggers new downloads of all database files. New downloads are first saved to a temporary directory, and therefore the program is relatively tolerant to errors during download or decompression in that it does not remove any existing files from disk until all previous steps have completed successfully.

The suggested way to use this script is as a cron job (daily, weekly, monthly, etc, as desired). In this case, syslog logging is implemented to integrate tracking of success or failure into standard system monitoring and reporting.

CAVEATS

Currently, the software does not make much effort to handle orphaned files (e.g. files in the local BLAST directory that, for whatever reason, are not tracked in the current sqlite database. This can be convenient, since it allows non-NCBI databases to co-exist in the same directory without fear of being inadvertently removed. The only situation in which the software will remove orphaned files is if they are currently listed in the SQLite database but no longer exist remotely.

AUTHOR

Jeremy Volkening (jeremy.volkening@base2bio.com)

COPYRIGHT AND LICENSE

Copyright 2014-2023 Jeremy Volkening

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.

<<