Project

General

Profile

List of Options

[Input/Output and -so]
  • -in or -i [XXX] Input Fasta file (with extensions: .fasta, .fna, .mfa, .fa, .txt)
  • -outdir or -out [XXX] Output directory. Default output will begin by "Result_", the input filename, followed by current date and time (day, month, year, hours, minutes, and seconds)
  • -LOG or -log Option allowing to write LOG files (default: 0)
  • -keepAll or -keep Option allowing to keep secondary folders/files (Prodigal/Prokka, Cas-finder, rawFASTA, Properties); (default: 0)
  • -HTML or -html Option allowing to display results as a static HTML web page. The web page created (index.html) will be dependent of a CSS file (supplementary_files/crispr.css) provided.
  • -copyCSS [XXX] Option allowing to copy provided CSS file into "Visualization" repository if option -HTML is set (default: 'supplementary_files/crispr.css')
  • -soFile or -so [XXX] Option allowing to use the shared object file if it is not present in current directory (default: 'sel392v2.so')
[Detection of CRISPR arrays]
  • -mismDRs or -md [XXX] Percentage mismatchs allowed between DRs (default: 20)
  • -truncDR or -t [XXX] Percentage mismatchs allowed for truncated DR (default: 33.3)
  • -minDR or -mr [XXX] Minimal size of DRs (default: 23)
  • -maxDR or -xr [XXX] Maximal size of DRs (default: 55)
  • -minSP or -ms [XXX] Minimal size of Spacers (default: 25)
  • -maxSP or -xs [XXX] Maximal size of Spacers (default: 60)
  • -noMism or -n Option used to do not allow mismatches (default value is 1 when this option is not called. i.e. mismatches are allowed by default)
  • -percSPmin or -pm [XXX] Minimal Spacers size in function of DR size (default: 0.6)
  • -percSPmax or -px [XXX] Maximal Spacers size in function of DR size (default: 2.5)
  • -spSim or -s [XXX] Maximal allowed percentage of similarity between Spacers (default: 60)
  • -DBcrispr or -dbc [XXX] Option allowing to use a CSV file of all CRISPR candidates contained in CRISPRdb (from last update) (default: 'supplementary_files/CRISPR_crisprdb.csv')
  • -repeats or -rpts [XXX] Option allowing to use a consensus repeats list generated by CRISPRdb in order to assign IDs and occurrence (default: 'supplementary_files/Repeat_List.csv')
  • -DIRrepeat or -drpt [XXX] Option allowing to use a file file containing repeat IDs and orientation according to CRISPRDirection (default: 'supplementary_files/repeatDirection.tsv')
  • -flank or -fl [XXX] Option allowing to set size of flanking regions in base pairs (bp) for each analyzed CRISPR array (default: 100)
  • *-levelMin or -lMin Option allowing to choose the minimum evidence-level corresponding to CRISPR arrays we want to display (default: 1)
[Detection of Cas clusters]
  • -cas or -cs Search corresponding Cas genes using Prokka (default kingdom: 'Bacteria') and MacSyFinder (default: 0)
  • -ccvRep or -ccvr Option used to write the CRISPR-Cas vicinity report (CRISPRs and Cas) if option -cas is set (default: 0)
  • -vicinity or -vi [XXX] Option used to define number of nucleotides separating a CRISPR array from its neighboring Cas system (default: 600)
  • -CASFinder or -cf [XXX] Option allowing to use the repository containing new CasFinder provided by Institut Pasteur (default: 'CasFinder-2.0')
  • -cpuMacSyFinder or cpuM [XXX] Option allowing to set number of CPUs to use for MacSyFinder (default: 1)
  • -rcfowce Option allowing to run CasFinder only when any CRISPR exists (default: 0) (set if -cas is set)
  • -definition or -def [XXX] Option allowing to specify Cas-finder definition (if option -cas is set) to be more or less stringent (default: 'General' or 'G'). Other allowed parameters are 'Typing' (or 'T'), and 'SubTyping' (or 'S'). For more information, please see the MacSyFinder documentation
  • -gffAnnot or -gff [XXX] Option allowing user to provide an annotation GFF file (if options -cas and -faa are set) (default: '')
  • -proteome or -faa [XXX] Option allowing user to provide a proteome file '.faa' (if options -cas and -gff are set) (default: '')
  • -cluster or -ccc [XXX] Option allowing to constitute clusters or groups of CRISPR or Cas systemes given a determined threshold e.g. 20000 bp (default: 0)
  • -getSummaryCasfinder or -gscf Option allowing to get summary file of Cas-finder (MacSyFinder) and copy it to TSV repository (default: 0)
  • -geneticCode or -gcode [XXX] Option allowing to modify the genetic code (translation table) for CDS annotation (default: 11)
[Use Prokka instead of Prodigal (default option)] Prokka (https://github.com/tseemann/prokka) must be installed to use the following options
  • -useProkka or -prokka Option allowing to use Prokka instead of Prodigal (default: 0)
  • -cpuProkka or cpuP [XXX] Option allowing to set number of CPUs to use for Prokka (default: 1)
  • -metagenome or -meta Option allowing to better analyze metagenome with Prokka (default: '')
  • -ArchaCas or -ac same option as -cas using 'Archaea' as default kingdom instead of 'Bacteria' (default: 0). Option to be used if -prokka is used.

Options waiting for a given parameter (filename, text, or number) are followed by symbols "[XXX]". Other options could be considered as booleans (yes or no, 1 or 0).

Examples of command line

(1) The minimal command line:

perl CRISPRCasFinder.pl sequence.fasta
In this example, the result folder will be in the directory named: "Result_sequence_DD_MM_YYYY_h_m_s". "DD_MM_YYYY_h_m_s" represents the current date (day, month, and year) and time (hours, minutes, and seconds)

(2) Changing some default parameters:

perl CRISPRCasFinder.pl -in sequence.fasta -md 20 -t 33.3 -mr 23 -xr 55 -ms 25 -xs 60 -pm 0.6 -px 2.5 -s 60
The result folder will be the same as in (1).

(3) Changing some default input data:

perl CRISPRCasFinder.pl -in multifasta.fna -drpt supplementary_files/repeatDirection.tsv -rpts supplementary_files/Repeat_List.csv -cas -ccvr -dbc supplementary_files/CRISPR_crisprdb.csv -cf CasFinder-2.0 -html

(4) Improve performances:

perl CRISPRCasFinder.pl -in multifasta.fna -cas -fr -cf CasFinder-2.0 -rcfowce -log -out Results_multifasta -cpuMacSyFinder 8

(5) Get CRISPR arrays and/or Cas systems organized as clusters containing elements having at most 20000 bp of difference between them:

perl CRISPRCasFinder.pl -in multifasta.fna -cas -cf CasFinder-2.0 -ccc 20000 -def SubTyping -out Results_with_clusters_and_Cas_subtyping_level

(6) Providing proteome and annotation (GFF) files for searching Cas genes:

perl CRISPRCasFinder.pl -in sequence.fasta -cas -cf CasFinder-2.0 -gff path/to/sequence.gff -proteome path/to/sequence.faa -keep

(7) The help documentation will be displayed:

perl CRISPRCasFinder.pl -help

(8) The current version of the program will be displayed:

perl CRISPRCasFinder.pl -v