List of Options¶
[Input/Output and -so]- -in or -i [XXX] Input Fasta file (with extensions: .fasta, .fna, .mfa, .fa, .txt)
- -outdir or -out [XXX] Output directory. Default output will begin by "Result_", the input filename, followed by current date and time (day, month, year, hours, minutes, and seconds)
- -LOG or -log Option allowing to write LOG files (default: 0)
- -keepAll or -keep Option allowing to keep secondary folders/files (Prodigal/Prokka, Cas-finder, rawFASTA, Properties); (default: 0)
- -HTML or -html Option allowing to display results as a static HTML web page. The web page created (index.html) will be dependent of a CSS file (supplementary_files/crispr.css) provided.
- -copyCSS [XXX] Option allowing to copy provided CSS file into "Visualization" repository if option -HTML is set (default: 'supplementary_files/crispr.css')
- -soFile or -so [XXX] Option allowing to use the shared object file if it is not present in current directory (default: 'sel392v2.so')
- -mismDRs or -md [XXX] Percentage mismatchs allowed between DRs (default: 20)
- -truncDR or -t [XXX] Percentage mismatchs allowed for truncated DR (default: 33.3)
- -minDR or -mr [XXX] Minimal size of DRs (default: 23)
- -maxDR or -xr [XXX] Maximal size of DRs (default: 55)
- -minSP or -ms [XXX] Minimal size of Spacers (default: 25)
- -maxSP or -xs [XXX] Maximal size of Spacers (default: 60)
- -noMism or -n Option used to do not allow mismatches (default value is 1 when this option is not called. i.e. mismatches are allowed by default)
- -percSPmin or -pm [XXX] Minimal Spacers size in function of DR size (default: 0.6)
- -percSPmax or -px [XXX] Maximal Spacers size in function of DR size (default: 2.5)
- -spSim or -s [XXX] Maximal allowed percentage of similarity between Spacers (default: 60)
- -DBcrispr or -dbc [XXX] Option allowing to use a CSV file of all CRISPR candidates contained in CRISPRdb (from last update) (default: 'supplementary_files/CRISPR_crisprdb.csv')
- -repeats or -rpts [XXX] Option allowing to use a consensus repeats list generated by CRISPRdb in order to assign IDs and occurrence (default: 'supplementary_files/Repeat_List.csv')
- -DIRrepeat or -drpt [XXX] Option allowing to use a file file containing repeat IDs and orientation according to CRISPRDirection (default: 'supplementary_files/repeatDirection.tsv')
- -flank or -fl [XXX] Option allowing to set size of flanking regions in base pairs (bp) for each analyzed CRISPR array (default: 100)
- *-levelMin or -lMin Option allowing to choose the minimum evidence-level corresponding to CRISPR arrays we want to display (default: 1)
- -cas or -cs Search corresponding Cas genes using Prokka (default kingdom: 'Bacteria') and MacSyFinder (default: 0)
- -ccvRep or -ccvr Option used to write the CRISPR-Cas vicinity report (CRISPRs and Cas) if option -cas is set (default: 0)
- -vicinity or -vi [XXX] Option used to define number of nucleotides separating a CRISPR array from its neighboring Cas system (default: 600)
- -CASFinder or -cf [XXX] Option allowing to use the repository containing new CasFinder provided by Institut Pasteur (default: 'CasFinder-2.0')
- -cpuMacSyFinder or cpuM [XXX] Option allowing to set number of CPUs to use for MacSyFinder (default: 1)
- -rcfowce Option allowing to run CasFinder only when any CRISPR exists (default: 0) (set if -cas is set)
- -definition or -def [XXX] Option allowing to specify Cas-finder definition (if option -cas is set) to be more or less stringent (default: 'General' or 'G'). Other allowed parameters are 'Typing' (or 'T'), and 'SubTyping' (or 'S'). For more information, please see the MacSyFinder documentation
- -gffAnnot or -gff [XXX] Option allowing user to provide an annotation GFF file (if options -cas and -faa are set) (default: '')
- -proteome or -faa [XXX] Option allowing user to provide a proteome file '.faa' (if options -cas and -gff are set) (default: '')
- -cluster or -ccc [XXX] Option allowing to constitute clusters or groups of CRISPR or Cas systemes given a determined threshold e.g. 20000 bp (default: 0)
- -getSummaryCasfinder or -gscf Option allowing to get summary file of Cas-finder (MacSyFinder) and copy it to TSV repository (default: 0)
- -geneticCode or -gcode [XXX] Option allowing to modify the genetic code (translation table) for CDS annotation (default: 11)
- -useProkka or -prokka Option allowing to use Prokka instead of Prodigal (default: 0)
- -cpuProkka or cpuP [XXX] Option allowing to set number of CPUs to use for Prokka (default: 1)
- -metagenome or -meta Option allowing to better analyze metagenome with Prokka (default: '')
- -ArchaCas or -ac same option as -cas using 'Archaea' as default kingdom instead of 'Bacteria' (default: 0). Option to be used if -prokka is used.
Options waiting for a given parameter (filename, text, or number) are followed by symbols "[XXX]". Other options could be considered as booleans (yes or no, 1 or 0).
Examples of command line¶
(1) The minimal command line:¶
perl CRISPRCasFinder.pl sequence.fasta
In this example, the result folder will be in the directory named: "Result_sequence_DD_MM_YYYY_h_m_s". "DD_MM_YYYY_h_m_s" represents the current date (day, month, and year) and time (hours, minutes, and seconds)
(2) Changing some default parameters:¶
perl CRISPRCasFinder.pl -in sequence.fasta -md 20 -t 33.3 -mr 23 -xr 55 -ms 25 -xs 60 -pm 0.6 -px 2.5 -s 60
The result folder will be the same as in (1).
(3) Changing some default input data:¶
perl CRISPRCasFinder.pl -in multifasta.fna -drpt supplementary_files/repeatDirection.tsv -rpts supplementary_files/Repeat_List.csv -cas -ccvr -dbc supplementary_files/CRISPR_crisprdb.csv -cf CasFinder-2.0 -html
(4) Improve performances:¶
perl CRISPRCasFinder.pl -in multifasta.fna -cas -fr -cf CasFinder-2.0 -rcfowce -log -out Results_multifasta -cpuMacSyFinder 8
(5) Get CRISPR arrays and/or Cas systems organized as clusters containing elements having at most 20000 bp of difference between them:¶
perl CRISPRCasFinder.pl -in multifasta.fna -cas -cf CasFinder-2.0 -ccc 20000 -def SubTyping -out Results_with_clusters_and_Cas_subtyping_level
(6) Providing proteome and annotation (GFF) files for searching Cas genes:¶
perl CRISPRCasFinder.pl -in sequence.fasta -cas -cf CasFinder-2.0 -gff path/to/sequence.gff -proteome path/to/sequence.faa -keep
(7) The help documentation will be displayed:¶
perl CRISPRCasFinder.pl -help
(8) The current version of the program will be displayed:¶
perl CRISPRCasFinder.pl -v