Note: a -h flag after a command (e.g. >structural_pangenome find-ngaps -h) generates a help message for said command.
Step1: Ngap detection
Description | Report the stretch of N (Ngap) in each assembly |
Command | haptic find-ngaps assembly_fasta_file_file -o output_file structural_pangenome find-ngaps assembly_fasta_file -o output_file OR haptic find-ngaps assembly_fasta_file_file -p prefix_folder structural_pangenome find-ngaps assembly_fasta_file -p prefix_folder |
Parameters | o: output filename (with complete path) p: prefix_folder which stores the resulting output file. t: optional parameter, threshold: minimum recurrence of Ngap sequence length that affects statistics (e.g. 100) |
Input | Fasta file for each assembly (FILE) |
output | Ngap.bed file for each assembly assembly (OUTPUT) |
Usage | structural_pangenome find-ngaps [-h] (-p PREFIX | -o OUTPUT) [-t THRESHOLD] FILE |
Example | haptic find-ngaps fastas/assembly_1.fas -o outputs/resulting_ngaps.bed structural_pangenome find-naps fastas/assembly_1.fas -o outputs/resulting_ngaps.bed |
Can be Parallelized | Yes |
Optional | No |
...
Description | Indexing the assembly fasta file |
Command | haptic index assembly_fasta_file -o output_file structural_pangenome index assembly_fasta_file -o output_file OR haptic index assembly_fasta_file -p prefix_folder structural_pangenome index assembly_fasta_file -p prefix_folder |
Parameters | o: output filename (with complete path) p: prefix_folder which stores the resulting output file. |
Input | Fasta file for each assembly assembly (FILE) |
output | Indexed fasta file for each assembly assembly (OUTPUT) |
Usage | structural_pangenome index [-h] (-p PREFIX | -o OUTPUT) FILE |
Example | haptic index fastas/assembly_1.fas -p outputs/indexed_fasta structural_pangenome index fastas/assembly_1.fas -p outputs/indexed_fasta |
Can be Parallelized | Yes |
Optional | No |
...
Description | Building the pangenome graph using minigraph iteratively. You need to define the order in which the assemblies will be processed. First, you need to define which assembly is better to start with. Then other genomes need to be ranked according to the completeness of the assembly and the distance relative to the first chosen assembly (from closest to farthest). |
Command | haptic build-graph assembly1_fasta_file assembly2_fasta_file -o assembly1__assembly2_graph_filename structural_pangenome build-graph assembly1_fasta_file assembly2_fasta_file -o assembly1__assembly2_graph_filename OR haptic build-graph assembly1_fasta_file assembly2_fasta_file -p folder_to_populate_output structural_pangenome build-graph assembly1_fasta_file assembly2_fasta_file -p folder_to_populate_output |
Parameters | o: output filename (with complete path) p: prefix_folder which stores the resulting output file (with complete path) |
Input | Fasta file for each assembly (FASTA), (and possibly gfa file to add - GRAPH) |
output | One gfa file for all assemblies assemblies (OUTPUT) |
Usage | structural_pangenome build-graph [-h] (-p PREFIX | -o OUTPUT) GRAPH FASTA |
Example | haptic build-graph fastas/assembly_1.fas fastas/assembly_2.fas -p outputs/assembly_pangenome_graph structural_pangenome build-graph fastas/assembly_1.fas fastas/assembly_2.fas -p outputs/assembly_pangenome_graph |
Parallelizable | No |
Optional | No |
...