Step1: Ngap detection
Description | Report the stretch of N (Ngap) in each assembly |
Command | haptic find-ngaps assembly_fasta_file_file -o output_file structural_pangenome find-ngaps assembly_fasta_file -o output_file OR haptic find-ngaps assembly_fasta_file_file -p prefix_folder structural_pangenome find-ngaps assembly_fasta_file -p prefix_folder |
Parameters | o: output filename (with complete path) p: prefix_folder which stores the resulting output file. t: optional parameter, threshold: minimum recurrence of Ngap sequence length that affects statistics (e.g. 100) |
Input | Fasta file for each assembly |
output | Ngap.bed file for each assembly |
Example | haptic find-ngaps fastas/assembly_1.fas -o outputs/resulting_ngaps.bed structural_pangenome find-naps fastas/assembly_1.fas -o outputs/resulting_ngaps.bed |
Can be Parallelized | Yes |
Optional | No |
Step2.1: Indexing
Description | Indexing the assembly fasta file |
Command | haptic index assembly_fasta_file -o output_file OR haptic index assembly_fasta_file -p prefix_folder |
Parameters | o: output filename (with complete path) p: prefix_folder which stores the resulting output file. |
Input | Fasta file for each assembly |
output | Indexed fasta file for each assembly |
Example | haptic index fastas/assembly_1.fas -p outputs/indexed_fasta |
Can be Parallelized | Yes |
Optional | No |
Step2.2: Iterative Pangenome graph build
Description | Building the pangenome graph using minigraph iteratively. You need to define the order in which the assemblies will be processed. First, you need to define which assembly is better to start with. Then other genomes need to be ranked according to the completeness of the assembly and the distance relative to the first chosen assembly (from closest to farthest). |
Command | haptic build-graph assembly1_fasta_file assembly2_fasta_file -o assembly1__assembly2_graph_filename OR haptic build-graph assembly1_fasta_file assembly2_fasta_file -p folder_to_populate_output |
Parameters | o: output filename (with complete path) p: prefix_folder which stores the resulting output file (with complete path) |
Input | Fasta file for each assembly, |
output | One gfa file for all assemblies |
Example | haptic build-graph fastas/assembly_1.fas fastas/assembly_2.fas -p outputs/assembly_pangenome_graph |
Parallelizable | No |
Optional | No |
Step2.3: Graph & stat - WIP
Step2.4: Pangenome Sequence Length distribution
Description | Get the pangenome sequence length distribution |
Command | haptic find-seqlen pengenome_fasta_file -o output_file OR haptic find-seqlen pengenome_fasta_file -p prefix_folder |
Parameters | o: output filename (with complete path) p: prefix_folder which stores the resulting output file (with complete path) g: optional parameter which also displays a graph of the sequence length distribution |
Input | Fasta file for each assembly |
output | One csv file for the whole Pangenome |
Example | Haptic find-seqlen pangenomes/pangenome__fasta.fas -o outputs/pangenome_seqlen.csv |
Can be Parallelized | No |
Optional | Yes |
Step2.5: Assembly vs pangenome
Description | Compare Assembly vs pangenome |
Command | haptic align indexed_assembly_fasta_file pangenome_fasta_file -o output_file OR haptic align indexed_assembly_fasta_file pangenome_fasta_file -p prefix_folder |
Parameters | o: output filename (with complete path) p: prefix_folder which stores the resulting output file (with complete path) |
Input | Indexed fasta file for each assembly, and the Pangenome Fasta file, it will be shared separately |
output | Pangenome paf file |
Example | haptic align fastas/assembly_1_indexed.fas pangenome/pangenome__fasta.fas -o outputs/output_pangenome.paf |
Can be Parallelized | Yes |
Optional | No |
Step3: Convert the PAF file into a delta file
Description | Convert the PAF file into a delta file. |
Command | haptic convert paf2delta paf_file assembly__fasta pangenome__fasta -o output_file OR haptic convert paf2delta paf_file assembly__fasta pangenome__fasta -p prefix_folder |
Parameters | o: output filename (with complete path) p: prefix_folder which stores the resulting output file (with complete path) |
Input |
|
output | Assembly Delta File |
Example | haptic convert paf2delta pafs/assembly_1_vs_pangenome__fasta.paf fastas/assembly_1.fas pangenomes/pangenome__fasta.fas -p outputs/paf2delta |
Can be Parallelized | Yes |
Optional | No |
Step4: Rendering plot dotplots
Description | OPTIONAL dotplot of the WGA |
Command | haptic render-dotplot delta_file -o output_file OR haptic render-dotplot delta_file -p prefix_folder |
Parameters | o: output filename (with complete path) p: prefix_folder which stores the resulting output file (with complete path) |
Input | Assembly fasta delta file |
output | · Assembly Fasta gp file · Assembly Fasta rplot file · Assembly Fasta fplot file · Assembly Fasta PNG file |
Example | haptic render-dotplot deltas/assembly_1_vs_pangenome__fasta.delta -p outputs/dotplot |
Can be Parallelized | Yes |
Optional | Yes |
Step5: Filtering Delta files – WIP
Step6: Rendering plot dotplots for filtered Delta files
Description | OPTIONAL dotplot of the WGA |
Command | haptic render-dotplot filtered_delta_file -o output_file OR haptic render-dotplot filtered_delta_file -p prefix_folder |
Parameters | o: output filename (with complete path) p: prefix_folder which stores the resulting output file (with complete path) |
Input | Assembly fasta filtered delta file |
output | · Assembly Fasta gp file · Assembly Fasta rplot file · Assembly Fasta fplot file · Assembly Fasta PNG file |
Example | haptic render-dotplot deltas/assembly_1_vs_pangenome__fasta_bbmhFilter.delta -p outputs/dotplot |
Can be Parallelized | Yes |
Optional | Yes |
Step7: Reverse Filtered Delta Files - WIP
Step8: Coordinate Pangenome Creation
Description | Create the coordinate pangenome system in JSON |
Command | haptic build-json reverse_filtered_delta_file -o output_file OR Haptic build-json delta_file -p prefix-folder |
Parameters | o: output filename (with complete path) p: prefix_folder which stores the resulting output file (with complete path) |
Input | Reversed filtered delta file |
output | Pangenome coordinate JSON file |
Example | Haptic build-json deltas/pangenome__fasta_vs_assembly_1_bbmhFiter.delta -o outputs/coordinate-json.json |
Can be Parallelized | Yes |
Optional | No |
Step9: Pangenome Sequence Length Distribution
Description | Get the sequence length distribution of the pangenome segment with coordinate in the pangenome |
Command | haptic find-seqlen reverse_filtered_delta_file -o output_file -g OR haptic find-seqlen reverse_filtered_delta_file -p prefix_folder -g |
Parameters | o: output filename (with complete path) p: prefix_folder which stores the resulting output file (with complete path) g: option to yield sequence length distribution graph |
Input | Reversed filtered delta file |
output | pangenome_coordinate_seqLength Graph |
Example | Haptic find-seqlen deltas/pangenome__fasta_vs_assembly_1_bbmhFiter.delta -p outputs/seqlen |
Can be Parallelized | Yes |
Optional | Yes |
Step10: Pangenome Path Creation - WIP