Note (Help Messages): a -h flag after a command (e.g. >structural_pangenome find-ngaps -h) generates a help message for said command.
Step1: Ngap detection
Description | Report the stretch of N (Ngap) in each assembly |
Command | structural_pangenome find-ngaps assembly_fasta_file -o output_file OR structural_pangenome find-ngaps assembly_fasta_file -p prefix_folder |
haptic command | haptic find-ngaps assembly_fasta_file_file -o output_file OR haptic find-ngaps assembly_fasta_file_file -p prefix_folder |
Parameters | o: output filename (with complete path) p: prefix_folder which stores the resulting output file. t: optional parameter, threshold: minimum recurrence of Ngap sequence length that affects statistics (e.g. 100) |
Input | Fasta file for each assembly (FILE) |
output | Ngap.bed file for each assembly (OUTPUT) |
Usage | structural_pangenome find-ngaps [-h] (-p PREFIX | -o OUTPUT) [-t THRESHOLD] FILE |
Example | haptic find-ngaps fastas/assembly_1.fas -o outputs/resulting_ngaps.bed structural_pangenome find-naps fastas/assembly_1.fas -o outputs/resulting_ngaps.bed |
Can be Parallelized | Yes |
Optional | No |
Step2.1: Indexing
Description | Indexing the assembly fasta file |
Command | structural_pangenome index assembly_fasta_file -o output_file OR haptic index assembly_fasta_file -p prefix_folder structural_pangenome index assembly_fasta_file -p prefix_folder |
haptic command | haptic index assembly_fasta_file -o output_file OR haptic index assembly_fasta_file -p prefix_folder |
Parameters | o: output filename (with complete path) p: prefix_folder which stores the resulting output file. |
Input | Fasta file for each assembly (FILE) |
output | Indexed fasta file for each assembly (OUTPUT) |
Usage | structural_pangenome index [-h] (-p PREFIX | -o OUTPUT) FILE |
Example | haptic index fastas/assembly_1.fas -p outputs/indexed_fasta structural_pangenome index fastas/assembly_1.fas -p outputs/indexed_fasta |
Can be Parallelized | Yes |
Optional | No |
Step2.2: Iterative Pangenome graph build
Description | Building the pangenome graph using minigraph iteratively. You need to define the order in which the assemblies will be processed. First, you need to define which assembly is better to start with. Then other genomes need to be ranked according to the completeness of the assembly and the distance relative to the first chosen assembly (from closest to farthest). |
Command | structural_pangenome build-graph assembly1_fasta_file assembly2_fasta_file -o assembly1__assembly2_graph_filename OR structural_pangenome build-graph assembly1_fasta_file assembly2_fasta_file -p folder_to_populate_output |
haptic command | haptic build-graph assembly1_fasta_file assembly2_fasta_file -o assembly1__assembly2_graph_filename OR haptic build-graph assembly1_fasta_file assembly2_fasta_file -p folder_to_populate_output |
Parameters | o: output filename (with complete path) p: prefix_folder which stores the resulting output file (with complete path) |
Input | Fasta file for each assembly (FASTA), (and possibly gfa file to add - GRAPH) |
output | One gfa file for all assemblies (OUTPUT) |
Usage | structural_pangenome build-graph [-h] (-p PREFIX | -o OUTPUT) GRAPH FASTA |
Example | haptic build-graph fastas/assembly_1.fas fastas/assembly_2.fas -p outputs/assembly_pangenome_graph structural_pangenome build-graph fastas/assembly_1.fas fastas/assembly_2.fas -p outputs/assembly_pangenome_graph |
Parallelizable | No |
Optional | No |
Step2.3: Graph & stat
Description | Extract from minigraph pangenome all fragmented sequences in a FASTA file |
Command | structural_pangenome convert gfa2fa assembly_1_2_3.gfa > pangenome__fasta.fas |
Parameters | gfa2fa: convert GFA file to FASTA file |
Input | One gfa file for all assemblies (GFA) |
Output | One FASTA file for the whole pangenome (OUTPUT) |
Usage | structural_pangenome convert gfa2fa GFA > OUTPUT |
Example | structural_pangenome convert gfa2fa 2.2.gfas/assembly123.gfa > 2.3.Stat/pangenome__fasta.fas |
Parallelizable | No |
Optional | No |
Step2.4: Pangenome Sequence Length distribution
Description | Get the pangenome sequence length distribution |
Command | structural_pangenome find-seqlen pengenome_fasta_file -o output_file OR structural_pangenome find-seqlen pengenome_fasta_file -p prefix_folder |
haptic command | haptic find-seqlen pengenome_fasta_file -o output_file OR haptic find-seqlen pengenome_fasta_file -p prefix_folder |
Parameters | o: output filename (with complete path) (OUTPUT) p: prefix_folder which stores the resulting output file (with complete path) (PREFIX) g: optional parameter which also displays a graph of the sequence length distribution (-g) |
Input | Fasta file for each assembly (FILE) |
output | One csv file for the whole Pangenome (OUTPUT) |
Usage | structural_pangenome find-seqlen [-h] (-p PREFIX | -o OUTPUT) [-g] FILE |
Example | Haptic find-seqlen pangenomes/pangenome__fasta.fas -o outputs/pangenome_seqlen.csv |
Can be Parallelized | No |
Optional | Yes |
Step2.5: Assembly vs pangenome
Description | Compare Assembly vs pangenome |
Command | structural_pangenome indexed_assembly_fasta_file pangenome_fasta_file -o output_file OR structural_pangenome align indexed_assembly_fasta_file pangenome_fasta_file -p prefix_folder |
haptic command | haptic align indexed_assembly_fasta_file pangenome_fasta_file -o output_file OR haptic align indexed_assembly_fasta_file pangenome_fasta_file -p prefix_folder |
Parameters | o: output filename (with complete path) p: prefix_folder which stores the resulting output file (with complete path) |
Input | Indexed fasta file for each assembly (TARGET), and the Pangenome Fasta file (QUERY), it will be shared separately |
output | Pangenome paf file (OUTPUT) |
Usage | structural_pangenome align [-h] (-p PREFIX | -o OUTPUT) TARGET QUERY |
Example | haptic align fastas/assembly_1_indexed.fas pangenome/pangenome__fasta.fas -o outputs/output_pangenome.paf structural_pangenome align fastas/assembly_1_indexed.fas pangenome/pangenome__fasta.fas -o outputs/output_pangenome.paf |
Can be Parallelized | Yes |
Optional | No |
Step3: Convert the PAF file into a delta file
Description | Convert the PAF file into a delta file. |
Command | structural_pangenome convert paf2delta paf_file assembly__fasta pangenome__fasta -o output_file OR structural_pangenome convert paf2delta paf_file assembly__fasta pangenome__fasta -p prefix_folder |
haptic command | haptic convert paf2delta paf_file assembly__fasta pangenome__fasta -o output_file OR haptic convert paf2delta paf_file assembly__fasta pangenome__fasta -p prefix_folder |
Parameters | o: output filename (with complete path) p: prefix_folder which stores the resulting output file (with complete path) |
Input |
|
output | Assembly Delta File (OUTPUT) |
Usage | structural_pangenome convert paf2delta structural_pangenome convert paf2delta FILE TARGET QUERY [-h] (-p PREFIX | -o OUTPUT) |
Example | haptic convert paf2delta pafs/assembly_1_vs_pangenome__fasta.paf fastas/assembly_1.fas pangenomes/pangenome__fasta.fas -p outputs/paf2delta structural_pangenome convert paf2delta pafs/assembly_1_vs_pangenome__fasta.paf fastas/assembly_1.fas pangenomes/pangenome__fasta.fas -p outputs/paf2delta |
Can be Parallelized | Yes |
Optional | No |
Step4: Rendering plot dotplots
Description | OPTIONAL dotplot of the WGA |
Command | structural_pangenome render-dotplot delta_file -o output_file OR structural_pangenome render-dotplot delta_file -p prefix_folder |
haptic command | haptic render-dotplot delta_file -o output_file OR haptic render-dotplot delta_file -p prefix_folder |
Parameters | o: output filename (with complete path) (OUTPUT) p: prefix_folder which stores the resulting output file (with complete path) (PREFIX) |
Input | Assembly fasta delta file (DELTA) |
output | · Assembly Fasta gp file · Assembly Fasta rplot file · Assembly Fasta fplot file · Assembly Fasta PNG file |
Usage | structural_pangenome render-dotplot [-h] (-p PREFIX | -o OUTPUT) DELTA |
Example | haptic render-dotplot deltas/assembly_1_vs_pangenome__fasta.delta -p outputs/dotplot structural_pangenome render-dotplot deltas/assembly_1_vs_pangenome__fasta.delta -p outputs/dotplot |
Can be Parallelized | Yes |
Optional | Yes |
Step5: Filtering Delta files
Description | Filter the minimap2 result (filter unwanted regions from delta file) |
Command | structural_pangenome filter delta assembly_1_vs_pangenome__fasta.delta |
Parameters | o: output filename (with complete path) (OUTPUT) p: prefix_folder which stores the resulting output file (with complete path) (PREFIX) m: optional, metrics: save filtering metrics to a file (name auto-generated) --inner_cutoff: optional, (INNER_CUTOFF): Threshold to filter the delta results based on the in between two existing fragments. default value: 0.8 --outer_cutoff: optional, (OUTER_CUTOFF): Threshold to filter the delta results based on the overlap with another fragment. default value: 0.5 |
Input | Assembly Delta File |
output | Filtered Assembly Delta File |
Usage | structural_pangenome filter-delta [-h] (-p PREFIX | -o OUTPUT) FILE [-INNER_CUTOFF] [-OUTER_CUTOFF] [-m] |
Example | structural_pangenome filter-delta deltas/assembly_1_vs_pangenome__fasta.delta -o filtered_deltas/ssembly_1_vs_pangenome__fasta_bbmh_filter.delta -m |
Can be Parallelized | Yes |
Optional | No |
Step6: Rendering plot dotplots for filtered Delta files
Description | OPTIONAL dotplot of the WGA |
Command | structural_pangenome render-dotplot filtered_delta_file -o output_file OR structural_pangenome render-dotplot filtered_delta_file -p prefix_folder |
haptic command | haptic render-dotplot filtered_delta_file -o output_file OR haptic render-dotplot filtered_delta_file -p prefix_folder |
Parameters | o: output filename (with complete path) (PREFIX) p: prefix_folder which stores the resulting output file (with complete path) (OUTPUT) |
Input | Assembly fasta filtered delta file (DELTA) |
output | · Assembly Fasta gp file · Assembly Fasta rplot file · Assembly Fasta fplot file · Assembly Fasta PNG file |
Usage | structural_pangenome render-dotplot [-h] (-p PREFIX | -o OUTPUT) DELTA |
Example | haptic render-dotplot deltas/assembly_1_vs_pangenome__fasta_bbmhFilter.delta -p outputs/dotplot structural_pangenome render-dotplot deltas/assembly_1_vs_pangenome__fasta_bbmhFilter.delta -p outputs/dotplot |
Can be Parallelized | Yes |
Optional | Yes |
Step7: Reverse Filtered Delta Files
Description | y |
Command | structural_pangenome reverse-delta assembly_1_vs_pangenome__fasta_bbmhFilter.delta -o pangenome__fasta_vs_assembly_1_bbmhFilter.delta |
Parameters | o: output filename (with complete path) (PREFIX) p: prefix_folder which stores the resulting output file (with complete path) (OUTPUT) |
Input | Assembly fasta filtered delta file (FILE) |
output | Reverse filtered delta file (OUTPUT) |
Usage | structural_pangenome reverse-delta [-h] (-p PREFIX | -o OUTPUT) FILE |
Example | structural_pangenome reverse-delta filtered_deltas/_assembly1_v1_bbmh-filter.delta -o reversed_filtered_deltas/ pangenome__fasta_vs_assembly_1_bbmhFilter.delta |
Can be parallelized | Yes |
Optional | No |
Step8: Coordinate Pangenome Creation
Description | Create the coordinate pangenome system in JSON |
Command | structural_pangenome build-coordinate reverse_filtered_delta_file -o output_file OR structural_pangenome build-coordinate delta_file -p prefix-folder |
haptic command | haptic build-json reverse_filtered_delta_file -o output_file OR haptic build-json delta_file -p prefix-folder |
Parameters | o: output filename (with complete path) p: prefix_folder which stores the resulting output file (with complete path) unzipped, optional: set this flag to generated an uncompressed file (–unzipped) pretty, optional: set this flag to generated a json file that is human readable (–pretty) m, optional: metrics, Whether to save coordinate metrics. Filename auto-generated if not given (-m) |
Input | Reversed filtered delta file(s) (DELTA) (DELTA ..), |
output | Pangenome coordinate JSON file (OUTPUT) |
Example | Haptic build-json deltas/pangenome__fasta_vs_assembly_1_bbmhFiter.delta -o outputs/coordinate-json.json |
Usage | structural_pangenome build-coordinate [-h] (-p PREFIX | -o OUTPUT) [–unzipped] [--pretty] [-m [METRICS]] DELTA [DELTA ...] |
Can be Parallelized | Yes |
Optional | No |
Step9: Pangenome Sequence Length Distribution
Description | Get the sequence length distribution of the pangenome segment with coordinate in the pangenome |
Command | structural_pangenome find-coordinate-seqlen pangenome_coordinate_file sequence_lengths_file -o output_file -g OR structural_pangenome find-coordinate-seqlen pangenome_coordinate_file sequence_lengths_file -p prefix_folder -g |
haptic command | haptic find-seqlen reverse_filtered_delta_file -o output_file -g OR haptic find-seqlen reverse_filtered_delta_file -p prefix_folder -g |
Parameters | o: output filename (with complete path) p: prefix_folder which stores the resulting output file (with complete path) g: optional, to yield sequence length distribution graph |
Input | pangenome coordinate file (COORDINATE) sequence lengths file (.csv) (SEQLEN) |
output | pangenome_coordinate_seqLength Graph (OUTPUT) |
Usage | structural_pangenome find-coordinate-seqlen [-h] (-p PREFIX | -o OUTPUT) [-g] COORDINATE SEQLEN |
Example | Haptic find-seqlen deltas/pangenome__fasta_vs_assembly_1_bbmhFiter.delta -p outputs/seqlen structural_pangenome find-coordinate-seqlen Step8.CoordPangenome/coordAssem_pangenome.json Step2.4.seqLens/pangenome__fasta_seqlen.csv -p outputs/seqlenPangenome -g |
Can be Parallelized | Yes |
Optional | Yes |
Step10: Pangenome Path Creation
Description | Builds a pangenome path json file, based on the coordinate json file |
Command | structural_pangenome build-path pangenome_coordinate_json_file -o output file |
Parameters | o: output filename (with complete path) p: prefix_folder which stores the resulting output file (with complete path) unzipped, optional: set this flag to generated an uncompressed file (–unzipped) pretty, optional: set this flag to generated a json file that is human readable (–pretty) |
Input | pangenome coordinate JSON file (JSON) |
output | pangenome path JSON FILE (OUTPUT) |
Usage | structural_pangenome build-path [-h] (-p PREFIX | -o OUTPUT) [–unzipped] [–pretty] JSON |
Example | structural_pangenome build-path Step8.CoordPangenome/coordAssem_pangenome.json -o 10.PathPangenome/path__pangenome.json --unzipped --pretty |
Can be Parallelized | Yes |
Optional | No |