Note (Help Messages): a -h flag after a command (e.g. >structural_pangenome find-ngaps -h) generates a help message for said command.
>structural_pangenome --help displays available structural_pangenome commands.
Step1: Ngap detection
Description | Report the stretch of N Ns (Ngap) in each assembly | |
Command | haptic structural_pangenome find-ngaps assembly_fasta_file _file -o output_file OR structural_pangenome find-ngaps assembly_fasta_file -o outputp prefix_filefolderOR | |
haptic command | haptic find-ngaps assembly_fasta_file_file - | p prefix_folderstructural_pangenome o output_file OR haptic find-ngaps assembly_fasta_file_file -p prefix_folder |
Parameters | o: output filename (with complete path) p: prefix_folder which stores the resulting output file. t: optional parameter, threshold: minimum recurrence of Ngap sequence length that affects statistics (e.g. 100integer) | |
Input | Fasta file for each assembly (FILE) | |
output | Ngap.bed file for each assembly assembly (OUTPUT) | |
Usage | structural_pangenome find-ngaps [-h] (-p PREFIX | -o OUTPUT) [-t THRESHOLD] FILE | |
Example | haptic structural_pangenome find-ngaps fastas/assembly_1.fas -o outputs/1.Ngaps/resulting_ngaps.bedstructural_pangenome | |
haptic example | haptic find- | napsngaps fastas/assembly_1.fas -o outputs/resulting_ngaps.bed |
Can be Parallelized | Yes | |
Optional | No |
Step2.1: Indexing
Description | Indexing the assembly fasta file |
Command | structural_pangenome index assembly_fasta_file -o output_file OR structural_pangenome index assembly_fasta_file -p prefix_folder |
haptic command | haptic index assembly_fasta_file -o output_file OR haptic index assembly_fasta_file -p prefix_folder |
Parameters | o: output filename (with complete path) p: prefix_folder which stores the resulting output file. |
Input | Fasta file for each |
assembly (FILE) | |
output | Indexed fasta file for each |
assembly (OUTPUT) | |
Usage | structural_pangenome index [-h] (-p PREFIX | -o OUTPUT) FILE |
Example | structural_pangenome index fastas/assembly_1.fas -p outputs/2.1.indexed_fasta |
haptic example | haptic |
Example
index fastas/assembly_1.fas -p outputs/2.1.indexed_fasta/ | |
Can be Parallelized | Yes |
Optional | No |
Step2.2: Iterative Pangenome graph build
Description | Building the pangenome graph using minigraph iteratively. You need to define the order in which the assemblies will be processed. First, you need to define which assembly is better to start with. Then other genomes need to be ranked according to the completeness of the assembly and the distance relative to the first chosen assembly (from closest to farthest). |
Command | structural_pangenome build-graph assembly1_fasta_file assembly2_fasta_file -o assembly1__assembly2_graph_filename OR structural_pangenome build-graph assembly1_fasta_file assembly2_fasta_file -p folder_to_populate_output |
haptic command | haptic build-graph assembly1_fasta_file assembly2_fasta_file -o assembly1__assembly2_graph_filename OR haptic build-graph assembly1_fasta_file assembly2_fasta_file -p folder_to_populate_output |
Parameters | o: output filename (with complete path) p: prefix_folder which stores the resulting output file (with complete path) |
Input | Fasta file for each assembly (FASTA), (and possibly gfa file to add - GRAPH) |
output | One gfa file for all assemblies Example assemblies (OUTPUT) |
Usage | structural_pangenome build-graph [-h] (-p PREFIX | -o OUTPUT) GRAPH FASTA |
Example | structural_pangenome build-graph fastas/assembly_1.fas fastas/assembly_2.fas -p outputs/2.2gfas/assembly_pangenome_graph |
haptic example | haptic build-graph fastas/assembly_1.fas fastas/assembly_2.fas -p outputs/assembly_pangenome_graph |
Parallelizable | No |
Optional | No |
Step2.3: Graph & stat - WIPstat
Description | Extract from minigraph pangenome all fragmented sequences in a FASTA file |
Command | structural_pangenome convert gfa2fa assembly_1_2_3.gfa -o 2.3.Stat/pangenome__fasta.fas |
Parameters | gfa2fa: convert GFA file to FASTA file |
Input | One gfa file for all assemblies (GFA) |
Output | One FASTA file for the whole pangenome (OUTPUT) |
Usage | structural_pangenome convert gfa2fa GFA -o OUTPUT |
Example | structural_pangenome convert gfa2fa 2.2.gfas/assembly123.gfa -o 2.3.Stat/pangenome__fasta.fas |
Parallelizable | No |
Optional | No |
Step2.4: Pangenome Sequence Length distribution
Description | Get the pangenome sequence length distribution | ||
Command | structural_pangenome find-seqlen pengenome_fasta_file -o output_file OR structural_pangenome find-seqlen pengenome_fasta_file -p prefix_folder | ||
haptic command | haptic find-seqlen pengenome_fasta_file -o output_file OR haptic find-seqlen pengenome_fasta_file -p prefix_folder | ||
Parameters | o: output filename (with complete path) (OUTPUT) p: prefix_folder which stores the resulting output file (with complete path) (PREFIX) g: optional parameter which also displays a graph of the sequence length distribution (-g) | ||
Input | Fasta file for each assembly assembly (FILE) | ||
output | One csv file for the whole Pangenome Pangenome (OUTPUT) | ||
Usage | structural_pangenome find-seqlen [-h] (-p PREFIX | -o OUTPUT) [-g] FILE | ||
Example | structural_pangenome find-seqlen pangenomes/pangenome__fasta.fas -o outputs/2.4.PangSeqLen/pangenome_seqlen.csv -g | ||
haptic example | haptic | Example | Haptic find-seqlen pangenomes/pangenome__fasta.fas -o outputs/pangenome_seqlen.csv |
Can be Parallelized | No | ||
Optional | Yes |
...
Description | Compare Assembly vs pangenome |
Command | structural_pangenome indexed_assembly_fasta_file pangenome_fasta_file -o output_file OR structural_pangenome align indexed_assembly_fasta_file pangenome_fasta_file -p prefix_folder |
haptic command | haptic align indexed_assembly_fasta_file pangenome_fasta_file -o output_file OR haptic align indexed_assembly_fasta_file pangenome_fasta_file -p prefix_folder |
Parameters | o: output filename (with complete path) p: prefix_folder which stores the resulting output file (with complete path) |
Input | Indexed fasta file for each assembly (TARGET) - From Step2.1, and the Pangenome Fasta file , it will be shared separately (QUERY) - From Step2.3 |
output | Pangenome paf file (OUTPUT) |
Usage | structural_pangenome align [-h] (-p PREFIX | -o OUTPUT) TARGET QUERY |
Example | structural_pangenome align fastas/assembly_1_indexed.fas pangenome/pangenome__fasta.fas -o outputs/output_pangenome.paf |
haptic example | haptic align fastas/assembly_1_indexed.fas pangenome/pangenome__fasta.fas -o outputs/output_pangenome.paf |
Can be Parallelized | Yes |
Optional | No |
...
Description | Convert the PAF file into a delta file. | ||
Command | structural_pangenome convert paf2delta paf_file assembly__fasta pangenome__fasta -o output_file OR structural_pangenome convert paf2delta paf_file assembly__fasta pangenome__fasta -p prefix_folder | ||
haptic command | haptic convert paf2delta paf_file assembly__fasta pangenome__fasta -o output_file OR haptic convert paf2delta paf_file assembly__fasta pangenome__fasta -p prefix_folder | ||
Parameters | o: output filename (with complete path) p: prefix_folder which stores the resulting output file (with complete path) | ||
Input |
| ||
output | Assembly Delta File File (OUTPUT) | ||
Usage | structural_pangenome convert paf2delta structural_pangenome convert paf2delta FILE TARGET QUERY [-h] (-p PREFIX | -o OUTPUT) | ||
Example | structural_pangenome convert paf2delta pafs/assembly_1_vs_pangenome__fasta.paf fastas/assembly_1.fas pangenomes/pangenome__fasta.fas -p outputs/paf2delta | ||
haptic example | haptic | Example | haptic convert paf2delta pafs/assembly_1_vs_pangenome__fasta.paf fastas/assembly_1.fas pangenomes/pangenome__fasta.fas -p outputs/paf2delta |
Can be Parallelized | Yes | ||
Optional | No |
...
Description | OPTIONAL dotplot of the WGA |
Command | structural_pangenome render-dotplot delta_file -o output_file OR structural_pangenome render-dotplot delta_file -p prefix_folder |
haptic command | haptic render-dotplot delta_file -o output_file OR haptic render-dotplot delta_file -p prefix_folder |
Parameters | o: output filename (with complete path) (OUTPUT) p: prefix_folder which stores the resulting output file (with complete path) (PREFIX) |
Input | Assembly fasta delta file file (DELTA) - From Step3. |
output | · Assembly Fasta gp file · Assembly Fasta rplot file · Assembly Fasta fplot file · Assembly Fasta PNG file |
Usage | structural_pangenome render-dotplot [-h] (-p PREFIX | -o OUTPUT) DELTA |
Example | structural_pangenome render-dotplot Step3.deltas/assembly_1_vs_pangenome__fasta.delta -p outputs/Step4.Render |
haptic example | haptic render-dotplot deltas/assembly_1_vs_pangenome__fasta.delta -p outputs/dotplot |
Can be Parallelized | Yes |
Optional | Yes |
...
Step5: Filtering Delta files – WIP
Description | Filter the minimap2 result (filter unwanted regions from delta file) |
Command | structural_pangenome filter-delta assembly_1_vs_pangenome__fasta.delta |
Parameters | o: output filename (with complete path) (OUTPUT) p: prefix_folder which stores the resulting output file (with complete path) (PREFIX) m: optional, metrics: save filtering metrics to a file (name auto-generated) --inner_cutoff: optional, (INNER_CUTOFF): Threshold to filter the delta results based on the in between two existing fragments. default value: 0.8 --outer_cutoff: optional, (OUTER_CUTOFF): Threshold to filter the delta results based on the overlap with another fragment. default value: 0.5 |
Input | Assembly Delta File - From Step3. |
output | Filtered Assembly Delta File |
Usage | structural_pangenome filter-delta [-h] (-p PREFIX | -o OUTPUT) FILE [-INNER_CUTOFF] [-OUTER_CUTOFF] [-m] |
Example | structural_pangenome filter-delta deltas/assembly_1_vs_pangenome__fasta.delta -o filtered_deltas/ssembly_1_vs_pangenome__fasta_bbmh_filter.delta -m |
Can be Parallelized | Yes |
Optional | No |
Step6: Rendering plot dotplots for filtered Delta files
Description | OPTIONAL dotplot of the WGA | |
Command | structural_pangenome render-dotplot filtered_delta_file -o output_file OR structural_pangenome render-dotplot filtered_delta_file -p prefix_folder | |
haptic command | haptic render-dotplot filtered_delta_file -o output_file OR haptic render-dotplot filtered_delta_file -p prefix_ | folder folder |
Parameters | o: output filename (with complete path) (PREFIX) p: prefix_folder which stores the resulting output file (with complete path) (OUTPUT) | |
Input | Assembly fasta filtered delta file file (DELTA) - From Step5. | |
output | · Assembly Fasta gp file · Assembly Fasta rplot file · Assembly Fasta fplot file · Assembly Fasta PNG file | |
Usage | structural_pangenome render-dotplot [-h] (-p PREFIX | -o OUTPUT) DELTA | |
Example | structural_pangenome render-dotplot deltas/assembly_1_vs_pangenome__fasta_bbmhFilter.delta -p outputs/dotplot | |
haptic example | haptic render-dotplot deltas/assembly_1_vs_pangenome__fasta_bbmhFilter.delta -p outputs/dotplot | |
Can be Parallelized | Yes | |
Optional | Yes |
...
Step7: Reverse Filtered Delta Files - WIPFiles
Description | y |
Command | structural_pangenome reverse-delta assembly_1_vs_pangenome__fasta_bbmhFilter.delta -o pangenome__fasta_vs_assembly_1_bbmhFilter.delta |
Parameters | o: output filename (with complete path) (PREFIX) p: prefix_folder which stores the resulting output file (with complete path) (OUTPUT) |
Input | Assembly fasta filtered delta file (FILE) - From Step6. |
output | Reverse filtered delta file (OUTPUT) |
Usage | structural_pangenome reverse-delta [-h] (-p PREFIX | -o OUTPUT) FILE |
Example | structural_pangenome reverse-delta filtered_deltas/_assembly1_v1_bbmh-filter.delta -o reversed_filtered_deltas/ pangenome__fasta_vs_assembly_1_bbmhFilter.delta |
Can be parallelized | Yes |
Optional | No |
Step8: Coordinate Pangenome Creation
Description | Create the coordinate pangenome system in JSON | |
Command | structural_pangenome build-coordinate reverse_filtered_delta_file -o output_file OR structural_pangenome build-coordinate delta_file -p prefix-folder | |
haptic command | haptic build-json reverse_filtered_delta_file -o output_file OR | Haptic haptic build-json delta_file -p prefix-folder |
Parameters | o: output filename (with complete path) p: prefix_folder which stores the resulting output file (with complete path) unzipped, optional: set this flag to generated an uncompressed file (–unzipped) pretty, optional: set this flag to generated a json file that is human readable (–pretty) m, optional: metrics, Whether to save coordinate metrics. Filename auto-generated if not given (-m) | |
Input | Reversed filtered delta file file(s) (DELTA) (DELTA ..), - From Step7. | |
output | Pangenome coordinate JSON file file (OUTPUT) | |
Example Haptic | structural_pangenome build-coordinate deltas/pangenome__fasta_vs_assembly_1_bbmhFilter.delta o outputs/coordinate_json.json --pretty --unzipped -m | |
haptic example | haptic build-json deltas/pangenome__fasta_vs_assembly_1_ | bbmhFiterbbmhFilter.delta -o outputs/coordinate-json.json |
Usage | structural_pangenome build-coordinate [-h] (-p PREFIX | -o OUTPUT) [–unzipped] [--pretty] [-m [METRICS]] DELTA [DELTA ...] | |
Can be Parallelized | Yes | |
Optional | No |
...
Description | Get the sequence length distribution of the pangenome segment with coordinate in the pangenome | ||
Command | structural_pangenome find-coordinate-seqlen pangenome_coordinate_file sequence_lengths_file -o output_file -g OR structural_pangenome find-coordinate-seqlen pangenome_coordinate_file sequence_lengths_file -p prefix_folder -g | ||
haptic command | haptic find-seqlen reverse_filtered_delta_file -o output_file -g OR haptic find-seqlen reverse_filtered_delta_file -p prefix_folder -g | ||
Parameters | o: output filename (with complete path) p: prefix_folder which stores the resulting output file (with complete path) g: option optional, apply to yield generate sequence length distribution graph | ||
Input Reversed filtered delta file | pangenome coordinate file (COORDINATE) - From Step8. sequence lengths file (.csv) (SEQLEN) - From Step2.4 | ||
output | pangenome_coordinate_seqLength Graph Graph (OUTPUT) | ||
Usage | structural_pangenome find-coordinate-seqlen [-h] (-p PREFIX | -o OUTPUT) [-g] COORDINATE SEQLEN | ||
Example | structural_pangenome find-coordinate-seqlen Step8.CoordPangenome/coordAssem_pangenome.json Step2.4.seqLens/pangenome__fasta_seqlen.csv -p outputs/seqlenPangenome -g | ||
haptic example | haptic | Example | Haptic find-seqlen deltas/pangenome__fasta_vs_assembly_1_bbmhFiter.delta -p outputs/seqlen |
Can be Parallelized | Yes | ||
Optional | Yes |
Step10: Pangenome Path Creation - WIPCreation
Description | Builds a pangenome path json file, based on the coordinate json file |
Command | structural_pangenome build-path pangenome_coordinate_json_file -o output file |
Parameters | o: output filename (with complete path) p: prefix_folder which stores the resulting output file (with complete path) unzipped, optional: set this flag to generated an uncompressed file (–unzipped) pretty, optional: set this flag to generated a json file that is human readable (–pretty) |
Input | pangenome coordinate JSON file (JSON) - From Step8. |
output | pangenome path JSON FILE (OUTPUT) |
Usage | structural_pangenome build-path [-h] (-p PREFIX | -o OUTPUT) [–unzipped] [–pretty] JSON |
Example | structural_pangenome build-path Step8.CoordPangenome/coordAssem_pangenome.json -o 10.PathPangenome/path__pangenome.json --unzipped --pretty |
Can be Parallelized | Yes |
Optional | No |