Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Note (Help Messages): a -h flag after a command (e.g. >structural_pangenome find-ngaps -h) generates a help message for said command.

>structural_pangenome --help displays available structural_pangenome commands.


Step1: Ngap detection

Description 

Report the stretch of Ns (Ngap) in each assembly 

Command 

structural_pangenome find-ngaps assembly_fasta_file -o output_file

OR

structural_pangenome find-ngaps assembly_fasta_file -p prefix_folder

haptic command

haptic find-ngaps assembly_fasta_file_file -o output_file

OR

haptic find-ngaps assembly_fasta_file_file -p prefix_folder

Parameters 

o: output filename (with complete path)

p: prefix_folder which stores the resulting output file.

t: optional parameter, threshold: minimum recurrence of Ngap sequence length that affects statistics (integer)

Input 

Fasta file for each assembly  (FILE)

output 

Ngap.bed file for each assembly (OUTPUT)

Usagestructural_pangenome find-ngaps [-h] (-p PREFIX | -o OUTPUT) [-t THRESHOLD] FILE

Example 

structural_pangenome find-ngaps fastas/assembly_1.fas -o outputs/1.Ngaps/resulting_ngaps.bed

haptic examplehaptic find-ngaps fastas/assembly_1.fas -o outputs/resulting_ngaps.bed

Can be Parallelized 

Yes 

Optional  

No 

...

DescriptionExtract from minigraph pangenome all fragmented sequences in a FASTA file
Commandstructural_pangenome convert gfa2fa assembly_1_2_3.gfa > -o 2.3.Stat/pangenome__fasta.fas
Parametersgfa2fa: convert GFA file to FASTA file
InputOne gfa file for all assemblies (GFA)
OutputOne FASTA file for the whole pangenome (OUTPUT)
Usagestructural_pangenome convert gfa2fa GFA > -o OUTPUT
Examplestructural_pangenome convert gfa2fa 2.2.gfas/assembly123.gfa > -o 2.3.Stat/pangenome__fasta.fas
ParallelizableNo
OptionalNo

...

Description 

Get the pangenome sequence length distribution 

Command 

structural_pangenome find-seqlen pengenome_fasta_file -o output_file

OR

structural_pangenome find-seqlen pengenome_fasta_file -p prefix_folder

haptic command

haptic find-seqlen pengenome_fasta_file -o output_file

OR

haptic find-seqlen pengenome_fasta_file -p prefix_folder

Parameters 

o: output filename (with complete path) (OUTPUT)

p: prefix_folder which stores the resulting output file (with complete path) (PREFIX)

g: optional parameter which also displays a graph of the sequence length distribution (-g)

Input 

Fasta file for each assembly (FILE)

output 

One csv file for the whole Pangenome (OUTPUT)

Usagestructural_pangenome find-seqlen [-h] (-p PREFIX | -o OUTPUT) [-g] FILE

Example 

structural_pangenome find-seqlen pangenomes/pangenome__fasta.fas -o outputs/2.4.PangSeqLen/pangenome_seqlen.csv -g

haptic examplehaptic find-seqlen pangenomes/pangenome__fasta.fas -o outputs/pangenome_seqlen.csv

Can be Parallelized 

No 

Optional  

Yes 

...

Description 

Compare Assembly vs pangenome 

Command 

structural_pangenome indexed_assembly_fasta_file pangenome_fasta_file -o output_file

OR

structural_pangenome align indexed_assembly_fasta_file pangenome_fasta_file -p prefix_folder

haptic command

haptic align indexed_assembly_fasta_file pangenome_fasta_file -o output_file

OR

haptic align indexed_assembly_fasta_file pangenome_fasta_file -p prefix_folder

Parameters 

o: output filename (with complete path)

p: prefix_folder which stores the resulting output file (with complete path)

Input 

Indexed fasta file for each assembly (TARGET) - From Step2.1, and the Pangenome Fasta file (QUERY) , it will be shared separately - From Step2.3

output 

Pangenome paf file  (OUTPUT)

Usagestructural_pangenome align [-h] (-p PREFIX | -o OUTPUT) TARGET QUERY

Example 

structural_pangenome align fastas/assembly_1_indexed.fas pangenome/pangenome__fasta.fas -o outputs/output_pangenome.paf

haptic examplehaptic align fastas/assembly_1_indexed.fas pangenome/pangenome__fasta.fas -o outputs/output_pangenome.paf

Can be Parallelized 

Yes 

Optional 

No 

...

Description 

Convert the PAF file into a delta file. 

Command 

structural_pangenome convert paf2delta paf_file assembly__fasta pangenome__fasta -o output_file

OR

structural_pangenome convert paf2delta paf_file assembly__fasta pangenome__fasta -p prefix_folder

haptic command

haptic convert paf2delta paf_file assembly__fasta pangenome__fasta -o output_file

OR

haptic convert paf2delta paf_file assembly__fasta pangenome__fasta -p prefix_folder

Parameters 

o: output filename (with complete path)

p: prefix_folder which stores the resulting output file (with complete path)

Input 

  • Pangenome Fasta File to convert (FILEQUERY) - From Step2.3
  • Assembly Fasta File (TARGET)
  • Assembly Paf File (QUERYFILE) - From Step2.5

output 

Assembly Delta File (OUTPUT)

Usagestructural_pangenome convert paf2delta structural_pangenome convert paf2delta FILE TARGET QUERY [-h] (-p PREFIX | -o OUTPUT)

Example 

structural_pangenome convert paf2delta pafs/assembly_1_vs_pangenome__fasta.paf fastas/assembly_1.fas pangenomes/pangenome__fasta.fas -p outputs/paf2delta

haptic examplehaptic convert paf2delta pafs/assembly_1_vs_pangenome__fasta.paf fastas/assembly_1.fas pangenomes/pangenome__fasta.fas -p outputs/paf2delta

Can be Parallelized 

Yes 

Optional 

No 

...

Description 

OPTIONAL dotplot of the WGA 

Command 

structural_pangenome render-dotplot delta_file -o output_file

OR

structural_pangenome render-dotplot delta_file -p prefix_folder 

haptic command

haptic render-dotplot delta_file -o output_file

OR

haptic render-dotplot delta_file -p prefix_folder 

Parameters 

o: output filename (with complete path) (OUTPUT)

p: prefix_folder which stores the resulting output file (with complete path) (PREFIX)

Input 

Assembly fasta delta file (DELTA) - From Step3.

output 

·        Assembly Fasta gp file  

·        Assembly Fasta rplot file 

·        Assembly Fasta fplot file 

·        Assembly Fasta PNG file 

Usagestructural_pangenome render-dotplot [-h] (-p PREFIX | -o OUTPUT) DELTA

Example 

structural_pangenome render-dotplot Step3.deltas/assembly_1_vs_pangenome__fasta.delta -p outputs/dotplotStep4.Render

haptic examplehaptic render-dotplot deltas/assembly_1_vs_pangenome__fasta.delta -p outputs/dotplot

Can be Parallelized 

Yes 

Optional 

Yes 

...

DescriptionFilter the minimap2 result (filter unwanted regions from delta file)
Commandstructural_pangenome filter-delta assembly_1_vs_pangenome__fasta.delta
Parameters

o: output filename (with complete path) (OUTPUT)

p: prefix_folder which stores the resulting output file (with complete path) (PREFIX)

m: optional, metrics: save filtering metrics to a file (name auto-generated)

--inner_cutoff: optional, (INNER_CUTOFF): Threshold to filter the delta results based on the in between two existing fragments.

default value: 0.8

--outer_cutoff: optional, (OUTER_CUTOFF): Threshold to filter the delta results based on the overlap with another fragment.

default value: 0.5

InputAssembly Delta File - From Step3.
outputFiltered Assembly Delta File
Usagestructural_pangenome filter-delta [-h] (-p PREFIX | -o OUTPUT) FILE [-INNER_CUTOFF] [-OUTER_CUTOFF] [-m]
Examplestructural_pangenome filter-delta deltas/assembly_1_vs_pangenome__fasta.delta -o filtered_deltas/ssembly_1_vs_pangenome__fasta_bbmh_filter.delta -m
Can be ParallelizedYes
OptionalNo

...

Description 

OPTIONAL dotplot of the WGA 

Command 

structural_pangenome render-dotplot filtered_delta_file -o output_file

OR

structural_pangenome  render-dotplot filtered_delta_file -p prefix_folder

haptic command

haptic render-dotplot filtered_delta_file -o output_file

OR

haptic render-dotplot filtered_delta_file -p prefix_folder

Parameters 

o: output filename (with complete path) (PREFIX)

p: prefix_folder which stores the resulting output file (with complete path) (OUTPUT)

Input 

Assembly fasta filtered delta file (DELTA) - From Step5.

output 

·        Assembly Fasta gp file  

·        Assembly Fasta rplot file 

·        Assembly Fasta fplot file 

·        Assembly Fasta PNG file 

Usagestructural_pangenome render-dotplot [-h] (-p PREFIX | -o OUTPUT) DELTA

Example 

structural_pangenome render-dotplot deltas/assembly_1_vs_pangenome__fasta_bbmhFilter.delta -p outputs/dotplot

haptic examplehaptic render-dotplot deltas/assembly_1_vs_pangenome__fasta_bbmhFilter.delta -p outputs/dotplot

Can be Parallelized 

Yes 

Optional 

Yes 

...

DescriptionReverse the delta and move pangenome from query to reference and assembly from reference to query
Commandstructural_pangenome reverse-delta assembly_1_vs_pangenome__fasta_bbmhFilter.delta -o 

pangenome__fasta_vs_assembly_1_bbmhFilter.delta

Parameters

o: output filename (with complete path) (PREFIX)

p: prefix_folder which stores the resulting output file (with complete path) (OUTPUT)

InputAssembly fasta filtered delta file (FILE) - From Step6.
outputReverse filtered delta file (OUTPUT)
Usagestructural_pangenome reverse-delta [-h] (-p PREFIX | -o OUTPUT) FILE
Examplestructural_pangenome reverse-delta filtered_deltas/_assembly1_v1_bbmh-filter.delta -o reversed_filtered_deltas/

pangenome__fasta_vs_assembly_1_bbmhFilter.delta

Can be parallelizedYes
OptionalNo

...

Description 

Create the coordinate pangenome system in JSON 

Command 

structural_pangenome build-coordinate reverse_filtered_delta_file -o output_file

OR

structural_pangenome build-coordinate delta_file -p prefix-folder 

haptic command

haptic build-json reverse_filtered_delta_file -o output_file

OR

haptic build-json delta_file -p prefix-folder 

Parameters 

o: output filename (with complete path)

p: prefix_folder which stores the resulting output file (with complete path)

unzipped, optional: set this flag to generated an uncompressed file (–unzipped)

pretty, optional: set this flag to generated a json file that is human readable (–pretty)

m, optional: metrics, Whether to save coordinate metrics. Filename auto-generated if not given (-m)

Input 

Reversed filtered delta file(s) (DELTA) (DELTA ..), - From Step7.

output 

Pangenome coordinate JSON file (OUTPUT)

Example 

structural_pangenome build-coordinate deltas/pangenome__fasta_vs_assembly_1_bbmhFilter.delta o outputs/coordinate_json.json --pretty --unzipped -m

haptic examplehaptic build-json deltas/pangenome__fasta_vs_assembly_1_bbmhFilter.delta -o outputs/coordinate-json.json
Usage

structural_pangenome build-coordinate [-h] (-p PREFIX | -o OUTPUT)  [–unzipped] [--pretty] [-m [METRICS]]

DELTA [DELTA ...]

Can be Parallelized 

Yes 

Optional 

No 

...

Description 

Get the sequence length distribution of the pangenome segment with coordinate in the pangenome 

Command 

structural_pangenome find-coordinate-seqlen pangenome_coordinate_file sequence_lengths_file -o output_file -g

OR

structural_pangenome find-coordinate-seqlen pangenome_coordinate_file sequence_lengths_file -p prefix_folder -g

haptic command

haptic find-seqlen reverse_filtered_delta_file -o output_file -g

OR

haptic find-seqlen reverse_filtered_delta_file -p prefix_folder -g

Parameters 

o: output filename (with complete path)

p: prefix_folder which stores the resulting output file (with complete path)

g: optional, apply to yield generate sequence length distribution graph

Input 

pangenome coordinate file (COORDINATE) - From Step8.

sequence lengths file (.csv) (SEQLEN) - From Step2.4

output 

pangenome_coordinate_seqLength Graph (OUTPUT)

Usagestructural_pangenome find-coordinate-seqlen [-h] (-p PREFIX | -o OUTPUT) [-g] COORDINATE SEQLEN

Example 

structural_pangenome find-coordinate-seqlen Step8.CoordPangenome/coordAssem_pangenome.json Step2.4.seqLens/pangenome__fasta_seqlen.csv

-p outputs/seqlenPangenome -g

haptic examplehaptic find-seqlen deltas/pangenome__fasta_vs_assembly_1_bbmhFiter.delta -p outputs/seqlen

Can be Parallelized 

Yes 

Optional 

Yes 

...

DescriptionBuilds a pangenome path json file, based on the coordinate json file
Commandstructural_pangenome build-path pangenome_coordinate_json_file -o output file
Parameters

o: output filename (with complete path)

p: prefix_folder which stores the resulting output file (with complete path)

unzipped, optional: set this flag to generated an uncompressed file (–unzipped)

pretty, optional: set this flag to generated a json file that is human readable (–pretty)

Inputpangenome coordinate JSON file (JSON) - From Step8.
outputpangenome path JSON FILE (OUTPUT)
Usagestructural_pangenome build-path [-h] (-p PREFIX | -o OUTPUT) [–unzipped] [–pretty] JSON
Example

structural_pangenome build-path Step8.CoordPangenome/coordAssem_pangenome.json -o 10.PathPangenome/path__pangenome.json

--unzipped --pretty

Can be ParallelizedYes
OptionalNo

...