Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Note: a -h flag after a command (e.g. >structural_pangenome find-ngaps -h) generates a help message for said command.

Step1: Ngap detection

Description 

Report the stretch of N (Ngap) in each assembly 

Command 

haptic find-ngaps assembly_fasta_file_file -o output_file

structural_pangenome find-ngaps assembly_fasta_file -o output_file

OR

haptic find-ngaps assembly_fasta_file_file -p prefix_folder

structural_pangenome find-ngaps assembly_fasta_file -p prefix_folder

Parameters 

o: output filename (with complete path)

p: prefix_folder which stores the resulting output file.

t: optional parameter, threshold: minimum recurrence of Ngap sequence length that affects statistics (e.g. 100)

Input 

Fasta file for each assembly  (FILE)

output 

Ngap.bed file for each assembly assembly (OUTPUT)

Usagestructural_pangenome find-ngaps [-h] (-p PREFIX | -o OUTPUT) [-t THRESHOLD] FILE

Example 

haptic find-ngaps fastas/assembly_1.fas -o outputs/resulting_ngaps.bed

structural_pangenome find-naps fastas/assembly_1.fas -o outputs/resulting_ngaps.bed

Can be Parallelized 

Yes 

Optional  

No 

...

Description 

Indexing the assembly fasta file

Command 

haptic index assembly_fasta_file -o output_file

structural_pangenome  index assembly_fasta_file -o output_file

OR

haptic index assembly_fasta_file -p prefix_folder

structural_pangenome index assembly_fasta_file -p prefix_folder

Parameters 

o: output filename (with complete path)

p: prefix_folder which stores the resulting output file.

Input 

Fasta file for each assembly assembly (FILE)

output 

Indexed fasta file for each assembly assembly (OUTPUT)

Usagestructural_pangenome index [-h] (-p PREFIX | -o OUTPUT) FILE

Example 

haptic index fastas/assembly_1.fas -p outputs/indexed_fasta

structural_pangenome index fastas/assembly_1.fas -p outputs/indexed_fasta

Can be Parallelized 

Yes 

Optional  

No 

...

Description 

Building the pangenome graph using minigraph iteratively. You need to define the order in which the assemblies will be processed. First, you need to define which assembly is better to start with. Then other genomes need to be ranked according to the completeness of the assembly and the distance relative to the first chosen assembly (from closest to farthest). 

Command 

haptic build-graph assembly1_fasta_file assembly2_fasta_file -o assembly1__assembly2_graph_filename

structural_pangenome build-graph assembly1_fasta_file assembly2_fasta_file -o assembly1__assembly2_graph_filename

OR

haptic build-graph assembly1_fasta_file assembly2_fasta_file -p folder_to_populate_output

structural_pangenome build-graph assembly1_fasta_file assembly2_fasta_file -p folder_to_populate_output

Parameters 

o: output filename (with complete path)

p: prefix_folder which stores the resulting output file (with complete path)

Input 

Fasta file for each assembly (FASTA), (and possibly gfa file to add - GRAPH)

output 

One gfa file for all assemblies assemblies (OUTPUT)

Usage

structural_pangenome build-graph [-h] (-p PREFIX | -o OUTPUT) GRAPH FASTA

Example 

haptic build-graph fastas/assembly_1.fas fastas/assembly_2.fas -p outputs/assembly_pangenome_graph

structural_pangenome build-graph fastas/assembly_1.fas fastas/assembly_2.fas -p outputs/assembly_pangenome_graph

Parallelizable 

No 

Optional  

No 

...