Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Step1: Ngap detection

Description 

Report the stretch of N (Ngap) in each assembly 

Command 

haptic find-ngaps assembly_fasta_file_file -o output_file

OR

haptic find-ngaps assembly_fasta_file_file -p prefix_folder

Parameters 

o: output filename (with complete path)

p: prefix_folder which stores the resulting output file.

Input 

Fasta file for each assembly 

output 

Ngap.bed file for each assembly 

Example 

haptic find-ngaps fastas/assembly_1.fas -o outputs/resulting_ngaps.bed

Can be Parallelized 

Yes 

Optional  

No 


Step2.1: Indexing

Description 

Indexing the assembly fasta file

Command 

haptic index assembly_fasta_file -o output_file

OR

haptic index assembly_fasta_file -p prefix_folder

Parameters 

o: output filename (with complete path)

p: prefix_folder which stores the resulting output file.

Input 

Fasta file for each assembly 

output 

Indexed fasta file for each assembly 

Example 

haptic index fastas/assembly_1.fas -p outputs/indexed_fasta

Can be Parallelized 

Yes 

Optional  

No 


Step2.2: Iterative Pangenome graph build

Description 

Building the pangenome graph using minigraph iteratively. You need to define the order in which the assemblies will be processed. First, you need to define which assembly is better to start with. Then other genomes need to be ranked according to the completeness of the assembly and the distance relative to the first chosen assembly (from closest to farthest). 

Command 

haptic build-graph assembly1_fasta_file assembly2_fasta_file -o assembly1__assembly2_graph_filename

OR

haptic build-graph assembly1_fasta_file assembly2_fasta_file -p folder_to_populate_output

Parameters 

o: output filename (with complete path)

p: prefix_folder which stores the resulting output file (with complete path)

Input 

Fasta file for each assembly,

output 

One gfa file for all assemblies 

Example 

haptic build-graph fastas/assembly_1.fas fastas/assembly_2.fas -p outputs/assembly_pangenome_graph

Parallelizable 

No 

Optional  

No 


Step2.3: Graph & stat - WIP


Step2.4: Pangenome Sequence Length distribution

Description 

Get the pangenome sequence length distribution 

Command 

haptic find-seqlen pengenome_fasta_file -o output_file

OR

haptic find-seqlen pengenome_fasta_file -p prefix_folder

Parameters 

o: output filename (with complete path)

p: prefix_folder which stores the resulting output file (with complete path)

g: optional parameter which also displays a graph of the sequence length distribution

Input 

Fasta file for each assembly 

output 

One csv file for the whole Pangenome 

Example 

Haptic find-seqlen pangenomes/pangenome__fasta.fas -o outputs/pangenome_seqlen.csv

Can be Parallelized 

No 

Optional  

Yes 


Step2.5: Assembly vs pangenome

Description 

Compare Assembly vs pangenome 

Command 

haptic align indexed_assembly_fasta_file pangenome_fasta_file -o output_file

OR

haptic align indexed_assembly_fasta_file pangenome_fasta_file -p prefix_folder

Parameters 

o: output filename (with complete path)

p: prefix_folder which stores the resulting output file (with complete path)

Input 

Indexed fasta file for each assembly, and the Pangenome Fasta file, it will be shared separately 

output 

Pangenome paf file 

Example 

haptic align fastas/assembly_1_indexed.fas pangenome/pangenome__fasta.fas -o outputs/output_pangenome.paf

Can be Parallelized 

Yes 

Optional 

No 


Step3: Convert the PAF file into a delta file

Description 

Convert the PAF file into a delta file. 

Command 

haptic convert paf2delta paf_file assembly__fasta pangenome__fasta -o output_file

OR

haptic convert paf2delta paf_file assembly__fasta pangenome__fasta -p prefix_folder

Parameters 

o: output filename (with complete path)

p: prefix_folder which stores the resulting output file (with complete path)

Input 

  • Pangenome Fasta File 
  • Assembly Fasta File 
  • Assembly Paf File 

output 

Assembly Delta File 

Example 

haptic convert paf2delta pafs/assembly_1_vs_pangenome__fasta.paf fastas/assembly_1.fas pangenomes/pangenome__fasta.fas -p outputs/paf2delta

Can be Parallelized 

Yes 

Optional 

No 


Step4: Rendering plot dotplots

Description 

OPTIONAL dotplot of the WGA 

Command 

haptic render-dotplot delta_file -o output_file

OR

haptic render-dotplot delta_file -p prefix_folder 

Parameters 

o: output filename (with complete path)

p: prefix_folder which stores the resulting output file (with complete path)

Input 

Assembly fasta delta file 

output 

·        Assembly Fasta gp file  

·        Assembly Fasta rplot file 

·        Assembly Fasta fplot file 

·        Assembly Fasta PNG file 

Example 

haptic render-dotplot deltas/assembly_1_vs_pangenome__fasta.delta -p outputs/dotplot

Can be Parallelized 

Yes 

Optional 

Yes 


Step5: Filtering Delta files – WIP


Step6: Rendering plot dotplots for filtered Delta files

Description 

OPTIONAL dotplot of the WGA 

Command 

haptic render-dotplot filtered_delta_file -o output_file

OR

haptic render-dotplot filtered_delta_file -p prefix_folder 

Parameters 

o: output filename (with complete path)

p: prefix_folder which stores the resulting output file (with complete path)

Input 

Assembly fasta filtered delta file 

output 

·        Assembly Fasta gp file  

·        Assembly Fasta rplot file 

·        Assembly Fasta fplot file 

·        Assembly Fasta PNG file 

Example 

haptic render-dotplot deltas/assembly_1_vs_pangenome__fasta_bbmhFilter.delta -p outputs/dotplot

Can be Parallelized 

Yes 

Optional 

Yes 


Step7: Reverse Filtered Delta Files - WIP


Step8: Coordinate Pangenome Creation

Description 

Create the coordinate pangenome system in JSON 

Command 

haptic build-json reverse_filtered_delta_file -o output_file

OR

Haptic build-json delta_file -p prefix-folder 

Parameters 

o: output filename (with complete path)

p: prefix_folder which stores the resulting output file (with complete path)

Input 

Reversed filtered delta file 

output 

Pangenome coordinate JSON file 

Example 

Haptic build-json deltas/pangenome__fasta_vs_assembly_1_bbmhFiter.delta -o outputs/coordinate-json.json

Can be Parallelized 

Yes 

Optional 

No 




Step9: Pangenome Sequence Length Distribution

Description 

Get the sequence length distribution of the pangenome segment with coordinate in the pangenome 

Command 

haptic find-seqlen reverse_filtered_delta_file -o output_file -g

OR

haptic find-seqlen reverse_filtered_delta_file -p prefix_folder -g

Parameters 

o: output filename (with complete path)

p: prefix_folder which stores the resulting output file (with complete path)

g: option to yield sequence length distribution graph

Input 

Reversed filtered delta file 

output 

pangenome_coordinate_seqLength Graph 

Example 

Haptic find-seqlen deltas/pangenome__fasta_vs_assembly_1_bbmhFiter.delta -p outputs/seqlen

Can be Parallelized 

Yes 

Optional 

Yes 


Step10: Pangenome Path Creation - WIP