Assembly Parameters

Complexity Settings

Visualization

Assembly Statistics

About De Novo Assembly

De Bruijn Graphs: Sequencing reads are broken into k-mers (subsequences of length k). Each k-mer becomes an edge, connecting two nodes representing its prefix and suffix (k-1)-mers. An ideal assembly finds an Eulerian path that traverses each edge exactly once.

Repeats Create Branches: When a sequence appears multiple times in the genome, the same k-mer connects to multiple contexts, creating branch points in the graph. Assembly must stop at these ambiguous junctions, producing multiple contigs instead of one complete sequence.

N50 Metric: The N50 is the contig length such that contigs of this length or longer contain at least 50% of the total assembled sequence. Higher N50 indicates better assembly contiguity. Coverage helps resolve ambiguities but cannot fully resolve repeats longer than read length.