Use mmseqs2 MSA Generation

Official Neurosnap webserver for accessing mmseqs2 MSA Generation online.

Overview

Generate MSAs using the mmseqs2, a robust algorithm designed for efficient multiple sequence alignment (MSA) generation, optimized to process large datasets with high speed and accuracy. Utilizing state-of-the-art alignment algorithms, mmseqs2 generates deep MSAs to support downstream analysis in protein structure prediction, evolutionary studies, and function annotation.

Neurosnap Overview

The mmseqs2 MSA Generation online webserver allows anybody with a Neurosnap account to run and access mmseqs2 MSA Generation, no downloads required. Information submitted through this webserver is kept confidential and never sold to third parties as detailed by our strong terms of service and privacy policy.

View Paper

Features

  • Generates high-quality multiple sequence alignments for protein sequences.
  • Supports various input sequences for pairing related modes.
  • Enables deep MSAs, aiding in accurate protein structure prediction and functional analysis.
  • Ideal for applications in evolutionary studies, protein annotation, and bioinformatics research.

Statistics

Neurosnap periodically calculates runtime statistics based on job execution data. These estimates provide a general guideline for how long your job may take, but actual runtimes can vary significantly depending on factors like input size or settings used.

Statistic Value
Credit Usage Rate loading...
Estimated Total Cost loading...
Runtime Mean loading...
Runtime Median loading...
Runtime Standard Deviation loading...
Runtime 90th Percentile loading...
Runtime Longest loading...

API Request

Access mmseqs2 MSA Generation using the Neurosnap API by sending a request using any programming language with HTTP support. To safely generate an API key, visit the API tab of your overview page.

Job Note

Provide a name or description for your job to help you organize and track its results. This input is solely for organizational purposes and does not impact the outcome of the job.

Configuration & Options

Service Inputs

The amino acid sequences corresponding to the proteins you want to generate an MSA for. This will not generate a seperate MSA for each provided protein but will instead generate a single paired MSA using all provided sequences. NOTE: This service uses the ColabFold mmseqs2 API and will share your input sequences with that API. If you are not confortable or authorized sharing your sequences with the third party API, do not use this service.

Choose a mode for MSA generation: 'unpaired' creates individual MSAs for each sequence; 'paired' generates MSAs for related sequences to preserve inter-sequence relationships; 'unpaired_paired' combines both approaches, maximizing coverage by first pairing sequences, then supplementing with unpaired MSAs if needed. For more information see this blog post: https://neurosnap.ai/blog/post/6711802c8bedccdda9e14fe4

This parameter determines the minimum percentage of sequence coverage required for a sequence to be included in the multiple sequence alignment (MSA). Specifically, it refers to the proportion of residues in the query sequence that must align with residues in a potential sequence match. For example, with a coverage threshold of 50% (the default), at least half of the positions in the query sequence must have aligned positions in the matched sequence. This ensures that only sequences with sufficient overlap in the alignment space are included, which can be especially useful in filtering out shorter or partial matches that may not provide meaningful conservation information. Adjusting this parameter can fine-tune the sensitivity of the MSA, balancing the trade-off between the amount of information retained and the level of similarity across sequences.

This parameter sets the minimum percentage of sequence identity required for a sequence to be included in the MSA. Sequence identity refers to the exact match of amino acid or nucleotide residues at aligned positions between sequences. With an identity threshold of 90% (the default), only sequences with at least 90% identity to the query sequence are retained. This setting is critical for filtering out highly divergent sequences that may not accurately represent conserved evolutionary information or functional relevance. Lowering the identity threshold increases diversity within the MSA, potentially capturing more distant homologs, whereas a higher threshold ensures that only close homologs, which are more functionally similar, are considered.

Maximum number of sequences to include in the final MSA. If the generated MSA exceeds this number, it will be trimmed down to this limit (default: 2048).

Ready to submit your job?

Once you're done just hit the submit button below and let us do the rest.

To submit a job please login or register an account.