neurosnap.nucleotide module#
Provides functions and classes related to processing nucleotide data.
- neurosnap.nucleotide.get_reverse_complement(seq)[source]#
Generate the complementary strand of a DNA or RNA sequence in reverse order.
- Parameters:
seq (str) – A string representing the nucleotide sequence. Valid characters are ‘A’, ‘T’, ‘C’, ‘G’ for DNA and ‘A’, ‘U’, ‘C’, ‘G’ for RNA.
- Returns:
A string representing the reverse complementary strand of the input sequence.
- Return type:
- Raises:
KeyError – If the input sequence contains invalid nucleotide characters.
- neurosnap.nucleotide.split_interleaved_fastq(fn_in, output_dir, *, preserve_identifier_names=False)[source]#
Split an interleaved FASTQ into left/right FASTQ files.
Assumes pairs are adjacent (left read followed by right read) and rewrites headers as “@<index>/1” and “@<index>/2”.
Supports gzip-compressed inputs with filenames ending in “.fastq.gz” or “.fq.gz”. Compression is detected by filename and streamed during read.
- Parameters:
fn_in (
Union[str,Path]) – Path to the interleaved FASTQ file.output_dir (
Union[str,Path]) – Directory to write outputs into.preserve_identifier_names (
bool) – If True, preserve the input read identifiers (normalizing mate suffix to “/1” or “/2”). If False, rewrite identifiers as “@<index>/1” and “@<index>/2”.
- Returns:
Paths to the left and right FASTQ output files.
- Return type:
Tuple[Path, Path]