guido.locus
Module Contents
Classes
Functions
|
Prepare annotation file for use with pyranges. |
|
Create a locus from coordinates. Coordinates are 1-based. If annotation |
|
Create a locus from sequence. |
|
Create a locus from gene name. If annotation file is provided, it will |
- class guido.locus.Locus(sequence, name=None, start=1, end=None, genome=None, annotation=None, **kwargs)[source]
-
- guide(ix)[source]
Fetch a guide from the locus by its index or name.
- Parameters:
- ixstr or int
Index of the gRNA.
- Returns:
- g: Guide
Guide object representing a gRNA
Examples
>>> import guido >>> seq = "TTATCATCCACTCTGACGGGTGGTATTGCGCAACTCCACGCCATCAAACATGTTCAGATTATGCAATCGTGAGTATTCGTTGACCACCGCTTGACCTGTGT" >>> loc = guido.Locus( ... sequence=seq, name="AgamP4_2R", start=48714554, end=48714654 ... ) >>> loc.find_guides() >>> loc.guide("gRNA-1") gRNA-1(CGCAATACCACCCGTCAGAGTGG|AgamP4_2R:48714561-48714584|-|) >>> loc.guide(0) gRNA-1(CGCAATACCACCCGTCAGAGTGG|AgamP4_2R:48714561-48714584|-|)
- find_guides(pam='NGG', min_flanking_length=0, selected_features='all')[source]
Find gRNAs in the locus.
- Parameters:
- pamstr, optional
gRNA PAM sequence, by default “NGG”
- min_flanking_lengthint, optional
Defines flanking region from the locus where gRNAs are ignored. By default 0, however simulate_end_joining() requires flanking region of 75 bp to simulate MMEJ.
- selected_featuresstr, optional
Limit gRNA search on only specified genomic features. Features are defined in the provided genome annotation file. By default {“all”}
- Returns:
- sorted_guideslist
List of gRNAs sorted by their position in the locus.
Examples
>>> import guido >>> genome = guido.load_genome_from_file( ... guido_file="/Users/nkranjc/imperial/ref/new/AgamP4.guido" ... ) >>> loc = guido.locus_from_coordinates(genome, "AgamP4_2R", 48714541, 48714666) >>> loc.find_guides() >>> loc.guides [gRNA-1(AAGTTTATCATCCACTCTGACGG|AgamP4_2R:48714550-48714572|+|), gRNA-2(CGCAATACCACCCGTCAGAGTGG|AgamP4_2R:48714561-48714583|-|), gRNA-3(AGTTTATCATCCACTCTGACGGG|AgamP4_2R:48714551-48714573|+|), gRNA-4(TTATCATCCACTCTGACGGGTGG|AgamP4_2R:48714554-48714576|+|), gRNA-5(TCTGAACATGTTTGATGGCGTGG|AgamP4_2R:48714589-48714611|-|), gRNA-6(CATAATCTGAACATGTTTGATGG|AgamP4_2R:48714594-48714616|-|), gRNA-7(GTTTAACACAGGTCAAGCGGTGG|AgamP4_2R:48714637-48714659|-|), gRNA-8(TATGTTTAACACAGGTCAAGCGG|AgamP4_2R:48714640-48714662|-|)]
Searching for gRNAs in a specific genomic feature:
>>> loc.find_guides(selected_features="exon") >>> loc.guides [gRNA-1(AAGTTTATCATCCACTCTGACGG|AgamP4_2R:48714550-48714572|+|), gRNA-2(CGCAATACCACCCGTCAGAGTGG|AgamP4_2R:48714561-48714583|-|), gRNA-3(AGTTTATCATCCACTCTGACGGG|AgamP4_2R:48714551-48714573|+|), gRNA-4(TTATCATCCACTCTGACGGGTGG|AgamP4_2R:48714554-48714576|+|), gRNA-5(TCTGAACATGTTTGATGGCGTGG|AgamP4_2R:48714589-48714611|-|), gRNA-6(CATAATCTGAACATGTTTGATGG|AgamP4_2R:48714594-48714616|-|)]
- simulate_end_joining(n_patterns=5, length_weight=20)[source]
Simulate end-joining and find MMEJ deletion patterns for each gRNA.
Microhomology scores are calculated based on proposed scoring model described by Bae et al. 2014.
- Parameters:
- n_patternsint, optional
Number of top scored MMEJ deletion patterns reported. By default 5.
- length_weightint, optional
Length weight parameter used in MMEJ scoring as defined by Bae et al. 2015. By default, 20.
- find_off_targets(external_genome=None, **kwargs)[source]
Find off-targets in the genome for each gRNA.
- Parameters:
- external_genomeGenome, optional
If provided, off-target search is performed in the external genome rather than in the genome which Locus is a part of. By default None.
- _guide_sequence_diversity(guide, g, pos)[source]
Calculate sequence diversity for each region of the guide.
- _guide_alt_ac(guide, g, pos)[source]
Calculate alternative allele count for each region of the guide.
- _guide_n_variants(guide, g, pos)[source]
Calculate number of variants for each region of the guide.
- _apply_variation_layer_data(guides, layer_name, layer_genotype_data, layer_pos)[source]
Apply sequence diversity, alternative allele count and number of variants as layers.
- add_layer(name, layer_data, layer_pos=None, apply_to_guides=True, is_variation=False)[source]
Adds a layer with the data to the locus.
- Parameters:
- namestr
Name of the layer
- layer_datanp.ndarray
Layer data. Needs to be the same shape as the locus.
- apply_to_guidesbool, optional
Apply layer data to gRNAs when adding it to the locus. By default True.
Examples
>>> locus = Locus("chr1", 100, 200) >>> layer_data = np.random.rand(100) >>> locus.add_layer("random", layer_data)
- _prepare_alt_matrix(rank_layer_names, method=np.mean)[source]
Prepares numerical matrix with the gRNA layer data to be used later in the ranking.
- Parameters:
- rank_layer_nameslist
List of layer names to be used in the ranking.
- method[type], optional
Method to use to combine the layer data, by default np.mean
- Returns:
- np.ndarray
Matrix with the layer data for each gRNA.
- rank_guides(layer_names=None, layer_is_benefit=None, weight_vector=None, ranking_method='TOPSIS', norm_method='Vector')[source]
Ranks guides based on the layer data.
- Returns:
- list
List of ranked guides.
- add_azimuth_score()[source]
Apply Azimuth score to a list of guides.
Azimuth is a machine learning-based predictive modelling of CRISPR/Cas9 guide efficiency. Sometimes its reffered to as Doench 2016 score.
Described in https://doi.org/10.1038/nbt.3437 (Doench et al., 2016)
- guido.locus._prepare_annotation(annotation_file_abspath, as_df=True)[source]
Prepare annotation file for use with pyranges.
- guido.locus.locus_from_coordinates(genome, chromosome, start, end)[source]
Create a locus from coordinates. Coordinates are 1-based. If annotation file is provided, it will be used to annotate the locus.
- Parameters:
- genomeGenome
Genome object. Can be created using Genome class.
- chromosomestr
Chromosome name.
- startint
Start position.
- endint
End position.
- Returns:
- Locus
Locus object.
- guido.locus.locus_from_sequence(sequence, sequence_name=None)[source]
Create a locus from sequence.
- Parameters:
- sequencestr
DNA sequence
- sequence_namestr, optional
Sequence name, by default None
- Returns:
- Locus
Object representing a locus from given sequence.
- guido.locus.locus_from_gene(genome, gene_name)[source]
Create a locus from gene name. If annotation file is provided, it will be used to annotate the locus.
- Parameters:
- genomeGenome
Genome object. Can be created using Genome class.
- gene_namestr
Gene name. Needs to be present in the annotation file.
- Returns:
- Locus
Locus object.