BCAWT package

BCAWT.BCAWT module

BCAWT.BCAW(main_fasta_file, save_path='', ref_fasta_file=None, genetic_code_=1, Auto=False)

BCAWT ( Bio Codon Analysis Workflow Tool ), it manages a complete workflow to analysis the codon usage bias for genes and genomes of any organism..

Args:

main_fasta_file (list): list of string of the file’s path or file-like object

save_path (str): absolute path to the directory to save the result in, default = the current directory

ref_fasta_file (list): list of string of the file’s path or file-like object, default = None

genetic_code_(int): default = 1, The Genetic Codes number described by NCBI (https://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi)

Auto (bool): default = False, if ref_fasta_file not None.

Notes:

  • Auto (bool): should be = True to auto-generate a reference set, when arg (ref_fasta_file) not available ( = None )

Returns:

BCAWT.BCAWT_auto_test module

BCAWT_auto_test.auto_check_files(path)

Check the expected outputs.

Args:
path: absolute path to the directory to save the result in

Returns:

None
BCAWT_auto_test.auto_test(path='', test_file='')

Run a demo test with 23 results ( see the documentation for more details about the expected output )

Args:

path: absolute path to the directory to save the result in test_file: absolute path to the fasta file that will be tested
Returns:
None

BCAWT.ATCG3 module

ATCG3.ACTG3(sequ, A=False, T=False, C=False, G=False)

Calculate A, T, G, and C content at the third position.

Args:

sequ (str): DNA sequence A (bool): default = False T (bool): default = False C (bool): default = False G (bool): default = False
Returns:
  • A3 content if arg(A) is True
  • T3 content if arg(T) is True
  • C3 content if arg(C) is True
  • G3 content if arg(G) is True
  • None if all args are False

BCAWT.CA module

CA.CA(file)

correspondence analysis.

Args:

file (directory): csv file contains genes’ RSCU values
Returns:
  • csv file contains genes’ values for the first 4 axes of the correspondence analysis result
  • csv file contains codons’ values for the first 4 axes of the correspondence analysis result
  • plot the genes first 2 axes values of the correspondence analysis result
  • plot the codons first 2 axes values of the correspondence analysis result

BCAWT.CA_RSCU module

CA_RSCU.CA_RSCU(allseq, allseq_name, The_Genetic_Codes_number=1)

calculate RSCU values for correspondence analysis.

Args:

allseq (str): DNA sequence allseq_name (str) : gene name The_Genetic_Codes_number (int) : default = 1, The Genetic Codes number described by NCBI (https://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi)
Returns:
DataFrame: DataFrame contains [gene_name and RSCU values]

BCAWT.GC123 module

GC123.GC1(sequ)

Calculates the GC content an the first codon position.

Args:

sequ (str): DNA sequence
Returns:
int: GC1 value
GC123.GC12(sequ)

Calculates the GC content average at the 1st and 2nd codon positions.

Args:

sequ (str): DNA sequence
Returns:
int: GC12 value
GC123.GC2(sequ)

Calculates the GC content an the 2nd codon position..

Args:

sequ (str): DNA sequence
Returns:
int: GC2 value
GC123.GC3(sequ)

Calculates the GC content an the 3rd codon position..

Args:

sequ (str): DNA sequence
Returns:
int: GC3 value

BCAWT.GRAVY_AROMO module

GRAVY_AROMO.GRAvy_ARomo(seq, genetic_code_=1, G=False, A=False)

calculating Gravy and Aroma for DNA sequence.

Args:
seq (str):DNA sequence genetic_code_(int): default = 1, The Genetic Codes number described by NCBI (https://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi) G (bool): default = False A (bool): default = False
Returns:
  • Gravy value if arg(G) is True
  • Aroma value if arg(A) is True
  • None if both args are False

BCAWT.Optimal_codon_corr_method module

Optimal_codon_corr_method.op_corr(ENc_file_name, RSCU_file_name)

determine the optimal codons using the correlation method described here: https://doi.org/10.1371/journal.pgen.1000556

Args:

ENc_file_name (file): file contains the ENc values for a set of genes RSCU_file_name (file): file contains the RSCU values for a set of genes
Returns:
DataFrame contains the optimal codons

BCAWT.P2_index module

P2_index.P2_index(sequ, wwc=False, sst=False, wwy=False, ssy=False, p2=False)

calculate P2 index.

Args:
sequ (str): DNA sequence wwc (bool): default = False sst (bool): default = False wwy (bool): default = False ssy (bool): default = False p2 (bool): default = False
Returns:
  • wwc value if arg(wwc) is True
  • sst value if arg(sst) is True
  • wwy value if arg(wwy) is True
  • ssy value if arg(ssy) is True
  • p2 value if arg(p2) is True

BCAWT.PR2_plot_data module

PR2_plot_data.PR2_plot(sequ, o=False, a=False)

Generate data for PR2 plot.

Args:
sequ (str): DNA sequence o (bool): default = False a (bool): default = False
Returns:
  • ordinate for PR2 plot if arg (o) is True
  • abscissa for PR2 plot if arg (a) is True