Clustal Omega 1.2.4
hhalign_wrapper.h File Reference
This graph shows which files directly or indirectly include this file:

Go to the source code of this file.

Functions

void SetDefaultHhalignPara (hhalign_para *prHhalignPara)
 FIXME.
 
int PosteriorProbabilities (mseq_t *prMSeq, hmm_light rHMMalignment, hhalign_para rHhalignPara, char *pcPosteriorfile)
 PosteriorProbabilities() calculates posterior probabilities of aligning a single sequences on-to an alignment containing this sequence.
 
double PileUp (mseq_t *prMSeq, hhalign_para rHhalignPara, int iClustersize)
 sequentially align (chain) sequences
 
double HHalignWrapper (mseq_t *mseq, int *piOrderLR, double *pdSeqWeights, int iNodeCount, hmm_light *prHMMList, int iHMMCount, int iProfProfSeparator, hhalign_para rHhalignPara)
 wrapper for hhalign. This is a frontend function to the ported hhalign code.
 
void SanitiseUnknown (mseq_t *mseq)
 get rid of unknown residues
 

Function Documentation

◆ HHalignWrapper()

double HHalignWrapper ( mseq_t * prMSeq,
int * piOrderLR,
double * pdSeqWeights,
int iNodeCount,
hmm_light * prHMMList,
int iHMMCount,
int iProfProfSeparator,
hhalign_para rHhalignPara )
extern

wrapper for hhalign. This is a frontend function to the ported hhalign code.

Parameters
[in,out]prMSeqholds the unaligned sequences [in] and the final alignment [out]
[in]piOrderLRholds order in which sequences/profiles are to be aligned, even elements specify left nodes, odd elements right nodes, if even and odd are same then it is a leaf
[in]pdSeqWeightsWeight per sequence. No weights used if NULL
[in]iNodeCountnumber of nodes in tree, piOrderLR has 2*iNodeCount elements
[in]prHMMListList of background HMMs (transition/emission probabilities)
[in]iHMMCountNumber of input background HMMs
[in]iProfProfSeparatorGives the number of sequences in the first profile, if in profile/profile alignment mode (iNodeCount==3). That assumes mseqs holds the sequences of profile 1 and profile 2.
[in]rHhalignParavarious parameters read from commandline
Returns
score of the alignment FIXME what is this?
Note
complex function. could use some simplification, more and documentation and a struct'uring of piOrderLR
HHalignWrapper can be entered in 2 different ways: (i) all sequences are un-aligned (ii) there are 2 (aligned) profiles. in the un-aligned case (i) the sequences come straight from Squid, that is, they have been sanitised, all non-alphabetic residues have been rendered as X's. In profile mode (ii) one profile may have been produced internally. In that case residues may have been translated back into their 'native' form, that is, they may contain un-sanitised residues. These will cause trouble during alignment
: introduced argument hhalign_para rHhalignPara; FS, r240 -> r241
: if hhalign() fails then try with Viterbi by setting MAC-RAM=0; FS, r241 -> r243

translate back ambiguity residues hhalign translates ambiguity codes (B,Z) into unknown residues (X). as we still have the original input we can substitute them back

◆ PileUp()

double PileUp ( mseq_t * prMSeq,
hhalign_para rHhalignPara,
int iClustersize )
extern

sequentially align (chain) sequences

Parameters
[in,out]prMSeqholds the un-aligned sequences (in) and the final alignment (out)
[in]rHhalignParavarious parameters needed by hhalign()
[in]iClustersizeparameter that controls how often HMM is updated
Note
chained alignment takes much longer than balanced alignment because at every step ClustalO has to scan all previously aligned residues. for a balanced tree this takes N*log(N) time but for a chained tree it takes N^2 time. This function has a short-cut, that the HMM need not be updated for every single alignment step, but the HMM from the previous step(s) can be re-cycled. The HMM is updated (i) at the very first step, (ii) if a gap has been inserted into the HMM during alignment or (iii) if the HMM has been used for too many steps without having been updated. This update-frequency is controlled by the input parameter iClustersize. iClustersize is the number of sequences used to build a HMM to allow for one non-updating step. For example, if iClustersize=100 and a HMM has been build from 100 sequences, then this HMM can be used once without updating. If the HMM has been built from 700 sequences (and iClustersize=100) then this HMM can be used 7-times without having to be updated, etc. For this reason the initial iClustersize sequences are always aligned with fully updated HMMs.

◆ PosteriorProbabilities()

int PosteriorProbabilities ( mseq_t * prMSeq,
hmm_light rHMMalignment,
hhalign_para rHhalignPara,
char * pcPosteriorfile )
extern

PosteriorProbabilities() calculates posterior probabilities of aligning a single sequences on-to an alignment containing this sequence.

Parameters
[in]prMSeqholds the aligned sequences [in]
[in]rHMMalignmentHMM of the alignment in prMSeq
[in]rHhalignParavarious parameters read from commandline, needed by hhalign()
[in]pcPosteriorfilename of file into which posterior probability information is written
Returns
score of the alignment FIXME what is this?
Note
the PP-loop can be parallelised easily FIXME

◆ SanitiseUnknown()

void SanitiseUnknown ( mseq_t * mseq)

get rid of unknown residues

Note
HHalignWrapper can be entered in 2 different ways: (i) all sequences are un-aligned (ii) there are 2 (aligned) profiles. in the un-aligned case (i) the sequences come straight from Squid, that is, they have been sanitised, all non-alphabetic residues have been rendered as X's. In profile mode (ii) one profile may have been produced internally. In that case residues may have been translated back into their 'native' form, that is, they may contain un-sanitised residues. These will cause trouble during alignment FS, r213->214

◆ SetDefaultHhalignPara()

void SetDefaultHhalignPara ( hhalign_para * prHhalignPara)
extern

FIXME.

Note
prHalignPara has to point to an already allocated instance