Skip to main content
Fig. 5 | BMC Genomic Data

Fig. 5

From: Developing a bioinformatics pipeline for comparative protein classification analysis

Fig. 5

Forest histogram and bimodal distribution. Forest (A). The forest is computed: here sequences that correlate more are shown with circle. The circle pointed out the score which was selected for higher correlation between references sequences and population. Sequences exceeded a prefixed value, fixed at 200 in this case, is highlighted. Bimodal distribution (B). The statistics of the score distribution is shown to explain the score used. It is divided in two zone: a lower zone, highlighted in green, and a upper zone, highlighted in blue. Detail of the upper zone of the bimodal distribution (C): the blue zone of (B) is zoomed out in x. In the upper zone, only the sequences which posses a strong correlation with the references are present. Their distribution, highlighted in blue, is almost uniform. Detail of the lower zone of the bimodal distribution (D): the green zone of (B) is zoomed out in y and rotated. In the lower zone, the distribution highlighted in green, resembles a Gaussian distribution. This is expected for a merely random correlation

Back to article page