Skip to main content
Fig. 7 | BMC Genomic Data

Fig. 7

From: Developing a bioinformatics pipeline for comparative protein classification analysis

Fig. 7

Model of bioinformatics pipeline method. A A method of filtering raw data could be applied to retrieve information at the level of families and superfamilies (domains, catalytic activities conserved motif). Deep learning is not always a feasible solution, because the signal-to-noise ratio must be high. Thus, it is necessary to preprocess the input data to minimize the noise, either selecting the best hits by the means of Blastp or performing an Alpha Fold 2 analysis. If neither of these processing is possible (mainly, due to the lack of univocal identification numbers reported in databases), I can preceded by filtering functional annotations in various databases to avoiding false positive weights in sampling and bias applied to the analysis. B In the latter case, the way to proceed is by the filtering pipeline shown here, which summarizes the investigation carried out by this paper

Back to article page