The CNN-T4SE is a tool that can be run in windows and linux operating systems. It is used to identify effector proteins through the amino acid sequence, evolutionary information in the form of a position-specific scoring matrix, secondary structure and solvent accessibility information of the proteins. The following steps will help you to use CNN-T4SE to identify effectors.
STEP-1: Prepare the input files of different prediction methods
PSSSA: To use PSSSA Encoding, the studied amino acid sequence(s) should be transferred to the sequences of "secondary structure" and "solvent accessibility" using the protein structure prediction tool SCRATCH , a local version of SCRATCH is also provided HERE
Onehot: The one-hot encoding procedure is integrated in the Software, just use the FASTA format sequence file as the input
PSSM: The PSSM files could be generated through a web tool POSSUM
An sample data of those three kinds of input files can be download HERE
STEP-2: Annotate Novel T4SE Protein(s) Using CNN-T4SE
J. J. Hong, Y. C. Luo, M. J. Mou, J. B. Fu, Y. Zhang, W. W. Xue, T. Xie, L. Tao*, Y. Lou*, F. Zhu*. Convolutional neural network-based annotation of bacterial type IV secretion system effectors with enhanced accuracy and reduced false discovery. Brief Bioinform. doi: 10.1093/bib/bbz120 (2019).
If you find any bug, please kindly report it to Mr. Hong (email@example.com) or Prof. Zhu (firstname.lastname@example.org). Thanks a million for using and improving CNN-T4SE, and welcome to visit our lab at https://idrblab.org/
IDRB: Innovative Drug Research and Bioinformatics Group
All rights are reserved by: Innovative Drug Research and Bioinformatics Group (IDRB)
College of Pharmaceutical Sciences, Zhejiang University Hangzhou, P.R. China, 310058.
Contact number: (86 - 571)88208444