Подано технологію вейвлет-фільтрації профілів експресій генів для видалення фонового «білого» шуму на основі критерію ентропія Шеннона, розрахованого з використанням методу оцінки Джеймса та Стейна. Запропоновано структурну блок-схему процесу визначення параметрів вейвлет-фільтру, яка передбачає розрахунок ентропії Шеннона як для фільтрованого сигналу, так і для видаленої шумової компоненти.
Представлена технология вейвлет-фильтрации профилей экспрессий генов для удаления фонового «белого» шума на основе критерия энтропия Шеннона, который рассчитан посредством использования метода оценки Джеймса и Стейна. Предложена структурная блок-схема процесса определения параметров вейвлет-фильтра, предполагающая расчет энтропии Шеннона как для фильтрованного сигнала, так и для удаленной шумовой компоненты.
Introduction. The solved task is focused on increasing the gene expression profiles quality, which are used to reconstruct the gene regulatory networks. The filtration process is one of the stages of data preprocessing, implementation of which corresponds to the increasing data quality by removing the background “white” noise component. The aim of the paper is development of the wavelet filtration technology of gene expression profiles based on the Shannon entropy criterion, which calculated by James-Stein shrinkage estimator using. Methods. During the research, the methods of the computer simulation, wavelet analysis, and entropy methods to estimate the studied data comprehension are used. Results. The results of the simulation prove that the choice of the mother wavelet type from orthogonal and biorthogonal wavelets in case of the gene expression profiles filtration is not determinative. In terms of the relative criterion calculated as the Shannon entropy ratio of the filtered gene expression profiles and the extracted noise component, the best results are obtained using the biorthogonal wavelet bior1.5, however the difference obtained using other types of wavelets is insignificant. The choice of the type of the wavelet from the family of the mother’s wavelets, the choice of the level of the wavelet decomposition, and the choice of the value of the thresholding coefficient are determining in this case. Conclusions. The wavelet filtration technology of gene expression profiles based on complex use of the methods to estimate the filtered signal and extracted noise comprehension component is proposed based on the performed simulation. The implementation of this technology allows us to optimise the wavelet filtration process of complex signals in order to remove the “white” noise component.