Introduction. Endoscopic dextranomer/hyaluronic acid copolymer (DxHA) injection is the most commonly used minimally invasive method of surgical treatment of vesicoureteral reflux (VUR) in children.

Purpose of the study. To estimate the accuracy of logistic prognostic models and artificial neural network for prediction a single endoscopic injection DxHA in VUR.

Materials and methods. We used endoscopic DxHA in 582 patients (783 ureteric units) of all grades reflux (I - 20, II - 133, III - 443, IV - 187), 53 ureters had complete duplication. A total effectiveness of surgery was 53.2%. A binary logistic regression model and an artificial neural network (multilayer perceptron) were created, taking the following as independent variables: grade of reflux, the patient's age and sex, the ureteral duplication and ureteral dilatation index.

Results. The univariate logistic regression showed that the selected predictors were strongly related to the outcome of the treatment. Binary logistic regression and neural network developed high accuracy of the predictions, area under ROC-curve was 0,7 for logistic regression model (a sensitivity of 70.7%, and a specificity of 66.3%) and 0.74 for artificial neural network (a sensitivity of 85.5%, a specificity of 65.3%). Synaptic neural network weights and logistic regression parameters were used in a scoring model to predict the outcome of a single endoscopic injection of DxHA in 2 independent hospitals. An outcomes analysis using predictive models in independent clinics showed a good quality of prediction both with the use of logistic regression (75% and 90% of the correct prognosis) and using a neural network (89.7% and 77% of the correct prediction).

Conclusion. An artificial neural network and a binary logistic regression model are an effective tool to assist urologists in identifying and applying endoscopic treatments for VUR in children.

Vesicoureteral reflux (VUR) is a common pathology of the urinary tract in children; an estimated prevalence is about 1%. Open ureteral reimplantation is highly effective, but it is accompanied by a high injury rate and a long hospital stay. Despite the lower efficiency, compared to reimplantation, endoscopic treatment (ET) of VUR using various volume-forming substances has become widespread due to its minimal invasiveness [1][2]. There are currently works devoted to the prediction of the ET outcome, based on a statistical multivariate analysis of the influence of various factors on the ET results [3]. However, one of the factors that affect the quality of statistical multivariate analysis is the possibility of biological systems violating hypotheses based on mathematical models when there are complex interactions between variables. Other factors include cases where continuous variables do not obey the normal distribution law, or when the variables are binary or interval in nature. This limits the use of linear regression analysis for building predictive models. In such cases, predictive models using logistic regression (LR) or artificial neural network (ANN) are more appropriate [4]. Therefore, Serrano-Durbá et al. [5] and later Seckiner et al. [6] used LR and ANN to predict ET results with good predictive quality on historical data. However, there is no prospective analysis of the predictive model quality in the work of Seckiner et al. [6] based on the results obtained in other clinics.

Undoubtedly, the use of a high-quality predictive model in clinical practice will help to answer the question concerning the probability of a positive result obtained using ET, which will ultimately contribute to the choice of the most effective treatment method for a patient with VUR.

The purpose of the study was to figure out the predicting possibility of the results of a single ET using a hyaluronic acid dextranomer (DxHA) and to select the optimal predictive model based on nonparametric LR and the ANN.

Therefore, a retrospective clinical study was conducted to assess the possibility of predicting the result of a single ET of VUR in pediatric patients and to identify clinical and radiological predictors that affect the ET result performed by the STING method. Inclusion criteria: children with primary VUR of grades I–IV according to the classification of the International Committee for the Reflux Study. Exclusion criteria: primary VUR of grade V (refluxing megaloureter); ureterocele; secondary VUR against the background of the neurogenic bladder, urethral valves, bladder exstrophy; previously performed open, laparoscopic, or endoscopic surgery on the bladder and distal ureter.

The study included 582 patients suffering from VUR aged from 2 months to 16 years (median 44 [ 19.8; 85.3 ]), who underwent ET using DxHA as a bulking agent, implanted by STING method. There were 192 boys (33%) and 390 girls (67%). The unilateral reflux was observed in 381 patients (65.5%), bilateral reflux – in 201 (34.5%). All in all, 783 ureters were operated on. In 62 (7.9%) cases, the complete doubling of the ureters was observed.

The medical examination protocol of patients with VUR included laboratory tests, ultrasound examination of the urinary system, voiding cystourethrography with the calculation of the ureteral dilatation index (UDI), which was calculated as the ratio of the distal ureter diameter to the height of the third lumbar vertebra (Fig. 1) and static nephroscintigraphy. In the cases of children accustomed to the toilet, a urination diary was studied as well, uroflowmetry with the control of residual urine, and a comprehensive urodynamic examination in the presence of dysfunction signs were performed. In the cases of bladder dysfunction, conservative therapy was used; therefore, ET was performed only after normalization of the lower urinary tract function.

Fig. 1. Measurements used to calculate the ureter dilatation index on a voiding cystourethrogram (A – diameter of the ureter, B – height of the 3rd vertebra)

The surgical treatment indication was the recurrent pyelonephritis course against the background of antibacterial prevention, the appearance of new renal scars, according to nephroscintigraphy, as well as the preference of parents.

After the performed ET, all the patients underwent control ultrasound examinations and voiding cystourethrography 6 months after the intervention. The complete resolution of VUR was considered a successful treatment result. The overall efficiency of a single ET was 64.9%. Therefore, the outcomes of a single ET in the patients included in the study are presented in Table 1.

Table 1. Single endoscopic treatment outcomes

Variables | Ureters, qty. (n = 783) | Outcomes | ||

success (n = 508) | failure (n = 275) | p | ||

Age, months | 54 [ 23; 89 ] | 29 [ 14; 73 ] | 0.001 | |

Gender: | ||||

male | 253 | 137 | 116 | ˂ 0.001 |

female | 530 | 371 | 159 | |

VUR grade: | ||||

I | 20 | 18 (90%) | 2 (10%) | ˂ 0.001 |

II | 133 | 110 (82.7%) | 23(17.3%) | |

III | 443 | 310 (70%) | 133(30%) | |

IV | 187 | 70 (37.4%) | 117(62.60%) | |

Complete duplication | 53 | 31 (58.5%) | 22(41,5%) | 0.001 |

UDI | 0.4 ± 0.02 | 0.6 ± 0.03 | < 0.001 | |

Notes: VUR – vesicoureteral reflux; UDI – ureteral dilatation index. |

Statistical analysis. In order to compare qualitative variables, the authors of this article used the χ2 indication, and for continuous variables with a distribution other than normal, nonparametric statistical methods were used. The Kappa-Cohen coefficient was chosen to measure the assessment consistency of the VUR degree between clinics.

In order to predict ET results, a regression model (logistic binary regression) and a multilayer ANN (multilayer perceptron) with two hidden layers and a sigmoid function of neuronal activation were used. The training sample of the ANN included 70% of the selected randomly active data. The authors of this article used an interactive learning method that is most suitable for data sets with associated independent variables. The ANN was built repeatedly until stable results of each variable significance appeared. After that, the estimates of the synaptic weights of ANN neurons were stored in an XML file for scoring purposes. The age and gender of the patients, the VUR grade, and the presence of ureteral doubling were selected as independent variables.

The “gender” and “ureter doubling” variables were encoded as following: boy – 1, girl – 0, doubling – 1, no doubling – 0. The output variable “result” was encoded as 0 – in the case of no result and 1 – if there was a result. The choice of these independent variables was justified by their most frequent use in predicting the results of VUR treatment, according to the literature data and their influence on a single ET result (Table 1). The models were built according to the data of three clinics, with an emphasis on checking the main ANN and LR models, according to the reference clinic data on the two additional clinical institutions.

The variables model was used to predict the ET results in two clinical institutions of the Russian Federation, which used a similar ET DxHA technique. The prediction rate was determined by calculating the area under the ROC curve. The IBM SPSS Statistics 22 software package (SPSS Regression Models and SPSS Neural Networks modules) was also used for these calculations.

The single-variant regression analysis revealed a statistically significant effect of the selected variables on single ET outcomes (Table 2).

Table 2. Univariate logistic regression of factors affecting the outcomes of endoscopic treatment

Variables | B/b | S | OR | 95% CI | p |

Age | 0.96/-0.7 | 0.2 | 0.5 | 0.37 – 0.67 | ˂ 0.001 |

Gender | 0.9/-0.7 | 0.2 | 0.5 | 0.37 – 0.73 | ˂ 0.001 |

VUR grade | 3.95/-1 | 0.1 | 0.4 | 0.27 – 0.44 | ˂ 0.001 |

Ureteral duplication | 0.7/-0.9 | 0.3 | 0.4 | 0.23 – 0.76 | ˂ 0.005 |

UDI | 1.3/-2.9 | 0.6 | 0.2 | 0.04 – 0.6 | ˂ 0.001 |

Notes: 1) B – constant; b – regression coefficient; S – root-mean-square error; OR –odds ratio; CI – confidence interval. 2) VUR – vesicoureteral reflux; UDI – ureteral dilatation index |

Table 3. Multivariate logistic regression of factors affecting the outcomes of endoscopic treatment

Variables | b | S | p | OR | 95% CI | |

lower limit | upper limit | |||||

Gender | -0.46 | 0.34 | 0.17 | 0.62 | 0.32 | 1.2 |

Age | -0.003 | 0.004 | 0.42 | 1.0 | 0.99 | 1.01 |

UDI | -2.16 | 0.7 | 0.002 | 0.16 | 0.03 | 0.45 |

VUR grade | -0.62 | 0.29 | 0.032 | 0.54 | 0.31 | 0.95 |

Duplication | -0.57 | 0.57 | 0.33 | 0.57 | 0.19 | 1.75 |

Constant | 3.3 | 0.97 | 0.001 | 27.3 | ||

UDI | -3.1 | 0.62 | < 0.001 | 0.05 | 0.01 | 0.16 |

Constant | 1.47 | 0.33 | < 0.001 | 4.3 | ||

Notes: 1) b – regression coefficient; S – root-mean-square error; p – statistical significance; OR – odds ratio; CI – confidence interval. 2) VUR – vesicoureteral reflux; UDI – ureteral dilatation index. |

When all the explanatory variables were included in the multiple LR model, it was found that the variables gender, age, and doubling were statistically insignificant (Table 3). While constructing the final model of multiple LR, the Wald method was used to select variables. The percentage of correct prediction, sensitivity, and specificity of the model were determined at each selection step. Therefore, it turned out that the best model quality was noted while using the UDI as the only variable (Table 3). The model demonstrated 71.3% of correct prediction with a sensitivity of 70.7% and a specificity of 66.3%; the area under the receiver operating characteristic (ROC) curve was 0.7 (CI 0.63–0.78, P 0.001).

When using an ANN, the network considered the UDI, the VUR grade, and doubling as the most important variables (normalized importance of 100%, 92.3%, and 49.4%, respectively). The predicted result performed using an ANN showed 74.5% of correct attributions, with an area under the ROC curve of 0.77 (CI 0.67–0.65, P 0.001), sensitivity and specificity equal to 85.5 and 65.3%.

While comparing samples from two independent clinics with a sample from a reference clinic, according to which prognostic models were built, statistically significant differences in some variables were noted (Table 4). In the Reference Clinic, where ET was performed in older children, there were statistically significant differences in the frequency distribution by the VUR grade and the overall ET effectiveness in comparison with Clinic 2. The same was noted in the results of surgery for grade III VUR as compared with both Clinic 1 and Clinic 2.

Taking into account the fact that the clinics used the same ET technique, it is important to note that different interpretations of VUR severity could underlie the differences found in the surgery effectiveness.

Table 4. Overview of patients’ demographics and outcomes of a single DxHA injection in 2 independent hospitals

Hospital | Age, mo. | Gender, male/female | VUR grade (Ureters, qty.) | ET efficiency, % | p |

1 | Me 36 [ 18; 65 ] | 109 / 147 | I (4) | 100 | 0.5 |

II (27) | 74.1 | 0.3 | |||

III (297) | 62.3 | 0.03 | |||

IV (48) | 47.9 | 0.2 | |||

Duplication (34) | 32.4 | 0.09 | |||

Total efficiency | 61.7 | 0.72 | |||

2 | Me 32.5 [ 12; 72 ] | 46 / 70 | II (22) | 77.3 | 0.5 |

III (47) | 42.6 | ˂ 0.001 | |||

IV (47) | 23.4 | 0.07 | |||

Duplication (27) | 44.4 | 0.9 | |||

Total efficiency | 41.3 | ˂ 0.001 | |||

Notes: 1) p – the significance of differences concerning the reference hospital’s data (χ2 test); 2) VUR – vesicoureteral reflux; UDI – ureteral dilatation index. |

In order to find out whether there was an inconsistency in VUR grade assessment, the operating surgeons studied the interpretation of 80 voiding cystourethrograms. It turned out that in 42.9% of cases, there was an inconsistency in VUR grade assessment. Therefore, weak consistency was found between the Reference Clinic and Clinic 1 (the Kappa-Cohen coefficient was 0.26). The consistency of the VUR grade assessment between the Reference Clinic and Clinic 2 could be assessed as a moderate one (the Kappa-Cohen coefficient was 0.61).

While using the multiple regression method for Clinic 1 and Clinic 2 (the Wald method to select variables in the final model), it was found out that in the case of Clinic 1, variables such as UDI, age, and doubling were included in the model. The model showed a 69% correct prediction with an area of 0.7 under the ROC curve (p 0.001, CI 0.6–0.8). As for Clinic 2, such variables as UDI and the VUR grade were selected in the model. The regression model showed 72% correct prediction with an area of 0.8 under the ROC curve (CI 0.7–0.9, p 0.001).

The use of an ANN for Clinic 1 showed that, just as in the Reference Clinic, the neural network considered the UDI (100%) and the VUR grade (78.5%) to be the most significant variables. According to the ROC analysis, the model showed good quality (AUC = 0.72, CI 0.7–0.8, p 0.001) with a correct prognosis of 69%, a sensitivity of 91%, and a specificity of 30.2%. As for Clinic 2, the ANN correctly predicted the ET result in 78% of the operated ureters, with UDI and the VUR grade being the most important variables (100% and 94%, respectively). The model had a sensitivity of 70% and a specificity of 82%, with an area of 0.8 under the ROC curve (CI 0.7–0.9; p 0.001). It is interesting to note that the use of stored synaptic neurons weights to predict the ET results (according to the Reference Clinic) correctly predicted the results in other clinics – for both Clinic 1 (70% of correct prognosis, P 0.001, sensitivity 90.5% and specificity 40.5%, AUC value = 0.7; P 0.001, CI 0.6–0.7) and Clinic 2 (72% of correct prognosis, P 0.001, sensitivity 77% and specificity 68%, AUC value = 0.73; P 0.001, CI 0.6 – 0.8). Similarly, the stored LR coefficients obtained from the Reference Clinic correctly predicted the ET results for both Clinic 1 (70% of correct prognosis, P 0.001, sensitivity 90.1% and specificity 33.3%, AUC value = 0.64; P 0.001, CI 0.6–0.7) and Clinic 2 (71.6% of correct prognosis, P 0.001, sensitivity 75% and specificity 69.1%, AUC value = 0.7; P 0.001, CI 0.63 – 0.82).

Currently, ET in children with the use of various volume-forming substances is considered by many surgeons as a method of choosing the initial treatment of children suffering from VUR [7][8]. Based on a meta-analysis performed by Elder et al. [9], the authors of this article can state a fairly wide range (from 50% to 85%) of the total single ET effectiveness, published by the authors whose works were included in the study. Later, Routh et al. [10] published a systematic review of the results of VUR treatment using DxHA. Therefore, there were 3 randomized studies of the 47 ones, 7 studies were prospective, and the rest demonstrated a retrospective review nature. The authors also noted a significant heterogeneity in the published results, the overall ET effectiveness varied from 42% to 92%. Different criteria for evaluating the ET effectiveness may be the possible heterogeneity sources in publications on ET results, as well as the sample size and its representativeness, variants of the ureterovesical segment anatomy, and the surgical experience of the authors of publications. However, the authors of the presented meta-analyses note that, despite the significant variability of ET results, there was a general tendency for the negative influence of the VUR grade on the intervention results. At the same time, heterogeneity was noted not only in the overall results for the samples but also in the subgroups formed depending on the VUR grade following the international classification. Thus, it is possible to assume another source of heterogeneity, which may be due to different interpretations of VUR severity.

Thus, a study conducted by Shaeffer et al. [11] showed that among the interviewed specialists with experience in VUR children diagnosis, there was only 59% consistency in VUR grade assessment. It is also interesting, but according to O'Neil et al. [12], after performing a questionnaire of pediatric urologists, it was figured out that in 29–35%, the VUR grade was heavier, which could lead to the overly active treatment of patients. Thus, it can be assumed that the prediction of treatment results based on a qualitative assessment of VUR severity is subjective and can significantly affect the reliability of statistics. This was confirmed by this particular study.

The concept of so-called “dilating” VUR has appeared in the literature in recent years. ureteral dilatation is considered as a possible variable of the results of conservative or endoscopic reflux treatment. In 2008, Argibay et al. [13] divided the degree of ureteral dilatation qualitatively (normal diameter, moderate and severe dilatation) and showed that the ET outcome in VUR of grades III-IV depended on the dilatation grade more strongly than on the VUR grade. However, such a stratification cannot be sufficiently objective, since it is based on a subjective assessment of the ureteral dilation grade. Later, in order to objectify the ureter dilatation grade, an index was proposed, which was calculated as the ratio of the ureter diameter in the pelvic region to the lumbar vertebra height [14]. This allowed avoiding the subjective properties of a qualitative assessment of the ureter dilatation grade. It is interesting but ureteral dilatation was the only statistically significant risk factor for the ET outcome in a study performed by Helmy et al. [15]. The ureteral dilatation grade may likely reflect the severity of the ureterovesical segment decompensation and possibly affect the ET outcome.

The works of Altug et al. [16] and Alizadeh et al. [17] showed that the configuration of the opening in the form of a golf hole or horseshoe negatively affected the ET outcomes. It should be noted that the above-mentioned variables did not lose their significance depending on the injection method (HIT or STING) or the type of volume-forming material [18][19].

Most studies devoted to predicting ET outcomes are based on traditional statistics and usually use linear or nonparametric (logistic) regression [3][20]. Maximum probability estimates for LR models often experience severe bias due to separation and multicollinearity issues arising from a large number of highly correlated elements, especially when there are complex interactions between variables. The main multicollinearity problem is the difficulty in interpreting the influence of variables on the explanatory variable. The presence of a correlation between variables can lead to the fact that the model results can differ significantly while being used on other samples. At the same time, it should be taken into account that in medical research, multicollinearity may reflect the mechanism of sample formation and certain physiological content. In this particular study, the sample was formed on the historical data of patients who had ET indications. Statistical analysis showed a shift in the frequency of severe VUR towards the younger age group. This is obviously because the majority of children in the younger age group with low VUR are subject to conservative treatment. At the same time, ET with a low VUR grade was performed as a concomitant procedure in a two-way process. The same relationship was found between VUR severity and gender, and severe reflux prevailed in boys. This may be explained by the more extensive ET indications in girls, due to the more frequent development of infection [21], even with low VUR grades. Thus, the analyzed sample, strictly speaking, was not a sample from the general population of children with VUR, and there was variable collinearity. Due to the presence of collinearity, while using the univariant regression, it is possible to get an erroneous conclusion about the influence of gender and age on the ET result. The influence of integral factors on each other can also lead to the use restriction of multiple regression as a method for studying the influence of complex variables on the explanatory variable, as well as the inability to correctly interpret the predictive model. In this study, a variable exclusion method based on the probability of Wald statistics was used to reduce collinearity. Thus, the reduction of the feature space allowed the authors of this article to obtain a good-quality predictive model using LR.

ANN have currently become increasingly widely used in medicine [4][22][23]. In order to predict the occurrence of febrile infection in children suffering from VUR, Arlen et al. [24] used an ANN, which showed a high accuracy of correct prediction (the area under the ROC curve was 0.76). Knudson et al. [25] showed the high accuracy of the ANN (0.86) in predicting the early resolution in conservative VUR treatment. As for using ANN as a method for predicting ET effectiveness, it is important to note that only one study was found in the available literature. It was a study by Serrano-Durbá et al. [5], in which the high predictive significance of the ANN was established (prediction accuracy of 0.77).

The possibility of training with the execution of several random sorting cycles while creating a sample of observations may be also considered as an advantage of ANN. Using the interactive learning method allows physicians to train the network when the input data (variables) are not independent of each other. It is also possible to evaluate the importance of variables based on a combination of training and test samples. This analysis reduces the sensitivity to complex interactions between variables and reduces the feature space.

In this particular study, LR and ANN demonstrated a fairly high quality of prediction in different clinics that used a single method of surgery with DxHA. The degree of ureterectasis was the most important variable for predicting single ET results. Thus, predictive models based on binary LR and ANN can be used prospectively in patients with VUR to determine the probability of the ET outcome.