Application of Artiﬁcial Neural Networks to Estimate Wastewater Treatment Plant Inlet Biochemical Oxygen Demand Emrah Dogan,a Asude Ates,b Ece Ceren Yilmaz,b Beytullah Erenb a Department of Civil Engineering, Sakarya University, Esentepe Campus, 54187 Sakarya, Turkey; [email protected] (for correspondence) b Department of Environmental Engineering, Sakarya University, Esentepe Campus, 54187 Sakarya, Turkey Published online 30 July 2008 in Wiley InterScience (www.interscience.wiley.com). DOI 10.1002/ep.10295

Biochemical oxygen demand (BOD) has been shown to be an important variable in water quality management and planning. However, BOD is difﬁcult to measure and needs longer time periods (5 days) to get results. Artiﬁcial neural networks (ANNs) are being used increasingly to predict and forecast water resource variables. The objective of this research was to develop an ANNs model to estimate daily BOD in the inlet of wastewater biochemical treatment plants. The plantscale data set (364 daily records of the year 2005) was obtained from a local wastewater treatment plant. Various combinations of daily water quality data, namely chemical oxygen demand (COD), water discharge (Qw), suspended solid (SS), total nitrogen (N), and total phosphorus (P) are used as inputs into the ANN so as to evaluate the degree of effect of each of these variables on the daily inlet BOD. The results of the ANN model are compared with the multiple linear regression model (MLR). Mean square error, average absolute relative error, and coefﬁcient of determination statistics are used as comparison criteria for the evaluation of the model performance. The ANN technique whose inputs Ó 2008 American Institute of Chemical Engineers

Environmental Progress (Vol.27, No.4) DOI 10.1002/ep

are COD, Qw, SS, N, and P gave mean square errors of 708.01, average absolute relative errors of 10.03%, and a coefﬁcient of determination 0.919, respectively. On the basis of the comparisons, it was found that the ANN model could be employed successfully in estimating the daily BOD in the inlet of wastewater biochemical treatment plants. Ó 2008 American Institute of Chemical Engineers Environ Prog, 27: 439–446, 2008

Keywords: water quality, artiﬁcial neural network, wastewater treatment plant, multiple linear regression model INTRODUCTION

Industrial and municipal wastewaters are major contamination sources of aquatic biota, accounting for several thousand types of chemicals released into the environment. Thus, the importance of implementing efﬁcient monitoring and control techniques for wastewater treatment systems is well known. A reliable model for any wastewater treatment plant is essential in order to provide a tool for predicting its performance and to form a basis for controlling the operation of the process. This process is complex and attains a high degree of nonlinearity due to the presence of bio-organic constituents that are difﬁcult December 2008 439

to model by using mechanistic approaches. Predicting the plant’s operational parameters using conventional experimental techniques, is also a time-consuming step and an obstacle in the way to an efﬁcient control of such processes. Biochemical oxygen demand is one of the major parameters for wastewater management and planning. It is an approximate measure of the amount of biochemical degradable organic matter present in a water sample. It is deﬁned by the amount of oxygen required for the aerobic microorganisms present in the sample to oxidize the organic matter to a stable organic form. The oxygen consumption from degradation of organic material is normally measured as biochemical oxygen demand (BOD) and chemical oxygen demand (COD), so there is an important relation between them. Performing the test for BOD requires signiﬁcant time and commitment for preparation and analysis. This process requires ﬁve days, with data collection and evaluation occurring on the last day [1]. Several water quality models such as traditional mechanistic approaches have been developed in order to manage the best practices for conserving the quality of water. Most of these models need several different input data, which are not easily accessible and make it a very expensive and time-consuming process. Artiﬁcial neural networks (ANNs) are computer techniques that attempt to simulate the functionality and decision-making processes of the human brain. In the past few decades, ANNs have been extensively used in a wide range of engineering applications. Recently, ANNs have been increasingly applied in modeling water quality. ANNs have been successfully used in hydrological processes, water resources, water quality prediction, and reservoir operation [2–7]. They have been used especially for the forecasting of water quality parameters and estimating nutrient concentration from pollution sources of watershed [8–10]. The research presented in this study is motivated by a desire to explore the potential of ANN estimation of biochemical oxygen demands of the input stream of a biochemical wastewater treatment plant. There is a ﬁve-day delay in determination of BOD, and when this is added to the hydraulic residence time, it is often too late to make proper adjustments in the wastewater treatment process. The ANN would be ideally suited for estimating inlet BOD owing to its ability to provide good generalization performance in capturing nonlinear regression relationships between predictors and the predictand. The performance of the ANN model is compared with multiple linear regression (MLR). Comparison of the results shows that the ANN model is statistically superior to the MLR model.

Figure 1. A typical three-layer feed forward ANN.

ral networks. They can be characterized by three components: Nodes Weights (connection strength) An activation (transfer) function ANN modeling is a nonlinear statistical technique; it can be used to solve problems that are not amenable to conventional statistical and mathematical methods. In the past few years there has been constantly increasing interest in neural networks modeling in different ﬁelds of hydrology engineering [11]. The basic unit in the artiﬁcial neural network is the node. Nodes are connected to each other by links known as synapses; associated with each synapse there is a weight factor. Usually neural networks are trained so that a particular set of inputs produces, as nearly as possible, a speciﬁc set of target outputs. The most commonly used ANN is the three-layer feed-forward ANN. In feed-forward neural networks architecture, there are layers and nodes at each layer. Each node at input and inner layers receives input values, processes, and passes them on to the next layer. This process is conducted by weights. Weight is the connection strength between two nodes. The numbers of neurons in the input layer and the output layer are determined by the numbers of input and output parameters, respectively. In the present feedforward artiﬁcial neural networks are used. The model is shown in Figure 1. In Figure 1, i, j, k denote nodes input layer, hidden layer, and output layer, respectively. w is the weight of the nodes. Subscripts specify the connections between the nodes. For example, wij is the weight between nodes i and j. The term ‘‘feed-forward’’ means that a node connection only exists from a node in the input layer to other nodes in the hidden layer or from a node in the hidden layer to nodes in the output layer, and that the nodes within a layer are not interconnected to each other.

METHODS

ANNs ANNs consist of large number of processing elements with their interconnections. ANNs are basically parallel computing systems similar to biological neu440 December 2008

Multiple Linear Regression If it is assumed that the dependent variable Y is affected by m independent variables X1, X2,. . .,Xm and a linear equation is selected for the relation Environmental Progress (Vol.27, No.4) DOI 10.1002/ep

Table 1. The daily statistical parameters of each data set.

Data set BOD (mg/L) COD (mg/L) N (mg/L) P (mg/L) SS (mg/L) Qw (m3/s)

xmean

Sx

Cv(Sx/xmean)

Csx

xmin

xmax

Correlation with BOD

237.054 445.407 48.42 4.35 322.228 63,562

99.13 178.85 17.89 1.97 162.28 17,233

0.42 0.40 0.37 0.45 0.50 0.27

20.16 20.31 20.53 0.24 1.87 20.73

33 73 9.5 0.5 52 15,920

610 865 81 9.8 1395 94,410

1.000 0.954 0.871 0.650 0.452 20.357

among them, the regression equation of Y can be written as: y ¼ a þ b1 x1 þ b2 x2 þ þ bm xm

(1)

y in this equation shows the expected value of the variable Y when the independent variables take the values X1 5 x1, X2 5 x2,. . .,Xm 5 xm. The regression coefﬁcients a, b1, b2,. . .,bm are evaluated, similar to simple regression, by minimizing the sum of the eyi distances of observation points from the plane expressed by the regression equation [12], N X

2 eyi ¼

i¼1

N X

ðyi a b1 x1i b2 x2i bm xmi Þ2

(2)

i¼1

In this study, the coefﬁcients of the regressions were determined using the least square method. COMPILATION OF DATA

The daily BOD, COD, total phosphorus (P), total nitrogen (N), suspended solid (SS), and water discharge (Qw) are obtained from one of the inlets of the wastewater treatment plant in Adapazari city, Turkey. The data sets consisting of composite samples that represent an average composition of the samples were used in this study. The data set based on analysis of composite daily samples is obtained by taking automatic grap samples every hour (over a 24-h period) of the wastewater entering the plant. The daily statistical parameters of each data are given in Table 1. In this table, xmean, Sx, Cv, Csx, xmin, and xmax denote the mean, standard deviation, variation, skewness coefﬁcient, minimum and maximum of the data, respectively. It is clearly seen from Table 1 that the mostly varied (Cv 5 0.50 mm) data is SS. The highest correlation coefﬁcient with the BOD (0.954) belongs to the COD. There is an inverse proportion between BOD and Qw with a negative correlation of 0.357. APPLICATION OF ANN MODEL

In this study, before the training of the model both input and output variables were normalized within the range 0.1 to 0.9 as follows: xi ¼ 0:8

ðx xmin Þ þ 0:1 ðxmax xmin Þ

(3)

where xi is the normalized value of a certain parameter, x is the measured value for this parameter, xmin Environmental Progress (Vol.27, No.4) DOI 10.1002/ep

and xmax are the minimum and maximum values in the database for this parameter, respectively. To develop an ANN model for estimating BOD, the available data set (364 daily water quality data of the year 2005) was randomly partitioned into a training set and a test set. About 67% (244 water quality data set) of the available record was selected for training while the remaining 33% (120 for water quality data set) was used for testing. For all created neural networks the general structure of input, one hidden and one output layer was used. To determine the optimal architecture, several neural networks were trained with different iteration numbers (epoch) and numbers of nodes in the hidden layer. For all cases a ‘‘log sigmoid transfer function (logsig)’’ was used in the hidden and output layers. When the logsig was applied, the inputs and the outputs were normalized to within the range of 0–1. The most accurate estimations of the ANNs were obtained with log sigmoid transfer function [13]. SELECTION OF INPUT PARAMETERS FOR THE ANN

The selection of the input parameters is a very important aspect for the neural network modeling. To use ANN structures effectively, input variables in the phenomenon must be selected with great care. This highly depends on the better understanding of the problem. In a ﬁrm ANN architecture, to not confuse the training process, key variables must be introduced and unnecessary variables must be avoided. For this purpose, a sensitivity analysis can be used to ﬁnd out the key parameters. Also sensitivity analysis can be useful to determine the relative importance of the parameters when sufﬁcient data are available. The sensitivity analysis is used to determine the effect of changes and to determine the relative importance or effectiveness of a variable on the output. The input variables that do not have a signiﬁcant effect on the performance of an ANN can be excluded from the input variables, resulting in a more compact network. Then it becomes necessary to work on methods like sensitivity analysis to make ANN work effectively. BOD depends on some independent parameters, which can be given in this form: BOD 5 f(COD, N, P, Qw, SS). The ﬁve ANN models were established using each independent parameter separately. Sensitivity analysis was applied to ﬁnd the most effective input parameters. The sensitivity analysis coefﬁcient of determination (R2) of the parameters involved in the phenomenon is given in Figure 2, respectively. It is clearly December 2008 441

Figure 2. Comparison of ANN results and observed BOD depending on each input parameter.

442 December 2008

Environmental Progress (Vol.27, No.4) DOI 10.1002/ep

Table 2. Determination of the most appropriate ANN architecture.

ANN structure (number of nodes in layers) ANN ANN ANN ANN ANN ANN ANN ANN ANN ANN

(5, (5, (5, (5, (5, (5, (5, (5, (5, (5,

Iteration number (Epoch)

Determination coefﬁcient (R2)

Mean square error (MSE)

1000 1000 1000 1000 1000 2000 2000 2000 2000 2000

0.917 0.919 0.915 0.913 0.895 0.908 0.909 0.894 0.898 0.884

716.21 708.01 737.30 757.42 869.38 743.79 755.84 864.95 828.60 966.52

2, 1) 3, 1) 5, 1) 7, 1) 10, 1) 2, 1) 3, 1) 5, 1) 7, 1) 10, 1)

seen from Figure 2 that the most effective parameter is determined as COD.

deviations and, thus, overﬁtting begins, as illustrated in Figure 3b. SENSITIVITY ANALYSIS

DETERMINATION OF AN APPROPRIATE ANN MODEL

One of the problems that occur during neural network training is called overﬁtting. Overﬁtting is suggested when the error on the training set is driven to a very small value, while for the test data presented to the network the error is large. That means the network has memorized the training examples, but it has not learned to generalize to new situations. To not overﬁt the training data, the appreciated epoch number, number of hidden layers, and node number of hidden layers must be chosen by trial and error process. Networks are sensitive to the number of nodes in their hidden layers. Too few nodes can lead to underﬁtting and too many nodes can result in overﬁtting. To reach an optimum amount of hidden layer nodes, 2, 4, 6, 8, 10, and 12 nodes are tested. The results are shown in Table 2. The ﬁrst column in this table denotes the nodes of each layer for the ANN models. Accordingly, an ANN structure like ANN(i,j,k) indicates a network architecture with i, j, and k nodes in input, hidden, and output layers, respectively. In this case the input layer covers the COD, total phosphorus, total nitrogen, suspended solid particles, and water discharge (COD, P, N, SS, Qw) and the output layer consists of the BOD. It can be seen from the Table 2 that the ANN(5,3,1) model with 1000 iterations having the R2 value of 0.919 and MSE value of 708.01 is the best model. Networks are also sensitive to the number of hidden layers. In this study, ANN architectures with only one and two hidden layers are tested, since three or more hidden layered systems are known to cause unnecessary computational overload. The variation of training and test MSE values for one and two hidden layered ANN models are presented in Figures 3a and 3b. As seen from these ﬁgures, ANN architecture with one hidden layer turns out to be a more stable design. Thus, for this study one hidden layered ANN having two hidden layer nodes with 1000 epochs has been adopted. If the epoch number exceeds 1000, Environmental Progress (Vol.27, No.4) DOI 10.1002/ep

It appears that while assessing the performance of any model for its applicability in estimating BOD, it is important to evaluate not only the average prediction error but also the distribution of prediction errors. The statistical performance evaluation criteria employed so far in this study are global statistics (R2 and MSE) and do not provide any information on the distribution of errors [14]. Therefore, in order to test the robustness of the model developed, it is important to test the model using other performance evaluation criteria such as average absolute relative error (AARE). The AARE not only gives the performance index in terms of predicting BOD but also shows the distribution of the prediction errors. These criteria can be computed as: AARE ¼

n 1X jREj N p¼1

(4)

in which, RE ¼

tp op 3 100 tp

(5)

where RE is the relative error in forecast expressed as percentage, tp is the observed BOD for the pth pattern; and op is the computed BOD for the pth pattern which is produced by ANNs; and N is the total number of the testing patterns. Clearly, the smaller the value of AARE is the better the performance. The performance control of the ANN outputs was evaluated by estimating the coefﬁcient of determination (R2) which is deﬁned as: R2 ¼

BODo BODs BODs

(6)

where: BODo ¼

n X

tp tmean

2

(7)

p¼1

December 2008 443

Figure 3. The observed and estimated biological oxygen demand values. (a) ANN system with one hidden layer

(MATLAB, 2004). (b) ANN system with two hidden layers (MATLAB, 2004).

Figure 4. The observed and estimated biological oxygen demand values.

BODs ¼

n X

tp op

2

(8)

i¼1

where, tmean is the mean BOD, The mean Square error (MSE) is deﬁned as,

MSEs ¼

N 2 1 X tp op N i¼1

APPLICATION OF MLR

(9)

COD is used as the common parameter for the rest of the sensitivity analysis. Performance evaluation of all possible combination of variables such that each and every combination includes COD, was also investigated. The ﬁndings are listed in Table 3. Based on 444 December 2008

the ﬁndings, as depicted in Table 3 the ANN model has ﬁve inputs (COD, N, P, SS, Qw) gives the best estimation.

The performance criterion for the test results of the MLR model is given in Table 4. As can be seen from Table 4, the model has the highest MSE and AARE and the lowest R2 values when COD is only used as input. However, it is clearly seen from Table 4 that while adding the total nitrogen, total phosphorus and suspended solid increases the models’ performance, adding water discharge decreases the models’ performance. However, the effect of water discharge on the phenomena was shown in Table 1 and Figure 2 the MLR model does not consider its effect. It is a drawback for the MLR. Environmental Progress (Vol.27, No.4) DOI 10.1002/ep

Table 3. Performance evaluation of the effective parameters using ANN for sensitivity analysis.

Performance AARE (%) MSE R2

COD 10.52 819.41 0.903

COD 1 N 10.25 778.67 0.903

COD 1 N 1 P 10.18 726.25 0.911

COD 1 N 1 P 1 SS 10.10 714.69 0.915

COD 1 N 1 P 1 SS 1 Qw 10.03* 708.01* 0.919*

Best results are indicated by ‘‘*’’.

Table 4. The performances of the MLR in the test period.

Performance

COD

COD 1 N

COD 1 N 1 P

COD 1 N 1 P 1 SS

COD 1 N 1 P 1 SS 1 Qw

AARE (%) MSE R2

10.35 798.92 0.902

10.21 747.73 0.908

10.21 741.26 0.908

10.20* 738.98* 0.909*

10.34 752.37 0.906

Best results are indicated by ‘‘*’’.

The performance of the selected neural network model and MLR in predicting BOD is demonstrated in Figure 4 for the test data set. As can be seen from Figure 4, both ANN and MLR estimates follow the corresponding experimentally measured data with a signiﬁcantly high R2 value of 0.919 and 0.909, respectively. Furthermore ANN statistically outperforms MLR in terms of BOD estimation. CONCLUSION

The present study demonstrates the capabilities of the ANN model for BOD modeling; however, the choice of ANN architecture and input parameters are crucial for obtaining good estimate accuracy. Thus, sensitivity analysis has been conducted to determine the degree of effectiveness of the variables by using various performance statistics. From the results obtained, an ANN model appears to be a useful tool for prediction of the inlet BOD. The results demonstrate that the COD is more effective on BOD estimation than the other four parameters. The remaining parameters were used one by one in estimating BOD. After the application of sensitivity analysis, other effective parameters were determined as total nitrogen N, total phosphorus P, suspended solid SS, and water discharge Qw, respectively. The models whose inputs are COD, water discharge, suspended solid, total nitrogen, and total phosphor have the best performance criteria among the input combinations tried in the study. This indicates that all these variables are needed for better BOD modeling. The MLR model was also used for predicting BOD. However, the effectiveness of the independent parameters was shown during the sensitivity analysis MLR model does not consider water discharges’ effect. It is a drawback for the MLR. On the basis of the comparison results, the ANN technique was found to be superior to the MLR technique. Environmental Progress (Vol.27, No.4) DOI 10.1002/ep

LITERATURE CITED

1. Chapman, D. (1992). Water quality assessments (1st Edition, pp. 80–81), London: Chapman and Hall. 2. Suen, J. P., Eheart, J. W., & Asce, M. (2003). Evaluation of neural networks for modelling nitrate concentration in rivers, Journal of Water Resource Plan and Man, 129, 505–510. 3. Aguilera, P. A., Frenich, A. G., Torres, J. A., Castro, H., Vidal, J. L. M., & Canton, M. (2001). Application of the Kohonen neural network in coastal water management: Methodological development for the assessment and prediction of water quality. Water Research, 35, 4053–4062. 4. Lobbrect, A. H., & Solomatine, D. P. (1999). Control of water levels in polder areas using neural networks and fuzzy adaptive systems. Water Industry Systems: Modeling and Optimization Applications, 1, 509–518. 5. Maier, H. R., & Dandy, G. C. (1996). The use of artiﬁcial neural networks for the prediction of water quality parameters, Water Resources Research, 32, 1013–1022. 6. Wen, C. G., & Lee, C. S. (1998), A neural network approach to multiobjective optimization for water quality management in a river basin, Water Resources Research, 34, 427–436. 7. Zaheer, I., & Bai, C. G. (2003). Application of artiﬁcial neural network for water quality management, Lowland Technology International, 5, 10–15. 8. Fogelman, S., Blumenstein, M., & Zhao, H. (2006). Estimation of chemical oxygen demand by ultraviolet spectroscopic proﬁling and artiﬁcial neural networks, Neural Computation and Applications, 15, 197–203. 9. Sengorur, B., Dogan, E., Koklu, R., & Samandar, A. (2006), Dissolved oxygen estimation using artiﬁcial neural network for water quality control, Fresenius Environmental Bulletin, 15 (9a), 1064– 1067. December 2008 445

10. Sovan, L. G., Maritxu, A., & Giraudel, J. (1999). Prediction of stream nitrogen concentration from watershed features using neural network, Water Resources Research, 33, 3469–3478. 11. ASCE Task Committee.(2000). Artiﬁcial neural networks in hydrology. I. Preliminary concepts. Journal of Hydrologic Engineering ASCE, 5, 115–123. 12. Bayazıt, M., & Oguz, B. (1998). Probability and statistics for engineers (p. 159), Istanbul, Turkey: Birsen Publishing House.

446 December 2008

13. MATLAB.(2004). Documentation Neural Network Toolbox Help Version 7.0, Release 14, The MathWorks, Inc. 14. Dogan, E., Sasal, M., & Isik, S. (2005). Suspended sediment load estimation in lower Sakarya river by using soft computational methods, Proceeding of the International Conference on Computational and Mathematical Methods in Science and Engineering, CMMSE (pp. 395–406), 2005, Alicante, Spain.

Environmental Progress (Vol.27, No.4) DOI 10.1002/ep

Biochemical oxygen demand (BOD) has been shown to be an important variable in water quality management and planning. However, BOD is difﬁcult to measure and needs longer time periods (5 days) to get results. Artiﬁcial neural networks (ANNs) are being used increasingly to predict and forecast water resource variables. The objective of this research was to develop an ANNs model to estimate daily BOD in the inlet of wastewater biochemical treatment plants. The plantscale data set (364 daily records of the year 2005) was obtained from a local wastewater treatment plant. Various combinations of daily water quality data, namely chemical oxygen demand (COD), water discharge (Qw), suspended solid (SS), total nitrogen (N), and total phosphorus (P) are used as inputs into the ANN so as to evaluate the degree of effect of each of these variables on the daily inlet BOD. The results of the ANN model are compared with the multiple linear regression model (MLR). Mean square error, average absolute relative error, and coefﬁcient of determination statistics are used as comparison criteria for the evaluation of the model performance. The ANN technique whose inputs Ó 2008 American Institute of Chemical Engineers

Environmental Progress (Vol.27, No.4) DOI 10.1002/ep

are COD, Qw, SS, N, and P gave mean square errors of 708.01, average absolute relative errors of 10.03%, and a coefﬁcient of determination 0.919, respectively. On the basis of the comparisons, it was found that the ANN model could be employed successfully in estimating the daily BOD in the inlet of wastewater biochemical treatment plants. Ó 2008 American Institute of Chemical Engineers Environ Prog, 27: 439–446, 2008

Keywords: water quality, artiﬁcial neural network, wastewater treatment plant, multiple linear regression model INTRODUCTION

Industrial and municipal wastewaters are major contamination sources of aquatic biota, accounting for several thousand types of chemicals released into the environment. Thus, the importance of implementing efﬁcient monitoring and control techniques for wastewater treatment systems is well known. A reliable model for any wastewater treatment plant is essential in order to provide a tool for predicting its performance and to form a basis for controlling the operation of the process. This process is complex and attains a high degree of nonlinearity due to the presence of bio-organic constituents that are difﬁcult December 2008 439

to model by using mechanistic approaches. Predicting the plant’s operational parameters using conventional experimental techniques, is also a time-consuming step and an obstacle in the way to an efﬁcient control of such processes. Biochemical oxygen demand is one of the major parameters for wastewater management and planning. It is an approximate measure of the amount of biochemical degradable organic matter present in a water sample. It is deﬁned by the amount of oxygen required for the aerobic microorganisms present in the sample to oxidize the organic matter to a stable organic form. The oxygen consumption from degradation of organic material is normally measured as biochemical oxygen demand (BOD) and chemical oxygen demand (COD), so there is an important relation between them. Performing the test for BOD requires signiﬁcant time and commitment for preparation and analysis. This process requires ﬁve days, with data collection and evaluation occurring on the last day [1]. Several water quality models such as traditional mechanistic approaches have been developed in order to manage the best practices for conserving the quality of water. Most of these models need several different input data, which are not easily accessible and make it a very expensive and time-consuming process. Artiﬁcial neural networks (ANNs) are computer techniques that attempt to simulate the functionality and decision-making processes of the human brain. In the past few decades, ANNs have been extensively used in a wide range of engineering applications. Recently, ANNs have been increasingly applied in modeling water quality. ANNs have been successfully used in hydrological processes, water resources, water quality prediction, and reservoir operation [2–7]. They have been used especially for the forecasting of water quality parameters and estimating nutrient concentration from pollution sources of watershed [8–10]. The research presented in this study is motivated by a desire to explore the potential of ANN estimation of biochemical oxygen demands of the input stream of a biochemical wastewater treatment plant. There is a ﬁve-day delay in determination of BOD, and when this is added to the hydraulic residence time, it is often too late to make proper adjustments in the wastewater treatment process. The ANN would be ideally suited for estimating inlet BOD owing to its ability to provide good generalization performance in capturing nonlinear regression relationships between predictors and the predictand. The performance of the ANN model is compared with multiple linear regression (MLR). Comparison of the results shows that the ANN model is statistically superior to the MLR model.

Figure 1. A typical three-layer feed forward ANN.

ral networks. They can be characterized by three components: Nodes Weights (connection strength) An activation (transfer) function ANN modeling is a nonlinear statistical technique; it can be used to solve problems that are not amenable to conventional statistical and mathematical methods. In the past few years there has been constantly increasing interest in neural networks modeling in different ﬁelds of hydrology engineering [11]. The basic unit in the artiﬁcial neural network is the node. Nodes are connected to each other by links known as synapses; associated with each synapse there is a weight factor. Usually neural networks are trained so that a particular set of inputs produces, as nearly as possible, a speciﬁc set of target outputs. The most commonly used ANN is the three-layer feed-forward ANN. In feed-forward neural networks architecture, there are layers and nodes at each layer. Each node at input and inner layers receives input values, processes, and passes them on to the next layer. This process is conducted by weights. Weight is the connection strength between two nodes. The numbers of neurons in the input layer and the output layer are determined by the numbers of input and output parameters, respectively. In the present feedforward artiﬁcial neural networks are used. The model is shown in Figure 1. In Figure 1, i, j, k denote nodes input layer, hidden layer, and output layer, respectively. w is the weight of the nodes. Subscripts specify the connections between the nodes. For example, wij is the weight between nodes i and j. The term ‘‘feed-forward’’ means that a node connection only exists from a node in the input layer to other nodes in the hidden layer or from a node in the hidden layer to nodes in the output layer, and that the nodes within a layer are not interconnected to each other.

METHODS

ANNs ANNs consist of large number of processing elements with their interconnections. ANNs are basically parallel computing systems similar to biological neu440 December 2008

Multiple Linear Regression If it is assumed that the dependent variable Y is affected by m independent variables X1, X2,. . .,Xm and a linear equation is selected for the relation Environmental Progress (Vol.27, No.4) DOI 10.1002/ep

Table 1. The daily statistical parameters of each data set.

Data set BOD (mg/L) COD (mg/L) N (mg/L) P (mg/L) SS (mg/L) Qw (m3/s)

xmean

Sx

Cv(Sx/xmean)

Csx

xmin

xmax

Correlation with BOD

237.054 445.407 48.42 4.35 322.228 63,562

99.13 178.85 17.89 1.97 162.28 17,233

0.42 0.40 0.37 0.45 0.50 0.27

20.16 20.31 20.53 0.24 1.87 20.73

33 73 9.5 0.5 52 15,920

610 865 81 9.8 1395 94,410

1.000 0.954 0.871 0.650 0.452 20.357

among them, the regression equation of Y can be written as: y ¼ a þ b1 x1 þ b2 x2 þ þ bm xm

(1)

y in this equation shows the expected value of the variable Y when the independent variables take the values X1 5 x1, X2 5 x2,. . .,Xm 5 xm. The regression coefﬁcients a, b1, b2,. . .,bm are evaluated, similar to simple regression, by minimizing the sum of the eyi distances of observation points from the plane expressed by the regression equation [12], N X

2 eyi ¼

i¼1

N X

ðyi a b1 x1i b2 x2i bm xmi Þ2

(2)

i¼1

In this study, the coefﬁcients of the regressions were determined using the least square method. COMPILATION OF DATA

The daily BOD, COD, total phosphorus (P), total nitrogen (N), suspended solid (SS), and water discharge (Qw) are obtained from one of the inlets of the wastewater treatment plant in Adapazari city, Turkey. The data sets consisting of composite samples that represent an average composition of the samples were used in this study. The data set based on analysis of composite daily samples is obtained by taking automatic grap samples every hour (over a 24-h period) of the wastewater entering the plant. The daily statistical parameters of each data are given in Table 1. In this table, xmean, Sx, Cv, Csx, xmin, and xmax denote the mean, standard deviation, variation, skewness coefﬁcient, minimum and maximum of the data, respectively. It is clearly seen from Table 1 that the mostly varied (Cv 5 0.50 mm) data is SS. The highest correlation coefﬁcient with the BOD (0.954) belongs to the COD. There is an inverse proportion between BOD and Qw with a negative correlation of 0.357. APPLICATION OF ANN MODEL

In this study, before the training of the model both input and output variables were normalized within the range 0.1 to 0.9 as follows: xi ¼ 0:8

ðx xmin Þ þ 0:1 ðxmax xmin Þ

(3)

where xi is the normalized value of a certain parameter, x is the measured value for this parameter, xmin Environmental Progress (Vol.27, No.4) DOI 10.1002/ep

and xmax are the minimum and maximum values in the database for this parameter, respectively. To develop an ANN model for estimating BOD, the available data set (364 daily water quality data of the year 2005) was randomly partitioned into a training set and a test set. About 67% (244 water quality data set) of the available record was selected for training while the remaining 33% (120 for water quality data set) was used for testing. For all created neural networks the general structure of input, one hidden and one output layer was used. To determine the optimal architecture, several neural networks were trained with different iteration numbers (epoch) and numbers of nodes in the hidden layer. For all cases a ‘‘log sigmoid transfer function (logsig)’’ was used in the hidden and output layers. When the logsig was applied, the inputs and the outputs were normalized to within the range of 0–1. The most accurate estimations of the ANNs were obtained with log sigmoid transfer function [13]. SELECTION OF INPUT PARAMETERS FOR THE ANN

The selection of the input parameters is a very important aspect for the neural network modeling. To use ANN structures effectively, input variables in the phenomenon must be selected with great care. This highly depends on the better understanding of the problem. In a ﬁrm ANN architecture, to not confuse the training process, key variables must be introduced and unnecessary variables must be avoided. For this purpose, a sensitivity analysis can be used to ﬁnd out the key parameters. Also sensitivity analysis can be useful to determine the relative importance of the parameters when sufﬁcient data are available. The sensitivity analysis is used to determine the effect of changes and to determine the relative importance or effectiveness of a variable on the output. The input variables that do not have a signiﬁcant effect on the performance of an ANN can be excluded from the input variables, resulting in a more compact network. Then it becomes necessary to work on methods like sensitivity analysis to make ANN work effectively. BOD depends on some independent parameters, which can be given in this form: BOD 5 f(COD, N, P, Qw, SS). The ﬁve ANN models were established using each independent parameter separately. Sensitivity analysis was applied to ﬁnd the most effective input parameters. The sensitivity analysis coefﬁcient of determination (R2) of the parameters involved in the phenomenon is given in Figure 2, respectively. It is clearly December 2008 441

Figure 2. Comparison of ANN results and observed BOD depending on each input parameter.

442 December 2008

Environmental Progress (Vol.27, No.4) DOI 10.1002/ep

Table 2. Determination of the most appropriate ANN architecture.

ANN structure (number of nodes in layers) ANN ANN ANN ANN ANN ANN ANN ANN ANN ANN

(5, (5, (5, (5, (5, (5, (5, (5, (5, (5,

Iteration number (Epoch)

Determination coefﬁcient (R2)

Mean square error (MSE)

1000 1000 1000 1000 1000 2000 2000 2000 2000 2000

0.917 0.919 0.915 0.913 0.895 0.908 0.909 0.894 0.898 0.884

716.21 708.01 737.30 757.42 869.38 743.79 755.84 864.95 828.60 966.52

2, 1) 3, 1) 5, 1) 7, 1) 10, 1) 2, 1) 3, 1) 5, 1) 7, 1) 10, 1)

seen from Figure 2 that the most effective parameter is determined as COD.

deviations and, thus, overﬁtting begins, as illustrated in Figure 3b. SENSITIVITY ANALYSIS

DETERMINATION OF AN APPROPRIATE ANN MODEL

One of the problems that occur during neural network training is called overﬁtting. Overﬁtting is suggested when the error on the training set is driven to a very small value, while for the test data presented to the network the error is large. That means the network has memorized the training examples, but it has not learned to generalize to new situations. To not overﬁt the training data, the appreciated epoch number, number of hidden layers, and node number of hidden layers must be chosen by trial and error process. Networks are sensitive to the number of nodes in their hidden layers. Too few nodes can lead to underﬁtting and too many nodes can result in overﬁtting. To reach an optimum amount of hidden layer nodes, 2, 4, 6, 8, 10, and 12 nodes are tested. The results are shown in Table 2. The ﬁrst column in this table denotes the nodes of each layer for the ANN models. Accordingly, an ANN structure like ANN(i,j,k) indicates a network architecture with i, j, and k nodes in input, hidden, and output layers, respectively. In this case the input layer covers the COD, total phosphorus, total nitrogen, suspended solid particles, and water discharge (COD, P, N, SS, Qw) and the output layer consists of the BOD. It can be seen from the Table 2 that the ANN(5,3,1) model with 1000 iterations having the R2 value of 0.919 and MSE value of 708.01 is the best model. Networks are also sensitive to the number of hidden layers. In this study, ANN architectures with only one and two hidden layers are tested, since three or more hidden layered systems are known to cause unnecessary computational overload. The variation of training and test MSE values for one and two hidden layered ANN models are presented in Figures 3a and 3b. As seen from these ﬁgures, ANN architecture with one hidden layer turns out to be a more stable design. Thus, for this study one hidden layered ANN having two hidden layer nodes with 1000 epochs has been adopted. If the epoch number exceeds 1000, Environmental Progress (Vol.27, No.4) DOI 10.1002/ep

It appears that while assessing the performance of any model for its applicability in estimating BOD, it is important to evaluate not only the average prediction error but also the distribution of prediction errors. The statistical performance evaluation criteria employed so far in this study are global statistics (R2 and MSE) and do not provide any information on the distribution of errors [14]. Therefore, in order to test the robustness of the model developed, it is important to test the model using other performance evaluation criteria such as average absolute relative error (AARE). The AARE not only gives the performance index in terms of predicting BOD but also shows the distribution of the prediction errors. These criteria can be computed as: AARE ¼

n 1X jREj N p¼1

(4)

in which, RE ¼

tp op 3 100 tp

(5)

where RE is the relative error in forecast expressed as percentage, tp is the observed BOD for the pth pattern; and op is the computed BOD for the pth pattern which is produced by ANNs; and N is the total number of the testing patterns. Clearly, the smaller the value of AARE is the better the performance. The performance control of the ANN outputs was evaluated by estimating the coefﬁcient of determination (R2) which is deﬁned as: R2 ¼

BODo BODs BODs

(6)

where: BODo ¼

n X

tp tmean

2

(7)

p¼1

December 2008 443

Figure 3. The observed and estimated biological oxygen demand values. (a) ANN system with one hidden layer

(MATLAB, 2004). (b) ANN system with two hidden layers (MATLAB, 2004).

Figure 4. The observed and estimated biological oxygen demand values.

BODs ¼

n X

tp op

2

(8)

i¼1

where, tmean is the mean BOD, The mean Square error (MSE) is deﬁned as,

MSEs ¼

N 2 1 X tp op N i¼1

APPLICATION OF MLR

(9)

COD is used as the common parameter for the rest of the sensitivity analysis. Performance evaluation of all possible combination of variables such that each and every combination includes COD, was also investigated. The ﬁndings are listed in Table 3. Based on 444 December 2008

the ﬁndings, as depicted in Table 3 the ANN model has ﬁve inputs (COD, N, P, SS, Qw) gives the best estimation.

The performance criterion for the test results of the MLR model is given in Table 4. As can be seen from Table 4, the model has the highest MSE and AARE and the lowest R2 values when COD is only used as input. However, it is clearly seen from Table 4 that while adding the total nitrogen, total phosphorus and suspended solid increases the models’ performance, adding water discharge decreases the models’ performance. However, the effect of water discharge on the phenomena was shown in Table 1 and Figure 2 the MLR model does not consider its effect. It is a drawback for the MLR. Environmental Progress (Vol.27, No.4) DOI 10.1002/ep

Table 3. Performance evaluation of the effective parameters using ANN for sensitivity analysis.

Performance AARE (%) MSE R2

COD 10.52 819.41 0.903

COD 1 N 10.25 778.67 0.903

COD 1 N 1 P 10.18 726.25 0.911

COD 1 N 1 P 1 SS 10.10 714.69 0.915

COD 1 N 1 P 1 SS 1 Qw 10.03* 708.01* 0.919*

Best results are indicated by ‘‘*’’.

Table 4. The performances of the MLR in the test period.

Performance

COD

COD 1 N

COD 1 N 1 P

COD 1 N 1 P 1 SS

COD 1 N 1 P 1 SS 1 Qw

AARE (%) MSE R2

10.35 798.92 0.902

10.21 747.73 0.908

10.21 741.26 0.908

10.20* 738.98* 0.909*

10.34 752.37 0.906

Best results are indicated by ‘‘*’’.

The performance of the selected neural network model and MLR in predicting BOD is demonstrated in Figure 4 for the test data set. As can be seen from Figure 4, both ANN and MLR estimates follow the corresponding experimentally measured data with a signiﬁcantly high R2 value of 0.919 and 0.909, respectively. Furthermore ANN statistically outperforms MLR in terms of BOD estimation. CONCLUSION

The present study demonstrates the capabilities of the ANN model for BOD modeling; however, the choice of ANN architecture and input parameters are crucial for obtaining good estimate accuracy. Thus, sensitivity analysis has been conducted to determine the degree of effectiveness of the variables by using various performance statistics. From the results obtained, an ANN model appears to be a useful tool for prediction of the inlet BOD. The results demonstrate that the COD is more effective on BOD estimation than the other four parameters. The remaining parameters were used one by one in estimating BOD. After the application of sensitivity analysis, other effective parameters were determined as total nitrogen N, total phosphorus P, suspended solid SS, and water discharge Qw, respectively. The models whose inputs are COD, water discharge, suspended solid, total nitrogen, and total phosphor have the best performance criteria among the input combinations tried in the study. This indicates that all these variables are needed for better BOD modeling. The MLR model was also used for predicting BOD. However, the effectiveness of the independent parameters was shown during the sensitivity analysis MLR model does not consider water discharges’ effect. It is a drawback for the MLR. On the basis of the comparison results, the ANN technique was found to be superior to the MLR technique. Environmental Progress (Vol.27, No.4) DOI 10.1002/ep

LITERATURE CITED

1. Chapman, D. (1992). Water quality assessments (1st Edition, pp. 80–81), London: Chapman and Hall. 2. Suen, J. P., Eheart, J. W., & Asce, M. (2003). Evaluation of neural networks for modelling nitrate concentration in rivers, Journal of Water Resource Plan and Man, 129, 505–510. 3. Aguilera, P. A., Frenich, A. G., Torres, J. A., Castro, H., Vidal, J. L. M., & Canton, M. (2001). Application of the Kohonen neural network in coastal water management: Methodological development for the assessment and prediction of water quality. Water Research, 35, 4053–4062. 4. Lobbrect, A. H., & Solomatine, D. P. (1999). Control of water levels in polder areas using neural networks and fuzzy adaptive systems. Water Industry Systems: Modeling and Optimization Applications, 1, 509–518. 5. Maier, H. R., & Dandy, G. C. (1996). The use of artiﬁcial neural networks for the prediction of water quality parameters, Water Resources Research, 32, 1013–1022. 6. Wen, C. G., & Lee, C. S. (1998), A neural network approach to multiobjective optimization for water quality management in a river basin, Water Resources Research, 34, 427–436. 7. Zaheer, I., & Bai, C. G. (2003). Application of artiﬁcial neural network for water quality management, Lowland Technology International, 5, 10–15. 8. Fogelman, S., Blumenstein, M., & Zhao, H. (2006). Estimation of chemical oxygen demand by ultraviolet spectroscopic proﬁling and artiﬁcial neural networks, Neural Computation and Applications, 15, 197–203. 9. Sengorur, B., Dogan, E., Koklu, R., & Samandar, A. (2006), Dissolved oxygen estimation using artiﬁcial neural network for water quality control, Fresenius Environmental Bulletin, 15 (9a), 1064– 1067. December 2008 445

10. Sovan, L. G., Maritxu, A., & Giraudel, J. (1999). Prediction of stream nitrogen concentration from watershed features using neural network, Water Resources Research, 33, 3469–3478. 11. ASCE Task Committee.(2000). Artiﬁcial neural networks in hydrology. I. Preliminary concepts. Journal of Hydrologic Engineering ASCE, 5, 115–123. 12. Bayazıt, M., & Oguz, B. (1998). Probability and statistics for engineers (p. 159), Istanbul, Turkey: Birsen Publishing House.

446 December 2008

13. MATLAB.(2004). Documentation Neural Network Toolbox Help Version 7.0, Release 14, The MathWorks, Inc. 14. Dogan, E., Sasal, M., & Isik, S. (2005). Suspended sediment load estimation in lower Sakarya river by using soft computational methods, Proceeding of the International Conference on Computational and Mathematical Methods in Science and Engineering, CMMSE (pp. 395–406), 2005, Alicante, Spain.

Environmental Progress (Vol.27, No.4) DOI 10.1002/ep