An Algorithm for Determining Optimum Link Traffic Volume Counts for Estimation of Origin-Destination Matrix

Travel demand information is one of the most important inputs in transportation planning. Today, the access to origin-destination (OD) matrix using traffic volume count information has caught the researchers’ attention because these methods can estimate OD matrices based on the flow volume in the links of network with a high accuracy at a much lower cost over a short time. In such algorithms, the number and location of links are one of the main parameters for traffic volume count; hence a better OD matrix can be achieved by choosing the optimum links. In this paper, an algorithm is presented to determine the number and location of optimum links for traffic volume count. The method specifies the minimum links to cover the maximum elements of OD matrix. This algorithm is especially useful for the estimation of ODM through gradient method, because only the O-D pairs covered by link traffic counts are adjusted and estimated in the gradient method. The algorithm is then scripted via EMME/2 and FoxPro and implemented for a large-scale real network (Mashhad). The results show that about 95% of the ODM can be covered and then adjusted by counting only 8% of the links in the network of Mashhad.


Introduction
Travel demand information expressed in form of origin-destination (OD) matrices are one of the most important and essential inputs in transportation planning and engineering, considered as the basic information for design and management of transportation systems.The access to such information is very difficult and requires a lot of time, money and human resources.However, it is possible to easily access the flow volume in the network links with a high accuracy at a very low cost.Hence, in recent years, many researchers have focused on the estimation of ODMs from the flow volume in the network links.Among a variety of approaches presented for estimating the ODM using traffic volume count data, the gradient method proposed by Spiess [1] is more effective to solve real problems on large scales.The Spice mathematical model is as follows: Where; A ^: a set of network links where the flow is counted.V ^: the vector for counted flow volume in network links.g: O-D demand matrix. () : an equilibrium traffic assignment model for assigning demand matrix g to the network.
V: the vector for network link flow (resulted from allocation of demand matrix g to the network).
The algorithm of gradient method is presented in Figure 1.As shown in steps 3 and 5 of the algorithm, the gradient method can only adjust (estimate) the O-D pairs in which the link traffic volume count exists for the shortest path between them (δak = 1).Otherwise, the method cannot adjust the O-D pair [2].
No particular method was proposed to find the network links for traffic volume count and estimation of ODMs until the last two decades and random selection was mostly used to choose the links for traffic volume count [3], so that some network links were randomly selected and counted.Subsequently, the selection of links was based on expert opinions by considering a screen line on the network.In a study about the importance of link traffic counts in reliability of the estimated ODM, Young et al presented methods to locate the link traffic volume counts [4].
In another study, Yang et al initially used the maximum possible relative error (MPRE) to examine the reliability of ODM according to the number and position of link traffic volume counts [5]; hence they proposed the following rules to locate the traffic counting points.
1) Covering rule: the link traffic counts must be located at points where a certain portion of trip between any O-D pair is observed.
2) Maximum flow fraction rule: for any O-D pair, the traffic counting points should be located at the links so that the flow fraction between the OD-pair out of flows on these links is as large as possible.
3) Maximum flow intercepting rule: the chosen links should intercept as many flows as possible.
4) Link independence rule: the traffic counting points should be located on the network links so that the resultant traffic counts on all chosen links are linearly independent.
Gentili and Mirchandani suggested a two-level model for sensor location flow estimation (SLFE) through the ODM.The high level of this optimization model chose the best set of link traffic volume counts based on solutions of the second level of model for each candidate set.In this optimization model, the minimum possible estimation error was considered while using the trip prediction data, based on which the link set was selected [6,7].
In another study, Wang and Mirchandani used a Bayesian statistical method to select the location of links through a decision-making technique according to the previous data and observations on traffic volume counts [8].They developed the model and evaluated it.In their model, the travel distribution method was utilized for initial information and routes.Then, a sub-model was developed simultaneously for the selection of a single link in order to simplify the solution and the proposed algorithm was implemented and solved by an iterative method.
Saraswalti and Cancheria suggested a method for selection and prioritization of the network links using information theory [9].Assigning a set of weights to each link (based on the data available and the OD matrix covered), they attempted to determine the optimum links using a binary integer programming (BIP) model and implemented and evaluated the method on a hypothetical network.
Abd-al-Shakour and Sashama noticed the calculation of optimum link traffic counts using screen lines based on covering the all paths with a certain high traffic volume in order to cover the maximum volume [10].They suggested two models of which the first one determined the optimum number of links and the second one specified their locations.
Bianco et al showed that the sensor location problem (SLP) was an NP-hard problem and proposed linear algorithms to solve it [11].Based on Bianco's work, Morrison et al presented a more difficult situation for the problem in general [12,13].
Shaw et al formulated a weighted sensor location problem (SLP) to find the optimum set of links based on importance, cost, etc. and then presented a greedy algorithm to solve it and implemented two numerical examples [14].
Lu et al employed the Kalman filtering to locate the links [15].In addition, Lee et al [16] and Dunjik and Liu [17] carried out studies on the factors affecting the link selections in freeway corridors.
Gholami et al proposed three models to determine the link traffic counts in turning movements at intersections [18] and in another research, they presented a structure to locate the links [19].
Castillo et al studied the problem of finding the minimum number of link traffic volume counts and their locations to reach the traffic volume of uncounted links [20].Young and Van analyzed the problem by developing a link between the OD matrix and the observations and estimations [21].Zang Dang et al proposed a model for locating the link traffic counts in the network aimed to fully cover the traffic volume in the links and formulated the problem as a linear integer programming [22].
In this paper, an algorithm is proposed to obtain optimum network links for traffic volume counting and estimation (adjustment) of the origin-destination matrix.Since the O-D pair coverage is a basic parameter in the success of gradient method and estimation of ODM, the proposed algorithm can find the minimum number of network links which give the most coverage of O-D pairs.If some of the network links are counted or should be considered in the set of optimum links for any reason, the proposed algorithm can be developed to calculate a set of optimum links covering maximum O-D pairs, considering the given links.The method presented in this article is efficient enough to run on large-scale (macro-scale) networks.This method is implemented on the network of Mashhad (a macro-scale network) and the results are presented below.
Step 2: Assign matrix gi to the network and calculate traffic volume on network links (va) (network equilibrium traffic assignment model).
Step 3: Determine the set of used paths Ki and equilibrium flow on paths hk for each origin-destination i ( ∈ ) and calculate and  according to equations below: Step 4: If the requirement is not met, i.e.   ()   ≤ 1, adjust the length of step as follows: Step 5: Calculate new amounts of matrix gi based on equation below: Step 6: Set 1 = 1 + 1 and   =    .
Step 7: If "stopping criterion" is not met, go to step 2; otherwise stop.

Proposed Algorithm
The algorithm proposed to determine optimum link traffic volume counts (the highest origin-destination coverage) is presented below.
Step (0) set: CL: the set of candidate links.
Step (1) Assign ĝi to the network (network equilibrium traffic assignment model).
Step (2) Determine the used paths Ki and their corresponding links for each origin-destination i ( ∈ ).
Step (3) Determine link a from the set UT, which covers the largest number of O-D pairs of the set UC (If several links satisfy this condition, choose the link which covers the largest number of O-D pairs from the set I).
Step (4) Calculate the set NCO:  ⊂ , it equals to new O-D pairs covered by the link a Step (5) Add the optimum link a (the link found in step (3)) to the end of the set OL: OL= ⋃{}.
Step (10) If stopping criterion is not met, go to step (3); otherwise stop.

Stopping Criteria in Algorithm
As discussed in the proposed algorithm, step (10) is related to the stopping criterion of the algorithm.In the algorithm for determining optimum link traffic volume counts, the stopping criterion can be any of three conditions below (or a combination of three conditions below): 1) The "number of optimum links" criterion (the maximum number of links which are selected as optimum links) that, in fact, is the counter l in the algorithm.
2) The criterion of "increasing the coverage of O-D pairs in two successive iterations" in the algorithm.
3) The criterion of "reaching a certain percentage of coverage of O-D pairs" (e.g.reaching coverage of 95% of O-D pairs).
Figure 2 illustrates the flow chart for determination of optimum link traffic volume counts.

Implementation of Algorithm
The method is theoretically implemented and tested on some small networks at first to evaluate the efficiency and reliability of proposed algorithm and it is found that the algorithm identifies the network links with the most coverage of O-D pairs and finally specifies a set of links in order to adjust the maximum elements of demand matrix.Then, the method is implemented on the network of Mashhad to evaluate its efficiency in large scale networks.In this study, the EMME/2 software and the FoxPro programming language are employed simultaneously to implement the proposed method.The EMME/2 software is used for equilibrium traffic assignment process.The information of used paths and the links of shortest path between the O-D pairs are also obtained through the EMME/2 software.The FoxPro programming language is applied to carry out a series of calculations for O-D pair coverage and provide databases for the shortest paths and links.

Implementation of Algorithm on the Network of Mashhad
To demonstrate the efficiency of the algorithm in large-scale real networks, the proposed method is implemented on the network of Mashhad in Iran, considered as a large-scale network.Mashhad is one of the biggest cities in Iran that always has traffic and transportation problems and a high travel demand due to its religious importance and the considerable number of pilgrims and visitors.The network of Mashhad has 2430 links and 1048 nodes.163 of the 1048 nodes are the regional centers serving as origin and destination points.In other words, the ODM of the city is a 163 × 163 matrix with 26569 members of which 7293 O-D pairs are effective (non-zero) [11].

Results of Implementation of Algorithm on the Network of Mashhad
The results of implementation of the method for finding 120 optimum links from all network links Mashhad are presented in Table 1.The optimum links are prioritized in this table and the origin node, destination node, number of covered O-D pairs and new covered O-D pairs by each link and percentage of O-D pair coverage by the set of optimum links are represented in each row.As shown in this table, 120 optimum links cover more than 90% of the origindestination pairs.If it is required to reach a more O-D pair coverage or the budget of traffic volume count is available for more links, the method can be continued to find more optimum links.Figures 3 and 4 show the results of this situation.Figure 3 shows the percentage of O-D pair coverage by optimum links and Figure 4 also presents the number of O-D pairs covered by optimum links.In Figure 3, the horizontal axis represents the number of optimum links (in priority order) and the vertical axis shows the percentage of O-D pair coverage by optimum links.For example, 50 links can cover about 80% of the ODM, while the 50th links covers about 2% of the O-D pairs.As seen in Figure 3, almost all of the O-D pairs can be covered by 380 links.
In Figure 4

Conclusion
In this paper, it is attempted to find optimum (minimum) link traffic volume counts for estimation (adjustment) of the ODM and suggest a method to solve the problem.The algorithm presented in this article is efficient enough to run in large scale (and macro scale) real networks.The most important feature of the proposed algorithm is to cover maximum elements of ODM using minimum number of link traffic volume counts.The EMME/2 and FoxPro are used simultaneously to implement the proposed method.
To demonstrate the efficiency of this method in large scale networks, the proposed algorithm is implemented on the network of Mashhad (with 2430 links, 1048 nodes and 163 regional centers), 120 optimum link traffic volume counts are determined in Mashhad by performing the method, so that more than 90% of the O-D pairs can be covered by these 120 links.

Figure 1 .
Figure 1.Steps of gradient method for estimation of demand matrix using traffic volume count data ĝi: initial O-D matrix.A: the set of network links.I: the set of O-D pairs.set of uncovered O-D pairs.OL: the set of optimum links.CO: the set of covered O-D pairs by the set of optimum links (OL).NCO: the set of new covered O-D pairs by optimum link a ( ∈ ).
, the horizontal axis also shows the number of optimum links to cover the O-D pairs and the left vertical axis represents the number of O-D pairs covered by each link alone.The right vertical axis represents the number of O-D pairs covered by optimum links.In Figure 4, the information on the coverage of origin-destination pairs is shown in three modes.The continuous line represents the entire O-D pairs covered by the set of optimum links, the dash line represents the number of new O-D pairs covered by each link and the dotted line indicates the O-D pairs covered by each link alone.This figure clearly demonstrates that the initial links have a great influence on the coverage of O-D pairs, so that the maximum O-D pairs are covered by these links.As observed, for example, the first 10 links just cover 41% of the O-D pairs (2984 O-D pairs).This value is 57.5% for the first 20 links (4203 O-D pairs).In other words, the second 10 links (links 11 to 20) can increase the O-D pair coverage by 16.5%.It is observed that the last 10 links (links 371 to 380) increase the O-D pair coverage only by 0.1%.In this figure, the dotted line which represents the number of O-D pairs covered by each link (alone) demonstrates that some links cover a relatively large number of O-D pairs alone; but they have a poor performance in the coverage of new O-D pairs which are not covered by previous links.For example, links 183 and 276 lonely cover 374 and 379 O-D pairs, respectively, but each one add 3 and 1 new O-D pairs to list of covered O-D pairs, respectively; in other words, each of them increase the O-D pair coverage by 0.04% and 0.01%, respectively.In this figure, the dash line also indicates that the increase in O-D pair coverage is very slight after the link 150.

Figure 2 .
Figure 2. Flow chart of method for determining optimum link traffic volume counts

Figure 3 .
Figure 3. Percentage of O-D pairs covered by optimum links

Table 1 . List of optimum link traffic volume counts in the network of Mashhad, prioritized in terms of coverage of maximum number of O-D pairs
Set the paths of   used and the links you have used Determine the  link of the  set so that the maximum number of origindestination pairs is from  set