The difference data makes
by
, 05-17-2014 at 02:03 PM (2413 Views)
Comparing data from two forex brokers shows differences in data feeds can have a significant impact on testing and trading — especially short-term time frames.
A common experience for many FX traders is trading a system through different forex brokerages. Because the spot FX market lacks a centralized exchange and is traded “over-the-counter,” forex brokerages can construct their
data feeds using any liquidity providers (typically banks) they want. As a result, the exact price data that appears on the charts of one FX brokerage may differ from another, even though the two feeds will be similar.
This creates several problems for trading system generation, testing, and trading since the specific data used in simulations may not match the characteristics of the data feed that is ultimately traded.
Here we’ll look at representative forex brokerage data differences, measuring their magnitude and the degree to which they will likely impact strategy design and actual trading. The Euro/U.S. dollar (EUR/USD) data feeds of
two large, regulated retail forex brokerages will be compared to quantify their differences across several different time frames. To capture possible differences in liquidity providers, one brokerage is from the United States
(National Futures Association, NFA, regulated) while the other is from the United Kingdom (Financial Services Authority, FSA, regulated).
Adjusting time stamps
The first and probably most important divergence between FX brokerages is the difference in time stamps. When you want to trade the same trading strategy through different brokerages, their respective price
bar structures must match perfectly. This means you may need to remove bars from the beginning or end of the week so weekly starting and ending times match. You should also adjust all historical
data used in simulations and trading to match unique GMT shifts and daylight savings time (DST) offsets. For example, in this case one of the brokerages we want to compare has a GMT shift of 0 while the other has a
GMT shift of +2 hours with a DST offset of +3 hours March through November. One brokerage starts the week at 00:00 GMT on Monday while the other starts the week at 22:00-23:00 GMT on Sunday;
one ends the week at 16:00 GMT while the other ends the week at 17:00-18:00 GMT on Friday. As a result, to allow proper comparison the second brokerage’s time was changed to a GMT shift of 0, taking
care to eliminate the DST according to the feed’s DST start and end times. Sunday bars and bars after 16:00 GMT on Friday were removed, because these do not exist for the other brokerage.
After this adjustment process we are left with two data feeds that are perfectly comparable. The final data sets consist of 60-minute (onehour), 240-minute (four-hour) and 1,440-minute (daily) bars spanning
from January 2012 to March 2014. After processing the price data to ensure matching time stamps, we can now assess differences between the open, high, low, and close values.
FIGURE 1 : DAILY DATA DIFFERENCE
FIGURE 2 : FOUR HOURS DATA DIFFERENCE
FIGURE 3 : ONE HOUR DATA DIFFERENCE
Comparing the data
Figure 1 shows the absolute differences across the two feeds for the open, high, low, and closing prices for each daily price bar. The overall difference is lowest for the high and close values while it’s highest
for the open and low values. As you might expect, there are several spikes in the differences during periods of reduced liquidity — e.g., early January, around Christmas, after big news events, etc. —
although these are very limited in nature; less than 0.01% of the overall data set exceeds three standard deviations.
Opening prices tend to show greater differences than closes because of larger price gapping, while lows exhibit larger differences possibly because of a tendency for sharp down moves during low
liquidity events. Figures 2 and 3 show the large differences on the daily time frame are also apparent on the four-hour and one-hour time frames, respectively, confirming they are generated by
low-liquidity news events rather than extended periods of accumulation in differences from one-hour bars.
Surprisingly, the absolute price differences are relatively low and constant — for the three time frames. Table 1 shows the median differences for the open/high/low/close prices. (The median is used here
instead of the average because of the presence of outliers that would otherwise skew measurements.) The fact that the daily and hourly bar differences are very similar in magnitude suggests there’s no accumulation of these
disparities throughout the day; hourly differences are cancelled as each day evolves. This means the difference between our two brokerages is not systematic but more akin to “noise” — which is what we’d expect from slightly
different latencies and quotes from FX liquidity providers. The differences are also notably small, with the average overall median difference representing less than one tick (0.0001).
Time frame considerations
Nonetheless, it’s important to consider the differences as a percentage of the average bar high-low distance (see also Table 1). While a one-tick difference is hardly important for the daily close (1.09%), this difference
becomes very important on the hourly time frame bars (4.85%). This means while brokerage dependency may play a small role on longer-term time frames, it will play a very important role on shorter-term
time frames. If you’re designing a system to trade on the daily time frame across these two brokerages, you may hardly ever get different signals. A system designed for the hourly time frame, however, will
have a higher chance of showing this divergence. If the noise per bar remains constant in terms of time frame while the range of each bar on shorter we can expect the percentage of a bar that changes depending
on a brokerage to be described by a “power law” dynamic. Figure 4 shows power law regressions for the differences studied. The four lines represent the open, high, low, and closing prices. The vertical axis represents
the price difference as a percentage of the average bar high-low range and the horizontal axis shows the time frame in minutes. Between the two brokers the median price difference as a percentage of the average bar range
increases roughly according to this equation:
Median of difference as % of average range =
-4 +25 × time frame-0.5
This means on the five-minute time frame, you would expect the median price difference to be roughly 7.2% of the bar range, while on the one-minute time frame the value will be approximately 21%.
Any system you design will be further affected by these differences, depending on the number of data-point comparisons used to make trading decisions. For example, if your strategy uses a price pattern defined by three data
points on the one-minute time frame, the odds that trade signals will differ across brokerages will be much higher than if the strategy used, say, a 20-bar moving average. Because the brokerage difference in this case can be described as noise, its impact is diminished by averaging and augmented by point-to-point comparisons.
Correct time stamps, use higher time frames
If you want to ensure the smallest possible difference between two data feeds (your historical data and a live feed, or two live brokerage feeds) make sure you correct the time stamps and weekly starting and ending times so
they match.
Also, focus on longer time frames, since shorter time frames will increase the risk of price data discrepancies resulting from the higher percentage of bar range attributable to “noise.” If you want to use shorter time frames, though, keep in mind it becomes essential to develop and test strategies on price data from the same source that will be used in live trading. Otherwise, the differences between testing and trading will be too great.
Article written by Daniel Fernandez on April (2014) issue of Currency Trader Magazines.