Data computation
With all aggregated data from multiple sources, DIA performs many layers of processing to build high-quality feeds resilient to bad actors. All steps can be customized with tailor-made methodologies to best serve each use case.
Step 1: Data cleaning and outlier detection
The price estimation process needs to be resilient against trades with prices diverting from the current market price. Reasons for diverting trades can be market manipulation, errors or flash crashes on certain exchanges, among others.
To avoid these irregularities, the first processing step involves data cleaning and outliers removal, to avoid building a feed using data that is completely away from the median. Detecting and excluding outliers is an important processing task, especially in low-volume and illiquid markets. Otherwise, a single low-volume trade can offset the price estimation and serve as a base token for other assets, leading to a chain reaction of misaligned price data.
How does this look in practice? By applying an Interquartile Range (IR) filter, DIA excludes data points and entire sets that lie outside of an acceptable range relative to the interquartile range. The filter analyses all trades of a predefined time range and sorts them by their recorded price. After that, this range of prices is divided into four price blocks, the quartiles. The boundaries of the full price range determine the boundaries of the first and the last quartile. To clear out outliers, any trades falling into the first or the last quartile are filtered out and subsequently, only trades falling into the "middle" quartiles move forward into further processing.
-> Learn more about outliers and market manipulation
Step 2: Price determination methodology application
Now that we understand how the process of clearing diverting trade data works, how is the final price actually calculated from all remaining data points? To retrieve a single USD price value for every asset, DIA uses trade-based price determination methodologies. These filters are functions to get a single price point from a collection of trades in a block.
Filters are calculated for each asset on each exchange individually, as well as for each asset on all exchanges combined. This combined filter result represents the result closest to the true "whole market" that can be determined by this system. As each use case requires different data needs, DIA can provide multiple filters to best serve the requirements of each application.
Let’s look at a couple of examples:
Volume Weighted Average Price (VWAP): is a methodology for trade-based price determination that takes into account the different volumes of trades. All trades from the queried time range are collected and weighted by their volume. Weighting means that the (normalized) volume of the trade is multiplied by its executed price. This is done by accumulating the previously calculated volume-price-products of all trades and dividing them by the sum of all volumes combined.
Moving Average with Interquartile Range Filter (MAIR): in this case, all the trades collected in the queried time range are ordered by timestamp. For each second in this time range, a block is created where trades are put into. As soon as the collection of all trades for each block is finalized, then it's weighted against the volume for each data point and the weighted average price is taken to arrive at the final price
-> See all pricing methodologies
Additionally, DIA can always implement new on-demand methodologies as well as build new methodologies together with users.