Is the Pain in Using Alternative Data Worth the Alpha Gain?
Alternative data sets have been expanding rapidly in the last five years. From satellite imagery to POS transaction receipts and phone location data, the realm of the measurable has expanded, requiring infrastructure and skills to exploit properly. AI and machine learning are helpful, but is it an intellectual pursuit or adds value to a fundamental process? After a short definition and overview of alternative data sets, we will explore some use cases in finance.
Definition
Personal data gathering has been at the forefront of news flow for years, as much for the seemingly unbridled exploitation of such data for political use, than for the more mundane marketing prowess it has facilitated. Point-of-sale (POS) data such as credit card receipts, mobile phone location data, and consumer behavior online are part of what is called “alternative data.” Satellite imagery, public records, sensors of all types, and anything published on the internet are other sources.
Alternative to what traditional data? The line is blurred and a hard definition is difficult to come by. Some common features to these alternative data sets include:
There are a large number of rows in those samples (billion plus) which necessitate big-data–type frameworks (infrastructure, software, and skills) to work properly.
Data needs to be curated and cleaned to obtain usable sample sets.
By themselves, they rarely provide insights. Contrary to a financial statement number or an address which are explicit, alternative data sets act as a complement to other known factual data sets, including news flow, to provide necessary insights.
Machine learning techniques come into play at some stage to link those unstructured data sets with other more structured samples to calibrate the explanatory factor of the data itself.
Use Cases in Finance
Numerous use cases shared in online publications, examples of trades from hedge fund managers, and data provider presentations are available. The breadth of application is extremely wide and doesn’t lend itself to a single framework to evaluate their usefulness.
In the last three years, the number of providers has skyrocketed. Neudata, an alternative data consultant, reckons more than 60 data sets have been made available in 2019 with web scraping and event database seeing the most launches.
On a side note, the resurgence of demand for ESG data is affecting the alternative data offering. The need for higher-frequency data on company behavior and adherence—including NLP analysis of patents, news, and company communications—is bound to change the dynamics of ESG reporting for the better.
Correlating ship movement through filing data, GPS tracking and satellite imagery is not an easy task, yet can provide data as to what is shipped where and at what speed. This will, in turn, provide insights for commodity demand, especially oil, or for the Chinese, export volume of manufactured goods as a key macro data point. (Source IHS Markit)
An interesting and very well-documented case is the infamous “chicken sandwich war” of the summer of 2019 initiated by restaurant chain Popeyes involving Wendy’s and Chick-fil-A. Popeyes (QSR) stock price started to move upwards late in the spring. From April on, some analysts became aware of very strong POS data from a few restaurants where a new chicken sandwich was on trial. Analyzing the data for profitability, behavior and blow out same-store sales data in the few trials, analysts started to gather information about a wider rollout which happened on August 17. That week, Popeyes’s Twitter’s followers increased 25%, which triggered the #chickensandwichwars handle and an explosion of flows on the topic—starting the year at $55 and reaching $78.50 on August 29.
Applications to Systematic Strategies
With public companies very limited in what they can share, and private companies not obliged to provide much information at all, alternative data can provide invaluable data points as a complement to fundamental analysis.
In contrast, the temptation to use the information for systematic strategy implementations is strong. However, the shared use cases shows that the extensive use of alternative data for such methodologies is still in its infancy and very sensitive to the lack of depth, unstructured and badly sorted data, and inexperienced data scientists.
The systematic strategy conundrum has been known forever. How does one develop a profitable algorithm based on historical data without optimizing such a system to the data set itself rather than the real world? Countless methodologies exhibiting great track records proved useless the moment they started trading for real. Beyond trading costs, the biggest issue is data mining and optimization, which experienced quant developers understand very well. Processes and checks are in place to avoid those, which is somewhat easier to do in structured data such as prices over time and financial data, rather than the unstructured nature of the typical alternative data sample set. Is the apple mentioned in a tweet or an article the fruit or the company? Is the jaguar a car, an animal, or a jet fighter? How is useful context derived from such information? And how can one rely on POS data when 3–5% of transactions are captured, depending on data quality?
The key point is that whatever methodology is used to derive insights from large data sets, understanding and acting on data quality is critical to the process.
Conclusion
Alternative data changes the way research and investment management accesses insights and knowledge into the investment process. Far from a monolith, these data sets are wide-ranging and the connection between them is where data scientists can bring value. Investors that require curated data and useful reports without the need for extensive machine learning application, most of providers tailor make their data sets for particular use cases, greatly reducing the time to market of such data.
Any serious analyst or investment manager has to become familiar with alternative data, at least to know what its competition is using and how. Only used structured data is a disadvantage in the alpha world, whether for fundamental, quantamental or pure quant processes.
Sources
Source Orbital Insight. Object Recognition
A Taxonomy of Alternative Data
5 Alternative Data Use Cases for Financial Services
Alternative Data Uses Cases, ed. 6
About the Author
Eric Bissonnier, CFA, started his investment career in 1992 at Chase Manhattan Private Bank in Geneva. After a year in New York, he took over portfolio responsibilities for multi-asset European clients, when multi-currency meant something in Europe. In 1998, Eric joined EIM as portfolio manager and analyst, taking on chief investment officer duties in 2000. During that time, EIM grew to be a leader in alternative investments, managing $15 billion in hedge fund investments for multiple institutions globally. Following its merger with Gottex, the group changed its name to LumX, focusing on hedge funds, delegated services, and fintech. Eric has thus accumulated over his 25+ years of investing knowledge in portfolio and risk management, product development, and process improvement. Lately, as part of the LumRisk fintech, he has extended his knowledge in the reporting challenges faced by institutions and portfolio control to regulatory and ESG-related challenges. Eric is a frequent speaker in investment conferences worldwide. He is a CFA® charterholder since 1999 and holds an MSc in economics from the University of Geneva.