By Robert Hansen
Preface: Investing well has become an artform. Concepts like fundamentals, competition, market share, and long-term viability are all concepts that the Warren Buffets of the world rely on when making decisions about which investments to focus on. Finding alternative tactics to base financial decisions based on the promise of short term gains is often risky, especially if it involves privileged information. Forecasting future events without breaking the law is the holy grail of investors.
Overview: As with any sufficiently complex system, the Internet provides many avenues for acquiring information that is otherwise sensitive. While reading this paper, it must be understood that most of the techniques mentioned are easy to comprehend, although some would be extremely difficult to implement. It is also not suggested that any of these techniques actually be implemented. This paper is designed only to explain how data mining can provide information that could prove useful in market decisions. Let's start with some assumptions.
Assumptions: First, we must assume that the company in question is publicly traded and is subject to semi-volatile issues, like market swings, public perception and large scale attrition. It also must be assumed that for these concepts to work, that the company is involved in at least some level of high-tech enterprise. For some of the following techniques to be truly effective, the company must be susceptible to changes in human capital.
The Techniques: It can be argued in many companies the most important commodity is the employees. Of course intellectual property, client base and mind-share are all critical components as well, but many of these things can be subverted by the human capital. For instance intellectual capital, while iron clad in some states due to non-compete agreements in states like California they are non enforceable due to "right to work" clauses. Clients and customers can leave due to allegiance to individual salesmen and mind-share can evaporate due to messy disclosures of internal affairs a la internalmemos.com.
While understanding the health of the company is often left to annual reports, understanding the morale and mental health landscape may prove just as valuable. Let's look at a few interesting places that can give huge insights into the mental well being of the people working at the company.
LinkedIn is a social networking site for working professionals. It's a place to keep in contact with people who work in other companies among others. However, it is also used to network when looking for work, as well as to help others get jobs through recommendations. Graphing out the rate of growth of new connections inside a company, as well as a sudden spurt of changes to information within profiles can signal a shift in employee satisfaction. This particular technique is bound by the types of people who use the site, and because the site is semi-region and demographic specific, and because account abandonment rates are unknown, the site should be used as only a positive indicator of change, not a negative one.
Knowing this information is more valuable than it may first appear. Job titles as well as occasionally particular information about which accounts they may have (in the case of sales) may be available both through the profile as well as through who they are networked to. It can also signal large scale attrition of critical functions within the company as entire organizations can become disgruntled with poor management, uncompetitive wages, or a host of other business issues.
Job sites like Monster.com are another prime place to derive information about a company's health. For a fee, companies can look for people who have relevant work experience, as well as for people who are currently employed at the company in question. Using nothing more than text based data mining it is possible to derive what departments are seeing unnaturally high attrition as well as what particular intellectual mind share may be walking out the door. Also, key information about projects, is often leaked through people's resumes. This technique is employed and expanded upon in industrial espionage where a fake recruiter will call the candidate and question them about project specifics, pulling as much sensitive information out of them as possible. While the latter form of this technique is illegal, it is highly successful in uncovering the veil of internal corporate strategy.
Likewise, human resources can also provide a fantastic wealth of information about open requisitions, which departments are being expanded, which are understaffed and may signal shifts in the corporate strategy. The more particular the requisition the more information can be derived about the organization doing the hiring.
If the company is critically dependent on keyword phrases in search engines for its success it can be useful to track the health of those particular keywords using Google Trends. In this way, you can see mass fluctuations in people's perception of the company or it's products. Also, tracking where the company ranks for the individual keywords that they monetize can be highly useful in understanding the economic future of the company.
Alexa, while mostly spyware, can also help in researching trending information about company traffic patterns. This information isn't statistically relevant when comparing two companies against one another, due to the fact that it is based off of users who have the toolbar installed, however it can be highly useful in knowing the health of any one company month over month. lso, because the lexa toolbar is used mostly by webmasters and search engine optimization experts, it can also be especially useful if the company monetizes that particular form of traffic.
Knowing the specific patterns of key executives is also highly predictive, in the case of public figureheads. For instance, knowing that an executive is buying another home may not be interesting if it is located within ski country. However, it is highly unlikely that an executive living in Los ngeles would buy a summer home in San Jose, California. This could be indicative of a future position there and that the executive is looking to live there while they work. Mining public county records may be highly useful for this type of analytics. While a huge undertaking, perhaps certain economically relevant regions are most key to understanding. For instance, if a large company is located in a particular region it may be useful to know that three or more executives in that area put their houses up for sale within the same month.
Often companies release new products in alpha phases prior to making them widely known. lso, it is very common for large companies to assign new network allocations for their online products. Running network scanners across the company domain on a regular basis can isolate and identify new swaths of networks prior to use. Because of the slow propagation of internet name servers, administrators realistically must assign publicly facing hostname prior to the sites becoming usable by the public. Taking snapshots of the network can help uncover new product launches prior to any public announcements. lso, depending on the network layout it can signal large and ongoing costs associated with the build out of additional infrastructure. Knowing the corporate bottom line can help in future forecasting of budgets and new/expanded cost centers.
Conclusions: While this paper is a brief spattering of different techniques, there are perhaps hundreds of other ways mine data beyond this. Some are more legal/ethical than others, and while this is nowhere near an exhaustive list, it is meant only as a way to get people thinking about the vast amount of information that leaks out of companies. This information can be monetized if identified and classified. The results of which can be huge predictors of future events, and can actually be used for additional monetary gains.
For instance, if it can be proven that any one technique is a 100% indicator, an investor can invest and then publicly disclose that information. In turn the public can make huge swings in the market, adding to the velocity of the market shift, and allowing the gains to become multiplicative. Ultimately this information can be used to an aggressive investor's advantage in a number of ways, whether it be shorting stock, buying into competitors, or other carefully planned investment strategies.
Thanks: Special thanks must to go Vasil Nadzakov and Erik Wu for inspiration and allowing me to think through my ideas.