Wallstreetbets Quantified
HERALD: Hype Early Recognition through Analysis of Language Demographics
THE BEQUANT r/wallstreetbets WEB SCRAPER and NLP MODEL
BEQUANT’s first in-house quantitative tool, HERALD, is a python program that combs through the /r/wallstreetbets subreddit and catalogues the quantity of posts and comments referring to each stock, runs them through a natural language processor to determine sentiment, and analyzes the results.
Seeing as the subreddit is built on social proof persuasion, with an idea taking off once a critical mass of people adopt it, this should give users of HERALD an edge. Day by day, one can see the ideas surfacing as they rise through the rankings.
The second element of the algorithm is its built in NLP (Natural Language Processing) capabilities. Each stock mention in posts and comments is run through the open-source MIT NLP program VADER Sentiment to determine the percentage of bulls, neutrals, and bears of the stock.
As of April 30th, 2021, the algorithm is fully automated, releasing new analysis every night at 9:30 pm ET through its Twitter account (@BEQUANT_HERALD). Due to Twitter character limits, only the bare stats are displayed. The program returns the complete analysis to me if something strikes me as worth a closer look. An example of this complete analysis can be seen to the right.
HERALD’s Twitter account is public because as the Wall Street Journal pointed out during the GameStop stock pump, WallStreetBets is an inherently destabilizing force on the market. Individual investors and hedge funds alike can benefit from knowledge of HERALD’s output of most affected stocks — this can prevent panic and bad decisions from occurring.
To discuss the algorithm, please contact me through the contact page or reach out through Twitter. BEQUANT’s handle is @BEQUANTAnalysis.
See below for the prospectus, which includes elements of the investing strategy used when trading based off of HERALD’s recommendations.
RECENT HERALD OUTPUT:
-
Sentiment/Interest #WallStreetBets A BEQUANT Model | Top stocks on WSB: $TSLA | [23 mntns] 30% $SPY | [16 mntns] 3… https://t.co/eBhDzVRD5i
-
Sentiment/Interest #WallStreetBets A BEQUANT Model | Top stocks on WSB: $TSLA | [19 mntns] 36% $SPY | [13 mntns] 3… https://t.co/K4F9Dzl7y0
HERALD
Hype Early Recognition through Analysis of Language Demographics
Prospectus | Aidan Slovinski | JUne 20, 2021
HERALD (Hype Early Recognition through Analysis of Language Demographics) is a python-based algorithm that combs through the /r/wallstreetbets subreddit, catalogues the quantity of posts and comments referring to each stock, runs them through a natural language processor to determine sentiment, and analyzes the results.
WallStreetBets is built on social proof persuasion with an idea taking off once a critical mass of people adopts it. HERALD predicts if a critical mass is reached before the stock’s price reflects WallStreetBets influence.
The second element of the algorithm is its NLP (Natural Language Processing) capabilities. Each stock mention in posts and comments is run through the open-source MIT NLP program VADER Sentiment to determine the percentage of bulls, neutrals, and bears of the stock.
As of April 30th, 2021, the algorithm is fully automated, releasing new analysis every night at 9:20 am ET and 10:20 pm ET through its private Twitter account. Due to Twitter character limits, only the ranking and sentiment of the top eight tickers are displayed. The program returns the complete analysis to me if I determine that the analysis needs a closer look. To track historical data, HERALD also keeps a full record of its complete analysis in a text document.
Effectiveness
HERALD is extremely effective and offers the potential of very high returns. The most effective strategy is to focus on options trading, and buy calls if HERALD identifies a stock that has not already jumped. Examples from the testing phase include $BB, $AMC, and $CLOV. Since historical options data is not public information, I will instead calculate return in the “Performance” section as if one had bought the stock at the opening bell after HERALD identified the stocks. Needless to say, had one bought call options on all three the trading day after HERALD identified them, with expiration dates circa two weeks in the future, the returns would be significantly higher, albeit the risk would be greater.
Methodology
Tickers are identified when HERALD returns that interest has reached a certain threshold as well as that interest is increasing. HERALD discounts tickers that suddenly jump to the top of interest charts, as I have historically found that in those cases outside factors (such as media attention) almost always play a role, which makes the outcome unpredictable.
Performance
HERALD identified $SPCE on May 24 at 1:22 pm and continued to identify interest and positive sentiment until June 2. If one bought the stock on May 24 at the days close of $26.89 and sold it at any time past May 26, they would have made anywhere from $4.20 per share (15.6%) to $10.60 per share (39.4%). This may be due in part to the news of Virgin Galactic’s first completed human spaceflight on May 22.
HERALD identified $BB on May 26 at 10:20 pm as being strongly influenced due to its consistent but rapid rise in interest and its high sentiment score. HERALD continued to identify interest and positive sentiment until June 7. Had one bought Blackberry stock at open on the 27th at a price of $9.70 and sold at the stock’s high of $15.88 on June 3, they would have made $6.18 per share, or 64%. Even if they sold the next day after a slight drop (it rose again to $15.80 the day after as well) to a price of $13.86, they would have made $4.16 per share, or 42.8%.
HERALD identified $AMC on May 27 at 10:20 pm as similarly strongly influenced, as the ticker displayed similar symptoms. Had one bought AMC stock at open on May 28 ($31.89 per share) and sold at the stocks peak on June 2 at a price of $62.55 per share, they would have made a profit of $30.66 per share or 96%. Even if one had held too long, they would have made anywhere from $10.92 per share (34%) to $28.84 per share (90%), as $CLOV’s stock price has hardly dropped to this day.
HERALD identified $SNDL on May 28 at 2:51 pm as rising in interest with solid sentiment. If one had bought at the day’s close at $0.97 and sold at the stocks peak of $1.29 on June 3, they would have made $0.32 per share (32.9%). Even if they sold late at the next day’s low of $1.09, they would have made $0.12 per share (12%).
HERALD identified $CLOV for the second time on June 4 at 2:57 pm. Had one bought $CLOV at the days close ($9.00 per share), they would have made over 146%, or $13.15 per share, had they sold at $CLOV’s peak of $22.15 on June 8. Even if they had held far too long, and sold two days later at $15.03, they would have made 67% or $6.03 per share.
These are all examples of extremely profitable predictions by HERALD, but it is also important to consider the stocks HERALD misjudged.
HERALD identified $UWMC as a ticker of interest on May 18 at 10:11 pm. UWM stock rose from $8.14 on opening on the 19th to a price of $9.17 on May 27 and $10.22 on June 11, gains totaling $2.05 per share (25.5%) at their peak, HERALD stopped identifying interest in $UWMC abruptly on May 24. This may be a glitch with Reddit’s API (as many other tickers abruptly left the list), but it likely would have resulted in the trader’s sale of the stock on May 24 or May 25 at prices ranging between $8.13 and $8.43. If the trader had bought stock, they would have been fine, but had they attempted to sell to close their call options, they would have lost money.
HERALD identified $RIDE as a ticker of interest on May 24 at 1:22 pm, however, unusually, interest steadily declined from there and had the trader bought $RIDE stock at May 24’s close price of $9.67 and sold anywhere from then till the stock completely dropped off of HERALD’s radar on May 27 at 10:22 pm, they would have sold anywhere from $7.88 to $10.63. This would have resulted in overall change from a loss of $1.79 per share (-18.5%) to a gain of $0.96 per share (9.9%).
Had one engaged in all these trades and put $1000 into each of them for a total of $7000 invested, they would have achieved gains of anywhere between $1523.80 to $3,918.65, returning between a 22% gain to a 56% gain over a 6-week period. This evaluation does not incorporate reinvestment of assets gained. As such, this calculation is best understood as displaying that HERALD has displayed an average range of gain of 22% to 56% per trade, for a cumulative average of 39% gain per trade.
Alternatively, if one reinvests the gains after each trade in chronological order, starting with capital of $1000, the projected overall return is between 235% for a total gain of $2,349.77, and 584% for a total gain of $5,841.91 over a 6-week period from just $1000 invested. For a visual, please see figure one below.
To put this in perspective, the S&P returned just over 1.8% from the time HERALD launched to the week of June 13.
These calculations are available for perusal in the attached spreadsheet, with page 1 covering the first evaluation method and page 2 the second.
Obviously, hypothetical trading data is far less compelling than actual results. To that end, on June 18, 2021, I liquidated $633 from my previous holdings to begin monetizing the algorithm.
Please email or text with any questions, comments, or proposals. If interested in algorithm updates, please request to follow @BEQUANTAnalysis on Twitter. This private account serves as virtual notebook for ideas for continued development as well as a record of version updates.