Paper Breakdown: Three Machine Learning Methods for Cryptocurrency Trading Strategies

Do repost and rate:

The following is a summary of the paper, “Application of Three Different Machine Learning Methods on Strategy Creation for Profitable Trades on Cryptocurrency Markets”, authored by Mohsen Asgari and Seyed Hossein Khasteh.  The full text can be viewed on arXiv.

Introduction:

I have long been interested in machine learning and its wide variety of applications.  When I reflect on how my understanding of the field has evolved over time, I find that I’ve become increasingly skeptical about any source that uses the traditional “artificial intelligence” and “machine learning” buzzwords.  I often find myself asking “Why would you use machine learning for that?” or “How could that possibly work?” – partly out of frustration that the field has become so overhyped, but mostly out of jealousy that I didn’t think of the approach first.

When it comes to analysis of cryptocurrency markets, an accurate machine learning trading solution has the potential to make a lot of money for its creators.  When I came across the subject of this article – the paper “Application of Three Different Machine Learning Methods on Strategy Creation for Profitable Trades on Cryptocurrency Markets” – I approached the material with my typical amount of healthy skepticism.  However, despite the aspects of this paper that leave me skeptical, there is an abundant amount of positive information that can be discerned from the authors’ work.

Quick Look:

Let’s start with a “too long; didn’t read” on the model used.  Then, I will write up some thoughts for any who are interested.  The high-level details of the paper are as follows:

  • Data:
    • The input space includes Open, High, Low, Close, and Volume (OHLCV) data for crypto currency pairs ETH-USDT, LTC-BTC, and ZEC-BTC. Data has also been augmented with financial technical indicators, including Commodity Channel Index (CCI), Relative Strength Index (RSI), Directional Movement Index (DMI), Moving Average Convergence Divergence (MACD), and Bollinger Band.
    • A single input from the input space includes 19 parameters across 60 timesteps (current timestep as well as the 59 prior timesteps). Each flattened feature vector contains 1140 parameters total.
    • The output (label) space at each timestep is a function of whether it would have been beneficial to enter the market with the retrospective of asset prices four hours later. A fee of 0.15% is assumed. If the price of an asset has increased by 0.15% (or greater) in 4 hours the label for that timestep is “1”, otherwise the label is “0”.
  • Models:
    • The three methods used are k-Nearest Neighbors (kNN), Random Forest, and eXtreme Gradient Boosting.
    • Scikit-learn is used to conduct kNN and Random Forest. XGBoost is used for the Gradient Boosting method.  Both packages trained the models in under 5 minutes.
    • The problem of focus is Market Direction Prediction, meaning that “main goal of the methods described in this article are to determine if the price of the [analyzed] cryptocurrencies will move [higher (less transaction fees)] or lower in the coming four hours."
    • (Model hyperparameters are listed at the end of this post for interested readers)
  • Implementation / Results:
    • For each model, decisions are made based on the last 60 minutes of trading data. After 4 hours, the decision is evaluated and profits/losses are calculated.  Then, another decision is made and the process repeats.
      • It is indicated that on transition from a “0” to a “1” prediction the simulation “enters the market” and on a transition from a “1” to a “0” prediction, the model “exits the market”.
      • It is indicated that in this simulation, entering and exiting the market is equivalent to buying and selling 1 unit of the asset in question (as denominated in the base asset of the pair).
      • Net profits are calculated to be accumulated base asset funds in addition to the value of 1 unit of the traded asset (i.e. market growth or loss experienced if the asset was bought and never sold).
    • Models are evaluated on traditional accuracy of prediction as well as profit achieved under simulation.
      • 9 models are evaluated in total (3 machine learning approaches for each of the 3 pairs of assets).
      • Testing Accuracy ranges from 0.467312 to 0.585956, with 8 of the 9 models performing above 0.5.
      • Percent of Profitable Trades ranges from 48.42% to 57.14% with 6 of the 9 models performing above 50%.
      • Profit Factor was above 1 for all models
      • Net Profit was positive for all models (i.e. profit on top of market gains)
    • The authors conclude that this model is a great starting point, but much more work is needed. The following ideas are specifically mentioned:
      • The current model is simple and has a high degree of investment risk
      • The model could include additional fundamentals of financial analysis
      • The model could ensemble additional machine learning approaches
      • Social network data streams can be leveraged as a market indicator
      • Deep Reinforcement Learning could be leveraged to develop a more sophisticated trading strategy

Discussion:

The authors certainly pose an interesting model – not just for trading cryptocurrencies, but also for trading assets in general.  One of the more fascinating ideas exhibited in this article is that of using different cryptocurrency pairs when evaluating model performance.  This is something specific to currency markets that could not be accomplished on a stock exchange, since currencies can be exchanged with one another directly, without having to necessarily transition through an intermediate.  (It should be noted that exchange methods vary depending on the exchange service.  It is also true that stocks could be denominated against other stocks and trades could be executed between the stocks almost instantaneously even with fiat currency as an intermediate). 

This idea is interesting, however, because many models are evaluated using a pair of cryptocurrencies that include a U.S. dollar stable coin.  This results in datasets that exhibit similar trends as cryptocurrency values tend to increase and decrease against the dollar in similar patterns.  This paper exhibited an interesting opportunity in terms of trading Bitcoin against a base currency of Zcash, which follows a very different trend than that of most dollar-cryptocurrency pairs.  What is most interesting about the results from the ZEC-BTC simulation is that all models appear to have performed similarly, whereas in the other simulations, performance varied more significantly between the different machine learning approaches.  This is likely related to the specific dataset rather than being a property of the ZEC-BTC pair, however it is interesting to note.

Probably the most interesting aspect of this paper is the overwhelmingly positive results.  I am – as I am with every machine learning paper – skeptical of this outcome.  I am unaware of any attempts to reproduce these results since the paper was published in May 2021. 

The dataset used encompasses cryptocurrency prices from July 2017 through April 2021.  I am curious if the extremely positive performance is directly related to the time period used for the “test” data.  It is specifically mentioned that the train/test split for these simulations is 95%/5%.  As a result, the “test” portion of the simulation is conducted from approximately (judging from graphs included in the paper) March 16th, 2021 to April 30th, 2021.  It should be noted that this time period also coincides with the most recent cryptocurrency bull run, the peak of which occurred on approximately May 11th, 2021 – only 3 days before this paper was published. 

It appears that the authors have not published any additional work to arXiv, however I am extremely interested in seeing them test this model against more recent datasets.  On the one hand, the positive results in this work are emphasized by stating that all “Net Profits” are calculated as additional gains on top of the value gained by the market.  As such, it is arguable that the positive results are not directly attributable to the positive performance of the market at the time of the “test” period.  On the other hand, I think that it could be argued that, since the market was in an overwhelmingly positive up-trend at the time of testing, the profitability of the model could be overly-inflated (especially since the models are boasting approximately 45%-60% prediction accuracy across the board).

My only other criticism is that the model is overly simplified for a trading model – another fact that the authors specifically acknowledged.  The trading model used in this paper only makes decisions at specified 4 hour intervals, creating the potential to miss opportunities that could be exploited from price movement within the 4 hour gaps.  This is especially naive in the case of consecutive “buy” signals, in which it appears that no trading would occur and that the model would only be holding its market position (if I am interpreting the paper correctly).  There is also no variability in the amount traded.  Including a factor of magnitude for each trade would drastically increase model complexity, however I believe that this notion is necessary for a trading model to be robust.  Finally, it should be noted that the 0.15% fee seems relatively non-conservative given my limited experience with cryptocurrency markets.  However, it should be noted that this fee is grounded in reality, and was established based on the BINANCE platform, which I have not used.

In summary, the models presented in this paper are very encouraging.  Despite model simplicity, the results boast significant profits over the market when simulated on a “test” time period of less than two months.  Though I am skeptical, the positive performance and drastic room for model growth has captured my interest.  I think this article is a fantastic starting point for anybody looking to get into the cryptocurrency prediction space.

Appendix - Model Hyperparameters:

Hyperparameters for each model are listed here for interested readers:

  • kNN:
    • Number of Neighbors: k=5 for ETH-USDT; k=20 for LTC-BTC, k=100 for ZEC-BTC
    • Weight function = Distance
    • Algorithm = Auto (Scikit will decide between BallTree, KDTree, and Naive)
    • Leaf Size = 30
    • Distance Metric for Tree = Minkowski (Power Parameter: 2)
  • Random Forest:
    • Number of Trees: n=700 for ETH-USDT; n=700 for ZEC-BTC; n=1000 for LTC-BTC;
    • Split Quality Function = gini
    • Tree Depth = expand until all leaves are pure
    • Samples to Split a Node = 2
  • Gradient Boosting:
    • Booster = gbtree
    • Learning Rate = 0.3
    • Loss Reduction for a Partition = 0
    • Tree Depth = 6
    • L2 Regularization Factor = 1
    • L1 Regularization Factor = 0

Regulation and Society adoption

Ждем новостей

Нет новых страниц

Следующая новость