How to improve your NFT Trading strategy with Data Science — Case Study
NFTs are pieces of digital arts traded anonymously on several Blockchains. In this article, we’ll explore how Data Analysis can improve the quality of your purchases through a Dataset developed by Astrid Network for Ocean Protocol and DataWhale.
What is an NFT?
Non-fungible tokens (NFTs) have emerged as a prominent player in the world of digital assets, capturing the imagination of artists, collectors, and investors alike. As a unique blend of technology and creativity, NFTs have transformed the way we perceive and interact with digital media, opening up new possibilities for creators and collectors.
The history of NFTs can be traced back to 2012, with the advent of colored coins on the Bitcoin blockchain. These coins were initially used to represent small quantities of digital assets such as artwork and domain names. However, it was not until the launch of the Ethereum blockchain in 2015 that NFTs truly began to flourish.
Ethereum’s smart contract capabilities allowed for the development of more sophisticated NFT protocols, such as ERC-721, which paved the way for the creation of unique digital assets. The popular CryptoKitties game in 2017 marked a pivotal moment in NFT history, as it showcased the potential of these tokens to represent digital collectibles. Since then, NFTs have rapidly evolved, expanding into various industries and use cases. Some notable examples include: Digital Art, Collectibles, Virtual Real Estate and Tokenized Physical Assets. Essentially, NFTs are being used to represent ownership of digital or real-world assets.
Data Science in NFT Trading
NFT trading, or the exchange of non-fungible tokens, has emerged as a revolutionary aspect of the digital asset ecosystem, empowering artists, creators, and collectors alike to buy, sell, and trade unique digital items. To effectively navigate this space, traders and collectors often rely on data-driven approaches, where several techniques are used to make informed decisions, to predict trends, and optimize their trading strategies.
In conclusion, data science has the potential to greatly enhance NFT trading by providing actionable insights, predicting trends, and personalizing the discovery process. By harnessing the power of machine learning and advanced analytics, traders and collectors can navigate the complex and ever-evolving world of non-fungible tokens with greater confidence and success.
In this article, we will explore the work that Astrid Network team has done for an Ocean Protocol Dataset in collaboration with DataWhale, one of the most important startups in the Ocean Industry. The software developed, allows users to identify patterns between a given trait and its profitability, or a given trait with the NFT worth. We are going to do it using Data Science, matrix, correlation concepts and specific packages. We won’t share any code to keep it simple.
After this overview, we’ll introduce some potential challenges and updates that can give even more useful insights and more valuable data.
Code Overview
DataWhale asked us to develop dynamic, scalable software capable of understanding fundamental patterns between NFT traits and their prices. This software must extract, store, process, and display data in a way that is useful and easy to understand for both advanced traders and NFTs enthusiasts. The output of the software will cover the following collections:
- BAYC
- MAYC
- Otherdeed
- Doodles
- Exosama
- Clonex
The analysis could have been extended to any other collection. The choice of this was at the complete discretion of the client.
Data Extraction
The first stage in developing the software is data mining. Having OpenSea’s Key API at our disposal, the gathering was quite simple. For data storage, we created a SQLite database, in which two tables are inserted: one to keep track of the items in each collection, the other to record the sales history of each collection. The first table contains the following fields:
- Token ID
- Image URL
- Traits List
- Rarity Score. The formula used is the one shared by Rarity.Tools
The second table contains the following fields:
- Token ID
- Sale Timestamp
- Token Exchanged Name
- Token Exchanged Amount
- Seller Address
- Buyer Address
- Transaction Hash
The script responsible for extracting and storing data is dynamic, i.e. it is capable of gathering real-time data, based on a frequency decided by the customer, which depends on the accuracy required for the analysis, the plan purchased from the web hosting etc…
Most Valuable Traits
The category of ‘Most Valuable Traits’ aims to highlight the correlation between the price of a given NFT and the traits of that collection. Basically, we are calculating how much on average a user is willing to spend to get an NFT with that trait.
Since each collection analyzed accepts several Tokens as payment, the first step is to unify the transaction history to one, in our case Ethereum. So, we cosrtuate a function that takes as input the timestamp and a token, and outputs the corresponding value in Ethereum at that specific datetime. By iterating this function for each sale and filtering the data we are interested in, we output an N x 2 matrix, where N is the total number of sales, the first row is the Token ID, and the second is the sale price. This matrix allows us to construct a very common chart in statistics, which is called the Normal Distribution. A probability distribution is a statistical function that describes the likelihood of obtaining the possible values that a random variable can take. In our case, that an NFT from that specific collection is sold at X price. The graph was developed using special libraries, giving output similar to this:
Based on the distribution curve, we identified three categories:
- Tier One. That of NFTs sold exceptionally at a higher-than-average price, i.e., the 90-percentile of the distribution curve.
- Tier Three. That of NFTs sold exceptionally at a lower-than-average price, i.e., the 10-percentile of the curve.
- Tier Two. All sales not included into the first or second category.
To make the analysis more real, we restricted this to the most recent sales, in a timeframe with which lower volatility was found.
The last step was to iterate each sale for each category and calculate the frequency of each trait in each category. The output? A series of histograms where we ranked the traits according to their frequency in Tier One.
Most Profitable Traits
The second analysis seeks to highlight the correlation between a given trait and its profitability. Basically, we are answering the questions: how profitable, statistically speaking, was that trait? Can you rank the most profitable tracts? We tried to analyze the sales history of the mentioned collections to extract the most profitable traits.
The first step in this analysis was similar to the previous one: we again converted all transactions to a single currency (ETH), but sorted the matrix by the datetime of the sale. We then identified the most profitable tokens.
Iterating each transaction, it is possible to identify the trades made on a single NFT (each purchase corresponded to a sale). By comparing the purchase price with the sale price, we identified the ROI associated with each trade and calculated its weighted average. The output of this second step, therefore, was to associate an average ROI with each token in the collection.
The third and final step was to correlate the NFTs with their traits. We create an array in which in the first column we identify the name and category of the trait, and in the second we will have another array in which we store all the ROIs associated with that trait. To find these, we iterated each NFT and in each of its traits. When a trait was not on the list, a new row in the array was generated and the corresponding ROI inserted. On the other hand, when an already present trait was found in an NFT, the corresponding array in the second row was populated with the corresponding ROI. Ultimately, we obtained a histogram in which we rank the traits in a collection according to the average ROI associated with them like shown here:
Potential Challenges
Understanding which characteristics are most valuable or profitable can be an excellent support in the decision-making process. However, this tool can be further improved in terms of performance, quality, and types of statistics calculated. We have identified some potential challenges:
- NFT Price Estimation. By using historical sales data, floor price history, global NFT market data, and some specific on-chain data, it is possible to estimate the correct price of an NFT through regression, a very useful approach for this type of task. This information can be extremely valuable for arbitrage opportunities on NFT prices (i.e., identifying the intrinsic value of the NFT through the model and purchasing it at a lower price) or for any decentralized application (dApp) that needs to estimate the value of an NFT through an analytical method (such as platforms specialized in NFT fractionalization).
- Wash Trading Detector. The statistics generated by the software have enormous potential. However, their reliability could be compromised by wash trading activities, which invalidate the statistics and artificially inflate the volume of NFT collections for various types of benefits. Applying an algorithm like Naive Bayes to determine the probability that a sale belongs to wash trading with certainty could help filter out malicious transactions and increase the quality of the output.
- Building a User Interface. The extracted data and statistics can be very useful, but often their potential is underutilized due to poor User Experience. Creating a platform that displays and updates this data in real-time (using the Ocean marketplace as a secure gateway to access the data) would allow even users with less experience in Web3 and IT to access a valuable source of data.
By addressing these challenges and expanding on the current tools and techniques, the process of making informed decisions in the NFT market can be greatly enhanced. Improved price estimation, detection of wash trading, and a user-friendly interface can contribute to a more transparent and accessible ecosystem for both experienced and novice traders, ultimately fostering growth and innovation in the NFT space.
Conclusions
In this article, we have analyzed the advantages of Data Science in NFT trading. We have examined a real case study and proposed improvements to make it more efficient and comprehensive. If you are interested in obtaining these analytics, you can purchase them on the Ocean Marketplace, or if you have a specific request of this kind, you can contact us. At Astrid, we build hybrid solutions that include Blockchain and Data Science.