Why stablecoin data is harder to understand than it seems.

Author: Sebastian Melendez Source: Artemis Translation: Shan Ouba, Golden Finance

Introduction

Stablecoins are currently the focus of the market. There is significant news almost every day. Last week, Stripe announced it will acquire the wallet service company Privy, while PayPal announced it will natively mint PYUSD on Stellar. News is coming in thick and fast, almost overwhelming. As more companies enter this field, the demand for tracking and acquiring stablecoin data is growing. However, from our communication with clients, people keep asking four questions:

  1. What are stablecoins actually used for?
  2. Who is using stablecoins?
  3. What opportunities exist?
  4. In which countries or regions are stablecoins used?

My job at Artemis is to collect, organize, and summarize stablecoin data every day to answer these questions. Today, we are going to debunk some “seemingly simple” data myths and see how difficult these questions really are to answer.

Myth 1: Stablecoin data is open, transparent, and readily accessible to everyone

The cost of independently accessing on-chain data is astonishingly high, and the technical barriers are also extremely high. Although the accessibility of raw blockchain data has improved over the past five years, there are still many hurdles. Mainstream data service providers such as Dune, Flipside, Allium, and Goldsky each have their advantages, but none can cover all key blockchains.

Actual Situation:

Nowadays, almost every company is launching its own blockchain, each with its own peculiar characteristics, making data analysis extremely complex.

If you want to gain a comprehensive understanding of your stablecoin usage patterns and discover potential opportunities, you need to be able to conduct a panoramic analysis across all relevant chains, not just the currently deployed platform. With the development of multi-chain strategies and the deepening of analytical needs, the complexity of data infrastructure has also increased.

Taking PYUSD as an example:

Once you have integrated LayerZero’s OFT cross-chain protocol, to truly see the whole picture, you must master:

  • Ethereum’s mechanism
  • Solana’s account model
  • LayerZero’s cross-chain logic
  • as well as the structures of emerging chains like Berachain and Flow

Worse still, users may also bridge tokens to more platforms, which exponentially complicates data issues.

The problem is not just the chain you are currently online with, but also the continuous expansion of the entire ecosystem, with new chains emerging one after another. This leads to the second problem: architectural fragmentation.

The data architecture and format of each chain are different

Think back to the early 2000s, when sending a file to someone didn’t mean they could open it. PowerPoint wouldn’t open, videos lacked codecs, and systems were all operating independently, everything couldn’t collaborate seamlessly. Even elementary school students were tormented by these issues.

The current blockchain world is just as chaotic as it was back then.

The most active chains currently —— Solana, Tron, Ethereum, TON, Stellar, Aptos —— have data architectures that are vastly different.

For example:

  • Solana: You need to understand the concepts of token account and owner account.
  • Ethereum: You need to understand smart contracts, EOA, and the ERC-20 standard.
  • Aptos, Sui: Using an object-oriented model, assets are programmable objects.
  • Stellar, TON: The architectures are completely different, but the usage of stablecoins is astonishing.

Understanding these on-chain activities means that you have to dismantle an increasingly complex web of technologies.

Look at PYUSD again:

Previously, you only needed to understand the architecture of Ethereum, Solana, and LayerZero. But now that it has landed on Stellar, you also need to understand:

  • Stellar’s smart contract platform Soroban
  • Soroban’s virtual machine model
  • A transfer and balance management logic that is completely different from Ethereum.

That is to say, you even have to become an expert in a certain chain to access and parse the data, let alone extract insights from it.

Myth 2: As long as you obtain blockchain data, insights will naturally emerge

Many people think that as long as the data access problem is solved, it will be easy to gain user insights next. Assuming you have sorted out the access permissions and captured the full-chain balance and transaction datasets, what have you actually obtained?

The answer is: a pile of noise.

On-chain addresses are merely strings of letters and numbers, and wallet balances are often inaccurate or misleading. Raw blockchain data does not equate to insights; it is just a messy pile of data that requires extremely complex cleaning and processing to become valuable.

**The reality is: To understand what happens on-chain, it is inseparable from context and off-chain data

Even if you have gone to great lengths to collect on-chain data, you still cannot answer the key questions: Who is using your stablecoin? Where are they?

All you can say is: “My stablecoin has been used.” This is not actionable and does not help you understand: user behavior, market penetration, growth opportunities. To achieve these insights, you must rely on off-chain context. And the real question is: what off-chain data do you need, and how do you obtain it?

Application and protocol tags: There is no single source of truth for tokenizing on-chain activity. Flipside, Dune, Open Label Initiative, Block Explorer, Arkham – they all provide some information, but each has its own pattern and limited coverage. In order to answer questions such as “What application is used for this address?” “or” What kind of use cases are we seeing? You’ll need to unify these fragmented sources of tags and manually tag important wallet addresses. If you don’t, you’ll only be able to use raw transaction data, which doesn’t provide any information about actual usage patterns.

  • Geographic Location: This is the key issue—perhaps the question I am asked most often: Where are my users? We use timezone heuristic algorithms and advanced technologies to infer geographic distribution. More importantly, we collaborate with data partners to obtain proprietary off-chain geographic data, which helps us accurately pinpoint which country wallets are most likely to come from.

The reality is that solving this labeling issue requires a significant amount of resources and industry connections. You need to establish partnerships with major L1s and protocols to build a comprehensive labeling dataset. Most teams do not have enough bandwidth or connections to manually handle this issue—that’s why many analytic efforts encounter bottlenecks after acquiring raw blockchain data. The context layer is where the real work begins.

Myth 3: Blockchain data is intuitive and consistent

Blockchain is far more complex than it seems. While the industry has begun to standardize around specific design patterns for token transfers over the past few years, this has not always been the case. When bridging technology first became popular, there was no community standard for tracking cross-chain activity. This creates confusion when trying to accurately track balances and transfers – especially for tokens that have been around long enough before these standards were introduced. You need to understand the specific history and characteristics of each chain to get accurate data.

**Reality: The “database model” of blockchain has been constantly changing — you must become a “historian on the chain” to obtain accurate data

We easily forget that these ecosystems are constantly changing. Take Solana as an example; its architecture (how the blockchain operates) and token program (the way tokens are created and transferred) have undergone significant upgrades.

  1. Architecture Upgrade: When Solana first launched, the chain did not store timestamps in long-term storage. This caused significant issues when attempting to calculate historical balances over time. Solana fixed this issue in 2020, but the damage was already done: how to accurately reconstruct historical balances without timestamps?
  2. Token Plan Upgrade: Last year, Solana launched the Token Plan 2022 to address the fragmentation issues in the original design, but this means you need to understand the nuances of both the old and new token plans to accurately track the fungible tokens.

Based on this, people often hear that blockchain is an immutable, public, append-only database. While this is generally true now, it wasn’t always the case in the early days. Optimism is a good example - they didn’t just launch after a genesis event. In fact, they completely relaunched a few months later.

What is the result? There is no complete dataset on all token transfers on the original Optimism chain.

Why is this important? The missing data is crucial for understanding the current and historical activities of major stablecoins on the OP mainnet, including USDC, USDT, and DAI. Without this data, you cannot obtain a complete dataset and cannot calculate accurate wallet balances.

Building an accurate dataset requires becoming a blockchain historian. Understanding the subtle evolution of each chain and explaining all these historical differences takes years of effort.

Conclusion

Blockchain data faces unique challenges that simply do not exist in other industries. Even though it is nominally “open and transparent”, extracting meaningful insights actually requires off-chain data, integrating dozens of data service providers, reading contextual information scattered across crypto Twitter and official documents, along with a team of more than 10 engineers. Otherwise, you’re just a blind person trying to feel an elephant, chasing a phantom market that changes at the speed of light.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • 1
  • Repost
  • Share
Comment
0/400
Distangervip
· 2025-06-18 10:12
a lot of noise - well said, it takes a lot of effort to understand something
View OriginalReply0
  • Pin

Trade Crypto Anywhere Anytime
qrCode
Scan to download Gate App
Community
  • 简体中文
  • English
  • Tiếng Việt
  • 繁體中文
  • Español
  • Русский
  • Français (Afrique)
  • Português (Portugal)
  • Bahasa Indonesia
  • 日本語
  • بالعربية
  • Українська
  • Português (Brasil)