Learning about and working with web3 data is challenging. This is true even if you already have experience working with data in other domains.
I know this because I have been facing this problem for years.
In this post, I share my experience working with web3 data. I also summarize the challenges you might face when building on it.
My Experience
My interest in web3 data began when I was doing my Ph.D. on financial technologies. I have studied economics, theories of value, and currencies as an ethnographer. This was interesting, but I wanted to learn more. I needed data.
Web3 was a logical area to enter as blockchain transactions are public. I thought, “I will do web3 data science,” expecting to jump straight into analysis. I retrained myself and started to freelance. When I became confident in my skills, I turned to blockchain analytics, hoping to learn more.
This proved to be hard. I realized that blockchain data is complex. Even using pre-processed data needed extensive preparation @novoszath2019. I had many questions: “What data is available?”, “How do I access it?”, “How should I interpret it?”, “What is relevant and what is noise?”.
It was hard to find answers. Blockchain data has been a niche topic discussed in a few places. I did not know many people working in web3 & blockchain, let alone with web3 data.
I continued to learn. I looked for gigs requiring work with on-chain data. I volunteered on web3 projects. I joined the data & intel squad of Aragon DAO. I applied for blockchain data jobs to practice their take-home assignments.
In each case, I found that accessing and processing web3 data is a challenge in itself. I also realized that web3 provides opportunities on the data layer. I turned my focus to data engineering, joined a company to learn the skills, and continued to practice.
Web3 vs Web2 Data Engineering
When I compared my web3 data projects with my day-to-day work, I recognized something. The most time-consuming tasks were unlike the ones I faced in my ‘normal’ job. Sometimes I spent days or even weeks on things like the following:
-
Decoding block and transaction data on blockchain explorers.
-
Reading smart contracts and white papers of web3 platforms and protocols.
-
Researching web3 data providers, interpreting their data, and testing their APIs.
-
Hunting down information in Discord and Telegram channels on a case-by-case basis.
-
I had to solve these challenges without courses, tutorials, or communities. And I am still facing these issues every day.
Challenges with Web3 Data
You will face many difficulties when working with Web3 data. The following quote from Allium demonstrates it well:
Answering a simple question like “Who are the biggest Ethereum token holders over time?” requires an engineering team to run their own RPC nodes, ingest the full history of the blockchain, clean the data, transform the data and finally summon a wizard to cast a complex SQL query [@allium2025].
Let’s break it down further:
- The web3 data domain has a steep learning curve. You need to understand both blockchain and data engineering concepts. Each is a rabbit hole on its own.
- The web3 ecosystem is diverse. Different chains have different data structures. Protocols have different smart contracts and business logic. Each time you start a new project, you have to learn its ins and outs.
- Access remains difficult despite improvements in tooling. Running nodes is resource-intensive and requires serious skills. Third-party data providers are costly and take lots of time to compare.
- There are no established tooling or best practices to follow.
- The divide between data and blockchain communities means expertise rarely crosses over. Many people in web3 are also wary of data due to privacy concerns.
- Learning resources are scarce.
The last problem with learning resources amplifies the rest. If you are trying to learn and build alone, you can end up like I did. You can end up spending weeks sifting through information collected piece by piece. You can spend months finding out which tool or data provider you should use. And you would not even know whether you are on the right track.
This is the reason I decided to create content and a community around web3 data.
Conclusion
I created the data3 community on Skool to find other people interested in Web3 data. I hope that this will help us to learn faster by sharing our experiences and failures.
It is a free community where I plan to share my learning and resources. Currently, I am developing a Web3 Data guide.
If you want to learn more about Web3 data, check it out!
References
- @allium2025: Allium. (2025). Allium Raises $21.5M – Theory Ventures, Kleiner Perkins, Amplify Partners
- @novoszath2019: Novoszath, A. (2019). Getting started with Bitcoin data on Kaggle with Python and BigQuery