

Opinion by: Michael O’Rourke, founder of Pocket
Network and CEO of Grove
Open data is currently a major contributor toward building a
global emerging tech economy, with an estimated market of over $350
billion. Open data sources often rely, however, on centralized
infrastructure, contrary to the philosophy of autonomy and
censorship resistance.
To realize its potential, open data must shift to decentralized
infrastructure. Once open data channels start using a decentralized
and open infrastructure, multiple vulnerabilities for user
applications will be solved.
Open infrastructure has many use cases, from hosting a
decentralized application (DApp) or a trading bot to sharing
research data to training and inference of large language models
(LLMs). Looking closely into each helps us better understand why
leveraging decentralized infrastructure for open data is more
utilitarian than centralized infrastructure.
Affordable LLM training and inference
The launch of the
open-source AI DeepSeek, which wiped out $1 trillion from the
US tech markets, demonstrates the power of open-source protocols.
It’s a wake-up call to focus on the new world economy of open
data.
To begin with, closed-source, centralized AI models have high
costs for training LLMs and generating accurate results.
Unsurprisingly, the final stage of training DeepSeek R1 cost
just about $5.5 million, compared to over $100 million for OpenAI’s
GPT-4. Yet, the emerging AI
industry still relies on centralized infrastructure platforms
like LLM API providers, which are essentially at odds with emerging
open-source innovations.
Hosting open-source LLMs like Llama 2 and DeepSeek R1 is simple
and inexpensive. Unlike stateful blockchains requiring constant
syncing, LLMs are stateless and only need periodic
updates.
Recent: Here’s
why DeepSeek crashed your Bitcoin and crypto
Despite the simplicity, the computational costs of running
inference on open-source models are high, as node runners need
GPUs. These models can save costs as they don’t require real-time
updates to continuously sync.
The rise of generalizable base models like GPT-4 has enabled the
development of new products through contextual inference.
Centralized companies like OpenAI won’t allow any random network
support or inference from their trained model.
On the contrary, decentralized node runners can support the
development of open-source LLMs by serving as AI endpoints to
provide deterministic data to clients. Decentralized networks lower
entry barriers by empowering operators to launch their gateway on
top of the network.
These decentralized infrastructure protocols serve millions of
requests on their permissionless networks by open-sourcing the core
gateway and service infrastructure. Consequently, any entrepreneur
or operator can deploy their gateway and tap into an emerging
market.
For example, someone can train an LLM with decentralized
computing resources on the permissionless protocol Akash, which
enables customized computing services at 85% lower prices than
centralized cloud providers.
The AI training and inference market has immense potential. AI
companies spend approximately $1 million daily on infrastructure
maintenance to run LLM inference. This takes the service obtainable
market, or SAM, to roughly $365 million annually.
As the data suggests, the market conditions indicate a massive
growth potential for decentralized infrastructure.
Accessible research data sharing
In the scientific and research domain, data sharing combined
with machine learning and LLMs can potentially accelerate research
and improve human lives. Access to that data has been walled in by
the high-cost journal system, which selectively publishes the
research that its board approves of and is broadly inaccessible
behind expensive subscriptions.
With the rise of blockchain-based zero-knowledge ML models, data
can now be shared and computed trustlessly, and privacy can be
preserved without revealing sensitive data. Thus, researchers and
scientists can share and access research data without
de-anonymizing potentially restricted personally identifiable
information.
To sustainably share open research data, researchers need access
to a decentralized infrastructure that rewards them for access to
that data, cutting out the middleman. An incentivized open data
network can ensure that scientific data remains accessible outside
the walled garden of expensive journals and private
corporations.
Unstoppable DApp hosting
Centralized data hosting platforms such as Amazon Web Services,
Google Cloud and Microsoft Azure are popular among app developers.
Despite their easy accessibility, centralized platforms suffer from
a single point of failure, affecting reliability and leading to
rare but plausible outages.
There are various instances in tech history when
Infrastructure-as-a-Service platforms have failed to provide
uninterrupted services.
For example, in 2022, MetaMask temporarily denied access to
users from specific geographical regions because Infura blocked
them after some US sanctions. Although MetaMask is decentralized,
its default connections and endpoints depend on centralized tech
like Infura to access Ethereum.
This wasn’t an isolated incident, either. Infura clients also
faced an interruption in 2020, while Solana and Polygon experienced
an overloading of centralized remote procedure calls (RPCs) during
peak traffic.
It is difficult for one company to handle diverse developer
needs in a thriving open-source ecosystem. There are thousands of
layer 1s, rollups, indexing, storage and other middleware protocols
with niche use cases.
Most centralized platforms, like RPC providers, keep building
the same infrastructure, which creates friction, slows growth
metrics, and affects scalability because protocols focus on
rebuilding the foundation instead of adding new features.
On the contrary, the massive success of decentralized social
network applications like BlueSky and AT Protocol signals users’
quest for decentralized protocols. Moving past centralized RPCs
into accessing open data, such protocols remind us of the need to
build and work on decentralized infrastructure.
For example, a decentralized finance protocol can source onchain
price data from Chainlink to stop depending on centralized APIs for
price feeds and real-time market data.
There are roughly 100 billion serviceable RPC requests in the
Web3 market, costing $3–$6 per million requests. Thus, the total
addressable market size of Web3 RPC is $100 million–$200 million
annually. With the steady growth of new data availability layers,
there can be over 1 trillion RPC requests daily.
It is imperative to pivot toward decentralized infrastructure to
stay in sync with open data transfers and tap into the open-source
data market.
Open data requires decentralized infrastructure
We’ll see generalized blockchain clients offloading storage and
networking to specialized middleware protocols in the long
term.
For example, Solana led the decentralization movement when it
first started to store its data on chains such as Arweave. No
wonder Solana and Phantom were once again the primary tools for
handling the massive TRUMP presidential memecoin traffic, a key
moment in financial and cultural history.
In the future, we’ll see more data flow through infrastructure
protocols, creating dependencies on middleware platforms. As
protocols become more modular and scalable, it’ll make space for
open-source, decentralized middleware to integrate at the protocol
level.
It is unfeasible to have centralized companies function as
intermediaries for light client headers.
Decentralized infrastructure is trustless, distributed,
cost-effective and censorship-resistant. As a result, decentralized
infrastructure will be the default choice for app developers and
companies alike, leading to a mutually beneficial growth
narrative.
Opinion by: Michael O’Rourke, founder of Pocket
Network and CEO of Grove.
This article is for
general information purposes and is not intended to be and should
not be taken as legal or investment advice. The views, thoughts,
and opinions expressed here are the author’s alone and do not
necessarily reflect or represent the views and opinions of
Cointelegraph.
...
Continue reading Centralized data infrastructure
violates Web3’s core of decentralization
The post
Centralized data infrastructure violates Web3’s core
of decentralization appeared first on
CoinTelegraph.
Bitcoin (COIN:BTCUSD)
Historical Stock Chart
From Feb 2025 to Mar 2025
Bitcoin (COIN:BTCUSD)
Historical Stock Chart
From Mar 2024 to Mar 2025