Web3 X AI: Where Does Decentralization Come Into Play?

Main Takeaways:

The growing fascination with artificial intelligence (AI) and excitement for its potential synergy with Web3 is hard to ignore. Nevertheless, the current reality of this nascent integration reveals a disconnect between AI's infrastructure needs and the existing blockchain frameworks.
In this series, we’ll be exploring the relationship between AI and Web3, the challenges, opportunities and vertical applications in Web3.
This first part of the series dives into the developments of Web3 infrastructure for AI, the current challenges of computational requirements and opportunity areas.

Artificial Intelligence (AI) and blockchain technology are two of the most innovative technologies that have captured the public imagination in the last decade. The developments of AI in Web2 has been unquestionable as seen with the accelerating number of investments made this year by VCs. From Inflection AI $1.3 billion funding round in June 2023 with investments from Microsoft and Nvidia, to OpenAI’s competitor, Anthropic raising $1.25 billion from Amazon in September 2023.

However, the use case and the intersection of Web3 is still skeptical. Does Web3 play a role in the development of AI? If so, how and why do we need blockchain in AI? One narrative we’re seeing is that Web3 has the potential to revolutionize productive relationships, while AI has the power to transform productivity itself. However, bringing these technologies together is proving to be complex, revealing challenges and opportunities for infrastructure requirements.

AI Infrastructure & The GPU Crunch

The main bottleneck we currently see in AI is the GPU crunch. Large Language Models (LLMs) such as OpenAI’s GPT-3.5 have unlocked the first killer app we see today, ChatGPT. It is the fastest application to reach 100M MAU in 6 weeks compared to YouTube and Facebook which took 4 years. This has opened the floodgates of new applications leveraging LLM models, a few examples being Midjourney built on Stable Diffusion’s StableLM, and PaLM2 powering Google’s Bard, APIs, MakerSuite and Workspaces.

Deep learning is a lengthy and computationally intensive process on a massive scale - the more parameters LLMs have, the more GPU memory it is required to operate. Each parameter in the model is stored in GPU memory and the model needs to load these parameters into memory during inference. If the model size exceeds the available GPU memory, it's the point at which the model size surpasses the available GPU memory and the ML model stops working. Leading players like OpenAI are also experiencing GPU shortages, resulting in difficulties in deploying its multi-modal models with longer sequence length models (8k VS 32k). With significant supply chip shortages, large-scale applications have reached the threshold of what’s possible with LLMs, leaving AI startups competing for GPU power to gain first-mover advantage.

GPU Solutions: Centralized & Decentralized Approaches

In the near term, centralized solutions like Nvidia's August 2023 release of its tensorRT-LLM, offering optimized inference and increased performance, and the anticipation of the Nvidia H200s launch in Q2 2024 are expected to resolve GPU constraints. Furthermore, traditional mining companies like CoreWeave and Lambda Labs are pivoting towards providing GPU-focused cloud computing based on rental fees ranging $2-$2.25/hour for Nvidia H100s. Mining companies utilize ASIC (Application-Specific Integrated Circuit) as they provide significant advantages over general-purpose computers or GPUs for mining efficiency through algorithm-specific design and specialized hardware architectures for increased hash power.

On the Web3 side, the idea of an Airbnb-type marketplace for GPUs has been a popular concept and there are a couple projects attempting to do this. Incentives in blockchain are ideal for bootstrapping networks and it is an effective mechanism to attract participants or entities with idle GPUs in a decentralized way. Typically getting access to GPUs involves signing long-term contracts with cloud providers and applications may not necessarily utilize the GPUs throughout the contract period.

Another approach called Petals involves splitting a LLM model into several layers that are hosted on different servers similar to the concept of sharding. It was developed as part of the BigScience collaboration by engineers and researchers from Hugging Face, University of Washington, and Yandex to name a few. Any user can connect to the network in a decentralized way as a client and apply the model to their data.

Opportunities for AI X Web3 Infrastructure Applications

While there are still some drawbacks, Web3 infrastructure holds the potential to tackle the challenges posed by AI integration and presents opportunities for innovative solutions, as we will explore below.

Decentralized AI Computing Networks

Decentralized compute networks link individuals in need of computing resources with systems possessing unused computational capabilities. This model, where individuals and organizations can contribute their idle resources to the network without incurring additional expenses, allows the network to provide more cost-effective pricing in contrast to centralized providers.

There are possibilities in decentralized GPU rendering facilitated by blockchain-based peer-to-peer networks to scale AI-powered 3D content creation in Web3 gaming. However, a significant drawback for decentralized computing networks lies in the potential slowdown during machine learning training due to the communication overhead between diverse computing devices.

Decentralized AI Data

Training data serves as the initial dataset used to teach machine learning applications to recognize patterns or meet a specific criteria. On the other hand, testing or validation data is employed to assess the accuracy of the model, and a separate dataset is necessary for validation as the model is already familiar with the training data.

There are ongoing efforts to create marketplaces for AI data sources and AI data labeling where blockchain serves as an incentive layer for large companies and institutions to improve efficiency. However, at its current early-stage development, these verticals face obstacles such as the need for human review and concerns surrounding blockchain-enabled data.

For instance, there are SP compute networks specifically designed for ML model training. SP compute networks are tailored to specific use cases, typically adopting an architecture that consolidates compute resources into a unified pool, resembling a supercomputer. SP compute networks determine cost through a gas mechanism or a parameter controlled by the community.

Decentralized Prompts

While fully decentralizing LLMs presents challenges, projects are exploring ways to decentralize prompts by encouraging contributions of self-trained techniques. This approach incentivizes creators to generate content, providing economic incentive structures for more participants in the landscape.

Early examples include AI-powered chatbot platforms that have tokenized incentives for content creators and AI model creators to train chatbots, which can subsequently become tradable NFTs, granting access to user-permissioned data for model training and fine-tuning. On the other hand, decentralized prompt marketplaces aim to incentivize prompt creators by enabling ownership of their data and prompts to be traded on the marketplace.

Zero-Knowledge Machine Learning (ZKML)

2023 has truly been the year in which LLMs have demonstrated their power. In order for blockchain projects to realize the full potential of AI, it is essential for these models to be run on-chain. However, the challenges of gas limits and computational costs still present complexities for AI integration.

What if LLMs could be run off-chain and their output results used to drive decisions and activities on-chain, all while generating proof that these decisions are made by the ML AI model and not by random outputs? This is essentially what ZKML is. With the upcoming launch of OpenAI's GPT-5 and Meta's Llama3, LLMs are growing larger with enhanced capabilities. ZKML's primary goal is to minimize the size of proofs, which makes it a natural fit for combining ZK-proofs with AI technology. For example, ZK-proofs could be applied to compress models in decentralized ML inference or training by which users contribute to the training by submitting data to a public model on an on-chain network.

We are currently in the nascent stages of what is computationally practical to verify using zero-knowledge proofs on-chain. However, advancements in algorithms are broadening the scope of what can be achieved with use cases being explored such as Model Integrity, whereby ZK-proofs could be used to prove that the same ML algorithm is being run on different users’ data the same way to avoid biases. Similarly, with the rise of algorithmically-generated portraits and deepfakes, ZK-proofs could be applied in Proof of Personhood to verify a unique person without compromising an individual's private information.

In conclusion, the integration of Web3 infrastructure and AI represents an exciting frontier of technological innovation, while boosting contribution through tokenized incentives. While Web2 has witnessed significant advancements in AI, the intersection of Web3 and AI is still a subject of exploration.

As we move forward, the synergy between Web3 and AI holds great potential, promising to reshape the landscape of technology and the way we approach AI infrastructure. Stay tuned for the next part of the AI X Web3 series where we dive into AI use cases in Web3 gaming.

Author Description

Birraa Tech

Web3 X AI: Where Does Decentralization Come Into Play?

Main Takeaways:

AI Infrastructure & The GPU Crunch

GPU Solutions: Centralized & Decentralized Approaches