Sam Williams: Making Arweave the Best Place to Publish

Web3 is booming, and Arweave is becoming a popular infrastructure choice for developers. PermaDAO is a community where everyone can contribute to the Arweave ecosystem. It's a place to propose and tackle tasks related to Arweave, with the support and feedback of the entire community. Join PermaDAO and help shape Web3!

Author: Sam Williams @ Arweave Founder, Forward Research CEO

Reviewer: Xiaosong HU @ Contributor of PermaDAO

Hey everyone, welcome to Arweave Day. Thank you so much to the everVision team and the whole crew for putting this event together. I'm sure it'll be great. So today, I want to speak to you about making Arweave the best place to publish.

Before I do, a small disclaimer: I speak only for Forward Research here today. One of the strengths of the Arweave ecosystem is that it has many different leaders and institutions, all attempting to make the network successful. So what I'm talking about here is really just our view of the world at Forward Research, how we see things developing, and what we're focused on pursuing in order to get the permaweb adopted. But this is one view of the world. You'll hear from other people like Community Labs, for example, later today who have different views, and that's the strength of the ecosystem. It has a wide variety of views, opinions, and directions people are pursuing in order to get this technology adopted.

Alright, so why are we here? Well, we want to build the next Library of Alexandria, but this time in perpetuity. In the face of fire and flood and government damage, we want to create a store of knowledge and history that is so well replicated and so resilient that it replicates data for eons to come.

We got the idea for this essentially from Bitcoin. Bitcoin, in the first block, Satoshi embedded a headline from The Times in the United Kingdom, and when they did so, they have made a record of history that is provenly unaltered and is now replicated in hundreds of thousands of different places on Earth. We want to do the same, except for large amounts of data, and we're well on track now. The network has been live for five years and has accrued, I think, 1.163 billion pieces of information so far, which is really incredible.

In order to do this, we had to build a new system of blockchain mining that rewards people for contributing hard drive space and replicating pieces of the network's data set. As well as this, we also had to create a new form of paying for it because, of course, if you have a permanent data storage system, you have to have some way to pay for the storage of data over time. The only way that we think this is possible is through a storage endowment structure.

So you put 200 years’ worth of storage cost into the network when you put your data inside it, and then as the cost of storage declines over time, that 200 years expands out. This makes the system extremely resilient and robust over long periods, but it does also means that storage on Arweave costs money. Now we have two competing problems. So we need the storage to cost a reasonable amount of money. If it didn't cost a reasonable amount of money, it wouldn't work. So that's one side, but we also need to onboard all of the world's knowledge and history.

Okay, so we have a bit of a juxtaposition. We think the solution is to incentivize primary publishing on Arweave directly. The idea is, if we can make it where people want to publish their data in the first instance, then we can make it we can essentially get the archive for free over time. If this is where people want to go to publish their information, they will pay to do so, and then over time, the archive will grow with current knowledge and culture, which will, of course, later become the history of the world.

So how can we possibly do this? Well, we've been approaching this in a number of different ways at Forward Research. I'm excited to tell you about a few of them now.

So first, this year, we introduced the idea of the Universal Data License (UDL). A Universal Data License is a human contract essentially that you can attach to your data on the Arweave network that allows you to say, 'Hey, anyone can openly use and remix this work as long as they pay a royalty fee of some kind or many different versions of monetization for content.' The idea is that you can then create an open data lake of content that is remixable.

As long as you pay, and from a developer's perspective, this is extremely exciting because instead of having the cold start problem every time I build a new application, I can build now on top of a big open data lake, essentially, of licensable information that can fill my application. And that's a very appealing proposition when you put it with its alternatives. And from the user's perspective, when you upload to the system, you're essentially uploading to all of the apps on the web in one go, rather than just one application. And that just stands to reason, is more effective because it means that you're going to have a larger volume of royalty streams because everyone that could potentially make use of your data is going to be able to access it, rather than just one company at a time, as is on the way on the traditional web.

Now, there's a lot more to get into about UDL, but that's the high level and why we think it'll be so powerful in incentivizing people to upload data to the Permaweb.

Of course, on Arweave, everything is an atomic asset. So when you tag your data with the Universal Data License, it becomes, would you say, immutably and intrinsically associated with the data. So whenever you have an identifier for that data, imagine even if you're embedding an image in a web page, you cannot dissociate that link, that identifier with the information about its licensing information or with other metadata or even having a token inside that dictates where royalty streams should go. So now, you have a piece of data that can flow between applications trivially, that has a license that tells you what the royalty information is, and then a token inside that allows you to distribute those royalties to whoever has a financial interest in the asset in a trivial fashion. And this is how we think we're going to break down the resistance to uploading to the network by making it just an amazing place to publish, the best place to publish.

So using the UDL is actually extremely easy. You just need to add a few tags. This is information that's available on the Arweave Wiki. You can look it up if you just go to ArWiki and then look for the UDL part of the Wiki structure.

So, okay, we have all of these assets now that can be licensed essentially. You have this open lake of licensed data and you have royalty streams in those assets that are tradable. Well, you need somewhere to trade it. That's why we built the Universal Content Marketplace. So the UCM looks a little bit like a traditional NFT marketplace, but under the hood, it's very different. For starters, of course, it's a fully decentralized permaweb application. This runs entirely on the network. You can just swap out the gateway and get that same application back anywhere in the world with pluggable infrastructure, essentially.

But when you look at an asset, these aren't just normal NFTs. You can see in the bottom left-hand corner here that the user interface highlights for you the data rights that you are buying when you buy into the contract on a piece of data. And you can also see here in this pie chart of owners. We're not particularly focused on non-fungible tokens, one-of-ones. We're actually focused on fungible tokens that allow you to gain access to rights in the data. And so you can just buy access to the financial part of this, just like you were using Uniswap, essentially a decentralized exchange for these types of assets.

But just by the nature of using this interface, so you see this button here, this is a Stamp button that's kind of like a universal protocol for the permaweb that we've been incubating. It actually has a fair launch token, a zero founder reward token inside it distributed to the people whose content is stamped in a day. But when you stamp it, you're essentially laying a trail across the permaweb of what you find interesting, what other people find interesting. And then this, you can see here, is a user interface called. It's essentially like Reddit for the permaweb except instead of having normal likes, we have Stamps. And when you stamp something in the Bazaar, in that user interface that I just showed you, it shows up here. It affects this web application. So these two web applications are built on the same set of data and they can share all of the royalty information and ownership information of that data trivially. And by the way, you can pull back those applications on any number of gateways in the network.

One thing that we're seeing about the composable data and ecosystem on top of Arweave, which is relatively novel, is this idea of related transactions. So you can see here one of the tools that the bazaar gives you is what we call scope. Basically, it's a microscope for atomic assets. You can look at what are the related pieces of data and you can explore that graph. And that means that essentially just as one example, every single thing on the perm web has a comments page now. Here's a comment that someone made on top of someone else's comment. There's no limit to the levels of depth to it. Or you could add a Permafacts market or really anything you want on top. You can annotate all of the information in an open way and explore the graph.

People are already using this to build, for example, video applications. So imagine YouTube, but when you upload a video, it's actually an atomic asset, which is spread across all of the applications in the ecosystem but licensed such that you as the content creator get paid when people make use of it. Now that's where we are with the Atomic Asset ecosystem, which is growing at a real rate at the moment. It's very exciting, but I also want to tell you very, very briefly a little bit about what we've been building with Arweave 2.7.

Arweave 2.7

So Arweave is a complete, useful, and scalable protocol. It does just what it says on the tin now, and we care so much about making sure that the purpose of the network remains unchanged and is resilient that we've encoded this in what are called the principles of the Arweave network, which are immutable, kind of I guess, representations of our social contract as a group. What is it that we're trying to do here? And this is built in such a fashion that even we as the people who founded the network in the first place can't touch it. No one, including me, can ever suggest an edit to the network that breaks these principles and it'd still be called Arweave. It would not be Arweave; it would be something else.

That's it. Arweave 2.7. The latest version still comes with some nicer half-tweaks, which I'll tell you about now:

Data root Merkel Tree Rebasing - So this is a modification to the protocol that allows you to have sub-Merkle trees inside your data entries. This will allow us to create bundling services that don't have to see the data that they are bundling. This is useful for two reasons. The first is that it is way more performant. Instead of having to upload all of your data to the bundler and the bundler potentially uploads all of that data with other people's data to another bundler and then eventually to the network, you can instead send a Merkel root of the data that you want to commit to the bundler, which then adds it with other Merkel routes and then sends that to the network, and then you concede your data directly to the network. That's helpful for a second reason, which is that it increases censorship resistance. So now the bundler doesn't see the data that is being uploaded to the network. This is very important, I think long-term.

Cooperative mining support in nodes - So this will actually come in 2.7.1. 2.7.0 will be the hard fork release, 2.7.1 will have a few other extra features. The half-fork enabling that feature that I just mentioned is cooperative mining in nodes. As you know, Arweave incentivizes you from 2.6 to replicate in full versions of the data set. And you don't have to do that personally alone if you don't want to because of this cooperative mining support, you can essentially work with friends or with other peers to create full replicas of the data set. And this is very important because it is a huge step towards one-hop routing to any piece of data. So the biggest problem in decentralized data system design is routing, and you'll know that there are other systems that call themselves data storage systems that say you can plug in a piece of data, you can pin it on a node, but you'll find that in practice, on average, it takes seven minutes to route to that piece of data if you are not accessing it on the node in which it is primarily stored. Which is obviously unworkable, you can't have a web where it takes seven minutes to resolve a single piece of data. So with Arweave, what we've done is we've incentivized people to create full replicas of the data set themselves, working potentially with other peers, and then be able to route to, yes, to any other piece of data from the data that they're storing. Essentially, it's incentivized routing, which is very, very powerful and will mean that ultimately each participant in the network will be at maximum one hop away from another participant through a predefined route that has the piece of data that you're looking for, which is great for finding data over time, at extremely large scales.

Pay-to-Pack - So in the Arweave network, you have to pack the data that you are mining on first, which is a little bit of CPU work. Now, obviously, there are some people that just want to run their operations on a Raspberry Pi or something like this, or a very, very low-power machine. One of the things that this version 2.7.1 will allow you to do is to off-board the packing to someone else, to pay them using this thing called the Permaweb Payment Protocol (P3) to pack the data for you and stream it back to you. Just very helpful if you want to mine on really small machines.

Alright, thank you so much for your time, everyone. Have a great Arweave day. I'm excited for the rest of the talks. See you!

🔗 More about PermaDAO :Website | Twitter | Telegram | Discord | MediumYoutube

Sign up for newsletter

Sign up here to get the latest news and updates delivered directly to your inbox.