On-chain Secure Track Scanning: Data Persistence Implications for Arweave
Web3 is booming, and Arweave is becoming a popular infrastructure choice for developers. PermaDAO is a community where everyone can contribute to the Arweave ecosystem. It's a place to propose and tackle tasks related to Arweave, with the support and feedback of the entire community. Join PermaDAO and help shape Web3!
Translator: Spike @ Contributor of PermaDAO
Reviewer: Xiaosong HU @ Contributor of PermaDAO
On-chain Secure Track Scanning: Data Persistence Implications for Arweave
The data disclosure of blockchain has created the infinite prosperity of data analysis. Whether it is the active mining of Dune and Nansen, or the ups and downs of the Memao Party, it is clear that the data itself has great economic value. How to control the security of data will also become the focus of the industry, especially in the underlying public chain can not bear its weight, has turned to L2 or centralized hosting server, the data may not last long, and the security of its representative will also be doubted. In the traditional development paradigm, the data itself is only stored in its own server or meta computing manufacturers, but in the chain world everyone needs to be responsible for data security, even the most ordinary users also need, because destroying the credibility of the data will cause a serious decline in the price of the currency, but the vandals are often profitable. The fact that power and responsibility are not equal is the fundamental reason for frequent accidents on the chain. Arweave's data permanence, therefore, has a special meaning, which is equivalent to maintaining a never-down server and unbreakable snapshot for the industry, especially for the high-security on-chain security industry, whose significance has long been underestimated.
Crypto-asset Risk Classification
From the perspective of all parties designing crypto asset transactions, policy regulation is a force majeure. At present, the trend that can be observed is to strengthen the regulatory capacity under the premise of legalization to ensure financial innovation and reduce the risk to the financial system.
In terms of the subjects directly involved in the trading of encrypted assets, they can be distinguished into central institutions and assets on the chain. The latter is pure DeFi, but the central institutions are the underlying anchor and source of liquidity of the entire crypto world, and their asset status is not clear.
What can truly carry out complete logical derivation and real-time monitoring is on-chain activities, which can be divided into three processes: ex-ante defense, on-chain activity tracking, and post-event emergency handling, which also constitute the main segmentation track in the security field.
Centralization: CEX and CeFi
On-chain asset security
Ex ante: audit security, code security, scams
On-chain: Attack, Rug, Cross-chain Bridge, MEV
Afterward: money laundering, privacy coin, fiat money in and out
For CEX and CeFi, the security prevention and control that can be introduced are more in the fields of supervision, audit, and multi-signature wallets, to understand the operation principle in the black box as much as possible. The main entities involved are the major legal regulatory authorities such as SEC and CFTC, as well as mainstream wallets such as Gnosis Safe, and the recently popular Merkle tree proof of exchange assets. From the perspective of on-chain asset security, there are two main types of pre-defense: code audit and active defense. In this paper, Slowmist and BlockSec are taken as examples to illustrate them. The operation of funds on the chain can be further subdivided into wallet security and interactive control. This is because the wallet has gradually become the precipitation area of personal and institutional assets, and is the second largest asset distribution center after CEX. The interactive control on the chain lies in continuous data accumulation of Dapp and addresses to judge their behavior and reduce their security factors. This paper takes WalletGuard and Go+ Security as examples to illustrate. After-the-fact traceback refers to the remediation measures that occur after the on-chain activity. In general, this is not a subdivision track, but a security measure that requires the full cooperation of the project side, the exchange, and the security agency.
In terms of product form, security products compete with the characteristics of "single point function as the breakthrough point, and gradually move towards comprehensive products as the trend", and gradually move towards SaaS integrated comprehensive solutions.
Audit platform Certik has also launched Skynet, an automatic monitoring SaaS platform that runs 7*24 hours uninterrupted after being on the chain, to defend against security threats.
In addition to providing pre-blockchain security audit services, BlockSec will also provide real-time security monitoring services for post-blockchain projects.
Slowmist not only provides audit services but also includes warning products and hacker tracking services. It is the most typical one-stop service platform in the industry.
In terms of target users, plug-ins (browsers, wallets), C-end, and B-end middleware, and SDK access have become the three main service methods. Security products are a "function" rather than directly facing the public users, which has become the industry consensus.
MetaShield and WalletGuard are both striving to integrate browser or wallet plugins, trying to maintain the lowest cost of promotion and development costs, and at the same time taking advantage of the existing mature product traffic;
Exponential DeFi can conduct a security assessment of wallet portfolios, helping retail investors grasp authoritative reports in the field of security without understanding technology;
The B-end SDK or middleware is typical of Go+ Security, which provides reusable apis to help applications avoid on-chain risks as much as possible and maintain the normal operation of products to the maximum extent.
In terms of the business model of each product, it is currently in the early stage of industry construction, code audit is more mature but the concentration of players is decentralized, and on-chain activities rely on continuous label and database construction, which is time-consuming and laborious. Post-tracking and on-chain insurance are still in the manual stage and embryonic stages.
Code auditing: Automated verification is still under development, and formal verification cannot guarantee 100% safety.
On-chain activities: database construction is highly homogeneous, and effective C-end traffic and B-end purchasing power cannot be formed.
On-chain insurance: At present, the TVL of the whole industry is less than 300 million US dollars, which is still a long night.
The market size of the safety track should be calculated from the number of losses and the financing of related projects. In the first half of 2022, the industry as a whole recorded $3.6 billion due to the loss of more than $2 billion.
From 2016 to 2021, the figures were 900 million, 1.2 billion, 2.1 billion, 5.5 billion, 4.3 billion and 9.7 billion, which can be found to be highly related to the bull and bear cycle. It can also be estimated that the overall market size is about $5 billion - $10 billion. On the whole, it is a "relatively segmented narrow market with great technical difficulty, mainly based on B end customers". And from the current more well-known project valuation amount, BlockSec financing 50 million yuan (about 7 million US dollars), Certik2022 financing a total of 148 million US dollars, it can be said that there is a large amount of financing comparable to GameFi and public chain level, can increase the institutional expected market size to more than 10 billion US dollars.
But in general, the security track is not a direct traffic entrance, but more a functional service scenario, its future need to be highly bound to the overall encryption market, and then the overall inference of its market value.
Competitive Product Analysis
Focusing on the core of the safety track -- the address on the chain, we can compare the ability of each major competitor to judge the behavior according to the address on the chain, which can be understood as the technical strength rating of the main competitor's on-chain analysis ability according to enough label data and enough time accumulation.
A number of addresses: Supports Ethereum, BSC, Polygon, Arbitrum, Avalanche, Heco, Fantom, OKC and other public blockchains, supports 99,036 malicious token addresses, and 6.5 million malicious addresses.
Product form: SaaS, providing a variety of API development kits
Main functions: Token/NFT/ malicious address, smart contract, signature, dApp security and other services
User-facing: B-side project parties, such as BitKeep and zkSync
Charge mode: free limit, and API special call charge mode
On the OKLink chain, Skyeye 2.0
Number of addresses: It supports BTC, ETH, EVM compatible chains, OKC and other mainstream public chains, currently including 91,900 traced and analyzed addresses, 391 million address tags, and 2500+ tag types.
Product form: Web page and platform
Main functions: address analysis, transaction graph, NFT traceability, address health, and on-chain monitoring: exchange address, amount threshold, transfer, zero clearance, etc
User orientation: C +G terminal (Public security department)
Charge mode: free to use the web for basic operations, Pro version needs to be purchased
Number of addresses: supports more than 10 mainstream public chains such as BTC, ETH, EOS, XRP, TRX, including more than 1K + entities, 200 million address tags and 90 million risk addresses.
Product form: Web page, system platform and API
Main functions: AML Risk Score, address label, transaction analysis, time analysis, visualization, address collection, monitoring and early warning, Investigations
For users: retail investors, Whale, institutions and government departments
Charging mode: basic package 99U/M, standard package 299U/M, commercial version customized on demand
A number of addresses: Supports EVM chains such as BTC, ETH, Polygon, BSC, and mainstream public chains such as Solana and Cosmos, including 84 million entities, 1.4 million tag types, 640,000 tokens, 5 million address behavior tags, 11,000 protocols, 17 million risk address tags, 51 million contract analysis, 31 million transaction and access data.
Product form: Web
Exploration mode: Wallet analysis, Token analysis, project exploration, NFT Browser;
Due-diligence mode: VC observation, Whale tracking, entity discovery, entity tracking, high-risk entity identification:
Investigations mode: address clustering, capital flow, coin holding address analysis, early warning
For users: free and paid users, as well as commercial solutions
Number of addresses: It supports 23 public chains, covering the vast majority of the current public chains, and more than 1 million risk address data, 80 risk data types
Product form: one-stop system, SaaS
Forensics: Address and transaction tracking, supports graphical display
Know-Your-VASP: Virtual asset service provider information query
Transaction Monitoring: Comprehensive coverage of US government-prohibited addresses and entities.
For users: Financial institutions, Crypto enterprises, government departments
Charging model: customized on demand
Number of addresses: millions, mostly Ethereum chains.
Product form: Browser plug-in and B-end solution
Main functions: Monitor the security degree of addresses, ensure the security of users when browsing and interacting with DApps, support DeFi and NFT
User-facing: C-end and B-end project side
Charging mode: free for C-end, customized for B-end
Product form: API, SaaS
Malicious Transaction API: Helps project parties identify malicious addresses
Eagle RPC Router: Wallet RPC that provides an early warning service similar to bank fraud notifications
For users: B-end institutions and project parties, not for individuals
Charge mode: the current promotional price is 0.01ETH, which was originally 7% of the recovered assets. Subsequent commercial services need to be customized on demand.
Number of addresses: 207 million transaction scans, 18,000 fraud protection, supports Ethereum, Polygon, and Solona
Product form: API
Main functions: Fraud address detection and protection
For users: project parties, especially wallets
Charging mode: B-end institutions, and personal services are being developed
Number of addresses: not disclosed, currently relies on users to upload phishing sites to keep the list updated
Product form: Browser plug-in incubated by Buidler DAO
High-risk behavior monitoring
Black and white list check
Full information transparency
User-facing: C-end retail investors
Charging model: Free
Decentralized secure track
At this point, we can summarize the future trends of the safety track, mainly including the following three points:
Cross-border. Not just DeFi, but asset types like NFTS, and full cross-chain capabilities.
Materialize. It starts from the address on the chain and ends up in the relevant entity to determine its behavior.
Synthesis. Address behavior analysis, coupled with integrated security data tools, to sell discriminative power, not the data itself.
In fact, the core competitiveness of the security track is the ability to store and utilize data, but its operation process revolves around a centralized service platform or entity, which can be considered as follows: decentralized security is built on centralized recognition.
Similar to our consistent view, this situation represents the early and weak status quo of decentralization, and if we imagine that their data can be seamlessly migrated to Arweave for storage, it can effectively eliminate the monopoly of centralized institutions on data, and then truly unleash the organizational capabilities of decentralized security.
It can be thought that Dune reduces the threshold for ordinary people to access data on the chain, provided that they have the ability to understand and master SQL statements. Institutions such as Slow Fog have shown that data can be used as safe oil after long-term practice, provided that they have been through the process of persistent labeling. Furthermore, the decentralization of data itself is still in the process of history.