IBM, MIT and Elliptic release world’s largest labeled dataset of bitcoin transactions

In this article:


Blockchain forensics startup Elliptic has teamed up with researchers at MIT’s IBM-funded AI Lab to produce the world’s largest set of labeled bitcoin transaction data. The labeling highlights unique transaction characteristics, and can be used to identify illicit actors in the crypto space.

Elliptic's dataset comprises 200,000 bitcoin transactions, with a total value of $6 billion. It will be publicly available from today, and can now be used by open-source developers and other researchers to train machine learning algorithms to spot characteristics that are unique to illicit or legitimate transactions. By helping to keep crypto within the law, it should boost its legitimacy in the eyes of governments around the world.

“Based on our own research we have labeled those transactions made by illicit actors (dark marketplaces, ransomware operators, fraudsters) and those made by legitimate actors (regulated exchanges, merchants, wallet services etc.,)” Elliptic Co-Founder Tom Robinson told Decrypt in an email.

“When applied to new data [the software] can pick out any transactions that match these patterns,” he further explained.

The same techniques could be used on a range of cryptocurrencies and blockchain-based assets, from Ethereum to Libra, according to the company’s statement.

Elliptic aims to help its clients better identify illicit transactions, reducing compliance costs and driving criminal activity out of the industry. Billions of dollars are laundered through cryptocurrencies each year. But while such advancements in deep learning for graph or network structured data show great promise in identifying bad actors in complex money laundering schemes, they have also raised concerns among proponents of privacy.

It's about privacy, stupid

In response, the company has claimed that the data it receives from exchanges and financial service providers does not include any personally identifiable information about users, such as names, addresses or social security numbers. However, it can still be used to connect multiple transactions to the same customer ID—one of the main techniques used to prevent financial crime.

To compliment their dataset, the researchers have published a paper, “Anti-Money Laundering in Bitcoin: Experiments with Graph Convolutional Networks for Financial Forensics.” It will be presented at the Knowledge Discovery and Data Mining Conference on August 5, 2019.

Elliptic has previously worked with the FBI and DEA to investigate illicit blockchain activity, and recently highlighted the use of bitcoin as a method of fundraising by Palestinian militant group Hamas.

 

Advertisement