- Coinbase (NASDAQ:
COIN) recently tested artificial intelligence chatbot ChatGPT to determine the accuracy of its token security review.
- The tests found that ChatGPT was not fit due to its inability to correctly identify high-risk assets.
American crypto exchange Coinbase recently tested ChatGPT, the artificial intelligence chatbot that has grown popular due to its AI powered solutions. Coinbase was testing the AI chatbot to determine the accuracy of its token security review. The firm did this by comparing its results with that of a blockchain security engineer.
ChatGPT seemed beneficial at improving productivity
According to a blog post by Coinbase, the exchange’s blockchain security team experimented with ChatGPT given the hype around its ability to detect security vulnerabilities. The security team’s job involves researching the most efficient and effective ways to review token contracts. Moreover, it decides whether it should list assets on the exchange.
The exchange’s blockchain security team ran several prompts on ChatGPT. Eventually, it concluded that the AI chatbot was not ready to be integrated into their security review process. The team compared 20 smart contract risk scores between ChatGPT and a manual security review and found that the AI bot gave the same result as the manual review 12 times.
The blog post read:
“Of the 8 misses, 5 of them were cases where ChatGPT incorrectly labeled a high-risk asset as low risk, which is the worst-case failure: underestimating a risk score is far more detrimental than overestimating.”
The blockchain security team concluded that ChatGPT was not capable of recognizing when it lacked context to perform a complete security analysis. This resulted in coverage gaps, where additional dependencies went unreviewed. The Coinbase team agreed that the AI tool could not be solely relied upon for performing a security review.
However, Coinbase acknowledged that ChatGPT showed promise for its ability to quickly assess smart contract risks. This was enough potential for them to continue investigating its use cases.