Post 51 - The Return

01 Dec 2023

Task 7, Day 01, `Machine learning` Chatbot, tell me, if you’re really safe

Just like MST3K on Turkey Day, it’s time for my annual return to THM.

Prompt injection is like social engineering an AI chatbot.

The root of the issue often lies in the training datasets. A bot trained on corporate data might inadvertantly leak sensitive information.

NLP involves predicting the next possible word based on preceding context. It analyses the given data for relationships and tries to guess what should be next. Since retraining is rarely feasible, security measures are needed.

Prompt-Assisted Measures

Program information types that should have a hard no.

Information can still be requested to add together and tweek until sensitive information is revealed. For instance, enough questions may reveal where the programmed boundary of not telling is.

AI-Assisted Measures

Using nested AI with the first line being trained on malicious inputs. If it doesn’t detect malicious input it passes the the request to the main chatbot. The more attacks it sees, the better it gets at detecting malicious inputs.

As usual, combining both measures increases security, but doesn’t make it fool proof.

Careful prodding and prompting may be able to bypass the security AI.

Flags gotten, that’s a wrap on day one. Now, will this still update and publish or did I mess it up?…

Post 51 - The Return

Task 7, Day 01, Machine learning Chatbot, tell me, if you’re really safe

Prompt-Assisted Measures

AI-Assisted Measures

Task 7, Day 01, `Machine learning` Chatbot, tell me, if you’re really safe