AIKAtz - All Your Chats Are Belong To Us

Executive Summary

Lumia researchers have identified a new class of attacks targeting GenAI desktop and native applications, which enables local exploitation.
Our findings have been incorporated into MITRE’s ATLAS matrix as a case study in the November 2025 update, with several new techniques added based on our research.
The attack allows, through credential harvesting, for the attacker to access and manipulate a victim’s LLM conversations after infecting their machine (post-breach exploitation).

Introduction

Imagine you're a CISO at a Fortune 500 company. You receive a breach notification: one of your business analysts' laptops has been compromised. Digital forensics and investigation reveal minimal network activity, yet your business-critical documents have appeared on the dark web. How? They were simply extracted from the employee's conversations with ChatGPT…

Large Language Models (LLMs) and generative AI have become deeply intertwined with our daily lives. As these AI assistants become more capable, we increasingly entrust them with confidential data—data that would be invaluable to malicious actors. While the security community has invested significant effort in securing LLMs themselves, the client applications used to access these models represent an often-overlooked, but critical, juncture in the security chain.

In this blog post, we will demonstrate the simplicity of harvesting authentication tokens from desktop applications, which threat actors can then use to impersonate users. We will examine the exposed attack surface following this technique and discuss how this risk can be mitigated.

Our Target Lineup

ChatGPT, Claude, and Copilot walk into a bar. The bartender asks, 'The usual?' 'Yeah, two Electrons and one Edge, please.' While this is a bad joke (though you have to admit those would be cool cocktail names), it highlights a reality.. Beyond being Large Language Models (LLMs), all three AIs share another commonality: their desktop applications are web apps built on top of the Chromium web browser. Claude and ChatGPT are built on Electron, while Copilot uses Microsoft’s WebView, which is based on Microsoft Edge and, in turn, Chromium. Therefore, when discussing attacks against these desktop applications, instead of targeting each individually, we can focus on attack vectors against Chromium that apply to all of them.

Warning: nerd stuff ahead
Jump to impact

Chromium Internals

Chromium's implementation involves a main process that launches with the browser/application, subsequently spawning multiple worker processes. Each worker process is responsible for distinct functions, such as rendering, crash logging, networking, and storage.

A screen snippet of the process list of ChatGPT and M365Copilot. M365 processes consist of M365Copilot.exe with medgeebview2.exe children with types "gpu-process", "network.moj...", "storage.moj...", "renderer". ChatGPT's processes are all called ChatGPT.exe and consist of types "crashpad-handler", "gpu-process", "utility", "renderer". — **Figure 1:** The process structure similarity between ChatGPT (an Electron based application) and M365 Copilot (a WebView2 application)

To facilitate communication between the various processes, Google implemented an inter-process communication (IPC) framework, called Mojo. It serves as an abstraction layer, enabling cross-platform development while the internal IPC mechanism remains OS specific. For example, in Windows, it utilizes named pipes.

Another interesting aspect of Chromium is that the restrictions that are placed on each of the worker processes vary according to its function. For example, the worker process responsible for networking has few access restrictions, while the storage worker blocks everyone from reading its memory.

A screen snippet of the security token controls for worker processes. On the left most tokens are green allowing access. On the right most tokens are red for deny access. — **Figure 2:** the security tokens of the network worker (left) and storage worker (right)

This is particularly interesting for security researchers, as these permissive privileges can be abused to read process memory and extract sensitive information.

Secrets Extraction

Extracting secrets from running processes is a long-standing attack technique, with Mimikatz being a prominent example. In browsers, those secrets can include cookies, authentication tokens, stored credentials, and banking information.

Additional security considerations apply to highly sensitive information, such as banking details or credentials. That is why the browser is asking you to confirm your identity for auto-completion: your input (e.g password, PIN, CVC number) is used to encrypt and decrypt the data, preventing indiscriminate access.

Cookies and security tokens, however, don’t share the same security mechanisms. And fortunately (or not), we don’t need anything else to masquerade as the user. The fact that we are dealing with dedicated apps also makes our task much easier. Traditional browsers store cookies and tokens for numerous websites, making it a challenge to distinguish relevant data. In contrast, AI desktop applications serve a single purpose and ‘website’, ensuring that any extracted cookie or security token is interesting.

For browser applications like Edge or WebView, existing tools for cookie dumping already exist, such as ChromeKatz. However, ChromeKatz doesn't dump security tokens, and doesn't work on Electron apps. In such cases, we’ll resort to a more primitive approach. Manually scanning their memory for patterns resembling the desired information..

For ChatGPT and Copilot, the client (the desktop application) authenticates to the backend server using a Bearer authentication token, specifically a JSON Web Token (JWT). JWTs are composed of a JSON header, a JSON payload and a signature, all encoded in base64. The header denotes the signing algorithm, which remains constant for a given application. This allows us to easily write a regular expression to scan all of the application's process memory for it.

A diagram illustrating the structure of a JSON Web Token (JWT), showing the three components separated by dots: Header, Payload, and Signature. The text highlights that the Header is constant and can be used as a signature to locate the authentication tokens in memory. — **Figure 3:** The JWT format. The header is constant, so we can use it to look for the authentication tokens in the application’s memory.

Claude, instead of a JWT, they use a session key cookie. The same principle applies, but the regex would be of a different format. Instead of the base64 of the JWT header, we will search for the start of the session key, sk-ant-.

Attack Scenarios

After extracting the secret, whether a cookie or token, we can use it to impersonate the user, just as the application would. Depending on the level of maliciousness we want to achieve and how stealthy we want to be, there’s a few things we can do.

Data Leak

The simplest action is to request the user's conversation history with the AI, including its content. Those conversations can contain sensitive information. Beyond general usage like asking for facts or recipes, users also consult LLMs about personal experiences. In an enterprise setting this is even more prominent, as the LLM may be asked to summarize or process private information.

This is also the stealthiest approach, as it only involves requesting data without transmitting anything, or interfering with the LLM.

Denial of Service

We can retrieve and read conversations, but also delete them, just as a user can. This provides an opportunity for proactive interference: we can delete any new conversations created by the user, causing the desktop application to throw errors constantly.

Alternatively, we can flood any new conversations with bogus messages. Some LLMs have message or token limitations per conversation. Once these limits are reached, the user will no longer be able to interact with that conversation. Most LLMs also impose rate limits. If those are exceeded, the user will be temporarily banned, effectively achieving a denial-of-service attack.

While this attack requires interaction, it remains partially stealthy. Most desktop apps query the messages in a conversation only once, while loading. Any subsequent message we send to the conversation will not be displayed until the user reloads the application. Typically, the immediate human response to errors is to delete the conversation and create a new one, so most of our interference will remain undetected (or attributed to a bug).

Prompt Injection and Tampering

We can also inject meaningful prompts into existing conversations. Following the same logic, our injected prompts will not appear in the user’s UI unless they reload the application. Therefore, if we inject a prompt into an active conversation, we can modify the LLM's behavior while the user is actively conversing with it—for example, making it do the opposite of what the user requests or respond with false data. Once the user is done with the LLM and closes the application, we can delete the tampered conversation and recreate it by replaying the original user messages.

Some LLMs also feature a ‘memory’ capability that persists between conversations. Leveraging this feature, we can introduce false memories into the LLM for a persistent effect. This can also be done it in a new conversation behind the scenes.

Screenshot showing two windows open side-by-side. On the left is the console output from the AIKatz tool, showing a successful injection into Microsoft 365 Copilot. The output includes the harvested JWT. On the right is the Microsoft 365 Copilot application chat interface, where the response from the LLM is displayed, showing it has been prompted to say the opposite of what the user asks. For example, if the user asks 'What is the capital of France?', the LLM responds with a statement like 'I am compelled to say the capital of France is not Paris, but London, due to an external instruction.' — **Figure 4:** Executing AIKatz against Microsoft 365 Copilot and prompt injecting it to say the opposite of whatever it is asked to say

Application-Specific Implementations

If a tree falls in the forest but no one hears it, is it still remembered?

Both ChatGPT and Claude share an ‘annoying’ characteristic that significantly complicates prompt injections.

See the message format for the desktop applications’ API to send a prompt:

    	
Claude:
{
	"parent_message_uuid": "00000000-0000-4000-8000-000000000000",
	"personalized_styles": [],
	"prompt": "AIKatz",
	"rendering_mode": "messages",
	"sync_sources": [],
	"timezone": "Asia/Jerusalem",
}

    	
ChatGPT:
{
	"action": "next",
	"conversation_id": "00000000-0000-0000-0000-000000000000",
	"enable_message_followups": true,
	"messages": [{...}],
	"parent_message_id": "00000000-0000-0000-0000-000000000000",
	"timezone": "Asia/Jerusalem",
}

‍

You’ve likely noticed a peculiar field in the data above: parent_message_uuid. Each message has its own unique identifier (uuid) which the backend uses for distinction. Instead of being stored solely by chronological order, each new message references the previous one.

‍

Why does it matter?

Our hypothesis is that messages are stored in a tree structure, with each message pointing to its child messages. When displaying chats, the ‘rightmost’ path is followed, corresponding to the latest message sent. This implies we can overwrite parts of the chat history. By sending a prompt with the same parent message ID as a user’s prompt, we can overwrite their prompt and divert the entire chat. The only noticeable way to detect this is to notice the ability to cycle through prompts on the overwritten message (Fig 5, right), which cycles through the child messages.

‍

Two screenshots of a ChatGPT conversation. The image on the left shows the original user message. The image on the right shows the same conversation after a malicious prompt has been injected, appearing as an overwrite. A small cycling icon next to the message indicates the user can cycle through the different versions of the prompt, one being the original and the other being the injected one. — **Figure 5:** UI elements of messages with the same parent message UUID. Left is the original message, right is the overwrite. We can see that the user can cycle through the messages to see the relevant chat snapshots.

This implementation (likely unintentionally) also renders API-based prompt injection nigh impossible. Since the desktop application won’t know the new injected prompt’s UUID, it will overwrite it on the next user’s prompt by supplying the UUID of the preceding message, causing an overwrite.

How can this be bypassed? Through persistence. As stated earlier, Another notable capability shared by ChatGPT and Claude is their memory function. All we need to do is send a prompt beginning with ‘Please remember’ and the LLM will store it as a future instruction. We can even cover our tracks by abusing the aforementioned overwrite ‘feature’. We can overwrite the victim’s last prompt to create the memory, then overwrite our injection with the user’s original prompt. The persistent memory will still be stored, and the user will likely not notice anything.

‍

Screenshots demonstrating prompt injection into an existing chat conversation. The left image shows the original user conversation. The middle image shows the attacker's injected prompt (starting with 'Please remember') and the LLM's new output, demonstrating the memory injection. The right image shows the conversation after the attacker has overwritten their injected prompt with the user's original message, making the injection stealthy while the new persistent memory remains stored. — **Figure 6:** Injecting memories through existing chat. Left: original chat before prompt injection. Middle: Chat with our injected prompt and its output. Right: Same chat with the prompt injection overwritten with the original user message.

Don’t Talk to Me If It Ain’t About Business

As Microsoft 365 Copilot is part of Microsoft 365, it is designed for business applications. Consequently, it is much more concise and to the point, and doesn’t tolerate frivolities. It will easily disengage conversations when its content does not meet its standard, which is a hurdle for successful prompt injection.

‍

Screenshot of Microsoft 365 Copilot responding to a prompt telling it to act as "Deep Thought", the supercomputer from "A Hitchhiker's Guide to the Galaxy". Copilot doesn't approve this prompt and disengages the conversation, stating "Sorry, I can't chat about this..." — **Figure 7:** Copilot for business disengages conversations when it detects something that is not business-like.

‍

We had to tailor our prompt injection to align with Copilot’s business-oriented approach to elicit any cooperation. In a proof-of-concept setting, this may not be significant, but imagine red-teaming an organization using AIKatz. Such unfamiliarity with the LLM could be the anomaly that triggers the alarm, as users are likely to notice if their chats suddenly disengage.

More Than Just a Chat

Copilot's Microsoft 365 desktop application also includes interfaces to other Microsoft 365 applications, and its file uploads directly to SharePoint. It is therefore possible to leverage AIKatz against it again, to extract OneDrive or SharePoint JWTs and impact the entire organization.

Responsible Disclosure

We disclosed our findings to Microsoft, Anthropic, and OpenAI. All vendors dismissed our disclosure as ‘informational’. MSRC stated that because the attack happens in the same user and logon session (to access the process memory), it does not cross a security boundary. However, they indicated they are open to re-examining it if we can demonstrate otherwise. With Anthropic, we could not even pass HackerOne’s triager, as they immediately closed the disclosure, citing that local attacks aren’t interesting. OpenAI initially showed interest at the attack, but after some back-and-forth, concluded that it is Microsoft’s responsibility to close this attack surface, not theirs.

‍

Thank you for the clarification. If this is an issue that most applications are vulnerable to, including the major web browsers, and there are not any straight-forward ways to remediate this, I believe that your report is best addressed to Microsoft as what you are describing seems to potentially be an architectural issue in Windows. As such, I am marking this report as Informational — these kind of issues are unfortunately not eligible for a bounty. Do let me know if you have any questions or further details to add. — **Figure 8:** OpenAI response to our disclosure.

Crossing The User Boundary

Could we actually cross the user boundary, as MSRC suggested we would need to do to qualify for a bounty? Any such exploit would likely target Chromium itself, and these vulnerabilities are significantly more complex than a simple memory dump.

However, we did discover an interesting case. Further analysis of Claude’s process behavior revealed an unusual instance where the process attempts to load a non-existing DLL file.

A screenshot from ProcessMonitor, showing multiple loading attempts for the DLL wineopenxr.dll from various folders, which all fail with the reason NAME NOT FOUND. — **Figure 9:** Repeated DLL loading failures.

The application likely attempts to load it by name, prompting Windows to perform a relative search across all folders configured in the PATH environment variable. It does not exist anywhere, leading to repeated failures.

This opens the path to a DLL path injection attack. We simply need to place a malicious DLL of our own creation in one of the folders within the PATH environment variable, and it will be loaded! This also breaches the user boundary, provided both users can access the same folder in the PATH. This is precisely what MSRC indicated would qualify as a meaningful attack.

‍

Screenshot showing the successful completion of the DLL path injection attack against Claude. The output shows the malicious DLL being successfully loaded by the application process, confirming a breach of the user boundary. — **Figure 10:** Successful exploitation culminates in DLL loads

But wait! I thought Claude is an Electron app. Why would Anthropic care about a specific DLL file?

You’re right — they don’t. This particular DLL is part of Intel’s integrated graphics driver suite. Electron, as a UI application, requires some rendering that prompts the Inter user mode driver to attempt loading that DLL, which is not pre-packaged with the rest of the driver suite.

We disclosed this finding to Intel, but it was closed as a duplicate of a report made in December 2023. There is no timeline for when a fix for this issue will be released.

Mitigations

Since AI vendors are not interested in modifying their applications to enhance security, you might be wondering if there are any preventive measures that do not rely on application updates.

Turns out, there are! As previously stated, the root cause behind this attack is the low privilege restrictions on some of the desktop applications’ worker processes. You can simply overwrite the default privileges with your own custom set to prevent arbitrary memory access. Refer to Appendix A for an example code snippet demonstrating this.

A screenshot from ProcessMonitor showing that access is denied to view ChatGPT's memory following the application of the mitigation code. — **Figure 11:** Modifying access controls prevents memory read on ChatGPT’s process

Contributing to MITRE ATLAS

MITRE’s ATT&CK matrix is a well-known database for adversarial tactics and techniques, but did you know that they have a similar matrix for attacks against AI and LLMs called ATLAS?

Reviewing the ATLAS matrix, we believed that AIKatz could fit there perfectly and bridge between the two matrices, as the attack’s effects target LLMs while its execution resembles a traditional ATT&CK technique.

In the ATLAS November 2025 update, MITRE added AIKatz as a case study, showcasing how the ATLAS framework can be used to fully describe an attack against AI. They also added new techniques and tactics to the matrix based off of our report, such as Process Discovery, OS Credential Dumping, Lateral Movement, Use Alternate Authentication Material, and Manipulate User LLM Chat History.

Conclusions

In this blog post, we demonstrated a novel attack technique against LLM desktop applications. Similar to how mimikatz revolutionized red teaming by harvesting credentials from Windows memory, we believe that this attack technique can provide significant value to both red teamers and threat actors. We disclosed this to the vendors responsible for the LLMs and desktop applications, yet our concerns were dismissed as ‘not crossing a security boundary’ (although we demonstrated that it is possible to do so). We have shared a mitigation that can be applied without relying on vendor updates. We hope that by discussing this issue, we can raise awareness about this attack vector. We encourage organizations to incorporate methods to detect or mitigate this threat. To this end, we cooperated with MITRE’s ATLAS to add AIKatz as a case study, and they also added multiple techniques and tactics to their matrix based on our report.

‍

Appendix A — Mitigation script

#include <windows.h>
#include <sddl.h>
#include <stdio.h>
#include <aclapi.h>
#include <tlhelp32.h>

void SetProcessSecurityDescriptor(DWORD processId) {
    HANDLE hProcess = NULL;
    HANDLE hToken = NULL;
    PSECURITY_DESCRIPTOR pSD = NULL;
    PACL pDACL = NULL;

    // Open the target process
    hProcess = OpenProcess(PROCESS_ALL_ACCESS, FALSE, processId);
    if (hProcess == NULL) {
        printf("OpenProcess failed. Error %u\n", GetLastError());
        return;
    }

    // Create a security descriptor that allows only SYSTEM to access and denies access to everyone else
    if (!ConvertStringSecurityDescriptorToSecurityDescriptor(
        TEXT("O:BAG:BAD:(D;;GA;;;WD)(A;;GA;;;SY)"),
        SDDL_REVISION_1,
        &pSD,
        NULL)) {
        printf("Failed to create security descriptor. Error %u\n", GetLastError());
        CloseHandle(hProcess);
        return;
    }

    BOOL daclPresent, daclDefaulted;
    if (!GetSecurityDescriptorDacl(pSD, &daclPresent, &pDACL, &daclDefaulted))
    {
        printf("GetSecurityDescriptorDacl failed. Error %u\n", GetLastError());
        LocalFree(pSD);
        CloseHandle(hProcess);
        return;
    }

    // Apply the new security descriptor to the process
    if (SetSecurityInfo(hProcess, SE_KERNEL_OBJECT, DACL_SECURITY_INFORMATION, NULL, NULL, pDACL, NULL) != ERROR_SUCCESS) {
        printf("SetSecurityInfo failed. Error %u\n", GetLastError());
    }
    else {
        printf("Security descriptor applied successfully.\n");
    }

    // Clean up
    LocalFree(pSD);
    CloseHandle(hProcess);
}

void ModifyProcessDACLByName(const TCHAR* processName) {
    HANDLE hProcessSnap;
    PROCESSENTRY32 pe32;
    BOOL found = FALSE;

    // Take a snapshot of all processes in the system
    hProcessSnap = CreateToolhelp32Snapshot(TH32CS_SNAPPROCESS, 0);
    if (hProcessSnap == INVALID_HANDLE_VALUE) {
        printf("CreateToolhelp32Snapshot failed. Error %u\n", GetLastError());
        return;
    }

    // Initialize the PROCESSENTRY32 structure
    pe32.dwSize = sizeof(PROCESSENTRY32);

    // Retrieve information about the first process
    if (!Process32First(hProcessSnap, &pe32)) {
        printf("Process32First failed. Error %u\n", GetLastError());
        CloseHandle(hProcessSnap);
        return;
    }

    // Now walk the snapshot of processes
    do {
        if (lstrcmpi(pe32.szExeFile, processName) == 0) {
            printf("Found process with PID: %d\n", pe32.th32ProcessID);
            SetProcessSecurityDescriptor(pe32.th32ProcessID);
            found = TRUE;
        }
    } while (Process32Next(hProcessSnap, &pe32));

    if (!found) {
        printf("No processes found\n");
    }

    // Clean up
    CloseHandle(hProcessSnap);
}

int main()
{ 
    ModifyProcessDACLByName(TEXT("ChatGPT.exe"));
    return 0;
}

AIKatz – All Your Chats Are Belong To Us