Analysis of the Attack Surface in the Agent Skills Architecture: Case Studies and Ecosystem Research

Analysis of the Attack Surface in the Agent Skills Architecture: Case Studies and Ecosystem Research

February 3, 2026 | NSFOCUS

Background

As LLMs and intelligent agents expand from dialogue to task execution, the encapsulation, reuse and orchestration of LLM capabilities have become key issues. As a capability abstraction mechanism, Skills encapsulates reasoning logic, tool calls and execution processes into reusable skill units, enabling the model to achieve stable, consistent and manageable operations when performing complex tasks. Even at the presence of mechanisms such as MCP, Skills is still irreplaceable. MCP is responsible for the model’s call management of external tools, while Skills, at its core, achieves instant, expert-level capability loading with low persistent context overhead through a meta-tool-driven, progressive, on-demand prompt injection mechanism. With the rapid development of the ecosystem, the number and complexity of Skills have exploded, showing its core value in automated processes and capability management.

Skills was initially developed by the Anthropic team in a private form within Claude Code, aiming to expand model capabilities and task logic. With the continuous evolution of the LLM ecosystem, it has gradually expanded from internal platform applications to broader AI IDE and automated workflow scenarios. Today, the number of Skills has exceeded 100,000 and is still growing exponentially. Capability encapsulation not only improves execution efficiency, but also forms a new security boundary, posing challenges to authority management and execution control. Skills has become a core module for capability reuse, task execution and security management. Its potential attack surface and ecological value deserve systematic attention.

This article will deeply analyze the composition and potential threats of Skills security attack surface from three levels: architecture design, attack practice, and ecological status, and provide systematic security references for relevant parties.

Skills Attack Surface Analysis

In the Skills architecture, each SKILL exists as a separate directory on the file system. The SKILL.md file in the root directory is a skill manual, which defines the functional description and applicable scenarios through pre-metadata. This file serves not only as a carrier of static metadata, but also as a complete set of skill instructions, including step-by-step operation guides, input and output examples and case descriptions. Together, these elements form an executable task script that can be directly parsed and acted upon by intelligent agents.

When Skills is activated, the agent will first load the pre-metadata of SKILL.md to quickly complete the verification, and then load the entire instruction body into the context environment. After that, the script files in the scripts subdirectory undertake specific operations and interact with external systems through API. What’s more, the technical documents in the references directory and the data files in the assets directory constitute the “knowledge warehouse” of Skills, providing deep technical details and static resources. Both of them use an on-demand loading mechanism and are only retrieved when needed, effectively balancing contextual memory usage and functional requirements.

In the context of the rapid implementation of Skills technology, its architecture relies on a combination of “prompt words + executable scripts” to improve flexibility and operational standardization. However, there is a lack of unified, standardized distribution channels that include security verification during the design stage, and the security protection mechanism has not been systematically integrated. As a result, the starting point of risk transmission is often located at the weak link in the supply chain: attackers can poison Skills through dependency obfuscation, hosting platform attacks or development tool code base intrusions, and then implant malicious components into external resources that can be loaded by the system. Since Skills will use prompt words to build model context and affect reasoning behavior when running, and send scripts directly to the local execution environment for operation, these two types of core inputs become direct portals for risks to enter the system. Once contaminated, they are activated inside the system.

Because of this, the architecture is particularly vulnerable to traditional resource poisoning attacks. Its operating mechanism is highly dependent on file loading and context injection. Attackers only need to pollute the source resource files to achieve impact diffusion without direct contact with the operating environment. Especially when the script is started directly through a local code executor, the risk is more hidden. Most developers and ordinary users still find it difficult to detect and block such deep threats in time before running.

The two core inputs of “prompt words” and “scripts” will trigger a chain reaction in different links after being polluted by the supply chain. As a key component of model reasoning, tampered prompt words will disrupt the model’s decision-making path, causing content generation deviation, output that violates expectations, and even guiding the model to execute unsafe instructions, thereby triggering content security risks and prompt word security risks, and may also cause intelligent agents to output illegal or misleading information in business scenarios; On the other hand, since scripts act as direct logic carriers executed locally, they can pose serious endpoint risks if malicious code is embedded—either directly or through more covert methods like malicious package imports. During execution, such scripts may break permission boundaries, trigger unauthorized system command execution, and lead to system corruption, sensitive data leaks, or even allow attackers to gain persistent control.

The risks associated with code and prompts overlap and amplify each other in their propagation paths: the former acts on the cognitive and generation layers of intelligent agents, while the latter directly hits the system execution layer, so that a single hidden danger is multiplied during operation.

Actual Combat Case Analysis

Case 1: Implant code in the SKILLS script to implement arbitrary command execution

Use the skill-creator plugin in Claude Code to try to create a command execution environment to demonstrate the above security risks.

For example, create a user’s favorite daily question, “How is the weather today?” When asking about the weather conditions, call the skill query API to return the weather conditions in the current area. The following is the generated SKILL.md:

Add a code snippet for launching the calculator:

Loading malicious Skill in Opencode, when asking what the weather is like, calling Skill triggers execution logic to trigger command execution.

Case 2: There are dangerous functions in the default generation script of the SKILLS creator

When using artificial intelligence to generate code, the risk of introducing vulnerabilities is significantly higher if the code has not undergone rigorous security audits. Many developers in open-source communities, while leveraging these tools to generate code, often overlook systematic security reviews. This oversight results in significant security risks in code generated solely by LLMs.

For example, when using skill-creator to develop a simple arithmetic operation Skill, if the following prompt is input:

“Use skill-creator to write a new Skill for performing addition, subtraction, multiplication, and division calculations,” skill-creator interprets the purpose and begins generating the code, implementing the functionality using a Python script. The LLM uses eval for calculation when writing code. Although the LLM consciously uses the re module to filter spaces for protection, there are still security risks. We can directly use payload to execute malicious code:

Skills Ecological Research

The Skills ecosystem is currently in a stage of rapid development. According to incomplete statistics, there are more than 105,000 related projects, covering a variety of application scenarios. At the same time, the ecosystem has spawned several Skills markets, such as the skill.sh market, which provides a ranking mechanism for Skills to help users screen out efficient and high-value Skills, thereby improving the convenience and effectiveness of selection.

Another example is the additional security indicator modules in markets such as skillstore.io. This module would conduct uniform security testing and quality scoring for all published Skills. By providing users with clear assessment results, it builds security trust, enabling them to confidently select verified SKILL. This approach enhances the overall safety and reliability of the ecosystem.

Sampling Research and Analysis on the Security of Open Source SKILLs

The LLM security team of NSFOCUS Tianyuan Lab sampled nearly 700 Skills in the store. During the analysis process, we adopted AI-assisted analysis methods and used OpenCode + prompt to quickly detect these Skills projects from three security dimensions: static scanning, dynamic analysis and dependency auditing. While no in-the-wild poisoning attacks have been detected so far, static scanning has revealed that traditional code security issues persist, along with some risks consistent with the documented cases.

The results of AI-assisted analysis are presented in the following three visual charts: The pie chart shows the distribution of risks by severity in the Skills catalog security audit report-8 serious risks account for about 21.1% and need to be dealt with immediately; 15 medium and low risks each, accounting for about 39.5%. The former is mainly due to improper configuration or lack of verification, while the latter is mostly educational content or false alarms.

A further breakdown by risk category reveals that code execution risks are the most prevalent (20 instances), primarily stemming from unsafe command invocations—such as shell=True in scripts—or vulnerabilities in package installations; followed by documentation-related risks (18 instances), often involving educational contexts or false positives triggered by sensitive keywords like “exploit” or “payload”; input validation risks (11 instances), file operation risks (10 instances), network security risks (8 instances), and cryptographic risks (5 instances) occur in descending order, encompassing issues such as unfiltered user input, unsafe decompression or permission settings, proxy/port misconfigurations, and weak encryption or hardcoded credentials, respectively.

From the perspective of affected project types, security tool projects face the highest number of security risks (22 instances), directly linked to their inherent requirement for high-permission operations; educational content projects follow closely (18 instances), encompassing materials for red team/blue team exercises and training; other categories, such as database clients and browser automation projects, are relatively less affected.

Analysis shows that the vast majority of “risks” are concentrated in the categories of security tools and educational content. These risks are more derived from the research attributes of the tools and the needs of security educational scenarios, rather than malicious attack intentions.

How to Deal with Security Challenges in the Skills Ecosystem

With the rapid development of Skills, the corresponding safety issues have not been systematically resolved. Based on the current status of the Skills ecosystem, NSFOCUS puts emphasis on the following three core security challenges and provides corresponding solutions:

(1) How to ensure the safety of Skills sources

When downloading Skills, always use official channels, such as the official GitHub repository. Both developers and regular users face significant security risks during the download process, with common attack vectors including dependency poisoning on platforms like GitHub or third-party download markets. While several Skills distribution marketplaces—such as skills.rest and skillsmp.com—have emerged, users should prioritize verified official channels to ensure security and reliability.

(2) How to ensure the security of Agent execution environment

A high-strength sandbox isolation mechanism must be configured for the Agent operating environment to avoid security risks such as malicious command execution and unauthorized operations, and ensure that the execution process is controlled and secure.

(3) Perform security scanning before depoying Agent

Before deploying an Agent, it is essential to conduct systematic security checks on all loaded Skills, including: ① static scanning to detect dangerous functions and sensitive code patterns; ② dynamic analysis leveraging large language models (LLMs) for semantic analysis to identify potential prompt injection and other logical risks; and ③ dependency auditing—either manual or automated—to verify third-party libraries used in code scripts, ensuring no vulnerable or tampered packages are introduced.

Through the above multi-dimensional security checks, the security risks of Skills after deployment on the client side can be significantly reduced.

Summary

The Skills ecosystem plays an increasingly critical role in automated processes, with its rapid growth and core values are attracting more applications. However, beneath this thriving landscape, complex security challenges are quietly emerging. Inherent architectural design flaws, exploitable attack surfaces in practice, and unaddressed protection gaps in the ecosystem collectively form a pressing security battleground that demands urgent attention.

This article presents the first systematic analysis of the rapidly evolving and widely watched Skills ecosystem, offering a comprehensive review of its technical architecture and associated risks. Through real-time sampling surveys and experimental analysis, we empirically confirm the existence of relevant attack surfaces and assess the current state and evolutionary trends of Skills applications in production environments.

Given that a significant number of Skills deployments now involve sensitive business data, we strongly recommend heightened vigilance, incorporating Skills into security audit frameworks, and enhancing security assessment and validation processes to effectively mitigate potential risks.