SkillTotal

Is nltk safe?

nltk-3.9.4 is an AI python_package analyzed by SkillTotal's deterministic static scanner. The scan found no malicious indicators, though 6 risky constructs are reported for review. It can: dynamic code execution, filesystem read, filesystem write, network egress and shell execution — capabilities are what the code can do, not a verdict on intent. Risk score 20/100 (low).

nltk-3.9.4 3.9.4

python_package · pypi:nltk
LOW
20
/ 100 risk score
Snapshot · scanned Jul 2, 2026 · nltk-3.9.4@3.9.4 · engine 0.24.0 / ruleset 25
No malicious indicators - review capabilities before installing
Notable — review in context (capabilities are not malware):
  • Python shell/command execution
  • Python dynamic code execution
  • Unsafe deserialization

No malicious indicators found by static analysis.

Automated static-analysis result. It can contain false positives and false negatives, and is not a claim about the intent of nltk's authors. Report a false positive.

Capabilities — what this component can do (not a risk score):
dynamic code executionfilesystem readfilesystem writenetwork egressshell execution

Findings (6)

HIGHUnsafe deserializationST-DESERIALIZE-PY

It loads data with a format that can rebuild arbitrary objects (e.g. pickle, or unsafe YAML).

db_out = shelve.open(db, "n")
db_in = shelve.open(db)

Why it matters: Feeding such a loader untrusted data can execute code hidden inside that data.

Fix: Deserialize untrusted data with a safe format/loader: JSON, or yaml.safe_load / Loader=SafeLoader. Reserve pickle/marshal for data you fully control.

HIGHPython dynamic code executionST-DYN-PY

The code turns strings into live code at runtime (eval / new Function / exec).

funcopy = eval(src, dict(_wrapper_=wrapper))
dec_func = eval(src, dict(_func_=func, _call_=caller))
return eval(s[start_position : match.end()]), match.end()
exec("import %s as model" % options.model)
w = eval("numpy." + window + "(window_len)")

Why it matters: If those strings aren't fixed and trusted, they become a way to run arbitrary code.

Fix: Avoid evaluating dynamically constructed code; if unavoidable, ensure the input is a trusted constant and never derived from external data.

HIGHPython shell/command executionST-SHELL-PY

The component can run operating-system commands or spawn processes.

p = subprocess.Popen(cmd, stdout=subprocess.PIPE)
p = Popen(_senna_cmd, stdin=PIPE, stdout=PIPE, stderr=PIPE)
p = subprocess.Popen(cmd, stdout=sys.stdout)
p = subprocess.Popen(
        ["svn", "status", "-v", filename],
        stdout=subprocess.PIPE,
        stderr=subprocess.PIPE,
    )
p = subprocess.Popen(
            cmd, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, stdin=subprocess.PIPE
        )
p = subprocess.Popen(cmd, stdin=stdin, stdout=stdout, stderr=stderr)
p = subprocess.Popen(
                    ["which", alternative],
                    stdout=subprocess.PIPE,
                    stderr=subprocess.PIPE,
                )
proc = subprocess.run(
                    ["dot", "-T%s" % t],
                    capture_output=True,
                    input=dot_string,
                    text=True,
                )
proc = subprocess.run(
                    ["dot", "-T%s" % t],
                    input=bytes(dot_string, encoding="utf8"),
                )
p = subprocess.Popen(cmd, stdout=output, stderr=output)
p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
p = subprocess.Popen(
                cmd,
                stdin=subprocess.PIPE,
                stdout=subprocess.PIPE,
                stderr=subprocess.PIPE,
            )
self._hunpos = Popen(
            [self._hunpos_bin, self._hunpos_model],
            shell=False,
            stdin=PIPE,
            stdout=PIPE,
            stderr=PIPE,
        )
p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
process = subprocess.Popen(
                ["dot", "-T%s" % output_format],
                stdin=subprocess.PIPE,
                stdout=subprocess.PIPE,
                stderr=subprocess.PIPE,
            )

Why it matters: Powerful and often legitimate — confirm the commands aren't built from untrusted input.

Fix: Confirm the command and its arguments are fully controlled and not derived from untrusted input; avoid shell=True.

MEDIUMPython filesystem readST-FS-PY-READ

The component reads files from disk.

with open(version_file) as infile:
with open(filename, "rb") as infile:
with open(filename, "rb") as infile:
with open(filename, "rb") as infile:
with open(filename) as infile:
with open(filename) as infile:
with open(usp) as infile:
with open(annfile) as infile:
with open(textfile) as infile:
"text": _ieer_read_text(m.group("text"), root_label),
"headline": _ieer_read_text(m.group("headline"), root_label),
return _ieer_read_text(s, root_label)
with open(weightfile_name) as weightfile:
with open(mapper_file, encoding="utf-8") as raw:
with open(ngram_file, encoding="utf-8") as f:
with open(path) as lin_file:
with self.abspath(framefile).open() as fp:
with self.abspath(framefile).open() as fp:
with open(self._textids) as fp:
with self.abspath(framefile).open() as fp:
with self.abspath(framefile).open() as fp:
open(self._fileid, "rb"), self._encoding
self._stream = open(self._fileid, "rb")

Why it matters: Usually legitimate, but worth confirming it can't be steered into reading sensitive files.

Fix: Confirm which files are read and that paths cannot be influenced by untrusted input to reach sensitive locations.

MEDIUMPython filesystem write/deleteST-FS-PY-WRITE

The component writes or deletes files on disk.

with open(filename, "wb") as outfile:
with open(filename, "wb") as outfile:
with open(filename, "wb") as outfile:
with open(filename, "w") as outfile:
with open(filename, "w") as outfile:
with open(filename, "w") as outfile:
logfile = open(logfilename, "a", 1)  # 1 means 'line buffering'
with open(trainfile_name, "w") as trainfile:
os.remove(trainfile_name)
os.remove(trainfile_name)
os.remove(weightfile_name)
with open(f"{tab_dir}/weights.txt", "w") as f:
with open(f"{tab_dir}/mapping.tab", "w") as f:
with open(f"{tab_dir}/labels.txt", "w") as f:
with open(f"{tab_dir}/alwayson.tab", "w") as f:
os.remove(os.path.join(temp_dir, f))
os.remove(os.path.join(temp_dir, f))
outfile = open(outfile, "w")
os.remove(self.write_file.name)
with open(filename, "wb") as outfile:
with open(filepath, "wb") as outfile:
with open(filename, "wb") as f:
os.remove(input_file.name)
os.remove(output_file.name)

Why it matters: Usually legitimate, but worth confirming the paths can't be controlled by untrusted input.

Fix: Confirm which files are written/deleted and that paths cannot be influenced by untrusted input.

MEDIUMPython network egressST-NET-PY

The component makes outbound network requests.

from urllib.parse import unquote_plus
if unquote_plus(sp) == "SHUTDOWN THE SERVER":
import urllib.request
from urllib.request import url2pathname
p = os.path.join(path_, url2pathname(resource_name))
pkg_dir = os.path.join(path_, url2pathname(pkg))
pkg_zip = os.path.join(path_, url2pathname(pkg + ".zip"))
p = os.path.join(path_, url2pathname(zipfile))
local_path = url2pathname(path_)
from urllib.error import HTTPError, URLError
response = requests.get(requests.compat.urljoin(self.url, "live"))
response = requests.get(requests.compat.urljoin(self.url, "ready"))
self.session = requests.Session()
import urllib.request
from urllib.parse import unquote, urlparse
parsed = urlparse(raw)
raw = unquote(parsed.path)
parsed = urlparse(str(url_input))
validate_path(unquote(parsed.path), context=f"{context}.file_scheme")
opener = urllib.request.build_opener(_ValidatingRedirectHandler())

Why it matters: Usually legitimate, but confirm the destinations are expected and no sensitive data leaves.

Fix: Confirm the destination hosts are expected and that no sensitive data is sent off-host.

Check your own component

Run the same evidence-backed scan on any MCP server, agent skill, or package.

Scan your own component

Or get notified if this component's risk changes:

How we determine this: deterministic static analysis (regex + AST), evidence-anchored, no code execution. Methodology →