Skip to content
← Blog

Attack post-mortem

PyTorch torchtriton: a December 2022 dependency-confusion incident

3 min read

In late December 2022, the PyTorch project disclosed that anyone who installed pytorch-nightly between December 25 and December 30 may have downloaded a malicious package named torchtriton. The malicious package was a textbook dependency-confusion attack: an internal-only package name, registered on the public PyPI registry by an attacker, with a higher version number than anything PyTorch itself was using.

This is a walk-through of what happened, why dependency confusion still works, and what defenses sit at the layer where the attack happens.

The attack mechanism

PyTorch's nightly build had a transitive dependency on torchtriton — an internal package that, until that point, only existed in PyTorch's own infrastructure and was installed via internal channels. When pip resolves torchtriton, it consults the configured index URL. If both an internal and public source are configured (the typical setup for pip with --extra-index-url), pip prefers the higher version regardless of source.

An attacker registered torchtriton on the public PyPI, published a version higher than any internal release, and waited. Anyone running pip install pytorch-nightly while configured with PyPI as an extra index pulled the public attacker-controlled package instead of the internal one.

What the malicious code did

The published torchtriton package included a binary that, when executed, exfiltrated information about the host: the Python environment, environment variables, hostname, and certain readable files such as ~/.gitconfig, ~/.ssh/*, and /etc/hosts. The exfiltration target was a fixed remote endpoint.

The attacker did not appear to have a wider goal beyond information collection. The PyTorch team was contacted by the package author shortly after publication; they later stated they had registered the name to demonstrate the vulnerability rather than to exploit it. Regardless of intent, the technique works.

Why dependency confusion is structurally hard to fix

Dependency confusion is not a flaw in a single package manager. It is what happens when the resolver consults multiple registries and treats them as a single namespace. The resolver picks a version; it does not pick a registry.

The fix on the consumer side is to constrain which registry a given package name can come from. pip supports this via --index-url (single source) instead of --extra-index-url (additive sources). Modern Python tooling such as uv, Poetry, and pip-tools support per-package index pinning more cleanly. npm has scoped packages (@company/pkg) for the same reason — a scope binds a name to an organization.

The fix on the publishing side is to claim your internal names on the public registry. PyTorch and other projects now routinely register placeholder packages on PyPI to prevent name squatting.

What the standard defenses do here

Lockfiles do not help. A lockfile only kicks in once the package is pinned. A new dependency added to the project resolves freshly against whatever indexes are configured.

Threat feeds do not help quickly. The malicious torchtriton was a brand-new package. By the time it was reported, the install window had been open for days.

Provenance helps only if you trust the source. PyTorch's internal torchtriton did not have a public provenance attestation; the attacker's package could have. Provenance proves who built a package, not which is the legitimate one for your project.

How a cooling and consensus gate changes it

A package that has just been published, with no community observations and no prior installs from inside your organization, is the exact profile of both a legitimate first-time dependency and a freshly registered confusion attack. A cooling gate that holds installs of brand-new packages with no community history forces the install through a review path.

That review path is where an engineer looks and asks: is this the package we expect? If the answer is no, the install is rejected and the project's dependency configuration is fixed to scope the name to the correct internal registry.

Takeaway

Dependency confusion will keep working as long as resolvers treat multiple registries as one namespace. Internal package owners should claim their names on public registries and scope dependencies explicitly. Consumers benefit from a hold-on-first-install gate that turns a silent install into a review, particularly for packages your organization has never seen before.