arxiv:2510.12117

Locket: Robust Feature-Locking Technique for Language Models

Published on Oct 14

· Submitted by

Lipeng (Tony) He on Oct 15

Upvote

Authors:

Lipeng He ,

Abstract

Chatbot providers (e.g., OpenAI) rely on tiered subscription schemes to generate revenue, offering basic models for free users, and advanced models for paying subscribers. However, a finer-grained pay-to-unlock scheme for premium features (e.g., math, coding) is thought to be more economically viable for the providers. Such a scheme requires a feature-locking technique (FLoTE) which is (i) effective in refusing locked features, (ii) utility-preserving for unlocked features, (iii) robust against evasion or unauthorized credential sharing, and (iv) scalable to multiple features and users. However, existing FLoTEs (e.g., password-locked models) are not robust or scalable. We present Locket, the first robust and scalable FLoTE to enable pay-to-unlock schemes. Locket uses a novel merging approach to attach adapters to an LLM for refusing unauthorized features. Our comprehensive evaluation shows that Locket is effective (100% refusal on locked features), utility-preserving (leq 7% utility degradation in unlocked features), robust (leq 5% attack success rate), and scales to multiple features and clients.

View arXiv page View PDF Add to collection

Community

ttttonyhe

Paper author Paper submitter 7 days ago

The widespread adoption of #LLM chatbot services has created a large and diverse user base, driving up computing and operational costs. Providers rely on tiered subscription plans to generate revenue 💰, offering black-box access to basic models for free users and advanced models to paying subscribers.

However, this all-or-nothing approach is unprofitable and inflexible for the users:

A pay-to-unlock scheme 🔐 for premium features (e.g., math, coding) and specific model capabilities (e.g. medical diagnosis, age-gating) offers a more sustainable alternative. In this work, we present a feature-locking technique (FLoTE) that is:

Effective in refusing locked features,
Utility-preserving for unlocked features,
Robust against evasion or unauthorized credential sharing, and
Scalable to multiple features and clients.

This work represents an initial step towards more fine-grained control of generative model behaviour, potentially enabling many future applications.

librarian-bot

6 days ago

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2510.12117 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2510.12117 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2510.12117 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.