MIA methods × model type × data domain (LLM / MLLM)

Method category / subclass Core signal / feature Access / threat model Model / modality Data domain Public evidence / notes
(1a) Logit / Loss-based — basic shadow / reference loss / log-prob; target vs reference gap Needs logits / likelihood (gray/white-box or limited black-box + reference) LLM (text-only) General data Classic score-based / reference-based MIA; large-scale eval on pretrained LLMs shows many settings are near-random, esp. “big data + few epochs” (Do Membership Inference Attacks Work on Large Language Models?).
(1a) -- -- LLM (text-only) Clinical data Strong evidence: masked LM on medical notes with likelihood-ratio reaches AUC≈0.90 (Quantifying Privacy Risks of Masked Language Models Using Membership Inference Attacks); clinical LM (ClinicalBERT/GPT-2, etc.) leak ~7% under white/black-box MIA (Membership Inference Attack Susceptibility of Clinical Language Models); work-in-progress on EHR QA LLM with canonical loss + paraphrasing MIA (Exploring Membership Inference Vulnerabilities in Clinical Large Language Models).
(1a) -- -- MLLM (multimodal) General data Multimodal score-based MIA uses text-image similarity / confidence: cosine threshold on CLIP (Practical Membership Inference Attacks Against Large-Scale Multi-Modal Models: A Pilot Study); NeurIPS 2024 benchmark for VLLMs with MaxRényi-K% confidence metric (Membership Inference Attacks against Large Vision-Language Models).
(1a) -- -- MLLM (multimodal) Clinical data No published systematic loss / log-prob MIA on “medical image + clinical text” multimodal models; clear gap.
(1b) Quantile / score-distribution regression Model non-member score distribution via quantile regression Needs confidence / score (softmax prob, logit, etc.) LLM General data NeurIPS 2023 uses quantile regression to approximate non-member scores, avoiding multiple shadows; matches classic shadow attacks with far less compute (Scalable Membership Inference Attacks via Quantile Regression).
(1b) -- -- LLM Clinical data No specific reports on clinical LLM / EHR; clinical work mostly uses simple threshold / likelihood-ratio.
(1b) -- -- MLLM General data In principle extendable to multimodal confidence / similarity, but published experiments focus on single-modal classification/regression; no explicit MLLM results yet.
(1b) -- -- MLLM Clinical data No public evidence.
(1c) Self-calibrated / reference-free (SPV-MIA) Model self-generates reference via self-prompt; compare loss / probabilistic variation Needs token-level probs / NLL (gray-box or black-box that returns per-token scores) LLM General data Self-prompt Calibration (SPV-MIA, NeurIPS 2024) reaches AUC ≈0.9 on fine-tuned LLMs without real shadow data (Practical Membership Inference Attacks against Fine-tuned Large Language Models via Self-prompt Calibration).
(1c) -- -- LLM Clinical data No open replication of SPV-MIA on clinical LLM / EHR; clinical setups still use canonical loss-based.
(1c) -- -- MLLM General data If API exposes token-level probs, can be extended, but VLLM work mostly uses confidence / similarity; no direct experiments reported.
(1c) -- -- MLLM Clinical data No public evidence.
(1d) Neighbourhood / neighbour-text Generate paraphrase / local-perturb neighbours and compare score gaps Needs logits / prob or at least loss / score (black-box + scores) LLM General data Neighbourhood comparison (ACL 2023) replaces shadow data with synthetic neighbours; reference-free and matches or exceeds reference-based MIA on benchmarks (Membership Inference Attacks against Language Models via Neighbourhood Comparison).
(1d) -- -- LLM Clinical data Clinical LLMs haven’t used neighbourhood MIA systematically; Nemecek et al. tried paraphrasing-based perturbation for clinical QA LLM (still canonical loss, not full neighbourhood) (Exploring Membership Inference Vulnerabilities in Clinical Large Language Models).
(1d) -- -- MLLM General data Could form multi-neighbour via augmentations/prompts for same image/text and combine with cross-modal consistency (see (3) & VL-MIA). Most multimodal work still relies on cosine / MaxRényi-K%, not explicit neighbourhood framework.
(1d) -- -- MLLM Clinical data No public evidence.
(2) Representation-based (embedding / activation) Geometry of embeddings/activations (distance, density, clusters, layer patterns) Needs internal features (gray/white-box) LLM General data Standard class in MIA surveys; also covered in recent large-model surveys (Membership Inference Attacks on Large-Scale Models: A Survey). Useful when logits are defended.
(2) -- -- LLM Clinical data ClinicalBERT privacy eval shows state-of-the-art reference-based MIA had limited ability to distinguish pseudonymized vs non-pseudonymized text, implying embedding/reference MIA may understate clinical PII leakage (Using Membership Inference Attacks to Evaluate Privacy-Preserving Language Modeling Fails for Pseudonymizing Data; see also End-to-end pseudonymization of fine-tuned clinical BERT models).
(2) -- -- MLLM General data LUMIA uses layer-wise linear probes on internal states across unimodal and multimodal tasks: single-modal avg AUC +15.7%, and 85.9% of multimodal settings AUC>60% (LUMIA: Linear Probing for Unimodal and MultiModal Membership Inference Attacks leveraging internal LLM states).
(2) -- -- MLLM Clinical data No LUMIA-like probing published for medical image+text models; if internal layers are accessible in medical VLMs, this is a natural attack path.
(3) Cross-modal / cross-query consistency Multiple queries / modalities / augmentations of same content; check output consistency, similarity, stability Black-box, can rely only on generated output; logits not required LLM General data For APIs returning only text, attackers can use multi-paraphrase / prompt robustness gaps to infer membership; statistical analyses warn outcomes are highly sensitive to thresholds and query design (Do Membership Inference Attacks Work on Large Language Models?).
(3) -- -- LLM Clinical data Nemecek et al. added paraphrasing-based MIA for clinical QA LLM, observing limited but measurable leakage—early exploration of multi-query consistency in clinical settings (Exploring Membership Inference Vulnerabilities in Clinical Large Language Models).
(3) -- -- MLLM General data VLLM benchmarks emphasize text/image consistency and confidence (VL-MIA pipeline + MaxRényi-K% metric) (Membership Inference Attacks against Large Vision-Language Models); some papers note text log-prob alone may fail in multimodal due to modality interaction (e.g., LLAVA analyses).
(3) -- -- MLLM Clinical data No system-level studies; medical multimodal (image report + image) would be well-suited for cross-modal consistency MIA, but remains open.
(4) Decoding / Perplexity / token dynamics Token-level loss / perplexity / decoding trajectory (top-k flips, entropy, stepwise change) Needs token-level output (may not need full logits) LLM General data If token NLL / perplexity is exposed, many classic MIAs can be approximated in a label-only manner; recent statistics show AUC often barely above random in realistic LLM settings and is threshold-sensitive (Do Membership Inference Attacks Work on Large Language Models? and related SoK).
(4) -- -- LLM Clinical data Clinical LLMs: token-NLL / perplexity black-box MIA on EHR QA (model Llemr) with canonical loss + paraphrasing shows limited but detectable leakage (Exploring Membership Inference Vulnerabilities in Clinical Large Language Models); masked LM setting also aligns with token-level likelihood stats (Quantifying Privacy Risks of Masked Language Models Using Membership Inference Attacks).
(4) -- -- MLLM General data If text-side token probs are exposed (caption/QA), LLM MIA applies; VL-MIA includes token-level MIA on LLAVA pretraining data (Membership Inference Attacks against Large Vision-Language Models). Few works study “multimodal token dynamics” explicitly.
(4) -- -- MLLM Clinical data No public evidence.
(5) Label-only MIA Only generated tokens (semantic similarity, rewritten perplexity proxy, output stability) Pure black-box (text output only, no logits) LLM General data PETAL (USENIX Sec 2025) uses per-token semantic similarity to approximate perplexity; label-only MIA can match some logits-based attacks on pretrained LLMs (Towards Label-Only Membership Inference Attack against Pre-trained Large Language Models); SoK notes gains are limited in many realistic settings.
(5) -- -- LLM Clinical data Clinical LLM MIA mostly relies on token-NLL (needs probs); truly “text-only, no probability” label-only MIA is nearly absent in clinical literature.
(5) -- -- MLLM General data In VLLMs one can query multiple prompts/views and use answer consistency or semantic distance; published work usually still needs some confidence/similarity (cosine or MaxRényi-K%), so strong label-only evidence is scarce.
(5) -- -- MLLM Clinical data No public evidence.
(6) Multimodal / vision-language MIA Image+text similarity, multimodal logits, token-level image detection, internal feature combos Needs access to multimodal encoder/decoder logits / features or generated output (black/gray-box) LLM General data Not applicable (needs multimodal).
(6) -- -- LLM Clinical data Not applicable.
(6) -- -- MLLM General data Main line of multimodal MIA: ICCV 2023 CLIP cosine + augmentation aggregation (Practical Membership Inference Attacks Against Large-Scale Multi-Modal Models: A Pilot Study); NeurIPS 2024 VLLM benchmark with token-level image detection pipeline and unified MaxRényi-K% metric (Membership Inference Attacks against Large Vision-Language Models).
(6) -- -- MLLM Clinical data No published system-level MIA on medical image + text multimodal models; major gap for healthcare multimodal privacy.
(7) Distillation-based MIA Use knowledge distillation to build reference + attack features (loss/entropy/embedding, etc.) Needs trainable reference model (attacker has compute and approximate data) LLM General data Recent work brings distillation to LLM-based recommender MIA, fusing multi-source signals to boost attacks—shows “distill + MIA” is viable for complex systems (Distillation-based Membership Inference Attacks for LLM-based Recommender Systems).
(7) -- -- LLM Clinical data No public reports in clinical LLM settings.
(7) -- -- MLLM General data Could distill multimodal encoder/decoder to a small model for probing, but current multimodal privacy work targets the large model directly; no explicit “distillation-based MLLM MIA” papers yet.
(7) -- -- MLLM Clinical data No public evidence.
(8) Internal-states / layerwise probing Train linear probe/classifier on intermediate activations to separate member vs non-member Needs internal states (gray/white-box; feasible for open LLM/MLLM) LLM General data LUMIA trains layer-wise probes and gains notable AUC (+15.7% avg) across many LLMs, analyzing which layers leak most (LUMIA: Linear Probing for Unimodal and MultiModal Membership Inference Attacks leveraging internal LLM states).
(8) -- -- LLM Clinical data No public LUMIA-style probing on clinical-tuned LLMs; technically feasible for open clinical-adapted LLMs (ClinicalBERT, ClinicalGPT, etc.).
(8) -- -- MLLM General data LUMIA shows vision inputs can amplify leakage: ~85.9% of vision-related settings have AUC>60%, indicating the vision channel can be a leakage amplifier (LUMIA: Linear Probing for Unimodal and MultiModal Membership Inference Attacks leveraging internal LLM states).
(8) -- -- MLLM Clinical data No published LUMIA-style probing for medical vision+text models; if hospital-internal diagnostic VLMs expose intermediate layers, this attack would be highly relevant, but remains “theoretically feasible” only.

Ranking (clinical focus)

Rank Most effective attack type (clinical) Rationale
S1 Loss / NLL-based MIA Medical text is structured + small-data → overfitting detectable
S2 Token-level perplexity / decoding MIA Clinical models often expose token-level scores
A Representation-based Has evidence but weaker separability
B Consistency-based / Label-only Clinical text space is narrow; signal weaker
C Quantile / SPV / Neighbourhood / Distillation No clinical evidence
D Multimodal MIA (medical) Major gap; unstudied