IEEE Transactions on Information Forensics and Security , April 2026 Hanrui Wang, Shuo Wang, Chun-Shien Lu, and Isao Echizen Face recognition poses serious privacy risks due to
its reliance on sensitive and immutable biometric data. While
modern systems mitigate privacy risks by mapping facial images
to embeddings (commonly regarded as privacy-preserving),
model inversion attacks reveal that identity information can
still be recovered, exposing critical vulnerabilities. However,
existing attacks are often computationally expensive and lack
generalization, especially those requiring target-specific training.
Even training-free approaches suer from limited identity
controllability, hindering faithful reconstruction of nuanced or
unseen identities. In this work, we propose DiMI, the first
diusion-driven, training-free model inversion attack. DiMI
introduces a novel pipeline combining robust latent code initialization,
a ranked adversarial refinement strategy, and a
statistically grounded, confidence-aware optimization objective.
DiMI applies directly to unseen target identities and face
recognition models, oering greater adaptability than trainingdependent
approaches while significantly reducing computational
overhead. Our method achieves 84.42%–92.87% attack success
rates against inversion-resilient systems and outperforms the best
prior training-free GAN-based approach by 4.01%–9.82%. The
implementation is available at https://github.com/azrealwang/
DiMI. ACS OMEGA, April 2026 Chung-Yen Lin,Wen-Chih Cheng, U-Lin Chen, Tzu-Tang Lin, Li-Hang Hsu, Yang-Hsin Shih, I-Hsuan Lu, Ying-Lien Chen, Shu-Hwa Chen The increasing prevalence of fungal infections represents a growing threat to human health, driven in part by the misuse of antibiotics and the rising incidence of resistance to conventional antifungal agents. Antifungal peptides (AFPs) have emerged as promising alternatives due to their diverse mechanisms of action and their relatively low propensity to develop resistance. To facilitate the systematic discovery of AFPs, we developed AI4AFP. This computational framework integrates curated antifungal peptide resources with advanced machine learning approaches to predict antifungal potential directly from peptide sequences. Using a comprehensive dataset, we constructed a seven-model ensemble that combines multiple sequence encoding strategies, including ProtBERT-BFD, PC6, and Doc2Vec, with diverse learning algorithms, including random forests, support vector machines, convolutional neural networks, and fine-tuned BERT models. This ensemble demonstrated robust performance on an independent test set, achieving 0.94 in accuracy and 0.89 in Matthews correlation coefficient, outperforming existing AFP prediction methods. Importantly, the predicted AFP score is intended to reflect the general antifungal potential rather than species-specific potency. Experimental validation against representative fungal pathogens, including Candida albicans, Candida glabrata, and Cryptococcus neoformans, revealed that peptides with high predicted AFP scores exhibited context-dependent antifungal activity. Several candidates displayed pronounced inhibitory effects against specific species, despite limited activity against others, highlighting the inherent species-dependence of antifungal efficacy and supporting the role of AI4AFP as a prioritization tool rather than a species-specific predictor. To complement antifungal prediction, we further developed a hemolysis classifier that incorporates both peptide sequence and applied concentration as continuous inputs, enabling explicit modeling of the dose-dependent nature of hemolytic toxicity. Experimental determination of the minimum concentration inducing 10% hemolysis (MHC₁₀) provided an empirical safety reference, allowing antifungal activity to be interpreted alongside concentration-dependent toxicity. All models and validation results are implemented on a user-friendly web server, AI4AFP (https://axp.iis.sinica.edu.tw/AI4AFP), providing an accessible platform for the discovery and prioritization of antifungal peptides, with consideration of both efficacy and safety. The 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026), Main Conference, July 2026 Arthur Amalvy, Vincent Labatut, Xavier Bost, and Hen-Hsen Huang While annotated corpora are crucial in the field of natural language processing (NLP), those containing copyrighted material are difficult to exchange among researchers. Yet, such corpora are necessary to fully represent the diversity of data found in the wild in the context of NLP tasks. We tackle this issue by proposing a method to lawfully and publicly share the annotations of copyrighted literary texts. The corpus creator shares the annotations in clear, along with a non-reversible hashed version of the source material. The corpus user must own the source material, and apply the same hash function to their own tokens, in order to match them to the shared annotations. Crucially, our method is robust to reasonable divergences in the version of the copyrighted data owned by the user. As an illustration, we present alignment experiments on different editions of novels. Our results show that our method is able to correctly align 98.7 to 99.79% of tokens depending on the novel, provided the user version is sufficiently close to the corpus creator's version. We publicly release novelshare, a Python implementation of our method. Forty-third International Conference on Machine Learning (ICML), July 2026 Cheng-Yi Lee, Yichi Zhang, Yuchen Yang, and Chun-Shien Lu, and Jun-Cheng Chen Recent studies have shown that semantic watermarks,
which embed information into the initial
noise of latent diffusion models (LDMs), are vulnerable
to black-box forgery attacks. However,
existing methods primarily rely on empirical evidence
and lack a rigorous theoretical understanding
of the conditions under which such attacks
succeed or fail. To bridge this gap, we rethink
the nature of such attacks through the lens of ratedistortion
in the latent space. Our analysis identifies
an irreducible distortion floor due to structural
mismatches between proxy and target models,
which fundamentally limits the fidelity of forged
watermarks. We further characterize this distortion
as structured geometric deviations on the latent
manifold, in the form of global drift and local
deformation rather than stochastic noise. Leveraging
these insights, we propose a scheme-agnostic
detection method that distinguishes forged samples
before watermark verification. Extensive experiments
demonstrate the effectiveness of our
method across diverse black-box scenarios, while
preserving robustness to common distortions. Forty-third International Conference on Machine Learning (ICML), July 2026 Ching-Chia Kao, Chia-Mu Yu, Chun-Shien Lu, and Chu-Song Chen Safety alignment of large language models is fragile:
even small fine-tuning perturbations elastically
revert behaviors toward those of the pretraining,
with degradation inversely proportional
to the size of the alignment set. We ask how to
achieve safety alignment with minimal augmentation.
To this end, we model augmentation as
a set of group actions on sequences and formalize
robustness gains as a normalized, monotone
submodular function over transformations. We
then leverage submodular optimization to select
minimal augmentations that provably improve robustness.
Experiments confirm that our approach
efficiently restores safety alignment while minimizing
the overhead of augmentation. IEEE Computational Intelligence Magazine, May 2026 Ammarah Hashmi, Sahibzada Adil Shahzad, Chia-Wen Lin, Yu Tsao, and Hsin-Min Wang Deep learning has been successfully applied in various fields, and its impact on deepfake detection is no exception. Deepfakes are fake, yet realistic synthetic content that can be used deceitfully for political impersonation, phishing, slander, or the spread of misinformation. Despite extensive research on unimodal deepfake detection, the identification of complex deepfakes through joint analysis of audio and visual streams remains relatively unexplored. To fill this gap, this survey first provides an overview of audiovisual deepfake generation techniques, applications, and their consequences, and then provides a comprehensive review of state-of-the-art methods that combine audio and visual modalities to increase detection accuracy, summarizing and critically analyzing their strengths and limitations. Furthermore, we discuss existing open source datasets for a deeper understanding, which can contribute to the research community and provide necessary information for beginners who want to analyze deep learning-based audiovisual methods for video forensics. By bridging the gap between unimodal and multimodal approaches, this paper aims to improve the effectiveness of deepfake detection strategies and guide future research on cybersecurity and media integrity. Scientific Data, May 2026 Ching-Huei Huang, Po-Cheng Hsu, San-Tzu Hsieh, Fu-Shen Tseng, Chung-Yen Lin Taiwan Hard Clam (Meretrix taiwanica) is an economically important aquaculture species in Taiwan, yet genomic resources for this species have remained fragmented. We present a telomere-to-telomere (T2T), haplotype-resolved, chromosome-level genome assembly for M. taiwanica, generated using PacBio HiFi long reads and Hi-C sequencing. The two haploid assemblies (hap1 and hap2) span 1,006.48 Mb and 1,007.28 Mb, comprising 126 and 66 sequences, respectively, and each containing 19 chromosomes. Hap1 and hap2 exhibit sequence N50 values of 53.87 Mb and 51.57 Mb, with average scaffold lengths of 7.99 Mb and 15.26 Mb, and contain 0.0176% and 0.1313% ambiguous bases. Comparative analyses revealed 81.59% and 83.78% syntenic regions between haplotypes and identified 10,175 structural variations. Repetitive elements constitute 47.06% and 47.02% of the hap1 and hap2 genomes. We annotated 23,320 and 23,598 protein-coding gene models, with median gene lengths of 7,721 bp and 7,657.5 bp, respectively. The mitochondrial genome was assembled at 21,164 bp and encodes 13 protein-coding genes, 22 tRNAs, and 2 rRNAs. Functional annotation covered 16.23% and 16.33% of the nuclear and mitochondrial gene sets. BUSCO analysis indicated genome completeness of 92.4% and 92.5%, and proteome completeness of 95.4% and 94.5% for hap1 and hap2. By providing the first T2T-level reference, this dataset enables precise identification of trait-associated markers for marker-assisted selection (MAS), thereby facilitating genetic improvement of growth and stress-resistance traits. Furthermore, it serves as a robust genomic framework for conservation genomics to assess the genetic diversity of both wild and hatchery populations of this economically vital species. the Fourteenth International Conference on Learning Representations (ICLR), April 2026 Yun-Jui Tsai, Wei-Yu Chen, Yan-Ru Ju, Yu-Hung Chang, Ti-Rong Wu Reinforcement learning (RL) agents achieve remarkable performance but remain far less learning-efficient than humans. While RL agents require extensive self-play games to extract useful signals, humans often need only a few games, improving rapidly by repeatedly revisiting states where mistakes occurred. This idea, known as search control, aims to restart from valuable states rather than always from the initial state. In AlphaZero, prior work Go-Exploit applies this idea by sampling past states from self-play or search trees, but it treats all states equally, regardless of their learning potential. We propose Regret-Guided Search Control (RGSC), which extends AlphaZero with a regret network that learns to identify high-regret states, where the agent's evaluation diverges most from the actual outcome. These states are collected from both self-play trajectories and MCTS nodes, stored in a prioritized regret buffer, and reused as new starting positions. Across 9x9 Go, 10x10 Othello, and 11x11 Hex, RGSC outperforms AlphaZero and Go-Exploit by an average of 77 and 89 Elo, respectively. When training on a well-trained 9x9 Go model, RGSC further improves the win rate against KataGo from 69.3% to 78.2%, while both baselines show no improvement. These results demonstrate that RGSC provides an effective mechanism for search control, improving both efficiency and robustness of AlphaZero training. Our code is available at https://rlg.iis.sinica.edu.tw/papers/rgsc. Proceedings of the 15th ACM SIGPLAN International Conference on Certified Programs and Proofs (CPP '26), January 2026 Liang-Ting Chen, Fredrik Nordvall Forsberg, Tzu-Chun Tsai We present an intrinsic representation of type theory in the proof assistant Cubical Agda, inspired by Awodey’s natural models of type theory. The initial natural model is defined as quotient inductive-inductive-recursive types, leading us to a syntax accepted by Cubical Agda without using any transports, postulates, or custom rewrite rules. We formalise some meta-properties such as the standard model, normalisation by evaluation for typed terms, and strictification constructions. Since our formalisation is carried out using Cubical Agda's native support for quotient inductive types, all our constructions compute at a reasonable speed. When we try to develop more sophisticated metatheory, however, the 'transport hell' problem reappears. Ultimately, it remains a considerable struggle to develop the metatheory of type theory using an intrinsic representation that lacks strict equations. The effort required is about the same whether or not the notion of natural model is used. Journal of Systems Architecture (JSA), March 2026 Chi-Wei Chu, Ding-Yong Hong, Jan-Jan Wu In deep learning frameworks, weight pruning is a widely used technique for improving computational efficiency by reducing the size of large models. This is especially critical for convolutional operators, which often act as performance bottlenecks in convolutional neural networks (CNNs). However, the effectiveness of pruning heavily depends on how it is implemented, as different methods can significantly impact both computational performance and memory footprint. In this work, we propose a column-wise N:M pruning strategy applied at the tile level and modify XNNPACK to enable efficient execution of pruned models on the RISC-V vector architecture. Additionally, we propose fusing the operations of im2col and data packing to minimize redundant memory accesses and memory overhead. To further optimize performance, we incorporate AITemplate’s profiling technique to identify the optimal implementation for each convolutional operator. Our proposed approach effectively increases ResNet inference throughput by as much as 4×, and preserves ImageNet top-1 accuracy within 2.1% of the dense baseline. IEEE Transactions on Cognitive and Developmental Systems, December 2025 Ammarah Hashmi, Sahibzada Adil Shahzad, Chia-Wen Lin, Yu Tsao, and Hsin-Min Wang The recent proliferation of hyper-realistic deepfake videos has drawn attention to the threat of audio and visual forgeries. Most previous studies on detecting artificial intelligence-generated fake videos only utilize visual modality or audio modality. While some methods exploit audio and visual modalities to detect forged videos, they have not been comprehensively evaluated on multimodal datasets of deepfake videos involving acoustic and visual manipulations, and are mostly based on convolutional neural networks with low detection accuracy. Considering that human cognition instinctively integrates multisensory information including audio and visual cues to perceive and interpret content and the success of transformer in various fields, this study introduces the audio-visual transformer-based ensemble network (AVTENet). This innovative framework tackles the complexities of deepfake technology by integrating both acoustic and visual manipulations to enhance the accuracy of video forgery detection. Specifically, the proposed model integrates several purely transformer-based variants that capture video, audio, and audio-visual salient cues to reach a consensus in prediction. For evaluation, we use the recently released benchmark multimodal audio-video FakeAVCeleb dataset. For a detailed analysis, we evaluate AVTENet, its variants, and several existing methods on multiple test sets of the FakeAVCeleb dataset. Experimental results show that the proposed model outperforms all existing methods and achieves state-of-the-art performance on Testset-I and Testset-II of the FakeAVCeleb dataset. We also compare AVTENet against humans in detecting video forgery. The results show that AVTENet significantly outperforms humans. GigaScience, November 2025 Yu-Hsin Chen, Chien-Fu Liu, Jun-Yi Leu*, and Huai-Kuang Tsai* Co-fractionation coupled with mass spectrometry (CF-MS) is a powerful strategy for mapping protein-protein interactions (PPIs) under near-physiological conditions. Despite recent progress, existing analysis pipelines remain constrained by reliance on handcrafted features, sensitivity to experimental noise, and an inherent focus on pairwise interactions, which limit their scalability and generalizability. To address these difficulties, we introduce FREEPII (Feature Representation Enhancement End-to-End Protein Interaction Inference), a unified deep learning framework that integrates CF-MS data with sequence-derived features to learn biologically meaningful protein-level representations for accurate and efficient inference of PPIs and protein complexes. FREEPII employs a convolutional neural network (CNN) architecture to learn protein-level representations directly from raw data, enabling feature sharing across interaction pairs and reducing computational complexity. To enhance robustness against CF-MS noise, protein sequences are introduced as auxiliary input to enrich the feature space with complementary biological cues. The supervised protein embeddings further encode network-level context derived from complex annotations, allowing the model to capture higher-order interactions and enhance the expressive power of protein representations. Extensive benchmarking demonstrates that FREEPII consistently outperforms state-of-the-art CF-MS analysis tools, capturing more biologically coherent and discriminative protein features. Cross-dataset evaluations further reveal that integrating multi-modal data from diverse experimental contexts substantially improves the generalization and sensitivity of data-driven models, offering a scalable, cross-species strategy for reliable protein interaction inference. IEEE Transactions on Information Forensics and Security , November 2025 Hanrui Wang, Ching-Chun Chang, Chun-Shien Lu, Christopher Leckie, and Isao Echizen Deep neural networks are highly vulnerable to
adversarial examples, which are inputs with small, carefully
crafted perturbations that cause misclassification—making
adversarial attacks a critical tool for evaluating robustness.
Existing black-box methods typically entail a trade-o between
precision and flexibility: pixel-sparse attacks (e.g., single- or fewpixel
attacks) provide fine-grained control but lack adaptability,
whereas patch- or frequency-based attacks improve eciency or
transferability, but at the cost of producing larger and less precise
perturbations. We present GreedyPixel, a fine-grained black-box
attack method that performs brute-force-style, per-pixel greedy
optimization guided by a surrogate-derived priority map and
refined by means of query feedback. It evaluates each coordinate
directly without any gradient information, guaranteeing
monotonic loss reduction and convergence to a coordinate-wise
optimum, while also yielding near white-box-level precision and
pixel-wise sparsity and perceptual quality. On the CIFAR-10
and ImageNet datasets, spanning convolutional neural networks
(CNNs) and Transformer models, GreedyPixel achieved state-ofthe-
art success rates with visually imperceptible perturbations,
eectively bridging the gap between black-box practicality and
white-box performance. The implementation is available at
https://github.com/azrealwang/greedypixelDIffUMI: Training-Free Universal Model Inversion via Unconditional Diffusion for Face Recognition
Abstract
Harnessing Sequence Embedding and Ensemble Learning to Identify Antifungal Peptides with Low Hemolytic Risk
Abstract
Overcoming Copyright Barriers in Corpus Distribution Through Non-Reversible Hashing
Abstract
Rethinking Forgery Attacks on Semantic Watermarks in Black-Box Settings: A Geometric Distortion Perspective
Abstract
Submodular Optimization for Minimal Augmentation in Robust Language Model Alignment
Abstract
Understanding Audiovisual Deepfake Detection: Techniques, Challenges, Human Factors, and Perceptual Insights
Abstract
Telomere-to-Telomere, Haplotype-Resolved Chromosome-Level Genome Assembly and Annotation of Taiwan Hard Clam (Meretrix taiwanica)
Abstract
Regret-Guided Search Control for Efficient Learning in AlphaZero
Abstract
Can We Formalise Type Theory Intrinsically without Any Compromise? A Case Study in Cubical Agda
Abstract
Efficient Column-Wise N:M Pruning on RISC-V CPU
Abstract
AVTENet: A Human-Cognition-Inspired Audio-Visual Transformer-Based Ensemble Network for Video Deepfake Detection
Abstract
Complete end-to-end learning from protein feature representation to protein interactome inference
Abstract
GreedyPixel: Fine-Grained Black-Box Adversarial Attack Via Greedy Algorithm
Abstract