[2602.23699] HiDrop: Hierarchical Vision Token Reduction in MLLMs via ...
via an inter-layer similarity measure and a differentiable top-k operator. To ensure practical efficiency, HiDrop further incorporates persistent positional encoding, FlashAttention-compatible token sele...