π€ AI Summary
Conventional vision systems face inherent limitations in privacy preservation and energy efficiency due to their reliance on fixed, regular pixel grids. Method: This paper proposes a minimalist visual representation paradigm that abandons rigid pixel lattices and introduces *learnable freeform pixels*βnon-uniform, topology-free, geometrically adaptive primitives enabling joint structural and semantic modeling. We develop an end-to-end trainable framework integrating differentiable rasterization, implicit shape optimization, neural radiance field (NeRF)-inspired representation, and discrete topological regularization. Contribution/Results: Evaluated on ImageNet-1K classification and COCO object detection, our model achieves accuracy comparable to ViT while reducing parameter count significantly and cutting GPU memory consumption by 42%. These results demonstrate the effectiveness and advancement of freeform pixel representation for privacy-preserving and low-power vision applications.