The UAE government-backed Technology Innovation Institute (TII) has announced the launch of Falcon 3, a family of open-source small language models (SLMs) designed to run efficiently on lightweight, single GPU-based infrastructures. Trained on 14 trillion tokens, the Falcon 3 family employs a decoder-only architecture with grouped query attention to share parameters and minimize memory usage for key-value (KV) cache during inference. This enables faster and more efficient operations when handling diverse text-based tasks.