IEEEWhile aerial vehicles offer enormous benefits in several application domains, multidrone localization and control in uncertain environments with limited onboard sensing capabilities remains an active research field. A formation control solution which does not rely on external infrastructure aids such as GPS and motion capture systems must be established based on onboard perception feedback. We address the integration of onboard perception and decision layers in a distributed formation control architecture for three-drone systems. The proposed algorithm fuses two sensor characteristics, distance, and vision, to estimate the relative positions between the drones. Particularly, we utilize the omnidirectional sensing property of the ultrawideband distance sensors and a deep learning-based bearing detection algorithm in a filter. The entire system leads to a closed-loop perception-decision framework, whose stability and convergence properties are analyzed exploiting its modular structure. Remarkably, the drones do not use a common reference frame. We verified the framework through extensive simulations in a realistic environment. Furthermore, we conducted real world experiments using two drones and proved the applicability of the proposed framework. We conjecture that our solution will prove useful in the realization of future drone swarms.