Local Flexibility Markets (LFMs) are considered a promising framework towards resolving voltage and congestion issues of power distribution systems in an economically efficient manner. However, the need for location-specific flexibility services renders LFMs naturally imperfectly competitive and market efficiency is severely challenged by strategic participants that exploit their locally monopolistic power. Previous works have been considering either non-strategic participants, or strategic participants with perfect information (e.g. about the network characteristics etc) that can readily compute their payoffmaximizing bidding strategy. In this paper, we take on the problem of designing an efficient LFM in the more realistic case where market participants do not possess this information and, instead, learn to improve their bidding policies through experience. To that end, we develop a multi-agent reinforcement learning algorithm to model the participants' learning-to-bid process. In this framework, we first present two popular LFM pricing schemes (pay-as-bid and distribution locational marginal pricing) and expose that learning agents can discover ways to exploit them, resulting in severe dispatch inefficiency. We then present a gametheoretic pricing scheme that theoretically incentivizes truthful bidding and empirically demonstrate that this property improves the efficiency of the resulting dispatch also in the presence of learning agents. In particular, the proposed scheme is able to outperform the popular distribution locational marginal pricing (DLMP) scheme, in terms of efficiency, by a factor of 15 − 23%.