The deployment of new airborne small cells in conjunction with an existing network equipped with a multiple-input multiple-output (MIMO) system can be a promising solution to meet the high throughput and coverage needs of future networks. In this study, we present the potential benefits of deploying a large number of antenna elements in downlink vertical heterogeneous cellular networks (VHCNets). The network model consists of a terrestrial base station (TBS) with an array of massive antenna elements and an aerial base station (ABS) with a single antenna to provide connectivity for ground user equipment (GUEs). The locations of the TBS and ABS are modeled as two-dimensional and three-dimensional Poisson point processes, respectively. An appropriate air-to-ground channel model encompasses both line-of-sight (LOS) and non-line-of-sight (NLOS) communication between GUEs and ABS. Using the concepts of stochastic geometry, the performance of the VHCNet was analyzed in terms of coverage probability, area spectral efficiency, and average ergodic rate. The approximated expressions for these performance metrics were validated using Monte Carlo simulations, and a close match was observed between the analytical and simulated results. Based on the simulation results, subsequent use of a massive number of antennas at TBSs in conjunction with ABSs in the VHCNet can improve the overall performance of the network.