The most common mass transit modes in metropolitan cities include buses, subways and taxicabs, each of which contribute to an interconnected complex network that delivers urban dwellers to their destinations. Understanding the intertwined usages of these three transit modes at different places and time allows for better sensing of urban mobility and the built environment. In this article, we leverage a comprehensive data collection of bus, metro and taxi ridership from Shenzhen, China to unveil the spatio-temporal interplay between different mass transit modes. To achieve this goal, we develop a novel spectral clustering framework that imposes spatio-temporal similarities between mass transit mode usage in urban space and differentiates urban spaces associated with distinct ridership patterns of mass transit modes. Five resulting categories of urban spaces are identified and interpreted with auxiliary knowledge of the city’s metro network and land-use functionality. In general, different mass transit modes cooperate or compete based on demographic and socioeconomic attributes of the underlying urban environments. Our proposed analytical framework provides a novel and effective way for exploring the mass transit system and the functional heterogeneity in cities. It demonstrates great potential for assisting policymaker and municipal manager in optimizing public transportation facility allocation and city-wide daily commuting distribution.