Finite element methods (FEM) are increasingly being used to simulate the acoustics of the vocal tract. For vowel production, the irreducible wave equation for the acoustic pressure is typically solved. However, diphthong sounds require moving vocal tract geometries so that the wave equation has to be expressed in an Arbitrary Lagrangian-Eulerian (ALE) framework. It then becomes more convenient to directly work with the wave equation in its mixed form, which not only involves the acoustic pressure but also the acoustic velocity. In turn, this entails some numerical difficulties that require resorting to stabilized FEM approaches. In this work, FEM simulations for the wave equation in mixed form are carried out to produce some diphthongs. Tuned two-dimensional vocal tracts are used which mimic the behavior of three-dimensional vocal tracts with circular cross-section.