Sound event recognition is a crucial aspect of human auditory perception. Hence, it has to be taken into account when it comes to understanding how humans perceive soundscapes. In that context, both unsupervised and supervised learning techniques can be used. On the one hand, this paper takes the latter approach for the recognition of sound events typically encountered in urban environments. Sound signals are described using a set of auditory-based features and then sound event recognition is performed employing multi-class Support Vector Machines. On the other hand, a combined approach including unsupervised learning (specifically, Self-Organizing Maps) for clustering and collecting real world samples and supervised learning for labeling is introduced. Finally, listening tests are also carried out in order to compare the accuracy achieved by the proposed system with the human ability.