Interacting with the environment using a brain-computer interface involves mapping a decoded user command to a desired real-world action, which is typically achieved using screen-based user interfaces. This indirection can increase cognitive workload of the user. Recently, we have proposed a screen-free interaction approach utilizing visual in-the-scene stimuli. The sequential highlighting of object surfaces in the user’s environment using a laser allows to, e.g., select these object to be fetched by an assistive robot. In this paper, we investigate the influence of stimulus subclasses—differing surfaces between objects as well as stimulus position within a sequence—on the electrophysiological response and the decodability of visual event-related responses. We find that evoked responses differ depending on the subclasses. Additionally, we show that in the presence of ample data subclass-specific classifiers can be a feasible approach to address the heterogeneity of responses.