Description: Web-Based Multimodal Interaction refers to interaction models that use web technologies to facilitate multimodal communication, that is, the combination of different modes of communication such as text, voice, images, and gestures. This interaction allows users to engage with systems and applications in a more natural and efficient manner, leveraging multiple sensory channels. The main characteristics of these models include the ability to recognize and process different types of inputs, the integration of various technologies such as voice recognition, natural language processing, and computer vision, and adaptation to user preferences and contexts. The relevance of multimodal interaction lies in its potential to enhance the accessibility and usability of web applications, enabling a greater number of people, including those with disabilities, to interact with technology effectively. Additionally, it fosters a richer and more immersive user experience, as it allows users to choose the mode of interaction that is most comfortable or effective for them in each situation. In an increasingly digital world, web-based multimodal interaction emerges as an innovative solution to meet the communication needs of modern users.