Introduction: [Toy Browser] Introduction
Previous article: [Toy Browser] HTTP Request and Response Parse
**Keywords: **HTML, CSS, DOM Tree, Finite-state machine
In previous article, toy browser can send a request to the server, receive its response, and parse the response. Now, HTML file is extracted. Next step, we need to create a DOM Tree by parsing HTML. It’s the most tricky and interesting part of toy browser. The codes of HTML Parser will help you understand it. In general, the HTML Parser does following things in the order:
As parsing response and chunked response body, our old friend, Finite-state machine will help us parsing HTML. Different from the previous tasks, HTML is much more complicated. Fortunately, whatwg.org has defined all the states for us (check here, tokenization). Unfortunately, there are 80 states ! To make the toy browser simple, I chose some of the states.
Finite-states machine for parsing HTML of toy browser
#html #html-parsing #css