ladybird

mirror of https://github.com/LadybirdBrowser/ladybird.git synced 2026-06-19 08:11:58 +00:00

Author	SHA1	Message	Date
Andreas Kling	ccf5a278ab	LibWeb: Keep deferred document.close cleanup on its parser document.close() can defer script-created parser cleanup while a parser-blocking script is pending. If document.open() installs a new parser before the old parser resumes, the deferred action must clean up the parser that scheduled it instead of the document's current parser. Capture that parser before installing the deferred action. This keeps the parked cleanup from affecting a parser installed by a later document.open() call.	2026-05-17 15:35:56 +02:00
Andreas Kling	29784ea397	LibWeb: Remove the C++ HTML tree builder Delete the old C++ tree-construction implementation and helper classes that became unused once the Rust parser is unconditional. Remove the C++ stack of open elements, active formatting elements, speculative mock element, and tree-builder-only token storage. Keep the C++ parser entry points that still own LibWeb DOM integration, encoding detection, tokenizer bridging, incremental parsing, and the speculative parser support used by resource discovery.	2026-05-17 15:35:56 +02:00
Andreas Kling	a7ece4b062	LibWeb: Make the Rust HTML parser unconditional Remove the runtime selector between the old C++ tree builder and the new Rust implementation. Always construct HTML documents and fragments with the Rust parser now that it matches the existing tests. Simplify dump-html-tree by dropping the backend option that only made sense while both parser implementations were available.	2026-05-17 15:35:56 +02:00
Andreas Kling	54879bc916	LibWeb: Complete Rust HTML tree construction Finish the Rust implementation of the spec tree-construction algorithms needed by the LibWeb test suite. Add the remaining table modes, foster parenting, scope helpers, adoption agency handling, ruby/list/form and select cases, frameset state, foreign-content edge cases, and parser host callbacks. Preserve behavior that depends on the C++ DOM integration, including parser-created custom element reactions, fragment quirks mode, arbitrary fragment namespaces, template fragment mode, fragment form ownership, MathML annotation-xml boundaries, contextual fragment scripts, parser script source positions, document.close() parser state, void-element insertion, and duplicate attribute tracking. Add focused tests for the parser edge cases that are easy to regress at the boundary between the Rust tree builder and the C++ DOM host.	2026-05-17 15:35:56 +02:00
Andreas Kling	de12062515	LibWeb: Wire Rust parser scripts and fragments Preserve Rust parser state across tokenizer runs and stop cleanly when a parser-blocking script has to execute. Thread the pending script back through the existing C++ parser entry point so document.write(), input insertion points, and script bookkeeping continue to use the normal LibWeb machinery. Add the fragment parser setup needed by innerHTML and contextual fragment parsing, including context elements, form ownership, tokenizer state selection, text coalescing, and foreign-content integration.	2026-05-17 15:35:56 +02:00
Andreas Kling	09296315c2	LibWeb: Add Rust HTML parser host plumbing Add the C++ and Rust scaffolding that lets the tree builder live in Rust while the DOM remains owned by LibWeb. Keep the exported surface small: Rust stores parser state, and C++ provides node creation, insertion, script, template, and GC hooks. Route dump-html-tree through the selectable parser backend so the new implementation can be exercised beside the existing parser while it is being brought up.	2026-05-17 15:35:56 +02:00
Sam Atkins	73c4b77f68	LibWeb/HTML: Support `align` attributes on table sections and rows thead, tbody, tfoot, tr, td, and th all have an `align` presentational attribute with identical definitions. We previously only supported it for td and th, and also allowed arbitrary text-align values instead of the 4 dictated by the spec.	2026-04-30 15:20:22 +02:00
Aliaksandr Kalenik	4762c4fa5c	LibWeb: Add incremental HTML parsing Introduce IncrementalDocumentParser, which streams the response body through a TextCodec::StreamingDecoder into the HTMLTokenizer one chunk at a time. The tokenizer pauses when it runs out of input and resumes once the next chunk is appended; when the body closes we close the tokenizer's input stream so it can finish the parse. DocumentLoading routes HTML responses through the new parser instead of buffering the full body before handing it to HTMLParser.	2026-04-29 04:12:44 +02:00
Aliaksandr Kalenik	01fa8a27ac	LibWeb: Extract HTMLParser::run_until_completion() Pull the post-parse-action setup, run loop, and post-parse invocation out of HTMLParser::run(URL, ...) into a new run_until_completion() method. The URL overload still calls it; behavior is unchanged. The incremental parser will use this entry point directly without going through the URL-setting overload.	2026-04-29 04:12:44 +02:00
Aliaksandr Kalenik	f499edefae	LibWeb: Track whether HTMLParser is script-created Add a ScriptCreatedParser flag plumbed through HTMLParser's constructor and create_for_scripting(). Only document.open()'s parser sets it to Yes. Document::close() step 3 now checks is_script_created() so it correctly skips parsers that weren't created via document.open(), matching the spec. Previously the check was just `if (!m_parser)`, which incorrectly let document.close() insert an EOF into a network-driven parser. The bug was mostly latent because the network parser used to finish quickly, but it matters once the network parser stays alive for the duration of a streamed parse.	2026-04-29 04:12:44 +02:00
Aliaksandr Kalenik	70ac025eff	LibWeb: Implement the speculative HTML parser When the HTML parser blocks on a synchronous external script, run a separate tokenizer over the unparsed input and issue speculative fetches for the resources it finds (script src, link rel=stylesheet\|preload, img src), with <base href> tracking and template/foreign-content skipping. Also fills in the previously-stubbed "consume a preloaded resource" algorithm and the document's "map of preloaded resources", so that <link rel="preload"> followed by a matching consumer deduplicates to a single fetch.	2026-04-26 18:48:29 +02:00
Aliaksandr Kalenik	b1ccab81ad	LibWeb: Replace spin_until in HTMLParser::handle_text with async resume Spinning a nested event loop to wait for a parser-blocking script blocks the calling thread, can deadlock, and creates reentrancy hazards. Switch to an event-driven pause/resume model, mirroring the prior HTMLParserEndState refactor (`df96b69e7a`). Three WPT document.write tests flip from Fail to Pass and are rebaselined: all write an external script via document.write() followed by inline content. With spin_until, control did not return to the caller of document.write() between writing the script and observing its effects so the test's order assertions saw a different sequence than the spec mandates.	2026-04-26 10:44:45 +02:00
Shannon Booth	8642801889	LibWeb: Set fragment scripting mode from the context document This corresponds with the editorial change to the HTML standard introducing the parsing mode enum of: `01c45cede` And a follow up normative change of: `508706c80` Making fragment parsing derive its scripting mode from the context document.	2026-04-14 23:01:36 +02:00
Aliaksandr Kalenik	df96b69e7a	LibWeb: Replace spin_until in HTMLParser::the_end() with state machine HTMLParser::the_end() had three spin_until calls that blocked the event loop: step 5 (deferred scripts), step 7 (ASAP scripts), and step 8 (load event delay). This replaces them with an HTMLParserEndState state machine that progresses asynchronously via callbacks. The state machine has three phases matching the three spin_until calls: - WaitingForDeferredScripts: loops executing ready deferred scripts - WaitingForASAPScripts: waits for ASAP script lists to empty - WaitingForLoadEventDelay: waits for nothing to delay the load event Notification triggers re-evaluate the state machine when conditions change: HTMLScriptElement::mark_as_ready, stylesheet unblocking in StyleElementBase/HTMLLinkElement, did_stop_being_active_document, and DocumentLoadEventDelayer decrements. NavigableContainer state changes (session history readiness, content navigable cleared, lazy load flag) also trigger re-evaluation of the load event delay check. Key design decisions and why: 1. Microtask checkpoint in schedule_progress_check(): The old spin_until called perform_a_microtask_checkpoint() before checking conditions. This is critical because HTMLImageElement::update_the_image_data step 8 queues a microtask that creates the DocumentLoadEventDelayer. Without the checkpoint, check_progress() would see zero delayers and complete before images start delaying the load event. 2. deferred_invoke in schedule_progress_check(): I tried Core::Timer (0ms), queue_global_task, and synchronous calls. Timers caused non-deterministic ordering with the HTML event loop's task processing timer, leading to image layout tests failing (wrong subtest pass/fail patterns). Synchronous calls fired too early during image load processing before dimensions were set, causing 0-height images in layout tests. queue_global_task had task ordering issues with the session history traversal queue. deferred_invoke runs after the current callback returns but within the same event loop pump, giving the right balance. 3. Navigation load event guard (m_navigation_load_event_guard): During cross-document navigation, finalize_a_cross_document_navigation step 2 calls set_delaying_load_events(false) before the session history traversal activates the new document. This creates a transient state where the parent's load event delay check sees the about:blank (which has ready_for_post_load_tasks=true) as the active document and completes prematurely.	2026-03-28 23:14:55 +01:00
Andreas Kling	e87f889e31	Everywhere: Abandon Swift adoption After making no progress on this for a very long time, let's acknowledge it's not going anywhere and remove it from the codebase.	2026-02-17 10:48:09 -05:00
Aliaksandr Kalenik	30e4779acb	AK+LibWeb: Reduce recompilation impact of DOM/Node.h Remove includes from Node.h that are only needed for forward declarations (AccessibilityTreeNode.h, XMLSerializer.h, JsonObjectSerializer.h). Extract StyleInvalidationReason and FragmentSerializationMode enums into standalone lightweight headers so downstream headers (CSSStyleSheet.h, CSSStyleProperties.h, HTMLParser.h) can include just the enum they need instead of all of Node.h. Replace Node.h with forward declarations in headers that only use Node by pointer/reference. This breaks the circular dependency between Node.h and AccessibilityTreeNode.h, reducing AccessibilityTreeNode.h's recompilation footprint from ~1399 to ~25 files.	2026-02-11 20:02:28 +01:00
Feng Yu	b58fcaeecf	LibWeb: Add HTMLSelectedContentElement for customizable select Introduce the HTMLSelectedContentElement and integrate it into <select>, <option> and HTMLParser. See whatwg/html#10548. There are two bugs with WPT tests which causes the third subtest in selectedcontent.html and selectedcontent-mutations.html fail. See whatwg/html#11882, web-platform-tests/wpt#55849.	2025-12-12 12:06:24 +00:00
Feng Yu	d2029b1814	LibWeb: Relax HTML parser to allow more tags inside <select> This implements parsing part of customizable <select> spec update. See whatwg/html PR #10548. Two failing subtests in `html5lib_innerHTML_tests_innerHTML_1.html` and `customizable-select/select-parsing.html` are due to the spec still disallowing `<input>` inside `<select>`, even though Chrome has already implemented this behavoir (see whatwg/html#11288).	2025-12-04 17:17:01 +00:00
mikiubo	0b715b20a2	LibWeb: Make HTML fragment parsing return ExceptionOr Update Element::parse_fragment and Node::unsafely_set_html to propagate exceptions. This refactor is needed as a prerequisite for implementing the XML fragment parser, which requires consistent error handling in fragment parsing.	2025-10-23 11:06:39 +01:00
Luke Wilde	b17783bb10	Everywhere: Change west consts caught by clang-format-21 to east consts	2025-08-29 18:18:55 +01:00
ayeteadoe	3df8e00d91	LibWeb: Enable EXPLICIT_SYMBOL_EXPORT	2025-08-23 16:04:36 -06:00
Sam Atkins	c57975c9fd	LibWeb: Move and rename CSSStyleValue to StyleValues/StyleValue.{h,cpp} This reverts `0e3487b9ab`. Back when I made that change, I thought we could make our StyleValue classes match the typed-om definitions directly. However, they have different requirements. Typed-om types need to be mutable and GCed, whereas StyleValues are immutable and ideally wouldn't require a JS VM. While I was already making such a cataclysmic change, I've moved it into the StyleValues directory, because it not being there has bothered me for a long time. 😅	2025-08-08 15:19:03 +01:00
Timothy Flynn	8b6e3cb735	LibWeb+LibUnicode+WebContent: Port DOM:CharacterData to UTF-16 This replaces the underlying storage of CharacterData with Utf16String and deals with the fallout.	2025-07-24 19:00:20 +02:00
Timothy Flynn	7280ed6312	Meta: Enforce newlines around namespaces This has come up several times during code review, so let's just enforce it using a new clang-format 20 option.	2025-05-14 02:01:59 -06:00
Andrew Kaster	6d11414957	LibWeb: Make storage of CSS::StyleValues const-correct Now we consistently use `RefPtr<StyleValue const>` for all StyleValues.	2025-04-16 10:41:44 -06:00
Andrew Kaster	8cfac6ed71	LibWeb: Store a SpeculativeHTMLParser on the HTML Parser The parser was previously added, but unused. Actually attaching one to the HTML Parser will let us test the limits of Swift interop.	2025-04-16 09:02:27 -06:00
Andrew Kaster	9ee2473aa4	LibWeb+LibGC: Import GC swift module into LibWeb and an initial user Start work on a speculative HTML Parser in Swift. This component will walk ahead of the normal HTML parser looking for fetch() requests to make while the normal parser is blocked. This work exposed many holes in the Swift C++ interop component, which have been reported upstream.	2025-04-03 16:47:48 -06:00
Andreas Kling	550613e526	LibWeb: Remember when HTML parser should ignore next line feed character There's a quirk in HTML where the parser should ignore any line feed character immediately following a `pre` or `textarea` start tag. This was working fine when we could peek ahead in the input stream and see the next token, but didn't work in character-at-a-time parsing with document.write(). This commit adds the "can ignore next line feed character" as a parser flag that is maintained across invocations, making it work in this parsing mode as well. 20 new passes in WPT/html/syntax/parsing/ :^)	2025-02-20 14:32:13 +01:00
Shannon Booth	f87041bf3a	LibGC+Everywhere: Factor out a LibGC from LibJS Resulting in a massive rename across almost everywhere! Alongside the namespace change, we now have the following names: * JS::NonnullGCPtr -> GC::Ref * JS::GCPtr -> GC::Ptr * JS::HeapFunction -> GC::Function * JS::CellImpl -> GC::Cell * JS::Handle -> GC::Root	2024-11-15 14:49:20 +01:00
Timothy Flynn	93712b24bf	Everywhere: Hoist the Libraries folder to the top-level	2024-11-10 12:50:45 +01:00

30 commits