Commit graph

9 commits

Author SHA1 Message Date
Jelle Raaijmakers
b3c186ce64 LibWeb: Add spec steps to prescan a byte stream algorithm
Fix up the implementation as we go. Also change `ByteBuffer const&` into
`ReadonlyBytes` everywhere, so we don't accidentally copy bytes.
2025-11-21 17:43:08 +01:00
Jelle Raaijmakers
f52632d48a LibWeb: Skip right amount of characters during encoding detection
When detecting an element's opening tag, the spec asks us to skip ahead
to the first whitespace or end chevron character before trying to read
attributes. Instead, we were always skipping 2 positions ahead and then
ignoring all whitespace characters and slashes, which was clearly wrong.

Theoretically this could have caused some weird behaviors if part of the
opening tag matched an expected attribute name, but it's very unlikely
to see that in the wild.
2025-11-21 17:43:08 +01:00
Jelle Raaijmakers
4bcf988e46 LibWeb: Use FlyString attribute name comparisons in encoding detection
Prevent some unnecessary work by performing pointer comparisons.
2025-11-21 17:43:08 +01:00
Jelle Raaijmakers
4ebbbdfef1 LibWeb: Advance position at attribute end quote in encoding detection
This did not cause any immediate issues except generating instances of
`Attr` with useless values which caused some unnecessary work during
encoding detection.
2025-11-21 17:43:08 +01:00
Sam Atkins
423cdd447d LibWeb+LibGfx: Apply editorial punctuation/whitespace/markup fixes
Corresponds to d426109ea1
and fd08f81d06
2025-06-25 03:12:19 +12:00
Sam Atkins
9dbeecb73d LibWeb: Correct some spec typos
Corresponds to 285a58bf30
2025-04-10 04:01:37 +02:00
Jaycadox
f672c57ca7 LibWeb: Check charset UTF-16LE/BE separately for UTF-8 conversion
Previously, the charset of name "UTF-16BE/LE" would be checked against
when following standards to convert the charset to UTF-8, but in
reality, the charsets "UTF-16BE" and "UTF-16LE" should be checked
separately.

Co-authored-by: Jelle Raaijmakers <jelle@ladybird.org>
2025-02-24 14:51:41 +01:00
Shannon Booth
f87041bf3a LibGC+Everywhere: Factor out a LibGC from LibJS
Resulting in a massive rename across almost everywhere! Alongside the
namespace change, we now have the following names:

 * JS::NonnullGCPtr -> GC::Ref
 * JS::GCPtr -> GC::Ptr
 * JS::HeapFunction -> GC::Function
 * JS::CellImpl -> GC::Cell
 * JS::Handle -> GC::Root
2024-11-15 14:49:20 +01:00
Timothy Flynn
93712b24bf Everywhere: Hoist the Libraries folder to the top-level 2024-11-10 12:50:45 +01:00
Renamed from Userland/Libraries/LibWeb/HTML/Parser/HTMLEncodingDetection.cpp (Browse further)