ladybird/Libraries/LibWeb/HTML/Parser/HTMLEncodingDetection.h
Jelle Raaijmakers f52632d48a LibWeb: Skip right amount of characters during encoding detection
When detecting an element's opening tag, the spec asks us to skip ahead
to the first whitespace or end chevron character before trying to read
attributes. Instead, we were always skipping 2 positions ahead and then
ignoring all whitespace characters and slashes, which was clearly wrong.

Theoretically this could have caused some weird behaviors if part of the
opening tag matched an expected attribute name, but it's very unlikely
to see that in the wild.
2025-11-21 17:43:08 +01:00

22 lines
732 B
C++

/*
* Copyright (c) 2021, Max Wipfli <mail@maxwipfli.ch>
*
* SPDX-License-Identifier: BSD-2-Clause
*/
#pragma once
#include <AK/ByteString.h>
#include <AK/Optional.h>
#include <LibGC/Ptr.h>
#include <LibWeb/Forward.h>
namespace Web::HTML {
Optional<StringView> extract_character_encoding_from_meta_element(ByteString const&);
GC::Ptr<DOM::Attr> prescan_get_attribute(DOM::Document&, ByteBuffer const& input, size_t& position);
Optional<ByteString> run_prescan_byte_stream_algorithm(DOM::Document&, ByteBuffer const& input);
Optional<ByteString> run_bom_sniff(ByteBuffer const& input);
ByteString run_encoding_sniffing_algorithm(DOM::Document&, ByteBuffer const& input, Optional<MimeSniff::MimeType> maybe_mime_type = {});
}