PK APOCALYPSE V1

APOCALYPSE V1

Current Path : /opt/hc_python/lib/python3.12/site-packages/lxml/html/__pycache__/
Upload File :
Current File : //opt/hc_python/lib/python3.12/site-packages/lxml/html/__pycache__/html5parser.cpython-312.pyc

�

���g�!��|�dZddlZddlZddlmZddlmZddlm	Z	ddl
mZmZm
Z
	eZ	ddlmZ	ddlmZGd	�d
e�Z	ddlmZGd�d
e�Ze�Zd�Zdd�Z		dd�Z		dd�Zdd�Z dd�Z!d�Z"e�Z#y#e$reefZY�cwxYw#e$r	ddlmZY�mwxYw#e$r	ddlmZY�wwxYw#e$rY�^wxYw)z?
An interface to html5lib that mimics the lxml.html interface.
�N)�
HTMLParser)�TreeBuilder)�etree)�Element�XHTML_NAMESPACE�_contains_block_level_tag)�urlopen)�urlparsec��eZdZdZdd�Zy)rz*An html5lib HTML parser with lxml as tree.c�>�tj|f|td�|��y�N)�strict�tree)�_HTMLParser�__init__r��selfr�kwargss   �F/opt/hc_python/lib64/python3.12/site-packages/lxml/html/html5parser.pyrzHTMLParser.__init__s�����T�M�&�{�M�f�M�N�F��__name__�
__module__�__qualname__�__doc__r�rrrrs��4�Nrr)�XHTMLParserc��eZdZdZdd�Zy)rz+An html5lib XHTML Parser with lxml as tree.c�>�tj|f|td�|��yr
)�_XHTMLParserrrrs   rrzXHTMLParser.__init__*s���!�!�$�R�v�K�R�6�RrNrrrrrrr's��9�	Srrc�b�|j|�}|�|S|jdt�d|���S)N�{�})�findr)r�tag�elems   r�	_find_tagr(0s.���9�9�S�>�D������9�9��#�6�7�7rc���t|t�std��|�t}i}|�t|t�rd}|�||d<|j
|fi|��j
�S)z�
    Parse a whole document into a string.

    If `guess_charset` is true, or if the input is not Unicode but a
    byte string, the `chardet` library will perform charset guessing
    on the string.
    �string requiredT�
useChardet)�
isinstance�_strings�	TypeError�html_parser�bytes�parse�getroot)�html�
guess_charset�parser�optionss    r�document_fromstringr77sn���d�H�%��)�*�*�
�~����G����D�%�!8��
�� � -�����6�<�<��(��(�0�0�2�2rc�>�t|t�std��|�t}i}|�t|t�rd}|�||d<|j
|dfi|��}|rFt|dt�r3|r1|dj
�rtjd|dz��|d=|S)a`Parses several HTML elements, returning a list of elements.

    The first item in the list may be a string.  If no_leading_text is true,
    then it will be an error if there is leading text, and it will always be
    a list of only elements.

    If `guess_charset` is true, the `chardet` library will perform charset
    guessing on the string.
    r*Fr+�divrzThere is leading text: %r)	r,r-r.r/r0�
parseFragment�stripr�ParserError)r3�no_leading_textr4r5r6�childrens      r�fragments_fromstringr?Os����d�H�%��)�*�*�
�~����G����D�%�!8��
�� � -����#�v�#�#�D�%�;�7�;�H��J�x��{�H�5����{� � �"��'�'�(C�(0���)4�5�5�����Orc�6�t|t�std��t|�}t	||||��}|rRt|t�sd}t|�}|r1t|dt�r
|d|_|d=|j|�|S|stjd��t|�dkDrtjd��|d}|jr<|jj�r"tjd|jz��d	|_|S)
a�Parses a single HTML element; it is an error if there is more than
    one element, or if anything but whitespace precedes or follows the
    element.

    If 'create_parent' is true (or is a tag name) then a parent node
    will be created to encapsulate the HTML in a single element.  In
    this case, leading or trailing text is allowed.

    If `guess_charset` is true, the `chardet` library will perform charset
    guessing on the string.
    r*)r4r5r=r9rzNo elements found�zMultiple elements foundzElement followed by text: %rN)
r,r-r.�boolr?r�text�extendrr<�len�tailr;)r3�
create_parentr4r5�accept_leading_text�elements�new_root�results        r�fragment_fromstringrLqs���d�H�%��)�*�*��}�-��#��M�&�/�/�1�H���-��2�!�M��=�)����(�1�+�x�0� (����
��Q�K��O�O�H�%������� 3�4�4�
�8�}�q����� 9�:�:�
�a�[�F�
�{�{�v�{�{�(�(�*���� >���� L�M�M��F�K��Mrc�t�t|t�std��t|||��}|dd}t|t�r|jdd�}|j
�j�}|jd�s|jd�r|St|d	�}t|�r|St|d
�}t|�dk(rW|jr|jj�s1|djr|djj�s|d
St|�r	d|_|Sd|_|S)a�Parse the html, returning a single element/document.

    This tries to minimally parse the chunk of text, without knowing if it
    is a fragment or a document.

    'base_url' will set the document's base_url attribute (and the tree's
    docinfo.URL)

    If `guess_charset` is true, or if the input is not Unicode but a
    byte string, the `chardet` library will perform charset guessing
    on the string.
    r*)r5r4N�2�ascii�replacez<htmlz	<!doctype�head�bodyrA���rr9�span)r,r-r.r7r0�decode�lstrip�lower�
startswithr(rErCr;rFrr&)r3r4r5�doc�startrQrRs       r�
fromstringr[�s���d�H�%��)�*�*�
�d�6�,9�;�C�
��"�I�E��%������W�i�0���L�L�N� � �"�E����� �E�$4�$4�[�$A��
��S�&�!�D��4�y��
��S�&�!�D�	�D�	�Q���	�	������1B��b����d�2�h�m�m�&9�&9�&;��A�w��
!��&�����K�����Krc���|�t}t|t�s|}|�.d}n+t|�rt	|�}|�d}nt|d�}|�d}i}|r||d<|j|fi|��S)a*Parse a filename, URL, or file-like object into an HTML document
    tree.  Note: this returns a tree, not an element.  Use
    ``parse(...).getroot()`` to get the document root.

    If ``guess_charset`` is true, the ``useChardet`` option is passed into
    html5lib to enable character detection.  This option is on by default
    when parsing from URLs, off by default when parsing from file(-like)
    objects (which tend to return Unicode more often than not), and on by
    default when parsing from a file path (which is read in binary mode).
    FT�rbr+)r/r,r-�_looks_like_urlr	�openr1)�filename_url_or_filer4r5�fpr6s     rr1r1�s����~����*�H�5�
!��� �!�M�	�-�	.�
�)�
*��� � �M�
�&��
-��� � �M��G�� -�����6�<�<��&�g�&�&rc��t|�d}|sytjdk(r!|tjvrt|�dk(ryy)NrF�win32rAT)r
�sys�platform�string�
ascii_lettersrE)�str�schemes  rr^r^�sB��
�c�]�1�
�F���

�,�,�'�
!��f�*�*�*��F��q� ��r)NN)FNN)$rrdrf�html5librr� html5lib.treebuilders.etree_lxmlr�lxmlr�	lxml.htmlrrr�
basestringr-�	NameErrorr0rh�urllib2r	�ImportError�urllib.requestr
�urllib.parserr!�xhtml_parserr(r7r?rLr[r1r^r/rrr�<module>rus�����
�.�8��I�I���H�'��&�!�
N��N�!�4�S�l�S��=�L�8�3�005�48��D-2�37�)�X3�l!'�H
��l���k���s�|�H����'�&�'���&�%�&���	��	�sE�B�B�B"�B3�	B�
B�B�B�"B0�/B0�3B;�:B;

if you don't want to be vaporized in a nuclear explosion, i simply have to become nuclear myself… i am atomic