php-doc-en/reference/url/functions/parse-url.xml
2021-11-11 13:01:03 +01:00

308 lines
8.4 KiB
XML

<?xml version="1.0" encoding="utf-8"?>
<!-- $Revision$ -->
<refentry xml:id="function.parse-url" xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink">
<refnamediv>
<refname>parse_url</refname>
<refpurpose>Parse a URL and return its components</refpurpose>
</refnamediv>
<refsect1 role="description">
&reftitle.description;
<methodsynopsis>
<type>mixed</type><methodname>parse_url</methodname>
<methodparam><type>string</type><parameter>url</parameter></methodparam>
<methodparam choice="opt"><type>int</type><parameter>component</parameter><initializer>-1</initializer></methodparam>
</methodsynopsis>
<para>
This function parses a URL and returns an associative array containing any
of the various components of the URL that are present.
The values of the array elements are <emphasis>not</emphasis> URL decoded.
</para>
<para>
This function is <emphasis role="strong">not</emphasis> meant to validate
the given URL, it only breaks it up into the above listed parts. Partial and invalid
URLs are also accepted, <function>parse_url</function> tries its best to
parse them correctly.
</para>
</refsect1>
<refsect1 role="parameters">
&reftitle.parameters;
<para>
<variablelist>
<varlistentry>
<term><parameter>url</parameter></term>
<listitem>
<para>
The URL to parse.
</para>
</listitem>
</varlistentry>
</variablelist>
<variablelist>
<varlistentry>
<term><parameter>component</parameter></term>
<listitem>
<para>
Specify one of <constant>PHP_URL_SCHEME</constant>,
<constant>PHP_URL_HOST</constant>, <constant>PHP_URL_PORT</constant>,
<constant>PHP_URL_USER</constant>, <constant>PHP_URL_PASS</constant>,
<constant>PHP_URL_PATH</constant>, <constant>PHP_URL_QUERY</constant>
or <constant>PHP_URL_FRAGMENT</constant> to retrieve just a specific
URL component as a <type>string</type> (except when
<constant>PHP_URL_PORT</constant> is given, in which case the return
value will be an <type>int</type>).
</para>
</listitem>
</varlistentry>
</variablelist>
</para>
</refsect1>
<refsect1 role="returnvalues">
&reftitle.returnvalues;
<para>
On seriously malformed URLs, <function>parse_url</function> may return
&false;.
</para>
<para>
If the <parameter>component</parameter> parameter is omitted, an
associative <type>array</type> is returned. At least one element will be
present within the array. Potential keys within this array are:
<itemizedlist>
<listitem>
<simpara>
<varname remap="structfield">scheme</varname> - e.g. http
</simpara>
</listitem>
<listitem>
<simpara>
<varname remap="structfield">host</varname>
</simpara>
</listitem>
<listitem>
<simpara>
<varname remap="structfield">port</varname>
</simpara>
</listitem>
<listitem>
<simpara>
<varname remap="structfield">user</varname>
</simpara>
</listitem>
<listitem>
<simpara>
<varname remap="structfield">pass</varname>
</simpara>
</listitem>
<listitem>
<simpara>
<varname remap="structfield">path</varname>
</simpara>
</listitem>
<listitem>
<simpara>
<varname remap="structfield">query</varname> - after the question mark <literal>?</literal>
</simpara>
</listitem>
<listitem>
<simpara>
<varname remap="structfield">fragment</varname> - after the hashmark <literal>#</literal>
</simpara>
</listitem>
</itemizedlist>
</para>
<para>
If the <parameter>component</parameter> parameter is specified,
<function>parse_url</function> returns a <type>string</type> (or an
<type>int</type>, in the case of <constant>PHP_URL_PORT</constant>)
instead of an <type>array</type>. If the requested component doesn't exist
within the given URL, &null; will be returned.
As of PHP 8.0.0, <function>parse_url</function> distinguishes absent and empty
queries and fragments:
</para>
<para>
<informalexample>
<screen>
<![CDATA[
http://example.com/foo → query = null, fragment = null
http://example.com/foo? → query = "", fragment = null
http://example.com/foo# → query = null, fragment = ""
http://example.com/foo?# → query = "", fragment = ""
]]>
</screen>
</informalexample>
</para>
<para>
Previously all cases resulted in query and fragment being &null;.
</para>
<para>
Note that control characters (cf. <function>ctype_cntrl</function>) in the
components are replaced with underscores (<literal>_</literal>).
</para>
</refsect1>
<refsect1 role="changelog">
&reftitle.changelog;
<informaltable>
<tgroup cols="2">
<thead>
<row>
<entry>&Version;</entry>
<entry>&Description;</entry>
</row>
</thead>
<tbody>
<row>
<entry>8.0.0</entry>
<entry>
<function>parse_url</function> will now distinguish absent and empty queries
and fragments.
</entry>
</row>
</tbody>
</tgroup>
</informaltable>
</refsect1>
<refsect1 role="examples">
&reftitle.examples;
<para>
<example>
<title>A <function>parse_url</function> example</title>
<programlisting role="php">
<![CDATA[
<?php
$url = 'http://username:password@hostname:9090/path?arg=value#anchor';
var_dump(parse_url($url));
var_dump(parse_url($url, PHP_URL_SCHEME));
var_dump(parse_url($url, PHP_URL_USER));
var_dump(parse_url($url, PHP_URL_PASS));
var_dump(parse_url($url, PHP_URL_HOST));
var_dump(parse_url($url, PHP_URL_PORT));
var_dump(parse_url($url, PHP_URL_PATH));
var_dump(parse_url($url, PHP_URL_QUERY));
var_dump(parse_url($url, PHP_URL_FRAGMENT));
?>
]]>
</programlisting>
&example.outputs;
<screen>
<![CDATA[
array(8) {
["scheme"]=>
string(4) "http"
["host"]=>
string(8) "hostname"
["port"]=>
int(9090)
["user"]=>
string(8) "username"
["pass"]=>
string(8) "password"
["path"]=>
string(5) "/path"
["query"]=>
string(9) "arg=value"
["fragment"]=>
string(6) "anchor"
}
string(4) "http"
string(8) "username"
string(8) "password"
string(8) "hostname"
int(9090)
string(5) "/path"
string(9) "arg=value"
string(6) "anchor"
]]>
</screen>
</example>
</para>
<para>
<example>
<title>A <function>parse_url</function> example with missing scheme</title>
<programlisting role="php">
<![CDATA[
<?php
$url = '//www.example.com/path?googleguy=googley';
// Prior to 5.4.7 this would show the path as "//www.example.com/path"
var_dump(parse_url($url));
?>
]]>
</programlisting>
&example.outputs;
<screen>
<![CDATA[
array(3) {
["host"]=>
string(15) "www.example.com"
["path"]=>
string(5) "/path"
["query"]=>
string(17) "googleguy=googley"
}
]]>
</screen>
</example>
</para>
</refsect1>
<refsect1 role="notes">
&reftitle.notes;
<caution>
<para>
This function may not give correct results for relative or invalid URLs,
and the results may not even match common behavior of HTTP clients.
If URLs from untrusted input need to be parsed, extra validation is
required, e.g. by using <function>filter_var</function> with the
<constant>FILTER_VALIDATE_URL</constant> filter.
</para>
</caution>
<note>
<para>
This function is intended specifically for the purpose of parsing URLs
and not URIs. However, to comply with PHP's backwards compatibility
requirements it makes an exception for the file:// scheme where triple
slashes (file:///...) are allowed. For any other scheme this is invalid.
</para>
</note>
</refsect1>
<refsect1 role="seealso">
&reftitle.seealso;
<para>
<simplelist>
<member><function>pathinfo</function></member>
<member><function>parse_str</function></member>
<member><function>http_build_query</function></member>
<member><function>dirname</function></member>
<member><function>basename</function></member>
<member><link xlink:href="&url.rfc;3986">RFC 3986</link></member>
</simplelist>
</para>
</refsect1>
</refentry>
<!-- Keep this comment at the end of the file
Local variables:
mode: sgml
sgml-omittag:t
sgml-shorttag:t
sgml-minimize-attributes:nil
sgml-always-quote-attributes:t
sgml-indent-step:1
sgml-indent-data:t
indent-tabs-mode:nil
sgml-parent-document:nil
sgml-default-dtd-file:"~/.phpdoc/manual.ced"
sgml-exposed-tags:nil
sgml-local-catalogs:nil
sgml-local-ecat-files:nil
End:
vim600: syn=xml fen fdm=syntax fdl=2 si
vim: et tw=78 syn=sgml
vi: ts=1 sw=1
-->