{"id":427,"date":"2020-09-16T00:53:00","date_gmt":"2020-09-16T04:53:00","guid":{"rendered":"https:\/\/molecularsciences.org\/content\/?p=427"},"modified":"2020-12-27T01:11:18","modified_gmt":"2020-12-27T06:11:18","slug":"how-to-parse-a-sentence-in-php","status":"publish","type":"post","link":"https:\/\/molecularsciences.org\/content\/how-to-parse-a-sentence-in-php\/","title":{"rendered":"How to parse a sentence in PHP"},"content":{"rendered":"\n<p>A sentence can be parsed in many different ways in PHP. A few methods are presented here along with their analysis. Regardless of the language you use for parsing a sentence, you can either match the characters of interest, or match the characters you wish to exclude and split on them or you can use a ready-made function if one is available.<\/p>\n\n\n\n<p>Consider the following code:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$sentence = \"Hey, Carol's 18th birthday is 15 days from today.\";\n$words = preg_split('\/\\W+\/', $sentence);\nprint_r($words);<\/code><\/pre>\n\n\n\n<p><strong>Output:<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>Array\n(\n &#91;0] =&gt; Hey\n &#91;1] =&gt; Carol\n &#91;2] =&gt; s\n &#91;3] =&gt; 18th\n &#91;4] =&gt; birthday\n &#91;5] =&gt; is\n &#91;6] =&gt; 15\n &#91;7] =&gt; days\n &#91;8] =&gt; from\n &#91;9] =&gt; today\n &#91;10] =&gt;\n)<\/code><\/pre>\n\n\n\n<p>The first example uses \\W which matches non-word characters. Non word characters include apostrophe (&#8216;), comma (,) and period (.) Carol&#8217;s is split into two words. Also note the empty cell at the end. Empty cells can be removed with PREG_SPLIT_NO_EMPTY argument as follows:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$sentence = \"Hey, Carol's 18th birthday is 15 days from today.\";\n$words = preg_split('\/\\W+\/', $sentence, -1, PREG_SPLIT_NO_EMPTY);\nprint_r($words);<\/code><\/pre>\n\n\n\n<p><strong>Output:<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>Array\n(\n&#91;0] =&gt; Hey\n&#91;1] =&gt; Carol\n&#91;2] =&gt; s\n&#91;3] =&gt; 18th\n&#91;4] =&gt; birthday\n&#91;5] =&gt; is\n&#91;6] =&gt; 15\n&#91;7] =&gt; days\n&#91;8] =&gt; from\n&#91;9] =&gt; today\n)<\/code><\/pre>\n\n\n\n<p>Consider this code:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>print \"\\nExample 2\\n\";\n$words = preg_split(\"\/&#91;^\\w']+\/\", $sentence);\nprint_r($words);\n\nprint \"\\nExample 3\\n\";\n$words = preg_split(\"\/&#91;\\s.,]+\/\", $sentence);\nprint_r($words);\n\nprint \"\\nExample 5\\n\";\n$words = preg_split(\"\/&#91;^\\w']+\/\", $sentence, -1, PREG_SPLIT_NO_EMPTY);\nprint_r($words);\n\nprint \"\\nExample 6\\n\";\n$words = preg_split(\"\/&#91;\\s.,]+\/\", $sentence, -1, PREG_SPLIT_NO_EMPTY);\nprint_r($words);\n\nprint \"\\nExample 7\\n\";\n$words = str_word_count($sentence, 1, '0123456789');\nprint_r($words);\n\nprint \"\\nExample 8\\n\";\n$words = preg_split('#&#91;\\\\s.,]#', $sentence, -1, PREG_SPLIT_NO_EMPTY);\nprint_r($words);\n<\/code><\/pre>\n\n\n\n<p><strong>Output<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>Example 2\nArray\n(\n&#91;0] =&gt; Hey\n&#91;1] =&gt; Carol's\n&#91;2] =&gt; 18th\n&#91;3] =&gt; birthday\n&#91;4] =&gt; is\n&#91;5] =&gt; 15\n&#91;6] =&gt; days\n&#91;7] =&gt; from\n&#91;8] =&gt; today\n&#91;9] =&gt;\n)\n\nExample 3\nArray\n(\n&#91;0] =&gt; Hey\n&#91;1] =&gt; Carol's\n&#91;2] =&gt; 18th\n&#91;3] =&gt; birthday\n&#91;4] =&gt; is\n&#91;5] =&gt; 15\n&#91;6] =&gt; days\n&#91;7] =&gt; from\n&#91;8] =&gt; today\n&#91;9] =&gt;\n)\n\nExample 4\n\n\nExample 5\nArray\n(\n&#91;0] =&gt; Hey\n&#91;1] =&gt; Carol's\n&#91;2] =&gt; 18th\n&#91;3] =&gt; birthday\n&#91;4] =&gt; is\n&#91;5] =&gt; 15\n&#91;6] =&gt; days\n&#91;7] =&gt; from\n&#91;8] =&gt; today\n)\n\nExample 6\nArray \n(\n&#91;0] =&gt; Hey\n&#91;1] =&gt; Carol's\n&#91;2] =&gt; 18th\n&#91;3] =&gt; birthday\n&#91;4] =&gt; is\n&#91;5] =&gt; 15\n&#91;6] =&gt; days\n&#91;7] =&gt; from\n&#91;8] =&gt; today\n)\n\nExample 7\nArray\n(\n&#91;0] =&gt; Hey\n&#91;1] =&gt; Carol's\n&#91;2] =&gt; 18th\n&#91;3] =&gt; birthday\n&#91;4] =&gt; is\n&#91;5] =&gt; 15\n&#91;6] =&gt; days\n&#91;7] =&gt; from\n&#91;8] =&gt; today\n)\n\nExample 8\nArray\n(\n&#91;0] =&gt; Hey\n&#91;1] =&gt; Carol's\n&#91;2] =&gt; 18th\n&#91;3] =&gt; birthday\n&#91;4] =&gt; is\n&#91;5] =&gt; 15\n&#91;6] =&gt; days\n&#91;7] =&gt; from\n&#91;8] =&gt; today\n)<\/code><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>A sentence can be parsed in many different ways in PHP. A few methods are presented here along with their analysis. Regardless of the language you use for parsing a sentence, you can either match the characters of interest, or match the characters you wish to exclude and split on them or you can use [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":430,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[23],"tags":[24],"class_list":["post-427","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-php","tag-php"],"_links":{"self":[{"href":"https:\/\/molecularsciences.org\/content\/wp-json\/wp\/v2\/posts\/427","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/molecularsciences.org\/content\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/molecularsciences.org\/content\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/molecularsciences.org\/content\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/molecularsciences.org\/content\/wp-json\/wp\/v2\/comments?post=427"}],"version-history":[{"count":2,"href":"https:\/\/molecularsciences.org\/content\/wp-json\/wp\/v2\/posts\/427\/revisions"}],"predecessor-version":[{"id":431,"href":"https:\/\/molecularsciences.org\/content\/wp-json\/wp\/v2\/posts\/427\/revisions\/431"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/molecularsciences.org\/content\/wp-json\/wp\/v2\/media\/430"}],"wp:attachment":[{"href":"https:\/\/molecularsciences.org\/content\/wp-json\/wp\/v2\/media?parent=427"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/molecularsciences.org\/content\/wp-json\/wp\/v2\/categories?post=427"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/molecularsciences.org\/content\/wp-json\/wp\/v2\/tags?post=427"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}